SharedData»Shared Data

Shared Data

In addition to the sources listed below, be sure to look at the information about data formats.

Links to data as well as other network resources; Cambridge Networks Network

Large network data sets maintained by Jure Leskovec

Million Song Data Set at Columbia University

Papers-Patents-Awards data sets hosted by SDB at University of Indiana

Large-scale whole brain structural connectivity Connectome data derived from Diffusion MRI (in Connectome File Format / GraphML)

Major League Baseball Batter-Pitcher Matchups (as MATLAB files)

Github data from Jeremy Corbett: tab separated format, pajek format

Facebook social graph, weighted random walks, and applications Datasets collected and hosted at the University of California, Irvine (explicit email request required)

Datasets used in Statistical Analysis of Network Data by Eric Kolaczyk

Radar measurements of the internet topology

Measurements of P2P activity

Data from Renaud Lambiotte

The University of Florida Sparse Matrix Collection (from Tim Davis)

Voteview website (contains Congressional roll call data; maintained by Keith Poole)

Voteworld website (contains voting data from several countries)

Charles Stewart's Congressional Data Page

Committee Assignments in the U.S. House of Representatives (as MATLAB files)

Processed roll call voting matrices (102nd-107th Houses) (as MATLAB files)

NCAA Division I-A Football

CiteULike data

Public Data from Amazon (US census data, etc.)

brain connectivity matrices (also contains .m files for quite a few network tools)

Santo Fortunato's community detection benchmarks (from this paper)

Dataverse (maintained by Harvard University's Institute for Quantitative Social Science)

Data on international trade between 1962 and 2000 (from Cesar Hidalgo)

Data posted by Lazlo Barabasi's group (University of Notre Dame)

Data posted by the Uri Alon group (Weizmann Institute)

Data posted by Jure Leskovec (Stanford University starting fall 09)

http://www.casos.cs.cmu.edu/computational_tools/data2.php

Data posted by Duncan Watts' Collective Dynamics group (Columbia University) (as raw edgelist)

Pajek datasets (in pajek formats)

Caida AS datasets

Data posted by Mark Newman (in GML format)

http://data.un.org/

Free and large Irish social network data (there's a competition with this too)

Data posted by School of Library and Information Sciences at University of Indiana

Network Workbench

The Empirical Networks Project

http://www.insna.org/software/data.html and

http://www.insna.org/software/public_data.html