Semantic Scholar Co-Authorship Sample

socialundirectededge labels

Constructed from a subsample of the Semantic Scholar Open Research Corpus, this dataset is a simplicial complex where nodes are authors and simplices are sets of co-authors on papers. Values on every simplex indicate the sum of citations of all papers co-authored at least by the corresponding set of authors.

Example

PaperAuthorsCitations
Paper 1A, B, C100
Paper 2A, B50
Paper 3A, D10
Paper 4C, D4
100150100104100A160B150C104D14

Take note that edge (A, B) has value 150 since both Paper 1 and Paper 2 were co-authored by at least A and B. Similarly, node C has value 104 since C co-authored Paper 1 (100 citations) and Paper 4 (4 citations).

Dataset Statistics

Nodes
352
Node Type
Author
Node Degree
Min1
Q163
Median127
Q3511
Max3,787
Simplices
24,200
Simplex Type
Co-Authorship
Maximal Simplex Size
Min2
Q16
Median7
Q39
Max11