Dataset
DBLP Co-Authorship
Network Type
The DBLP co-authorship network is a hypergraph where nodes are authors from the DBLP computer science bibliography database and hyperedges are papers, representing sets of authors who collaborated on a publication. Each author is labeled with their primary research area, inferred from the majority research area of their publications. The research areas include Database, Data Mining, AI, Information Retrieval, Computer Vision, and Machine Learning. This dataset is commonly used as a benchmark in hypergraph learning tasks for author classification and link prediction.
Dataset Statistics
Nodes
Nodes
22,363
Node Type
Author
Node Label
Research Area
Node Degree
Min1
Q12
Median3
Q34
Max197
Node Label Distribution
Hyperedges
Hyperedges
32,304
Hyperedge Type
Publication
Hyperedge Degree
Min2
Q12
Median2
Q33
Max18
Changelog
Revision 2
- Update to format version
0.3. - Drop papers with only one author.