Dataset
DBLP Co-Authorship
Network Type
Overview
The DBLP co-authorship network is a hypergraph where nodes are authors from the DBLP computer science bibliography database and hyperedges are papers, representing sets of authors who collaborated on a publication. Each author is labeled with their primary research area, inferred from the majority research area of their publications. The research areas include Database, Data Mining, AI, Information Retrieval, Computer Vision, and Machine Learning. This dataset is commonly used as a benchmark in hypergraph learning tasks for author classification and link prediction.
Statistics
Nodes
Nodes
22,363
Node Type
Author
Node Label
Research Area
Node Degree
Min1
Q12
Median3
Q34
Max197
Node Label Distribution
6 unique labels · imbalance degree: 2.27
Hyperedges
Hyperedges
32,304
Hyperedge Type
Publication
Hyperedge Degree
Min2
Q12
Median2
Q33
Max18
Changelog
Revision 2
- Updated the format version to 0.3.
- Dropped papers with only one author.