Enron Graph Corpus
Enron Graph Corpus is created from Enron Email Corpus using Ontea information extraction tool. Graph contains of nodes representing emailas interconected with real word entities such as people, telephone numbers, addresses or companies. It contains also a lot of nioisy data.
See publication for more details:
-
Michal Laclavík, Marek Ciglan, Štefan Dlugolinský, Martin Šeleng, Ladislav Hluchý
Emails as Graph: Relation Discovery in Email Archive
[Slides]
In Email2012 workshop, WWW 2012, April 16–20, 2012, Lyon, France, pages 841-846, 2012
It can be used to experiment on relation search in graph in similar way as we have done (see publication above).
See also video and demo on fraction of email corpus with our Email Social Network Search
The size of the corpus is:
- Number of Nodes: 8,269,278
- Number of Edges: 20,383,709
It can be downloaded here: enron.graph.zip