NYTNews dataset

Dataset Information

New York Times News corpus contains all of the published articles in New York Times over 7.5 years (Jan 2000–July 2007) (available fromhttps://catalog.ldc.upenn.edu/ LDC2008T19). The named entities (people, places, organizations) are hand-annotated by human editors. We construct weekly temporal graphs (390 time points) in which each node corresponds to a named entity and edges depict co-mention relations in the articles. The data contains around 320, 000 entities, however no ground truth events.