The Semantic API

The Semantic API complements the Articles API. With the Semantic API, you get access to the long list of people, places, organizations and other locations, entities and descriptors that make up the controlled vocabulary used as metadata by The New York Times (sometimes referred to as Times Tags and used for Times Topics pages).

The Semantic API uses concepts which are, by definition, terms in The New York Times controlled vocabulary. Like the way facets are used in the Articles API, concepts are a good way to uncover articles of interest in The New York Times archive, and at the same time, limit the scope and number of those articles. The Semantic API maps to external semantic data resources, in a fashion consistent with the idea of linked data. The Semantic API also provides combination and relationship information to other, similar concepts in The New York Times controlled vocabulary.

Note: In URI examples and field names, italics indicate placeholders for variables or values. Parentheses ( ) indicate optional items. Square brackets [ ] are not a convention — when URIs include brackets, interpret them literally.

The Semantic API at a Glance

Base URI

http://api.nytimes.com/svc/semantic/v2/concept

Scope

The New York Times controlled vocabulary (over 10,000 people, places, organizations and descriptors used to classify New York Times articles metadata) and New York Times articles from 1981 to today (excludes wire services such as the Associated Press)

Getting Started

Key Concepts

URI: By linked data reference URI, something The New York Times has for many (and soon all) concepts in its controlled vocabulary, listed at http://data.nytimes.com

Article: By New York Times article

Search: By querying the complete list of New York Times concepts using the Article Search API's search functionality.

The first and simplest way to refer to a concept is by type. There are four concept types: people, places, organizations and descriptors. The concept_type is a mandatory parameter for accessing the Semantic API with a concept type request.

As part of its work on linked data New York Times R&D created http://data.nytimes.com, a site where we publish the vocabulary used to index concepts. The site assigns a unique URI to each concept, and in the form of http://data.nytimes.com/concept_uri. This "concept_uri" is a mandatory parameter for accessing the Semantic API with a linked data reference.

The final way to obtain concept information is to check what concepts are associated with a given article in The New York Times. The Semantic API takes advantage the article's unique nytimes.com address to format an articleURI parameter. The articleURI is the part of the article's Web address that comes after "http://www.nytimes.com" minus its ".html" suffix. The articleURI is a mandatory parameter for accessing the Semantic API with an article.

In addition to the mandatory parameters corresponding to the three ways of accessing the Semantic API, optional parameters can also be applied. The "fields" parameter can be given the value "all" to provide a global request that includes all optional parameters.

Optional Parameters

Optional fields are returned in result_set. They are briefly explained here:

pages: A list of topic pages associated with a specific concept.

ticker_symbol: If this concept is a publicly traded company, this field contains the ticker symbol.

links: A list of links from this concept to external data resources.

taxonomy: For descriptor concepts, this field returns a list of taxonomic relations to other concepts.

combinations: For descriptor concepts, this field returns a list of the specific meanings tis concept takes on when combined with other concepts.

geocodes: For geographic concepts, the full GIS record from geonames.

article_list: A list of up to 10 articles associated with this concept.

scope_notes: Scope notes contains clarifications and meaning definitions that explicate the relationship between the concept and an article.

search_api_query: Returns the request one would need to submit to the Article Search API to obtain a list of articles annotated with this concept.

query

Precedes the search term string. Used in a Search Query. Except for <specific_concept_name>, Search Query will take the required parameters listed above (<concept_type>, <concept_uri>, <article_uri>) as an optional_parameter in addition to the query=<query_term>.

offset

Integer value for the index count from the first concept to the last concept, sorted alphabetically. Used in a Search Query. A Search Query will return up to 10 concepts in its results.

http://api.nytimes.com/svc/semantic/v2/concept/name/nytd_des/Baseball.xml?fields=all&api-key=your-API-key
http://api.nytimes.com/svc/semantic/v2/concept/name/nytd_per/Obama, Barack.json?fields=pages,links,scope_notes&api-key=your-API-key (you need the space after the comma separating last name and first name)

Constructing a Semantic API Request by Linked Data Reference

<concept_uri> is the numerical ID that comes at the end of the URI for concept reference pages at http://data.nytimes.com. For example:

Constructing a Semantic API Request by Article

The <article_uri> refers to the part of the article's Web address that comes after "http://www.nytimes.com" minus its ".html" suffix. The article_uri is a mandatory parameter for accessing the Semantic API with an article. An example of the article_uri:

The optional parameters for a Search Query are <concept_type>, <concept_uri>, <article_uri>. These optional parameters are derived from the other types of Semantic API requests and are explained in the instructions for constructing those requests (above).

An example of a Search Query, it asks for all of the concepts of type 'nytd_per' that contain the substring "Evan" and return the result formatted in XML.

Responses

Format and Result Sets

Currently, responses are in XML and JSON.

Top-level result_set fields are num_results, fields and results; results-level fields are concept_name, concept_type, concept_uri, concept_status, is_times_tag, first_use, last_use, use_count, searchApiQuery and a few more fields that have sub-fields: pages, tickerSymbol, links, taxonomy, combinations, scope_note, geocode and articleList. All of these fields are described in the Data Fields table.

The article_list_field of the Semantic API returns as many as 10 article records.

A Search Query returns as many as 10 concept records.

Data Fields

This section summarizes the available search result fields. To control which fields are returned for each search result, use the optional fields parameter. For details, see Requests.

Name

Data Type

Parent

Description

num_results

Integer

result_set

The total number of results returned by the search api.

more_results

Boolean

result_set

For the search request this field is retured instead of num_results. This field indicates if the offset can be further incremented to return additional fields.

fields

Array (Strings)

result_set

The list of user-requested fields.

results

Array (Objects)

result_set

The results of API request

concept_name

String

results

The label of the concept

concept_type

Array (Strings)

results

The type of the concept: one of nytd_des, nytd_geo, nytd_per, nytd_org.

concept_uri

String

results

The URI of the concept on http://data.nytimes.com. Please note that certain of the URIs returned for this field will not properly resolve to http://data.nytimes.com. This issue is being addressed and will likely be resolved by the second quarter of 2012.

concept_status

String

results

An indication if the concept is currently being applied to new articles. 'Active' if the concept is being applied, 'Deleted' if not.

is_times_tag

Boolean

results

True if this concept is retured by the TimesTags API.

first_use

String

results

The day on which this concept was first used to annotate an article on nytimes.com.

last_use

String

results

The day on which this concept was most recently used to annotate an article on nytimes.com.

use_count

Integer

results

The number of articles annotated with this concept.

searchApiQuery

Array (Strings)

results

The request one would need to submit to the NYT Article Search API to obtain a list of articles annotated with this concept.

A number of our organization indexing concepts refer to publicly traded companies. We know the stock symbol and exchange associated with most of these companies and this element expresses that knowledge.

ticker_symbol

Strings

tickerSymbol

A stock ticker symbol.

ticker_exchange

Strings

tickerSymbol

The exchange on which the stock is traded.

links

Array (Arrays)

results

The primary purpose of our Linked Data activity has been to link our internal indexing concepts with external data resources. This work has yielded a number of such links and the <links> list enumerates these links.

link

Array (Strings)

Iinks

A container for links.

relation

String

link

The relation of this concept to the link (e.g. sameAs, broader, narrower).

link

String

link

The name of the linked item. Either a string or a URI.

link_type

String

link

The type of the link (e.g. Wikipedia, Freebase, DBPedia).

link_mapping_type

String

link

Indicates whether or not a person or a machine created this link. Either “manual” or “automatic”.

taxonomy

Array (Arrays)

results

As part of our linked data effort, we have created a small taxonomy for our descriptor indexing concepts. For example we have declared that the concept “Anatomy and Physiology” is /narrower than/ the concept “Science”. The <taxonomy> list enumerates these taxonomic relations.

taxonomyItem

Array (Strings)

taxonomy

A taxonomic relation.

target_concept_name

String

taxonomyItem

The label for the concept.

target_concept_type

String

taxonomyItem

The type of the concept.

taxonomic_relations

String

taxonomyItem

The type of the relation. One of: NT, BT, RT, UF,

combinations

Array (Arrays)

results

Certain of our indexing concepts are combined with other indexing concepts to indicate a specific meaning. For example the concept “Cancer” and the concept “Lungs” can be combined to indicate that an article is about lung cancer. The <combinations> list enumerates all such combinations in which this concept participates.

combination

Array (Strings)

combinations

A combination container.

combination_target_concept_name

String

combination

The label for the concept.

combination_target_concept_type

String

combination

The type of the concept.

combination_note

String

combination

A brief plain-English note describing the combination

scope_note

Array (Strings)

results

scope_note_text

String

scope_note

The text of the scope note.

scope_note_type

String

scope_note

The type of the scope note.

geocodes

Array (Arrays)

results

geocode

Array (Objects)

geocodes

The <geocode> element contains information about the geographic entity embodied by this concept. This information is drawn from the freely available geonames database.