Article Search API Version 1

Note: As of June 1, 2014, version 1 of the Article Search API has been deprecated. Please use version 2.

With the Article Search API, you can search New York Times articles from 1981 to today, retrieving headlines, abstracts, lead paragraphs, links to associated multimedia and other article metadata. Along with standard keyword searching, the API also offers faceted searching. The available facets include Times-specific fields such as sections, taxonomic classifiers and controlled vocabulary terms (names of people, organizations and geographic locations). For details on keyword and faceted searching, see Constructing a Search Query.

Note: In URI examples and field names, italics indicate placeholders for variables or values. Parentheses ( ) indicate optional items. Square brackets [ ] are not a convention — when URIs include brackets, interpret them literally.

The Article Search API at a Glance

Base URI

http://api.nytimes.com/svc/search/v1/article

Scope

New York Times articles from 1981 to today (excludes wire services such as the Associated Press)

Getting Started

Key Concepts

To make the most of the Article Search API, you need to understand fields and facets.

A field is a piece of data. But the field type is as important as the data contained in the field. The type affects how you use the field: it may be a searchable field you can use to limit or expand your query, or it may just be useful information that's returned with your search results. When constructing a query, take the time to consider which fields you're searching against and which fields will be returned with your results. Be sure to review the table of Data Fields.

A facet is a special type of field. Facets may contain Times-specific data (such as descriptors assigned by Times indexers), or they may be aspects of the data (such as a date or a term mapped to an external database) that help you approach your search results from a different angle or perspective. For more about facets, be sure to read Faceted Searching and Working With Facets.

About This Document

You can get good results with a simple query. But to make the most of all the options and features of the API, be sure to read this entire page. Here's a recommended approach:

Optional Parameters

Sets the starting point (inclusive) of the range of publication dates to return. Must be used with end_date.

end_date

YYYYMMDD

Sets the end point (inclusive) of the range of publication dates to return. Must be used with begin_date.

facets

Comma-delimited list of up to 5 facets

Specifies the sets of facet values to include in the facets array at the beginning of response, which collects the facet values from all the search results.

Note: The facets parameter does not specify search terms — use the query parameter to search against facets. Also note that the facets parameter does not control the facet values that are returned for each individual search result — use the fields parameter to specify which facets you want to see in each search result. For more information, see Faceted Searching and Responses.

If you omit this parameter, the response will not include any sets of facet values.

fields

Comma-delimited list of fields (no limit)

Specifies the fields (including facets) you want return for each search result. By default (unless you include a fields list in your request), the following fields are returned for each result: body, byline, date, title, url

To return only the array of facet values (no search results), set fields to a blank space (encoded: fields=+) and include the facets parameter.

The value of offset corresponds to a set of 10 results (it does not indicate the starting number of the result set). For example, offset=0 corresponds to records 0-9. To return records 10-19, set offset to 1, not 10.

rank

newest (default) | oldest | closest

Use the rank parameter to set the order of the results. The default rank is newest.

Constructing a Search Query

In an Article Search API request, the value of query can consist of keywords only, facets only, or both keywords and facets. This section describes each type of search and provides examples to illustrate the differences.

Keyword Searching

By default, the keywords you specify are searched against the text in three fields: title, byline and body, combined in an OR search.

You can also apply your keywords to specific data fields, using the syntax field:keywords. Here are a few examples, with and without limiting field labels:

In the first example, the phrase absentee ballot will be searched against the text in the title, byline and body (in an OR search). The second example expands that search to include the term election — but only in the title field. Finally, the last example looks for two terms in two specific fields.

In many cases, you may find it easiest to build your search in this way: first get a set of general results against the default fields, and use those results to identify specific terms and fields to make your query more accurate.

The Data Fields table in the Responses section includes a column indicating whether the field can be searched using the field:keyword syntax.

Returning FieldsThe fields you search against in your query are not automatically returned with your search results. Use the fields parameter to specify which fields to return in the response. Example:

If you do not specify any fields, the following data fields will be returned for each result: body, byline, date, title, url.

Getting Facets via KeywordsKeyword searches can also be used to return sets of facet values (even if you do not actually search on any facet values). The following request returns a set of values for the org_facet, allowing you to get a quick idea of the organizations mentioned in articles with the word bailout in the title:

You might also want to try the searchable facet_terms field, which collects terms from des_facet, geo_facet, org_facet and per_facet. The following request looks for articles that have the word "chocolate" in at least one of those facets. This kind of request can help you zero in on the standardized terms that best suit your search.

Faceted Searching

The Article Search API supports faceted searching, which can be much more powerful than standard keyword searching. Facets (whose field names are appended with _facet for easy identification) can help you explore Times-specific categories and subjects. For example, the desk_facet allows you to search by Times desk (such as Business), and the org_facet allows you to search by normalized organization name (such as ITT Corp). Dates and date segments also serve as facets.

When searching on a facet, separate the facet name and value with a colon, and surround the value with square brackets: facet-name:[value].

To narrow a keyword search by facet, include a facet field in your query parameter along with your search terms. To explore a facet by itself (without search keywords), specify only that facet in your query. You can specify several facets in one query in order to explore their intersection.

Each Article Search response can include sets of facet values, drawn from all search results. These sets are collected in a facets array at the beginning of the response. Use the facets parameter to control which facets are included in that array. (Each individual search result record can also include facet values for that specific result; these are controlled with the fields parameter.)

In each of these examples, the facet helps you look at the general obama keyword search from a different angle. For more on this idea, see Working With Facets.

Facet ValuesSome facet values must be uppercase, while others must be mixed case. For facets that correspond to Times controlled vocabularies, you must specify the exact standardized term that you want to match. To get standardized terms, try one of these methods:

The first example excludes some terms from the search. The second example searches for a specific phrase. The third example does both, and throws in a couple of facets for extra power.

Working With Facets

Facets can be thought of as search "perspectives." With facets, you can look at search results from different perspectives, and you can approach your search queries from different angles. Each facet can be seen as representing a property or characteristic of Times article data.

Facets can reveal points of commonality and distinction that are not immediately apparent. For example, two articles with the word "bicycle" in the title may have two very different nytd_section_facet (NYTimes.com section) values: "Movies" and "Health." Similarly, two articles that discuss seemingly disparate topics, such as cloud computing and auto shows, may share a des_facet (descriptive subject term) value: "NEW MODELS, DESIGN AND PRODUCTS." The following examples illustrate two ways of using facets in your requests.

Refining by Facet The examples in this section are linked to the Times API console tool — click each example to see the results without coding.

You might want to begin by doing a simple keyword search, but including several sets of facet values in the response, using the optional facets parameter:

When you refine by facet in this way, you can retrieve precise results with Times-specific data.

Exploring With FacetsAlong with using facets to refine an existing query, you can use them to explore subjects and identify trends. For example, the following request asks, "Which recent months saw the most articles about layoffs?"

As these examples suggest, it's best not to try to formulate a single, perfect search query. Instead, create several queries, examine the intersections of the results, and learn from the patterns and gaps.

Responses

Format and Result Sets

Currently, responses are in JSON only.

Each Article Search response can include sets of facet values, drawn from all search results. These sets are collected in a facets array at the beginning of the response. Use the optional facets parameter to control which sets of facet values are included in the response.

Note: the sets of facet values (collected in the facets array) are returned in addition to the search results. Each search result record can also include facet values for that specific result, if facets are specified with the fields parameter.

The API returns 10 records at a time. However, the facets array at the beginning of the response is drawn from all results, not just the current set of 10.

Web Site Data and Print DataCertain Article Search API response fields are specific to the printed Times newspaper, while others are specific to NYTimes.com. For example, the title field corresponds to the headline as it appeared in the printed paper, and nytd_title reflects the NYTimes.com headline. The two title fields may or may not have the same value.

In each print-web pair of fields, the NYTimes.com field is prepended with nytd_ to distinguish it. For examples, see the Data Fields table.

Data Fields

This section summarizes the available search result fields. To control which fields are returned for each search result, use the optional fields parameter. For details, see Requests.

Field Types Legend:S = Searchable: can be searched using the field:keyword syntaxF = Facet: must be specified in the facet:[value] syntaxI = Information: returned in responses

In this table, fields are listed in alphabetical order.

Name

Data Type

Field Types

Description

abstract

String

S, I

A summary of the article, written by Times indexers. Note: there can be a significant gap (as much as a year) between the publication of the article and the writing of the abstract. As an alternative, use body.

author

String

S, I

An author note, such as an e-mail address or short biography (compare byline)

body

String

S, I

A portion of the beginning of the article. Note: Only a portion of the article body is included in responses. But when you search against the body field, you search the full text of the article.

byline

String

S, I

The article byline, including the author's name

classifiers_facet

Array (Strings)

F, I

Taxonomic classifiers that reflect Times content categories, such as Top/News/Sports

column_facet

String

F, I

A Times column title (if applicable), such as Weddings or Ideas & Trends

comments

Boolean

S, I

Indicates whether user comments are associated with the article (for articles from 2007 to present). To retrieve comments, use the Community API.

Descriptive subject terms assigned by Times indexers. This facet is included in the facet_terms field (see the description in this table).

When used in a request, values must be UPPERCASE

desk_facet

String

F, I

The Times desk that produced the story (e.g., Business/Financial Desk)

facet_terms

String

S

Combines des_facet, geo_facet, org_facet and per_facet. Search facet_terms to find your query in any of those facets (essentially a combined OR search).

This field is case insensitive (but individual facets are case sensitive).

fee

Boolean

S, I

Indicates whether users must pay a fee to retrieve the full article

geo_facet

Array (Strings)

F, I

Standardized names of geographic locations, assigned by Times indexers. This facet is included in the facet_terms field (see the description in this table).

When used in a request, values must be UPPERCASE

lead_paragraph

String

S, I

The first paragraph of the article (as it appeared in the printed newspaper)

material_type_facet

Array (Strings)

F, I

The general article type, such as Biography, Editorial or Review

multimedia

Array

I

Associated multimedia features (interactive graphics, slideshows, etc.), including URLs (see also the related_multimedia field). "Multimedia" does not include photos; use the small_image fields for photo metadata.

nytd_byline

String

S, I

The article byline, formatted for NYTimes.com

nytd_des_facet

Array (Strings)

F, I

Descriptive subject terms, assigned for use on NYTimes.com (to get standardized terms, use the TimesTags API)

When used in a request, values must be Mixed Case

nytd_geo_facet

Array (Strings)

F, I

Standardized names of geographic locations, assigned for use on NYTimes.com (to get standardized terms, use the TimesTags API)

When used in a request, values must be Mixed Case

nytd_lead_paragraph

String

S, I

The first paragraph of the article (as it appears on NYTimes.com)

nytd_org_facet

Array (Strings)

F, I

Standardized names of organizations, assigned for use on NYTimes.com (to get standardized terms, use the TimesTags API)

When used in a request, values must be Mixed Case

nytd_per_facet

Array (Strings)

F, I

Standardized names of people, assigned for use on NYTimes.com (to get standardized terms, use the TimesTags API)

When used in a request, values must be Mixed Case

nytd_section_facet

Array (Strings)

F, I

The section the article appears in (on NYTimes.com)

nytd_title

String

S, I

The article title on NYTimes.com (this field may or may not match the title field; headlines may be shortened and edited for the web)

nytd_works_mentioned_facet

Array (Strings)

F, I

Literary works mentioned (titles formatted for use on NYTimes.com)

org_facet

Array (Strings)

F, I

Standardized names of organizations, assigned by Times indexers. This facet is included in the facet_terms field (see the description in this table).

When used in a request, values must be UPPERCASE

page_facet

String

F, I

The page the article appeared on (in the printed paper)

per_facet

Array (Strings)

F, I

Standardized names of people, assigned by Times indexers. This facet is included in the facet_terms field (see the description in this table).

When used in a request, values must be UPPERCASE

publication_daypublication_monthpublication_year

DateDateDate

F, IF, IF, I

The day (DD), month (MM) and year (YYYY) segments of date, separated for use as facets

related_multimedia

Boolean

S, I

Indicates whether multimedia features are associated with this article. Additional metadata for each related multimedia feature appears in the multimedia array. "Multimedia" does not include photos; use the small_image fields for photo metadata.

section_page_facet

String

F, I

The full page number of the printed article (e.g., D00002).

This data is added to the API 7 days after the article is published. Some articles that appear on NYTimes.com do not appear in the printed newspaper and thus do not have a value for section_page_facet.

small_imagesmall_image_urlsmall_image_heightsmall_image_width

BooleanStringIntegerInteger

S, IIII

The small_image field indicates whether a thumbnail image is associated with the article. The small_image_url field provides the URL of the image on NYTimes.com. The small_image_height and small_image_width fields provide the image dimensions.

source_facet

String

F, I

The originating body (e.g., The New York Times or The International Herald Tribune)

text

String

S

The text field consists of title + byline + body (combined in an OR search) and is the default field for keyword searches. For more information, see Constructing a Search Query.

title

String

S, I

The article title (headline); corresponds to the headline that appeared in the printed newspaper

tokens

Array(Strings)

I

Your query terms, returned for your reference

url

String

S, I

The URL of the article on NYTimes.com

word_count

Integer

I

The full article word count

works_mentioned_facet

Array (Strings)

F, I

Literary works mentioned in the article

Examples

These examples do not include the required api-key parameter. Be sure to include your API key in your request.

Requests

Search for lakes and ice as keywords, retrieving only those records that include the EPA as an org_facet, and rank the results by closest match:

The next example searches for all articles that discuss the state of New York, according to geographic terms assigned by Times indexers. Because such a broad search on a single facet is not likely to yield immediately useful results, you would probably want to retrieve additional facet values in the response, and then use those to narrow your search.

Responses

Here is a portion of a sample JSON response to the third example (title:afghanistan comments:y), with fields and facets parameters added:

{
"facets":{
"per_facet":[
{
"count":13,
"term":"OBAMA, BARACK"
},
{
"count":5,
"term":"MCCHRYSTAL, STANLEY A"
},
{
"count":4,
"term":"GATES, ROBERT M"
},
{
"count":4,
"term":"KARZAI, HAMID"
},
{
"count":4,
"term":"MULLEN, MICHAEL G"
},
...
{
"count":1,
"term":"TILLMAN, PAT"
}
],
"publication_year":[
{
"count":21,
"term":"2009"
},
{
"count":5,
"term":"2008"
},
{
"count":1,
"term":"2007"
}
]
},
"offset":"0",
"results":[
{
"body":"The most important line in President Obama’s Afghan speech was not about Afpak policy (so named by the White House) but about the U.S. domestic situation: “Our troop commitment in Afghanistan cannot be open-ended — because the nation that I am most interested in building is our own.” As military strategy for winning a war",
"date":"20091208",
"des_facet":[
"UNITED STATES DEFENSE AND MILITARY FORCES",
"UNITED STATES ECONOMY",
"AFGHANISTAN WAR (2001- )"
],
"title":"OP-ED COLUMNIST; Afghanistan on Main Street",
"url":"http:\/\/www.nytimes.com\/2009\/12\/08\/opinion\/08iht-edcohen.html"
},
{
"body":"AFTER the dramatic three-month buildup, you’d think that Barack Obama’s speech announcing his policy for Afghanistan would be the most significant news story of the moment. History may take a different view. When we look back at this turning point in America’s longest war, we may discover that a relatively trivial White House",
"date":"20091206",
"des_facet":[
"UNITED STATES DEFENSE AND MILITARY FORCES",
"AFGHANISTAN WAR (2001- )",
"SPEECHES AND STATEMENTS"
],
"title":"OP-ED COLUMNIST; Obama’s Logic Is No Match for Afghanistan",
"url":"http:\/\/www.nytimes.com\/2009\/12\/06\/opinion\/06rich.html"
},
{
"body":"WASHINGTON — Defense Secretary Robert M. Gates , Secretary of State Hillary Rodham Clinton and the nation’s top military officer on Wednesday laid out a muscular defense of President Obama ’s decision to send 30,000 additional troops to Afghanistan, but they made clear that his plan to begin withdrawing those forces by July 2011",
"date":"20091203",
"des_facet":[
"UNITED STATES DEFENSE AND MILITARY FORCES",
"AFGHANISTAN WAR (2001- )",
"SPEECHES AND STATEMENTS"
],
"title":"Obama Team Defends Policy on Afghanistan",
"url":"http:\/\/www.nytimes.com\/2009\/12\/03\/world\/asia\/03policy.html"
},
{
"body":"Americans have reason to be pessimistic, if not despairing, about the war in Afghanistan. After eight years of fighting, more than 800 American lives lost and more than 200 billion taxpayer dollars spent, the Afghan government is barely legitimate and barely hanging on in the face of an increasingly powerful Taliban insurgency. In his speech",
"date":"20091202",
"des_facet":[
"UNITED STATES DEFENSE AND MILITARY FORCES",
"AFGHANISTAN WAR (2001- )",
"EDITORIALS",
"SPEECHES AND STATEMENTS"
],
"title":"EDITORIAL; The Afghanistan Speech",
"url":"http:\/\/www.nytimes.com\/2009\/12\/02\/opinion\/02wed1.html"
},
...
{
"body":"The top military commander in Afghanistan warns in a confidential assessment of the war there that he needs additional troops within the next year or else the conflict ''will likely result in failure.'' The grim assessment is contained in a 66-page report that the commander, Gen. Stanley A. McChrystal, submitted to Defense Secretary Robert M. Gates",
"date":"20090921",
"des_facet":[
"UNITED STATES DEFENSE AND MILITARY FORCES",
"AFGHANISTAN WAR (2001- )"
],
"title":"General Calls for More Troops To Avoid Afghanistan Failure",
"url":"http:\/\/www.nytimes.com\/2009\/09\/21\/world\/asia\/21afghan.html"
}
],
"tokens":[
"title:afghanistan",
"comments:y"
],
"total":27
}