Data is data, or are they?

The short answer: data can be singular or plural. In some formal and technical contexts the plural form is preferred, but the singular form is increasingly common and standard. In most contexts you can write these data or this data, data are or data is, and so on.

Data emerged in 1646 as the plural of the Latin datum, which according to the OED was the past participle of dare (“give”) andmeant “a thing given or granted; a thing known or assumed as a fact, and made the basis of reasoning or calculation; a fixed starting point for a series of measurements etc.”

Datum remains standard and retains the general meaning of “a unit of information”, though it tends to appear mostly in academic and specialist disciplines such as philosophy, surveying, geodesy, topography, technical drawing, and cartography:

Several map datums were erroneous, which threw the hikers off-track.
“The principal datum input to any search algorithm is a description of its search space.” (Alan Hutchinson, Algorithmic learning)
“[T]he paper seen and the seeing of it are only two names for one indivisible fact which, properly named, is the datum, the phenomenon, or the experience.” (William James, The Meaning of Truth)

The meaning of the derived plural data has changed somewhat over the centuries. The OED definition from the late 19th century (“Facts, esp. numerical facts, collected together for reference or information”) seems to testify to the broadening influence of the hard sciences. In the 20th century, the rapidly expanding fields of information technology incorporated the word into a huge variety of IT- and computer-related compound nouns, such as database, data entry, data flow, data mining, data processing, data protection, and data stream.

The plural data is used in many scientific, technical, academic and other formal contexts, though different practices prevail in different places. Among the major news media, The Economist advises the plural usage; The Guardian, singular. The Times Style Guide expressly permits both. Here are some examples of plural usage found via the British National Corpus:

“Our data are too uncertain to draw firm conclusions” (Criminal Law Review)
“Most of the data are new” (Journal of Gastroenterology and Hepatology)
“These data are then used to calculate bond enthalpies.” (Michael Freemantle, Chemistry in Action)

In computing jargon, social sciences, and everyday use, data is often treated as an abstract mass noun, like information. It has the general meaning “mass of information” and takes a singular verb, singular pronoun (it) and singular modifiers (e.g. this, a few, much):

“On this map the data is recorded by county and not by region” (Peter Hardy, A Right Approach to Economics?)
“All this data is then written up as a technical report” (Atkins & Atkins, An Introduction to Archaeology)
“The retina codes and combines the data so that it can be fed into the 1 million fibres entering the optic nerve” (Laszlo Solymar, Lectures on Electromagnetic Theory)

Few non-specialists who use the word data think of it as the plural of datum. Similarly, agenda has taken on a singular life of its own (distinct from the near-obsolete agendum) and has given rise to the standard plural agendas. Consider also media (from medium), criteria (from criterion), graffiti (from graffito), and stamina (from stamen). All of these imported plurals have characteristics of usage with varying degrees of acceptance and acceptability. Agendas may be common and standard, but medias, datas and criterias are not – at least, not much and not yet.

A note of advice: try to be internally consistent, and be mindful of context. Sometimes one form is preferred: for example, most publishers have a house style to which your text must conform. Even in reputable publications, however, usage is decidedly mixed, and discrepancies can result in editorial mix-ups, as Merriam-Webster has shown. Readers who cling to the Latin origins of data may protest the singular form on principle, but this gripe is misguided. I should know: the singular form used to grate on me, but I wised up.

Post navigation

14 Responses to Data is data, or are they?

Money is pluralised in Danish. The Danish word for money is ‘penge’, though this doesn’t tell you much. But, for instance, when they want to say, “I’m not leaving you over the money, Øfuls. It doesn’t mean anything to me. I’m leaving you because you are a total pain in the ørse”, they say something like “I’m not leaving you over the money, Øfuls. They don’t mean anything to me…”.

Lucy: Thank you for my first ever lesson in Danish, and especially for the vivid illustration. A quick investigation reveals that a similar word, pengö (an onomatopoeic term for ringing or twanging), was the unit of currency in Hungary before the forint was introduced. Maybe there are others.

I appreciate you posting this clarification. This topic regularly comes up in the seminars I teach on writing technical documents and writing policies and procedures. I also plan to tweet on Twitter to call attention to your article.

Christine: You’re very welcome! Thank you for the kind words, and for the link from your own post on the subject.

Catherine: My pleasure; I’m glad you found it helpful, and I appreciate you spreading the word. There remains considerable uncertainty about whether data is singular or plural, when in fact it can be either, depending on context.