How are China, Estonia and Germany different from India, Greece and the UK? To an economist, one answer is obvious: savings rates. Germans save 10 percentage points more than the British do (as a fraction of GDP), while Estonians and Chinese save a whopping 20 percentage points more than Greeks and Indians. Economists think a lot about what drives people to save, but many of these international differences remain unexplained. In a recent paper of mine, I find that these countries differ not only in how much their residents save for the future, but also how their native speakers talk about the future.

In late 2011, an idea struck me while reading several papers in psychology that link a person’s language with differences in how they think about space, color, and movement. As a behavioral economist, I am interested in understanding how people make decisions. Could a person’s language subtly affect his or her everyday decisions? In particular, could the way a person’s language marks the future affect their propensity to save for the future?

In a nutshell, this is precisely what I found. After scouring many datasets with millions of records on individual household savings behavior—along with a number of peculiar health performance metrics like grip strength and walking speed—I find that languages that oblige speakers to grammatically separate the future from the present lead them to invest less in the future. Speakers of such languages save less, retire with less wealth, smoke more, practice more unsafe sex and are more obese. Surprisingly, this effect persists even after controlling for a speaker’s education, income, family structure and religion.

Back when my first paper on this topic circulated, many linguists were appropriately skeptical of the work. Their concerns are concisely explained in two well-thought out posts (here and here) by the linguists Mark Liberman and Goeffrey Pullum on the blog they founded, Language Log. Mark and Geoffrey also invited me to write a guest post explaining the work. In that post, I discuss which of their possible concerns are unlikely given the patterns I find across the world in people’s savings and health behaviors, and also try to clarify which of their concerns I was not yet able to address.

This exchange prompted a broad set of discussions as to what different types of data, analyses and experiments could, in principle, answer the questions raised by the patterns I find. Cross-disciplinary discussions took place in a subsequent post by Julie Sedivy and followup posts by Mark Liberman, and also at the Linguistic Data Consortium’s 20th Anniversary Workshop. Several new avenues of investigation and work came out of these interactions, three of which are now ongoing projects.

One new idea that I’ve begun to explore entails measuring a language’s time reference by scraping the web—to search for natural patterns in language—in addition to using linguistic classifications. This led me to search the web for the simplest form of writing about the future I could find: weather forecasts. Why weather forecasts? Well, forecasts rarely talk about the past, so they’re a natural place to look for speech about the future. Weather forecasters also generally communicate in natural, straightforward language, and often convey similar content across different settings. Can patterns in weather forecasts measure how languages structure the future, and can these differences predict how people save for the future? Amazingly, they do.

A team of linguistics and economics students assisted with this analysis, and managed to scrape the web for weather forecasts in 39 languages from around the world. The figure below summarizes what we found: wide variation in how often, when talking about future weather, forecasts in a particular language grammatically mark the future as something distinct from the present. In English, for example, this comes down to the relative frequency of sentences like:

Rain is likely this weekend. (present tense “is”)

It will likely rain this weekend. (future tense “will rain”)

What’s surprising is that when I repeat the statistical analysis I did in the paper, I find an incredibly strong relationship between how forecasters talk about weather and how much people choose to save. Essentially, a 20 percentage point increase in the frequency of future tenses results in 1% less of GDP saved. This finding holds even after taking into account a country’s level of development, rate of growth, demographics, social security protections and major religions.

What does this mean? I don’t believe it demonstrates extreme weather forecaster persuasion. Rather, I think it shows that many different ways of measuring how languages mark time share a strong and striking relationship with how speakers of those languages save. In short, I believe more than ever that the data suggests a strong and robust relationship between linguistic and economic data, a relationship that leaves us at an exciting crossroads: one where economists have a tremendous amount to learn from linguists.

The figure below measures the percent of time weather forecasts use future vs. present tenses (download a larger version as a PDF). See the paper here for details.

Economist Keith Chen starts today’s talk with an observation: to say, “This is my uncle,” in Chinese, you have no choice but to encode more information about said uncle. The language requires that you denote the side the uncle is on, whether he’s related by marriage or birth and, if it’s your father’s brother, whether […]

Keith Chen is a Yale economist who made a stir earlier this year with an intriguing working paper relating economics and language — a paper that is yet unpublished, Bruno is careful to note onstage. Chen is onstage at TED to hash over the question, which is sparking ongoing debate and refinement (read a key […]

Although I haven’t had the chance yet to read your papers, the graph presents Catalan and Basque as two future-intensive languages. Culturally, and judging from how the economies of these two regions in Spain fared until 2000, Catalan and Basque people were probably the ones saving the most in Spain. But even in Spain, people from Castille or Galicia were more into saving than people from Andalucia. I would have to find statistics for years and Spanish regions and different saving patterns would emerge. And even saving trends have changed in comparison with the 1960s. Let’s suppose that your hypothesis is true. In the case of differences across Spanish regions, we would find regional differences in the frequency of the future correlating to saving behaviour. This could be extended to explain the differences, for example, among British ex-colonies. But what about Catalonia and the Basque provinces where people have been culturally more austere and in your data appear as high-future users for their weather forecasts? What about bilingual or multilingual populations using languages with different future tense patterns? What about the U.S. where people are more obese on average than countries where saving is higher? I’ll read (in the near future) your papers to see if some more questions arise.

Yes, it would make sense that the author didn’t have access to Brazilian Portuguese data, which explains why it has 0%. I got a bit confused because in the introduction of the shorter blog post about this topic, Chinese is described as a “futureless language.” Seeing as both it and Brazilian Portuguese had 0% in the graph, I thought a presumption was being made that it lacked a future verb tense. As you, and my Portuguese teacher, point out, it doesn’t lack one at all.

I was drawn to this article, but I had immediate doubts.. I grew up in Malaysia where the population demographics are: Malay, Chinese, Indian, others. Each ethnicity retain their original languages. The most obvious problem in the country is the economic gap and behavior in savings between Malays, Chinese and Indian. The Chinese, despite being minority, hold the most wealth. Malay language is also a non-futured language. There is no distinction between past/present/future. I’m not sure about Tamil language, as I’m not familiar with it. Interesting theory, but totally not applicable when comparing between non-futured languages