Which English?

The answer depends on where you live. Many people in Newfoundland find that sentence perfectly grammatical.

By taking this quiz, you will be helping train a machine algorithm that is mapping out the differences in English grammar around the world, both in traditionally English-speaking countries and also in countries like Mexico, China, and India.

At the end, you can see our algorithm’s best guess as to which English you speak as well as whether your first (native) language is English or something else.

It’s fun, it only takes a few minutes, and it’s for Science! My only quibble (as I told them in the box they provide for feedback) is that I don’t understand why there are so many sentences of bizarro Chomskyoid form (a la Is that the book you saw the man he said he told to hold it for?); throw one in for kicks, sure, but more is overkill. My results:

Our top three guesses for your English dialect:
1. American (Standard)
2. Singaporean
3. US Black Vernacular / EbonicsOur top three guesses for your native (first) language:
1. English
2. Swedish
3. Norwegian

I’m sure it’s coincidence that my mother was Norwegian-American, but she would have been pleased and amused. (Via Anatoly.)

I’m a little sad it was so easy for it to identify me correctly. It correctly pegged me as a native speaker of standard American English. The other varieties it proposed were both North American, and the other possible native languages were both West Germanic.

I got Standard American, AAVE, and Singaporean on the one hand, and English, Norwegian, and Dutch on the other. I suspect the Norwegians, Swedes, and Dutch who’ve gone through the filter just have very good Standard English, rather than there being anything specific to Hat’s English or mine.

I associate Throw the cow over the fence some hay sentences not with Newfs but with Pennsylvania Dutch.

This is really weird, because I am a native speaker, I proofread for a living, and I’ve always thought of my English as “Standard” to the point of caricature. I guess it’s possible that heavy exposure to British literature has given me a case of Commonwealth English, hence the Australian and Singaporean results? (I wouldn’t say “the team were playing in London,” but I do instinctively acknowledge it as “correct.”)

I don’t know why I’m Norwegian, though. Maybe there’s more Norwegian ancestry in Upstate New York Eighties English than I think.

I’m from South West England and have never even been to Ireland (or Scotland really, because I don’t think a two week holiday could have had much of a long-term impact on dialect), so not finding that result terribly impressive… Right side of the Atlantic at least.

My dialects were located to 1) England, 2) Wales and 3) Scotland. My native languages were 1) English, 2) Swedish and 3) German. Placing me solely on the British isles is of course a compliment to the algorithm: Besides what I learned in school I have spent some time in pubs with Scottish football hooligans, easy to understand since those northerners know how to pronounce both “r” and “ch” :-).

The order between the Germanic languages can be a tricky thing to decide. English and Swedish grammar are very close and the easiest part for a Swede when learning English. And the German one isn’t that far away either.

they put me as NZ or Welsh or UK English, but I am in fact Irish, Dublin Irish to be exact. I don’t particularly consider correct punctuation to be a question of grammar, especially considering that they requested that I answer based on what would be acceptably correct, and not what would be prescriptively correct, but other than that, an interesting experiment.

Dialect: 1) American (standard) — correct; 2) AAVE; 3) Singaporean; L1: 1) English (correct); 2) Dutch; 3) Swedish. I have afaik no Swedish-speaking ancestors, and while I had some Dutch-speaking ancestors who probably continued the language in the Hudson valley under Anglophone political rule for a substantial period, it had died out some time prior to the birth (circa 1840) of my maternal grandfather’s maternal grandfather, so there’s no plausible influence from that on my own idiolect. The parts of the US I have spent the majority of my life in were under Swedish and/or Dutch rule prior to coming under Anglophone rule but I have never seen any convincing account attributing peculiarities of local dialect (except perhaps for a few lexical items, which is not what this test was measuring) to that 17th century history.

I of course construed words like “correct” or “grammatical” in the instructions with reference to my own native dialect, since checking the box for sentences I happen to know to be well-formed in varieties of English different from my own would have just messed up their data to no productive end.

Hey folks — glad you are all enjoying this. One note: we are in the process of training the algorithm. So it’s not going to get everybody correct. In testing so far, it correctly identifies native speakers as native speakers around 90%-95% of the time, but that means that sometimes it’s wrong. Of course it misidentifies a good percentage of non-native speakers as native simply because some people have learned English really well and the test is short!

As more data comes in, it will get more accurate. But ultimately we’ll want to switch from using the Euclidean distance algorithm we’re currently using to something more sophisticated. I’m not sure when that will be.

Rather spookily it identified me correctly as essentially Scots. Spookily because I left when I was 11 and my English is normally pegged by others as English RP (I can hear differences myself, but they may be all in my imagination.) It’s probably more relevant that I retain many links with Scotland and am very used to hearing Scots speakers. After all the test is all about what you would accept as grammatical rather than what you might say. I certainly OK’d several things as correct that I don’t think I use myself (so I adopted the opposite strategy to JWB.)

It guessed that if I weren’t a native English speaker my native language was Swedish or Norwegian. Must be my clean-cut Nordic features. I once had a flatmate who spoke Danish …

Not sure what to make of

“2. Americans, Canadians, and South Africans accept I sent my mother a letter instead of to my mother.”

I’m sure they do, but don’t *all* native English speakers? It’d be the normal way for me to say it and I’ve only been to South Africa once …

Maybe they ought to be clearer about what strategy they want users to adopt? Leaving aside far-flung things (e.g. a sentence or two I suspect would be unremarkable in informal speech in Dublin, but remarkable in NY if uttered by someone without an Irish accent), there were a certain number of examples that would be well-formed in AAVE but not in my own standard AmEng. I understand the AAVE examples perfectly well and might on any given day in New York hear one of them uttered in my presence, but I wouldn’t say them myself and (this may be the more important point) treat them as presumptive evidence that the speaker has a different native variety of English than I do (and indeed would expect anyone who uttered them natively to have somewhat different pronunciation than I do). My intuition is that saying I found all of them acceptable would increase the odds the algorithm would peg my own native variety as AAVE, which would be incorrect, and/or (once I told them at the end that I did not self-identify as an AAVE speaker) make the algorithm more confused as to what the reliable indicia of having AAVE as ones native variety are. Now, if AAVE (or Scots) had been my childhood variety but because of the subsequent trajectory of my life I had lost many of its distinctive features from my own idiolect I suppose I might feel differently.

I didn’t check any of what I thought to be clearly AAVE (which I don’t use), but ‘U.S. Black Vernacular’ was their second guess after American Standard. I have lived most of my life in the American South, so there may be an influence of which I be unaware 😉

Various comments suggest the algorithm may be more sophisticated than one might think. It seems to be Americans who are closer to AAVE for example, even when trying the stick-to-my-own-vernacular approach.

In my own case it certainly isn’t impossible that my syntax has been less affected by my exile among the heathen English than my phonology, and that’s the sort of thing which is by no means so obvious to strangers trying to place your origin. Come to that, subtle syntactic differences are an area where naive introspection about your own idiolect tends not to get you very far either.

My native language was determined to be Scandinavian, which is OK, I am clearly not a native English speaker and “Scandinavians” are runner ups. But it bugs me what was the reason. I steadfastly refused to give any determiner to “President Barack Obama”, but that cannot be it, even if wrong, there were no other determiner-checkers (or I missed them?) and you cannot make much from an isolated mistake. I also doubted my choice about transitive properties of “donate”, but it also is an isolated case. What I’ve deliberately done is to allow simple past where formal tense agreement required more complicated constructions (I was/have been working until now -> I worked until now; I had finished dinner before she came -> I finished dinner before she came). Is that it? Or being labeled as Scandinavian is simply a substitute for “speaks standard English, but makes a few random mistakes”?

This sentence is misleading: “to” would not be right (at least to me) when “my mother” comes before “a letter”, but if “a letter” comes first, then “to” is meeded before “my mother”. From what I can observe, the difference between the two structures is not grammatical but pragmatic: which part of the sentence is old information (stated first) or new information (stated last).

I sent my mother a letter answers What did you send (to) your mother? The new information is a letter. (Perhaps the presence or absence of “to” is dialectally relevant in the question rather than the answer).

I sent a letter to my mother answers Who did you send a letter to? – the new information is my mother.

Same with other “ditransitive” verbs, which need to specify two kinds of “objects” (“give” is the most typical one).

I still have not seen the quiz, but recent comments remind me of an exercise I found in one linguistics textbook. There were six or seven groups of words, phrases and short sentences, each group typical of a particular English-speaking area. The point was to try to identify each area. The comment was something like “if one of these groups sounds perfectly ordinary to you, that’s because it is typical of the area you are from”. Indeed, the group I found quite ordinary was Canadian English! All the other groups included unknown or exotic-sounding words as well as some bizarre sentences which were apparently quite normal in their native environment.

Top 3 guesses for your native English dialect:
1: New Zealand
2: English (England)
3: Australian

Correct, but why “English (England)” at #2, and what does that mean, anyway? I’d guess something southeastern if it can sit there between its Australasian cuzzies. And the question that included “she’ll be right” was surely a major Noo Zild giveaway.

Perhaps, like Steve Martin, you were born a poor black child
only in the Russian meaning of the word, which still wouldn’t give much headway with AAVE. I’m pretty much sure that it was my choices of chopped-up sentences with a period in the middle of a line. Cuz I know that my Russian sentences are mile-long, and in English, whenever I edit my own writing “for clarity”, I replace periods and semicolons with periods in a most generous way 😉

I think the Celtic element came in simply because I consciously approved of sentences which reflected those languages. Not sure about Hungarian, though I do know the language well. I’m a monoglot English speaker, though I’m close to native in several other (all European, but pretty varied: Romance, Slavic, Baltic, Uralic, Germanic, Celtic) languages. I have never lived anywhere but the UK for more than six months and have a pretty good to very good knowledge of a lot more, which allows me to ‘mug them up’ and get speaking after a few days. As far as I’m concerned, I’m pretty bad at speaking in any language, which makes them interesting!

It placed me as Irish, from the Republic, first language English, which is exactly correct. I didn’t fine the Chomskybot sentences that bad, though maybe they would have been better with some more commas.

I marked those pronoun-free second sentences as OK. On reflection, I’m not sure their affect is part of my native dialect; they are colloquial, but artificially so (George HW Bush “Those were exciting days. Lived in a little shotgun house, one room for the three of us. Worked in the oil business, started my own.”)

“I’m after doing something” is grammatical in British English, it just doesn’t mean what it means in Irish English. Since the other options were synonymous with the Irish sense, I presume that’s what they were aiming for, but others might not.

I also lied by saying I had lived 10 years in Dublin; in fact I was sometimes in Dun Laoghaire-Rathdown, which is listed separately, a case of overprecision.

“She’ll be right” is Oz at least as much as NZ.

I also got Hungarian second, which is probably because something had to be second.

Is it? Can’t say I ever heard it from a Strine speaker. Then again, it seems to mostly crop up in mocking the national self-stereotype in NZ speakers. Still, considering how much we share, it certainly is not a surprise that Aussies say that too.

“She’ll be right”, also “She’ll be apples” was standard in Oz in my youth. but may have gone out of fashion ?

It would extend it vastly, but they might get more accurate results if they put in some slang terms and asked if the reader used them natively. For example, “bingle” for fender-bender (US – minor car accident) seems to me to be uniquely Australian, and entered the language after I left there in the late 1950s. The origin seems to be unknown.

I didn’t find the Chomskybot sentences that bad, though maybe they would have been better with some more commas.

I didn’t mean they were bad sentences, just that they didn’t seem to fit with the rest — to me, any sentence you have to parse and think about is not a good test sentence. But I was scarred for life by having to take a course in Transformational Grammar in grad school, so I may be overreacting.

Due to closely related grammatical systems it’s no wonder that so many of the respondents above get classified as Dutch or Scandinavian. But four of you have been guessed upon as possible native speakers of Hungarian! What makes the algorithm believe that?

Hungarian is about as far from English as you can get – at least in Europe. It’s agglutinating, lacks prepositions, has 18 noun cases, practically nothing of its basic lexicon in common with English and a varying so called “topic-prominence” word order, unlike the quite strict one in English. Had somebody approved a sentence like “Air-by they went Budapest-from London-to” I would have suspected a Magyar.

Either there is a bug in the algorithm or a hitherto undiscovered underlying common structure in Germanic and Uralic.

I’m from Melbourne, Australia and it picked my English variant as 1.NZ 2. Oz 3. Black vernacular/Ebonics
Might have been the “she’ll be right” but I agree that that phrase is every bit as Oz as NZ. I chose a sentence with “the man that…” but there was no option for my preferred “the man who…”. It correctly picked me as native English speaker and chose Swedish as 2.

Had somebody approved a sentence like “Air-by they went Budapest-from London-to” I would have suspected a Magyar.

Of course not. I interact with Hungarians all the time, and have never heard a native Hungarian speaker offer anything even close to that in English, and I would be shocked if one ever did – just as I don’t have to be particularly fluent in Japanese to know that “karera-ga itta kara London e Washington de sora” is utterly ridiculous. I assume the algorithms are sophisticated enough to look for more subtle deviations.

iching:
I too thought it was interesting that there was no ‘who’ option in ‘the-man-X’ sentence. I suspect that ‘that’ has become the more common relative pronoun in reference to an impersonal noun (like man, woman, person, teacher, etc.), at least in casual speech. But, it would clearly be ungrammatical to say, ‘*John, that came yesterday, is still here.’

P.S. It is highly unlikely that the absence of the ‘who’ option was an inadvertent omission. I wonder if this was designed to flush out non-native speakers who (that?) have been drilled on prescriptive grammar.

The fact that the Hungarian native speakers who have taken the test (who can’t possibly be representative of Hungarian native speakers generally, or even Hungarian native speakers who know English) came up with similar results. That’s it. It has nothing to do with similarities between Hungarian and English, only between a tiny subset of hungarophones and a tiny subset of anglophones.

They’re not disclosing the probabilities assigned to their guesses. Suppose their algorithm thinks it’s 98.0% likely I’m a native Engish speaker, 0.45% likely I’m a native Dutch speaker, 0.40% likely I’m a native Swedish speaker, 0.35% likely I’m a native Hungarian speaker, and so on, with increasingly smaller percentage chances assigned to other possibilities. The distances between all of the contenders for 2d and 3d place may well be significantly less than the margin of error, making the difference between an L1 Anglophone test-taker who gets “Norwegian” as second-best guess and one who instead gets “Hungarian” perhaps largely-to-entirely the result of random statistical noise.

Our top three guesses for your English dialect:
1. American (Standard)
2. Australian
3. Singaporean

Our top three guesses for your native (first) language:
1. English
2. German
3. Romanian

Interesting. First, it’s true that my written English (as opposed to my sound system) is largely American thanks to science and teh intarwebz, and I deliberately avoided “the committee were divided”, but is that enough to put England as not even the third guess? Probably there isn’t much transatlantic difference in syntax in the first place.

Romanian, that’s intriguing. I made an effort to avoid sentences that would work in German. Maybe I should sulk now that Norwegian isn’t among the top 3.

Afterwards, when they ask about your native and currently primary languages, “German” and “Swiss German” are separate entries. Other kinds of “German” are not distinguished.

After you tell them about your dialect, clickable question marks appear next to the guesses:

“Determining dialect

Here are some ways the algorithm has learned to distinguish different English dialects:

Here are some of the ways the algorithm has learned to guess native language:

1. If you are a good match for one of the standard English dialects (American, Canadian, etc.), you are probably a native English-speaker.

2. Non-native English speakers rarely use Irishisms, Scottishisms, or other regionally-specific language.

The algorithm is only just starting to be able to guess native languages other than English. As it learns more, we’ll know more about what the distinguishing features of different language backgrounds are.”

In Russian, in the days when I wrote in online forums semi-anonymously, I took great care to introduce occasional grammatical and spelling errors, whenever I didn’t want to be identified as a foreign resident / a Jew / an intelligentsia living fossil. Native born sons shall have no fear of an occasional misspelling and misspunctuation!

My grandfather, who was a native speaker of Welsh and English, but did know German well, having done postgraduate study in Germany, really was told once by a German girl that the only thing that gave him away as a foreigner was that his German was too perfect.

She may have been blinded by his charm of course (hereditary with us.) Or just polite. I think she may merely have meant that he was given away as non-native by the fact that he was talking rather too much like a book.

I can’t say I’ve ever correctly identified someone as foreign because their English was too good.

I realize that my Hungarian example was exaggerated. But I imagined, that the algorithm judged from repeated syntactic errors done by non natives of a particular parentage. An example: Swedish shares with English the SVO word order but in expressions where an emphasized word is put first a ‘V2’ rule takes over, ie the verb comes in second place in the sentence. Thus Swedes could very well accept all the following four sentences as correct English but be revealed by the algorithm:

Now I don’t think I belong to a particular English dialect, but being Czech I would expect a Slavic language to appear in the second set, or at least a language that lacks articles, as patterns in article usage are often good evidence about the speaker’s origin.

Swedish shares with English the SVO word order but in expressions where an emphasized word is put first a ‘V2’ rule takes over, ie the verb comes in second place in the sentence. Thus Swedes could very well accept all the following four sentences as correct English

You’re assuming that people commonly insert the syntax of their own language when using a foreign one, which is probably the case only at a very low level of language ability. One of the first things teachers do is wean students away from that sort of thing (“wean,” in the case of some of the teachers of my long-ago youth, including things like “yell at” and “chase around the classroom brandishing a ruler”).

I posted a link to this thread elsewhere; from here on people post their quiz results, some accurate, some less.

I think she may merely have meant that he was given away as non-native by the fact that he was talking rather too much like a book.

Either that, or (which is the same at least in German) his German lacked regional features. Given how pluricentric even Standard German is, that’s extremely rare – the actor Sky du Mont manages it (I can’t place him more precisely than “Germany”; turns out he grew up in Argentina), and that’s it to the best of my knowledge.

Or would in any English speaking out-of-the-way-corner of the world the last three ones be considered grammatical?

The middle two, at least, are acceptable in poetry.

Why just the suggestion: Hungarian?

I bet one Hungarian has ever taken the quiz…

You’re assuming that people commonly insert the syntax of their own language when using a foreign one, which is probably the case only at a very low level of language ability.

Depends. You’re right on the simplest, most common things like SVO (though I’ve seen pretty fluent people forget to use “do” in questions and negations, or express the past tense twice in negations like “I didn’t drove there”); but the native language does often shine through in rarer, sentence-level problems.

Like RF, my wife is a native English speaker and wound up with Norwegian as the first choice for her first language (followed by Dutch and then English). Her first choice for dialect was US Black Vernacular as her dialect (followed by South African and then US Standard).

My suspicion is that this is an artifact of her Appalachian background.

I’ve certainly heard things like “I’m done my homework” from Canadians. It sounds only the slightest bit odd to my standard American ear, although I would never use it myself. “I’m finished my homework” sounds much stranger

The test correctly nailed me as a Canadian. I grew up in Southern Ontario, and “I’m finished my homework” sounds fine to me. Although it’s been a while since I’ve said it, I believe it would be my first option.

I tried the test several times, over several days, but didn’t successfully finish it. Each time, it “hung” with the “next” button gr[a/e]yed out, and no poking had any further effect. The Back button took me to the “new tab” page. This was on a Macbook Pro with the latest OSX (Mavericks) installed. I used several browsers, including Safari, Chrome, Firefox, and Opera; all of them hung at some point.

Too bad; I was a bit curious about what it’d tell me. I’ve lived in several parts of the US, including all four corners (WA, AZ, FL, MA), and two middlish states (WI, AR), with lots of education, and several linguistic profs have told me that I’m useless for linguistics research because my dialect is too mixed. Thus, the one about whether a committee was/were divided, both sound correct to me, though I’d probably say “was”. (Unless I’d just been reading something from an English source and the effect hadn’t yet worn off. 😉

“I’m finished with my homework” suggests to me that there may be more homework to do, but I have no intention of doing it.
“I’m finished my homework” means the same as “I’ve finished my homework”, and almost the same as “I’ve my homework finished” or again “I’m after finishing my homework”.

DE: I can’t say I’ve ever correctly identified someone as foreign because their English was too good.

Some years ago when I was still teaching, there was a meeting of chairs of French departments in our (Canadian) province. We all travelled for a meeting at one small university on a Saturday (when there would be no classes). The meeting was conducted entirely in French, spoken in various accents both native and non-native. I was puzzled by the speech of the host, who spoke grammatically perfect French and whose delivery was so precise and phonetically accurate along conservative lines that he almost have been taken for an actor trained for the classical French repertoire (eg Corneille or Racine in the 17C). Would an actual Frenchman have spoken that way? The giveaway came after several hours, when he spotted a student apparently looking for someone or something in the vicinity of the meeting, and said to him: Monsieur, …. qu’est-ce que tu veux? (Sir, what do you want?). On top of the “too good” French accent, the incongruous combination of the relatively formal Monsieur spoken to a student and the familiar tu form made it impossible to think that the speaker could be from France.

the incongruous combination of the relatively formal Monsieur spoken to a student and the familiar tu form made it impossible to think that the speaker could be from France.

That reminds me of what I was told about the great scholar Rudolf Thurneysen when I was studying Old Irish in Dublin: that he could identify the most obscure deuterotonic form, root hopelessly mashed by vowel reduction, at a glance, but he was shaky on when to use the two different verbs meaning ‘to be’ (is and tá), even though no one who’s studied the modern language for more than a week could possibly confuse them. (This was in the context of urging us to study Modern Irish as well, and it worked.)

Actually, royal princes called each other by the names of the territories they had title to, so Louis XIII would have called his brother first Anjou and later Orléans while when adult, others called him (and referred to him as) Monsieur.

This brother may have been the first to be called officially Monsieur but the term also applied to the next eldest brothers of later kings.

JC: Talking about Monsieur in the context of the royal family would be like saying Sa Majesté when referring to the king. If one of these people is replaced by the normal successor, each title within the hierarchy automatically gets changed accordingly. If there is a possibility of ambiguity, it is always possible to add the person’s name, as Sa Majesté Louis XIV.

During the early years of Louis XIV (1643-1660), his brother was known as le Petit Monsieur and his uncle as le Grand Monsieur. Other than that period, Monsieur was the brother of Henri III, Louis XIII, Louis XIV, Louis XVI (this Monsieur became Louis XVII), and Louis XVII. Louis XV and Louis-Philippe had no living brothers during their reigns.

Oops, I saved too soon. I meant to write “the title of Monsieur was held by the brothers of Henri III” etc., and to add that in modern times, perhaps because of his longevity, Monsieur unqualified generally means the brother of Louis XIII.

JC, There was never officially a Louis XVII: the name was given informally by royalists to the son of Louis XVI, who would normally have succeeded him but died as a child during the Revolution, after the execution of his parents. He did not have a younger brother, he had himself been the younger brother of the elder prince and heir, who had died before the revolution. Louis XVI did have two younger brothers, who went into exile and later became Louis XVIII (not Louis XVII, by respect for the child who would normally have inherited the kingship) and Charles X respectively (since Louis XVIII did not have a male heir).

Louis XVIII did have a living brother, who was his heir and did become king as Charles X. Louis XVIII did not do too bad a job as king, but Charles X was so reactionary and became so impopular that he was toppled by another (but short-lived) revolution.

And you can support my book habit without even spending money on me by following my Amazon links to do your shopping (if, of course, you like shopping on Amazon); I get a small percentage of every dollar spent while someone is following my referral links, and every month I get a gift certificate that allows me to buy a few books (or, if someone has bought a big-ticket item, even more). You will not only get your purchases, you will get my blessings and a karmic boost!

Favorite rave review, by Teju Cole:
"Evidence that the internet is not as idiotic as it often looks. This site is called Language Hat and it deals with many issues of a linguistic flavor. It's a beacon of attentiveness and crisp thinking, and an excellent substitute for the daily news."

From "commonbeauty"

(Cole's blog circa 2003)

All comments are copyright their original posters. Only messages signed "languagehat" are property of and attributable to languagehat.com. All other messages and opinions expressed herein are those of the author and do not necessarily state or reflect those of languagehat.com. Languagehat.com does not endorse any potential defamatory opinions of readers, and readers should post opinions regarding third parties at their own risk. Languagehat.com reserves the right to alter or delete any questionable material posted on this site.