Research into grammar by academics at Northumbria University suggests that a significant proportion of native English speakers are unable to understand some basic sentences.

The findings — which undermine the assumption that all speakers have a core ability to use grammatical cues — could have significant implications for education, communication and linguistic theory.

There hasn't been a lot of press uptake so far — just three articles as to this moment indexed by Google News, with the UPI the most prominent, although the press release is a couple of days old. But half a dozen people have sent me email about it, so it's creating some buzz nevertheless.

Actives:
The boy photographed the girl.
The soldier grabbed the sailor.
The man carried the woman.
The girl fed the boy.
The sailor hit the soldier.
The soldier pushed the sailor.

Passives:
The girl was hugged by the boy.
The woman was chased by the man.
The woman was pulled by the man.
The soldier was frightened by the sailor.
The sailor was kicked by the soldier.
The man was kissed by the woman.

Q-is:
Every umbrella is in a stand.
Every feather is in a vase.
Every toothbrush is in a mug.
Every ball is in a box.
Every pencil is in a jar.
Every cake is in a tin.

Q-has:
Every shoe has a hamster in it.
Every bowl has a turtle in it.
Every cone has an ice cream in it.
Every pot has a windmill in it.
Every basket has a dog in it.
Every dish has an orange in it.

On each trial, the subject's task is to say whether the sentence is true of the picture. The pairings of picture-types and sentence-types were semi-randomized, so that two of the same sentence type were never adjacent. Each subject got six instances of each sentence type.

The subjects were of two types: "High Academic Attainment" (basically graduate students) and "Low Academic Attainment" (who "had at most 11 years of formal education and were employed as shelf-stackers, packers, assemblers, or clerical workers"). Unsurprisingly, the HAA subjects never made a mistake. You may or may not be surprised to learn that the LAA subject often got things wrong, and that their error rate varied systematically with the type of sentence:

Condition

HAA (N=19)

LAA (N=31)

Active

Mean (SD)

100 (0)

97 (6)

Median

100

100

Min-Max

100-100

83-100

Passive

Mean (SD)

100 (0)

88 (18)

Median

100

100

Min-Max

100-100

33-100

Q-is

Mean (SD)

100 (0)

78 (24)

Median

100

83

Min-Max

100-100

0-100

Q-has

Mean (SD)

100 (0)

43 (30)

Median

100

33

Min-Max

100-100

0-100

Here's a bit more detail on the distribution of mistakes across LAA subjects in the passive, Q-is and Q-has cases:

(Note that the bars are plotted just to the left of the corresponding number at the high end of the scale, and just to the right of the corresponding number at the low end of the scale, somewhat confusingly…)

In a second experiment, Street and Dabrowska looked at the effects of systematic instruction in this particular task on subjects' performance.

Overall, I found the experiments fairly convincing. I do worry a bit about what some colleagues and I used to call "the paper airplane effect": At one point we thought we had discovered that a certain fraction of the population is surprisingly deaf to certain fairly easy speech-perception distinctions; the effect, noted in a population of high-school-student subjects, was replicable; but observing one group of subjects more closely, we observed that a similar fraction spent the experiment surreptitiously launching paper airplanes and spitballs at one another. In my opinion, Street & Dabrowska's learning experiment (not discussed here yet) lowers but doesn't eliminate the possibility that the effect is due as much to a difference in diligence and attentiveness as to a difference in linguistic knowledge.

But almost half a century after the work of Peter Wason (see here and here), I don't think anyone should find it shocking that significant numbers of people find it difficult to "understand" some fairly elementary sentences. I don't mean to say that there's nothing new here, just that Dabrowska seems to me to overstate the "consensus" about the distribution of linguistic (and in particular semantic) abilities of certain sorts.

Anyhow, I've got to check out of my hotel and head for the airport, so that's all I have time to say now. I'll take this up again after I get back to Philadelphia.

61 Comments

Stephen Jones said,

Seems to me a question of attention spans, practice in doing pointless exercises, and the totally artificial nature of the experiment.

[(myl) That's basically the "paper airplane" theory; and I agree that it's likely to be part of the story. The question is whether it's the whole story; the authors' learning experiment helps, I think. But mostly, it seems to me that the very-well-documented Wason selection task, where the grad students will mostly get it "wrong" (or "right", depending on the content of the material), shows that some very simple sentences can be very hard for people to interpret "correctly".]

If the experimenter had asked the participants to freely give a sentence describing the picture, and there had turned out to be a large number of inappropriate or incorrect sentences there might be some evidence for the proposition.

[(myl) This is a good point; and I guess if they had found that the LAA subjects had a non-trivial propensity for getting simple passive sentences backwards in their own spontaneous descriptions of pictures, that would be much more convincing and much more surprising.]

Adouma said,

@Tom D
Maybe it's just because they're unlikely to ever be uttered. I can't think of any situation in which I'd say "every cake is in a tin." Maybe "all the cakes are in tins," or "I put all the cakes away." But I can't say why those examples are so unnatural either.

[(myl) As I observe here, a corpus of 400 million words has no instances of the pattern "Every NOUN is in a|an NOUN". So I think your evaluation of unnaturalness is correct.]

Could it be some degree of misunderstanding was introduced by the pictures? To me at least, they aren't clear. The first pair assumes you know how a US sailor dresses, and can deduce that the guy in unsoldierly vivid green is some kind of soldier. The second pair involves some indeterminate woolly animals in what look like red cat's-food bowls. Is the effort to identify them part of the test?

[(myl) Since the experiments were done in the UK, I assume that the "soldiers" are supposed to look like soldiers to Brits. And since everybody got the active sentences pretty much 100% correct, apparently that worked. As for the indeterminate woolly animals in red bowl-like containers (apparently supposed to be baskets), I assume that cross-Atlantic cultural differences are neutral on this one. Could some LAA subjects have been confused about whether the key task was sentence-interpretation or picture-plausibility checking? Maybe so.]

I can imagine a context or two for the Q-is and Q-has, like a hotel's banquet operation, with the supervisor checking on setup.

"Did you set out the linen?"
"Yes, and every napkin is on a plate."

"What about the water goblets?"
"Every goblet has a lemon slice in it."

What I'm thinking of is a kind of detailed confirmation in response to a general question. Is X true? Yes, and what's more, that includes Y and Z as part of X. Not common, necessarily, but not completely arbitrary. Especially if the boss is a micromanager.

That's a soldier? From WWI, maybe. But while I doubt people were really clear on id'ing some of the drawings at that basic level, if I was given that picture with the labels "soldier" and "sailor" I could answer it. I doubt I've have described it using them. The dogs in baskets(?) look more like hamsters in bowls, but again, I think I'd be focused on the "every X has a Y in it" and just sneer mentally at the drawing.

That doesn't mean that others might say "there are no baskets in this picture". Didn't they discover that many people "incorrectly" pair up spoon with bowl instead of fork? Biases like that might be at play here.

But I think the boring nature of the test is far more likely to be at play here.

I can imagine some people looking at one of the second pair of images and deciding that "every basket has a dog in it" is false, because the containers don't look like baskets and the contents do not look unambiguously like dogs. More generally, the two groups of participants probably bring different expectations about testing situations to the task. The graduate students would be very good at guessing what the experimenter intends them to attend to.

DW said,

"But I think the boring nature of the test is far more likely to be at play here."

This. This and the fact that test-taking norms are cultural. The successful test-taker must be willing to "mentally sneer," as another poster put it, at dogs in baskets that look more like hamsters in bowls yet obediently answer the question straightforwardly, and proceed through several pages or hours' worth of similar inanities while suppressing the urge to defenestrate. Not everyone is willing to conform to this test-taking norm, or has any incentive to do so. The subjects who are "high achievers" got that way by proving not only their ability to answer the questions but also their willingness to conform to the test-taking norm, despite the fact that taking such a test is like experiencing a brief period of psychosis.

Bill Walderman said,

Were the sentences spoken to the subjects or were the subjects required to respond after reading the sentences? If the sentences had to be read, could poor reading ability have played a role in the LAA group's difficulty with some of the sentences? Also, could dialect or register differences have played a role, i.e., maybe the syntax of some of the sentences was different from the syntax that members of the LAA group use in their daily speech.

Mr Fnortner said,

I recognize that my comment is not academically rigorous, yet we all have had experiences that reveal another's incapacity to understand, or follow instructions. Having been disappointed by trades people, mechanics, co-workers, bosses(!), and launderers, for instance, we are probably all convinced that more than a few people cannot thoroughly comprehend their native tongue.

Whether the fault generally lies with flaws in the speaker's use of language, or in the listener's incapacity to understand, could be a research topic. The concern, shared by many here, that the illustrations, sentences, and process are off-putting and an important source of error should be examined. Also, it would be nice to know in what proportion the same difficulty in understanding would be found in those with prestigious jobs, and to what extent those with menial jobs have sound language skills.

a said,

I've believed for years that difficulty with passives has been responsible for some students' poor performance on multiple-choice exams. There is that "paper airplane effect", of course, and I've especially noticed the problem with non-native speakers of English, but I've also definitely seen motivated, native-speaker students with reasonable command of the material stumble on exam questions purely because of the syntax of the answer choices. Students do also get these backwards in their own writing sometimes: "precedes" and "is preceded by" are pretty much interchangeable for many undergraduates writing answers to phonology questions (things may be a bit better for "follows" and "is followed by", but the problem still comes up).

Mr Punch said,

Dabrowska claims that low academic achievers can't understand some simple sentences, on the basis of their relatively poor performance on a test. Since low academic achievers tend to perform less well on tests (and more so in Britain, I believe, than in the US), however, the methodology seems to introduce a systematic bias in the direction of the claimed result. This is, I think, something more than the "paper airplane effect."

michael farris said,

Slightly off-topic but not really. IME people don't really respond to written negative instructions very well. That is if you give people instructions (verbally or in print) on what they're NOT supposed to do, a non-trivial percentage will go ahead and do just exactlyt that (because they honestly interepret the instructions to be what they're supposed to do).

So remember never use negatives in telling people what you want them to do.

mausi said,

So subjects that didn't go through academic conditioning were too smart to take those stupid tests seriously. Maybe there should have been a reward for "getting it right" (correctly anticipating the expected answers)?

chris said,

Stephen Jones said,

But mostly, it seems to me that the very-well-documented Wason selection task, where the grad students will mostly get it "wrong" (or "right", depending on the content of the material), shows that some very simple sentences can be very hard for people to interpret "correctly".

But isn't the point about the Wason task that doing it successfully requires abstraction.

Every basket has a dog in it.

This sentence will confuse because what stands out in the picture is the dog without the basket, not the three dogs in the basket.

In the cases of the actives and passives the point is that the first two pictures don't provide us with a theme; if you had a prior picture of the person hit walking down the street then I suspect the scores would have been higher.

Beth said,

Even if the results were due to lack of attention (the paper-airplane effect), what of it? This just demonstrates (unscientifically) that lower-educated people are less likely to pay attention, even during important moments such as scientific studies. So if the result is due to inattention, we might expect that result to manifest in various other situations (work, personal transactions) where the person is similarly inattentive.

What confused me was the fact that "Participants were tested individually in a similar setting at the place were they worked or studied" (p 10). I find this odd and not in keeping with standard psycholinguistics methodology. this may have also contributed to the paper airplane effect.

Daniel Johnson said,

Both the "Q-is" and "Q-has" are classroom-standard ways of expressing mathematical/logical statements about subset relationships, but are rather artificial outside of that kind of technical use. I would think something like "There are enough beds for all the little dogs" or "Each bed has a little dog in it" would be a more common way of expressing the same notion in normal discourse.

Diane said,

Seems to me like an effective way to rule out the "paper airplane effect" is to provide an incentive to perform well. Offering a twenty-dollar-bill for every correct answer on a 10-question test would surely motivate them to actually put in the required two minutes of effort.

[(myl) Yes, incentives are well known to improve performance in such experiments. The students, of course, have 17 years of incentives still actively influencing their behavior, while the "LAA" subjects are some years past this time of their lives, and presumably were never as deeply incentivized for test-taking in the first place.]

I wonder if the paper-airplane effect can be reduced by offering the participants a slight financial incentive (say, you tell them they'll get $1 per correct answer)?

Anyways, for this particular experiment, it seems hard for me to believe that the results are nothing more than the paper-airplane effect: if so, how do you account for the fact that both of the groups got virtually perfect scores on the "control questions" (the active sentences)? I guess it could be that the passive and quantifier sentences require more attention for most people to parse, and once a sentence reached this level of difficulty, a significant portion of the test-takers started to zone out. But then there would still be a couple of interesting (IMHO) things going on here:

1. Even the simplest (syntactically-speaking) passive or quantified sentences require a significant and measurably-greater amount of X for many adults to parse, where X is some combination of effort, motivation, understanding, preparation, or some unknown factor(s);

2. Being able to parse passive sentences (at least, in this kind of testing environment…) seems to corollate with educational attainment.

The Street-Dabrowska experiments are more surprising to me than the Wason selection test because I would have expected that the passive and "Q-is" sentences are "simple enough" that *everyone* would get near-perfect scores on them. It's unsurprising that if you gave people, say, passages from Edward Gibbons, then people who had devoted more of their lives to academic pursuits would have an easier time with them; but the fact that the passives and quantifiers in the sentences of the experiment were tripping people up surprised me.

[(myl) But the Wason sentences are simple "if A then B" constructions — and the commonest form of error is to generalize them to "A if and only if B", or perhaps to invert the implication entirely. Similarly, I expect that subjects' commonest mistake on sentences of the form "Every X is in a Y" is to generalize the mapping to a one-to-one correspondence between X's and Y's, or perhaps to check the mapping backwards from Y's to X's.]

Terry Collmann said,

Le Mur said,

DW: Not everyone is willing to conform to this test-taking norm, or has any incentive to do so.

I had two first thoughts about it: that the LAA group is more likely to goof-off and/or 'prank' the testers*, and that, on the other hand, some people are stupid. But since "their error rate varied systematically with the type of sentence," I'll go with the latter.

[(myl) The effect of lower attention and motivation in such experiments will interact with test-item characteristics. Think about doing timed arithmetic: you're not likely to get 2×2 wrong even if you're not paying much attention, but 6×13 might be a different story.]

As for the sailor and soldier, the same two guys are on the left and right of both pictures, so I can't tell soldier from sailor since both dress as the other.

Richard said,

My first question was the same as @Bill Walderman's, were these sentences put up as text or read to the participant? I'd bet you'd get a stronger paper airplane effect from people who don't read much to start with.

[(myl) The sentences were read out loud to the subjects (or perhaps played from a recording, it's not clear).]

Ran Ari-Gur said,

@Beth: Even if a paper-airplane effect is interesting and meaningful, I would be very disappointed in a scientist who didn't care about the underlying causes of observed effects. You speculate that anything causing these observed effects would have identical effects in other circumstances; but you give no evidence for that speculation, and I doubt it greatly.

This experiment reminds me a bit of Luria's investigations into Russian peasants' understanding (-slash- willingness to accept) formal syllogisms and other artificial logical constructs. IIRC, he found that the peasants with even a little bit of formal education were "better" at applying the reasoning that he supplied (rather than reaching their own conclusions in different ways).

[(myl) The connection to Luria is worth following up — some discussion and links are here.]

I haven't yet read the article, so I don't know if they make this point, but: the responses on the Q-is and Q-has sound like they're similar to children's responses (thinking of the variety of experiments on children's comprehension of quantifiers; references available on request). And if I recall correctly, though it's been a little while since I've taught a class with these papers on the syllabus, there was a little debate as to whether children were failing to understand the quantifiers, or if they were reacting to some other pragmatic issue.

Which brings it to Dave Ferguson's comment above: if a hotel worker reports to their boss "Every napkin is on a plate", what they're really reporting, pragmatically, is a one-to-one correspondence between napkins and plates; they'd be guilty of some sort of Gricean violation if they only had eight napkins and forty plates, but nevertheless put each napkin on a plate and left the remaining 32 plates napkinless. So the HAA subjects might have gotten things Technically Correct, while the LAA subjects were getting things pragmatically correct.

[(myl) This is an excellent point, and surely is a key part of the explanation. Note that it's related to Wason's result, in which many people act as if "if" meant "iff". It doesn't explain the difference between Q-is and Q-has, though.]

If that's what's going on, is there a publishable result? Oh, certainly. But when a press release says that some English speakers don't "understand some basic sentences", we might want to consider whether it's semantic parsing or pragmatic parsing that counts as "understanding" a sentence, and which group it is that failed to understand. (If I say "Speaking as a linguist, these results are intriguing", and a LAA subject says "Really? Why do you think that?" while a HAA subject says "Really? These results are speaking as a linguist? I don't understand"–well, then I know who parsed the sentence "grammatically correctly", and I also know who understood it.)

Directly relevant: See the remarkable book Gleitman, Lila R., and Gleitman, Henry. 1970. Phrase and Paraphrase: Some Innovative Uses of Language. New York: W. W. Norton. There is a good review of it by Terry Langedoen here: dingo.sbs.arizona.edu/~langendoen/ReviewOfGleitman.pdf. I suspect the only reason the book didn't become a standard in linguistics curricula was that the conclusions are evidently "politically incorrect", just like the conclusions of the Street and Dabrowska experiments. I think the Gleitman and Gleitman results are more convincing than S &D's, from the little I've seen of the latter. But Langendoen argues that even they don't really refute the Chomskian dogma of universally equal competence. (One reasonable point Langendoen makes is about the need to distinguish 'language facility' from 'language competence'.) (He has high praise for the book even though he doesn't agree with the conclusions.)
The experiments in the book concern people's ability to understand and paraphrase novel 3-word compounds built from different word orders and different stress patterns of various 3-word sets. For instance, from 'black', 'house', 'bird', one can build both 'easy' and 'fully well-formed' compounds like 'black house-bird' (like Langendoen, I'm using hyphens to show compounded parts, which is enough to determine what the stress pattern will be), and 'difficult', 'semantically implausible', grammatically questionable compounds like 'house-bird black'. College students and up agreed about what were possible paraphrases — e.g. for 'house-bird black', it could be the shade of black that house-birds are, or stuff you use to blacken house-birds; there are open-endedly many possibilities, but it canNOT mean 'a black bird that lives in the/a house'. In the experiment, graduate students as a group scored much higher than secretaries who had not completed college. And to make sure it was not just failure of imagination, they repeated the experiment in multiple-choice form. And lo and behold, grad students picked grad-student paraphrases and secretaries picked secretary paraphrases.
Even I am uncomfortable reporting this published work, even in this forum, because I think our department's secretaries are fantastic, and I would not like them to think I have some anti-secretary beliefs as a result of the study. (I'm sure the same is true of the Gleitmans; they were brave.) I have no idea how they would respond to 'house-bird black', but I'd sooner rethink the role of pragmatics in processing than think any the less of our secretaries; but it's hard to discuss this work without it seeming to be about who's smarter.
The Gleitmans note that when the secretaries produced 'wrong' answers, there was a strong tendency to 'err' in the direction of producing meanings that were more plausible (as in the case of my example above with 'black bird who lives in a house'). This seems related to the notion of charitable interpretation. So maybe secretaries are more inclined to try to make sense of what is said to them and academics are more inclined to be pedantically literal. I suspect it's good to have plenty of each in the world.

p.s. Evidently Lance and I were composing at the same time. We both end up making similar points, namely about the need to pay more attention to the possibly different priorities given to pragmatic appropriateness and grammatical correctness among people whose internal grammatical competence may be identical.

Jerry Friedman said,

@Barbara Partee: That's very interesting! (And not just because I've commented on this blog that I think some people ignore a great deal of syntax in understanding language.)

In the experiment, graduate students as a group scored much higher than secretaries who had not completed college.

it's hard to discuss this work without it seeming to be about who's smarter.

If the Gleitmans had wanted to avoid discussions of who's smarter, they could have rated responses as "pragmatic" versus "syntactic" or something, instead scoring "low" and "high".

It would have been interesting to present the questions in a context where the pragmatically likely answer was better; toddlers' utterances, for instance. "This little girl said 'house-bird black'. What do you think she meant?" I suspect many of the grad students would have given the same answers as the secretaries, which means they'd be as good or better at both comprehension tasks (sorry), and a few would have had trouble applying pragmatic knowledge.

Matt said,

"Does that mean you take pride in the extraordinary achievement of the advertisers and deodorant-company executives and investors who gain high incomes by fooling Joe Schmoe?"

My point is, the term "income inequality" is just as manipulative as Madison Avenue dopy. But at least if one points out to Joe Shmoe about the deodorant ads, he"s glad to be undeceived. You, on the other hand, prefer to stay manipulated by PC jargon.

People who invest in, manufacture or sell anything (including stuff that keeps people from stinking up the joint) is what it's all about. If you hate them, you hate a hell of a lot of people, from Steve Jobs to the guy with a hotdog stand. These people pay taxes which go to academics so they can tell us all about "income inequality."

Doug said,

"This sentence ['Every basket has a dog in it.'] will confuse because what stands out in the picture is the dog without the basket, not the three dogs in the basket."

Is there a way to make it non-confusing without spoiling the point of the experiment? The idea is to check whether the respondents will affirm that sentence in cases where "Every basket has a dog in it" is true but the sentence "Every dog is in a basket" is false. If you just erase the extra dog and have a one-to-one correspondence, there's no confusion but no experiment either, since people with the wrong interpretation will give the right answer.

Lila Gleitman said,

So, here is a first response re the ancient Gleitman finding of massive population differences in laboratory paraphrasing "tests," studies we conducted more than 40 years ago when *psycholinguistics* was Brand New (Barbara, thanks for remembering that old monograph). Our conclusion from this work actually was not (as most of our readers concluded, perhaps because of our lumpish prose but perhaps because the differences in subjects' response style are really striking at first glance) that there are huge grammar/morphology differences among the normal speaker-listener population, but rather that factors other than grammatical knowledge play a much larger role in linguistic performance than was acknowledged at that early date, e.g., plausibility effects of the kinds that we suggested — and that commentators in the present dialogue have been suggesting as well. Such influences are very well known today, no one would find them surprising at all. A particularly important effect (as it least one commentator above suggested) is what's called today "cognitive control." In the present context, the ability or inclination to inhibit a prepotent response (people like Trueswell and Thompson-Schill have had much to say on this topic, really interesting). No place here for a long discussion but suffice to say that the most educated group in our old experiments delayed longer in answering when the structure was difficult than when it was transparent, whereas the less educated group responded more quickly overall ("impulsivity"?) — and answered hard questions as fast as they did easy ones. So it's pretty easy to conclude, as we did, that such tests don't test "competence" or "knowledge of grammar" equally across educational populations because they are differentially subject to other influences. BTW simply selecting the groups as we did in The Bad Old Days is something I would never never do today, of course, I blush in retrospect. BTW also the study being cited here as the source of the present blogging is so amateurishly designed and conducted that nobody should fret about it, I think, or try to disentangle its facts and artifacts.

the other Mark P said,

Research into grammar by academics at Northumbria University suggests that a significant proportion of native English speakers are unable to understand some basic sentences.

There is an assumption here – that the questions causing the trouble are "basic".

I can assert quite happily that "a significant proportion people are unable to understand some basic graphs" because to me a log graph is not complicated. That might be because I'm a Maths teacher, rather than because log graphs are basic.

So perhaps all our valiant researchers have shown is that they are appallingly bad at recognising what are basis sentences!

To me the Q-is and Q-has sentences read like logic puzzles, not like simple English. I would not say Every shoe has a hamster in it. but would say instead There is a hamster in every shoe.

Write the sentences in what is actually the "basic" form and they are much easier to understand.

Jerry Friedman said,

@Matt: I sympathize with your frustration, but this is not the place to debate the issues you've brought up. Let me just assure you that I don't by any means hate all rich people, still less owners of hot dog stands.

Stephen Jones said,

Is there a way to make it non-confusing without spoiling the point of the experiment?

Depends what the point of the experiment is.

If the purpose of the experiment is to show that when there is a difference between the formal interpretation of a sentence and what you would expect pragmatically then lower academic achievers tend to plump for the latter, then the experiment is just fine as it is.

Stephen Jones said,

For the dogs in the basket one, change it for a set of hotel rooms and people. Then the sentence 'Every room has a guest in it' will not be dissonant with the problem that there is no room to place the spare guest in.

elinar said,

Maybe Dabrowska overstates the consensus on uniform linguistic competence. But it is nevertheless a common (and often unquestioned) assumption that all native speakers converge on the same grammar and any putative differences are merely the effect of extraneous influences, and nothing to do with REAL competence.

And maybe the study is methodologically flawed, but I still find their results interesting and their line of research worth pursuing.

Incidentally, Ngoni Chipere has carried out similar experiments with similar results. (See e.g. his article “Real language users”.) He also found that on some tasks, non-native graduates performed better than both native graduates and non-graduates, presumably because they had been taught grammar explicitly.

Of course, most of the constructions used in Chipere’s experiments are highly complex and completely unnatural – i.e. definitely not basic sentence types. I’m talking about such sentences as “The doctor knows that the fact that taking care of himself is essential surprises Tom”, which no one probably ever utters outside a syntax class.

The point is, though, that the data used by some linguists as evidence for innate universal grammar (REAL competence) consist of these kinds of complex (written language) structures that are only comprehensible to highly educated, linguistically aware people.

I find that somewhat ironic.

Frank Newton said,

So much to talk about — bad illustrations, philosophy of education, failures of communication, quantifiers.
1. Kapitano's right on about the campiness of the soldier-sailor pictures. The campiness tells me that the illustrator was either a woman or a child — someone who didn't have a first-hand perspective on keeping this stuff above suspicion. But as a librarian, I recently had the task of removing some 20- and 30-year old elementary education books from the library collection, and I have to say I think our elementary school children are (or used to be) exposed to a lot of bad drawing in the course of their schooling. These pictures aren't much worse than those in a lot of textbooks.
2. Tolerance of boredom and tolerance of nonsense are two predictors of success in education. I think tolerance of boredom is also a useful trait in adult life. Life throws boring tasks at everybody. As for nonsense, I divide nonsense into two categories — nonsense aimed at getting me to part with some of my money, and nonsense with no ulterior motive. I despise nonsense of the first kind, but I like some nonsense with no ulterior motive. But the higher mathematics was a very difficult kind of nonsense for me.
3. People who are sour on educational testing speak of the sense of futility that overwhelms them when trying to do tests. I can empathize with that, because I had exactly the same feeling when doing pushups and pullups. Why are we doing this? It's a very vivid memory.
4. The proposition that some people have stronger language capacity than others seems like a natural to me. The Reader's Digest quotations from what people write or say when they have to describe their own automobile accidents provide strong evidence for huge differences in different people's ability to prevent ambiguity from creeping into a story. That is the point of the saying from the Civil Rights era, "Tell it like it is!" People praise a good description of what happened or the way things are, because they know a good description is harder than it looks. Given that putting into words what happened is difficult for many people, I'm not surprised if there are also differences in people's ability to grasp the logic of grammar. Alone among field linguists, Leonard Bloomfield described differences in individual Wisconsin Menominees' ability to use the communicative and artistic resources of the Menominee language. His lack of respect for people not very good at using their native language is not something to imitate, but his honesty on the subject is refreshing (I have lost the reference to the article).
5. I disagree with the comparison drawn between the soldier-sailor sentences and the Wason selection task sentences. Consider these two sentences:
(a) The soldier was hit by the sailor.
(b) Which card(s) should you turn over in order to test the truth of the proposition that if a card shows an even number on one face, then its opposite face is red? [from the Wikipedia article "Wason selection task" cited by Liberman].
If a journalist wants to say that people who don't understand (a) are "unable to understand some basic sentences", I'm not going to complain too much. But sentence (b) is NOT a "basic sentence"!!!
6. I agree with Tom D., Adouma, and Daniel Johnson that the quantifier sentences are unidiomatic. The study of quantifier meaning is one of the refuges of prescriptive linguistics. I believe English speakers (at least those speaking American English) have a powerful preference for "all" over "each" and "every." The fact that "each" and "every" are often more precise than "all" simply isn't relevant from the point of view of descriptive linguistics. It's true that "Each dog is in a basket" rules out the possibility that one dog is sprawled across two baskets, while "All the dogs are in baskets" does not rule out that possibility. But if we are describing spoken English, we'd do better to report that people usually don't rule out the possibility that one dog is sprawled across two baskets.

Barney said,

Seems to me like an effective way to rule out the "paper airplane effect" is to provide an incentive to perform well. Offering a twenty-dollar-bill for every correct answer on a 10-question test would surely motivate them to actually put in the required two minutes of effort.

I wonder if the paper-airplane effect can be reduced by offering the participants a slight financial incentive (say, you tell them they'll get $1 per correct answer)?

I think these ideas would also be problematic, especially the first one, and particularly for people who don't have much money, as the prospect of winning or failing to win money might actually make it harder to interpret the sentences, especially with a significant sum like $200. With a low incentive like $1 per correct answer I wonder of people might feel more insulted by the low pay than by the idea of doing the test for no pay at all.

Dan Pink spends a lot of time talking about experiments look at the effect of these extrinsic motivators.

Doug said,

This discussion reminds my of the earlier Language Log thread about the fact that some people apparently responded differently when asked their opinion about allowing "gay men and lesbians" to serve openly in the military as opposed to allowjng "homosexuals" to do so. There was much puzzlement at this, but apparently the pollsters who uncovered this effect did not then track down any of these people and ask them what their reasoning was.

Similarly, here with the "every" sentences, someone really ought to find some respondents who gave incorrect answers and ask them to have another look, and explain why they gave those answers. That way we could see if they really have different interpretations of "every" sentences than we do, or if they were just bored, inattentive, clowning around, etc.

The experimenters do not seem to have done this [unless I missed it in my skimming of the paper], but they almost did so. There is this interesting bit from the "more individual differences" paper, in the section discussion the effect of training:

"Indeed several participants reported a ‘eureka’ experience as soon as the
particular construction was explained during the ‘grammar lesson’.5 These participants claimed that whereas in the pre-test they had simply guessed, they now knew what the correct answer was – and their performance corroborates this."

To me, this is the most convincing part of the whole thing. We have people who grew up speaking English who are telling us straight out that they could not reliably interpret these "every" sentences until they were explicitly explained in a "grammar lesson."

I find it astounding that English speakers could grow up without learning this, but there it is, so now I believe that such people really do exist, surprising (and disturbing) though that is.

Stephen Jones said,

To me, this is the most convincing part of the whole thing. We have people who grew up speaking English who are telling us straight out that they could not reliably interpret these "every" sentences until they were explicitly explained in a "grammar lesson."

Not until they were explicitly explained but until their attention was drawn to them. I suspect all participants were perfectly capable of producing and understanding the particular constructions when appropriate. But in this case the sentences were often not appropriate; that is to say they jarred with a natural explanation of what was happening.

DW said,

Aaron Davies said,

How do you avoid having someone dig up a technically feasible but seriously far-fetched parse? I would have no trouble allowing "house-bird black" as "a black bird which lives in a house" if the context were poetry.

Tyr said,

I have to note this is Northumbria University.
The odds are her uneducated lot were recruited from the area.
The postgrads though- the odds are they are from all over the country.
Geordies speak very different to standard English and uneducated people working menial jobs tend not to be able to code switch. In Geordie one would virtually never say "the girl was hugged by the boy", it would always just be "d'lad hugged d'lass".
These sentences…really read as rather alien to me. I understand them of course, they're correct English. But rather odd and not typical of everyday usage at all.

I happen to know quite a bit about this study both through having read it (which some of the commenters haven't) and through knowing the researchers involved, and also because I've been fortunate enough to be involved in making a programme that one of the researchers appeared on.

1) If you read the study you'll see that the participants were fine on a variety of other tests – and some of the low education participants were also fine on the hard grammatical sentences. So it can't just be test taking ability.

2) If they were poor at taking tests, then perhaps more test-taking would help them improve, but not a really quick training session. In fact it's the opposite – training of less than 5 minutes helped them do better at the construction they were trained on, but not at the other construction – and more test-taking practice didn't help them do better on the kind of sentence they weren't trained on. So it's VERY unlikely to be due to test-taking experience or attention.

3) Again if you read the study, participants weren't from Newcastle, but this seems irrelevant at best, and rather prejudiced at worst. Lots of people with strong Geordie accents go to university and get postgrad degrees and learn to read complex scientific material that contains lots of passives. Most people who speak most accents of English DO rarely use reversible passives in regular speech. The point is that a lot of people who speak English do understand them, even so – they must have got them from somewhere else than "English as she is spoke" in the streets of their native city.
What it *seems* like is that because for most sentences it's really easy to work out the meaning from context, people are doing that – and it's only where you can't do that they are tripping up. So this applies to the quantifiers ("every") where there's a straightforward meaning that most people can use in most circumstances – and if this clashes with the generally accepted meaning then if you've never thought about it or had it explained or had any situation where the one-to-one meaning clashes with what you see and hear, you've never needed to rethink your interpretation.

4) If you are a linguist trying to get this kind of thing published – woe betide you if you underestimate the consensus among linguists on early/innate universal grammar.

5) I doubt James Street, who is fairly multi-talented and also drew the pictures, would be too pleased to be categorised as "either a woman or a child".

6) We don't yet know how common it is for this type of participant to produce this type of sentence. But if you assess production you can't tell if people know sentences for sure – because you can avoid them if you can't work out how to use them.