The trouble is, as you can see, it says Origin of the species. This is a reasonably common mistake: the content words are present and the 'little words' get mixed up or left out. Doesn't matter? Well, maybe not, but maybe it does.

The way Darwin wrote it, it's clear that it refers to the process by which there are lots of different species (natural selection). The way it's written on this cover, it might do, because the plural of species is species. But it easily might not, and I suspect that it doesn't in the minds of the people who make this mistake. After all, one of the major ideas in this book is the idea that humans are descended from apes, so it's natural for people (self-centred as we are) to think of it as being about the origins of the human species.

Does it matter? Well, I think so, because humans are not the most important thing in the world, no matter what we may think. To us, though, we are. It would just be nice if people designing (quite expensive) products to sell could take two minutes to make them accurate.

Friday, 22 November 2013

A lovely instance of involuntary code-switching due to L1 (first language) influence on this LanguageLog post:

Biagio intends to write in English, because LanguageLog is an English-Language-medium blog. They use the Italian conjunction e 'and', presumably because it occurs between two Italian words and the instinct was just too strong to overcome the (conscious?) act of writing in English.

I've been wearing them for some time, and refer to them as my 'fox gloves'. I shift the primary stress to the first syllable (FOXgloves), indicating that I consider this to be a compound. During the time I've owned these gloves I've said the word 'foxglove', referring to the flower. And yet I've never, not once, realised that it would be hilariously witty to refer to them as foxgloves until this morning, when I wasn't even wearing them (it's proper winter glove weather now).

This is testament to the power of our language faculty to keep homophones apart. Puns wouldn't work, for instance, if we were constantly aware of similar-sounding strings. There's a joke which goes like this:

Two cats, one called OneTwoThree and one called UnDeuxTrois, were having a swimming race. Why did OneTwoThree win?

Because UnDeuxTrois cat sank!

This joke works because UnDeuxTrois cat sank is exactly homophonous with un deux trois quatre cinq (the numbers from one to five in French) for many English speakers, not to mention this set of numbers being learnt pretty much as a 'chunk' or formulaic utterance by the 8-year-olds telling this joke, and so we are presented with a situation in which our brain is temporarily confused by the words, finds the humour and then has a good old chuckle.

Not all puns are exact homophones, and one of my favourite jokes is this one:

Why are there no aspirins in the jungle?

The parrots eat 'em all!

This pun relies on parrots eat 'em all sounding like paracetamol, but in fact I pronounce paracetamol the other way, with an e as in bed (something like /ˌpæɹəˈsɛtəmɒl/ for the linguists), and my all is not like the ol syllable. Nevertheless, there are many puns that work because a string of sounds is precisely identical with two different meanings, and yet our brains don't ever confuse them until we are made to by the complicated joke set-up. Similarly, we don't ever seem to get homophonous words mixed up (pen, bank etc., where the words have two or more totally separate meanings). We even manage to think of different lexical categories from the same root as different (analyses is one I use in teaching: it can be the plural of the noun analysis or the 3rd person singular present tense form of the verb analyse).How we store and retrieve these is a question I'm going to let the brain scientists work on.

Monday, 4 November 2013

In the book '48 hours' by J Jackson Bentley (a fairly crappy crime thriller), forensic corpus linguistics turned up. In the book, a woman is kidnapped and her kidnappers make her send a videoed message to her boyfriend. She's a smart cookie so she gives lots of coded information about where she's being held in the message, including the word 'print'. The police try to find out what she might mean by this. They do that by running the word through a database to see what words 'print' occurs with most often.

“Luke again,” the speaker chirped. The computer is showing that the word ‘print’ can be associated with the word ‘press’ in the next sentence, as in ‘printing press’. This could be code for Dee telling us that the industrial unit houses a printing press.”

There's a good deal of suspension of disbelief required to get through this book, but this is a real thing. It's called 'collocation': when a word tends to co-occur with another one with greater than chance frequency.

If you look at a corpus (collection of texts) like the British National Corpus, you can very easily make it tell you this stuff (I'm no corpus linguist and even I can do it). Here's a screenshot of what happens if you look for the words that most frequently occur immediately after the word 'print' or one of its derivatives (such as 'printing'). The photo's a bit small but 'press' is there on the list, in eighth most common position (if I've worked the search terms right), after things like 'characters', 'material' and so on. I suppose, if you were looking for a clue to a place, 'press' would be the first one to give you anything to go on.

You can click on the words and find out what the context is, just in case there's some false results or you want more detail or whatever, and you get this:

That shows you the type of writing it was found in, and gives you the bit of sentence either side so that you can understand the phrase in context. I'm not really au fait with police techniques, but I wouldn't be at all surprised if they do use this kind of method when it's appropriate. There are such people as forensic linguists who work closely with the police, whose job is often to determine if a particular person is the author or a document.

Spoiler alert:

So they located a nearby printing press and, after much showdown, rescued 'the girls', as the two adult hostages were patronisingly referred to throughout.

Friday, 1 November 2013

Linguists have to spend quite a lot of time explaining that they don't correct people's grammar, and trying to prevent people from judging each other based on the language they use, and trying to explain that one standard language is not inherently better than a non-standard version or a different standard version.

But you know, we're people too, and we have human opinions. This was an exchange between two linguists:

As you can see, the second linguist, as well as the one mentioned in the first tweet, have negative opinions of their own regional dialects. I also know another linguist who said that she didn't think she'd be taken seriously as an academic if she used the accent she grew up with.

It seems that we need to distinguish between disliking an accent and thinking that it's wrong or worse than the standard. We know that all dialects (and indeed languages) are equally valid, equally correct and equally suitable for use. This is where we differ from many non-linguists, who often think that a person speaks in a non-standard way because they're lazy or stupid. We also know that objectively, accents don't sound stupid or unfriendly or untrustworthy: those are values projected onto speakers of that dialect by the listener. We're like non-linguists, though, in thinking that some accents are simply not as suited to our own preference. And we also know how much people judge you based solely on the way you speak - more reason than any other to moderate your accent if you think people might regard it unfavourably.