June 18, 2004

Predicting random eggcorns

Morrissey drops an eggcorn
on his new
CD (as good as "Vauxhall and I", not quite as good as "Your
Arsenal"), in the song "I Like You": "Something in you
caused me to / take a new tact with you".

Some other common examples recently sent in by readers include "slight
of hand" for "sleight of hand", and "for all intensive purposes"
in place of "for all intents and purposes".

As usual, the phonetic difference between the original and the eggcorn ranges
from nil ("slight of hand") to small ("for all intensive purposes").

I've pointed out in the past that one can use web search to compare rates of
eggcorn usage in different contexts, for instance
in news vs. the web at large. I'd like to reiterate here that web search makes
it possible to predict eggcorns and investigate their occurrence experimentally
-- and even to estimate their rates of occurrence.

Thus I can open a magazine on my desk to a random page, and pick a random phrase
-- here is "marginal cost" -- and predict a likely eggcorn -- say,
"margin of cost" -- and check the web to find it!

(link)
What IT potentially offers, he says, is economies of scale, the possibility
of enlarging the scope of educational activities at a relatively low margin
of cost, and mass student customization of education.
(link)
Leaving the law on one side and turning to economics, it is a well established
principle that if one wants to maximise one's returns, one carries out an
activity until the margin of cost is equal to the margin of revenue.
(link) Realize, of course,
that if the report costs more to compile the margin of cost to the report
to one more copy would go down, making one more copy even cheaper in comparison.

The raw rates of occurrence (195,000 whG for marginal cost, 148 for
margin of cost) are not necessarily to be trusted. One needs to do
some sampling to estimate the rate of valid hits -- and to do that, we really
need one additional feature from Google or other web search systems, namely
the ability to get a random sample of N hits. But with that proviso, one could
really use such techniques to do Google
psycholinguistics.