Monthly Archives: August 2010

First mention in the New York Times of the expression “sanctity of marriage.”

We cannot be silent on those laws of your country which, in direct contravention of God’s own law, ‘instituted in the time of man’s innocency,’ deny, in effect, to the slave the sanctity of marriage, with all its joys, rights and obligations.

I’ve been playing with Microsoft Research’s ngram server in my spare time. I’ll write some more about this in time, I hope.

The API returns base 10 logs for its probabilities. Because the probabilities returned are so very low, it makes more sense to return a log probability. I could be wrong, but I think it’s normal practice to return base 2 logs, or base e logs for log probabilities. They are easily to convert from one base to another, so it doesn’t really matter. But I was wondering why base 10 logs are returned.

The penny dropped when I was just playing around printing out base 10 of some probabilities:

I felt silly (since log10x = y by definition means x = 10y). But it’s really helpful to look at a number like ‑5.2721823 and know there are 5 zeros to the right of the decimal point. It’s even easier to parse than 0.0000053434.

Can I a tiny shout-out from other US Christians who are grateful to live in a country that has religious freedom built into the very core of its democracy and constitution? And that, therefore, efforts to curtail the religious rights of Muslims near the 9/11 site should be resisted?

I just noticed that Richard Stoddard posted a recording of my leading Few Happy Matches at Camp Fasola in 2009 (the recording was done by Al McCready). This is a tune I had been leading a lot at Sacred Harp singings and conventions in 2009; I still like it. One silly reason is that the tune’s title indicates a goal of a good search engine. The title actually comes from an Isaac Watts poem about the difficulty of good marriage matches, according to Warren Steele.

In the 1991 Denson edition of the Sacred Harp, there is a fermata in the antepenultimate measure; I spent a long time thinking about how long to hold this. It is sometimes held just the shortest extra time. But Wade Kotter had brought an early copy of the Sacred Harp to the singing, and I noticed that there was no fermata, but two tied-half notes. You can see this for yourself at the online version of BF White’s 1860 edition of the Sacred Harp at Michigan State; a similar thing is true of the version in the Southern Harmony (where the tune is called “Willoughby,” from whence BF White might easily have taken it.

This gave me justification for holding it for a longish time. In fact, Tom Malone (dear friend and singing master) has suggested that a fermata stops time, a very interesting philosophical puzzle. Listening to Few Happy Matches now is very suggestive. It does feel like time is being stopped as we sing who sometimes are afraid to die—.

It may be that you’ll experience time being stopped as you listen or sing this tune.

In one of the “Mad Men” discussions, someone asks if it was anachronistic to describe Freddy Rumsen as “clean and sober” in 1963 (from here, I think.)

The first clear use of this expression in the New York Times archives is from August 28, 1892, which has the embedded note in an article about a murder trial:

DEAR MISS CLOVER: WIll you meet me outside the Canterbury at 7:30 to-night? DO you remember the night I bought your boots? You were too drunk to speak to me. If you come clean and sober, please bring this paper and evenelope with you. (Neill held for murder; The death of Matilda Clover described by a witness; New York Times, August 28, 1892).

Someone else found hundreds of references in Google Books. So it’s ok to say Freddy was clean and sober; I hope he stays that way.

I enjoyed reading the Wikipedia page about its “lamest edit wars.” One of these edit wars was whether the article on what Americans call “aluminum” and what Brits call “aluminium” should have, as its fundamental title, “Aluminum,” or “Aluminium.” And, one of the arguments presented in favo(u)r of “Aluminum” was that more Google hits are available for the US spelling than the UK spelling. “Ghits” is notoriously unreliable (as are Bing hits and the other search engines), since the number of search results reported are subject to lots of factors, not of which is tied directly to actual number of documents returned.

However, Bing (my employer) has recently provided programmatic access to its data on ngrams (frequency statistics based on the number of word tokens) found on web pages, query logs and anchor text (the data inside links). And I can safely express that the US spelling is much more frequently used. Here is the actual data, based on the June 2009 data release:

Source

P(Aluminum)

P(Aluminium)

Ratio US:UK

Body text

0.00852

0.00487

1.76

Anchor text

0.00727

0.00426

1.70

Query text

0.00974

0.00483

2.01

So, as a data point: “aluminum” is around twice as frequent as “aluminium” on the Web.