A blog about search, search skills, teaching search, learning how to search, learning how to use Google effectively, learning how to do research. It also covers a good deal of sensemaking and information foraging.

Thursday, December 30, 2010

I grew up in LA at a time when there were oil derricks everywhere. I had friends with rigs literally in their back yards, and when driving around LA or the Long Beach area, it was very common to see the grasshopper oil rigs pumping up and down, and the smell of oil as you went past. (FWIW, this is a "grasshopper" oil pumper. If you haven't seen one in action, the curved heads on the left bob up and down from the pivot in back.)

At the end of the year I got to thinking about life in LA and all the oil shocks and crises that have been in the news during 2010. And that made me think of a simple enough question:

When was the first productive oil well drilled in California? And.. for extra credit, who drilled it and where?

Tuesday, December 28, 2010

Allow me to digress for a moment from our usual theme of search and sensemaking in order to make a bigger, probably more important point.

As we've seen over the past year, Google (and Bing) have made many changes to their interfaces, user-facing capabilities, underlying index and results ranking. The most visible changes were to Bing's image search interface (continuously scrolling results) and Google's Instant Search ("search results come back as quickly as you type"). But many, many changes get made every week--some obvious, some pretty subtle.

The biggest news story in many ways was the "DecorMyEyes" story which broke in the New York Times in late November. Good investigative journalism on the part of the Times revealed a fairly abusive reseller who was taking advantage of his obnoxious behavior (which generated a large number of web posts linking to his site) to boost his position in searches for his product.

But more interestingly (for us on SearchResearch), Google fixed this problem within just a few days by a clever ranking algorithm tweak. As described by Amit Singhal on the Official Google Blog, the solution is more than just sentiment analysis, but involves detecting overall terrible user experiences on the part of purchasers and then using that information to change the rank position of results of sellers.

The implication of all these changes--which are ongoing and continuous--is that web search is a dynamic beast. What you get from a search today might not be what you get tomorrow.

In other words, web search isn't anything like a normal "reference search" from the days of yore. Not so long ago, reference materials stayed pretty constant, or at least changed slowly enough that the book / journal publishing cycle was rapid enough to stay up with the changes.

Now, however, things change rapidly. Not only does information accelerate (a point made masterfully by James Glecik in his book Faster: the acceleration of just about everything), but aggregations of information are constantly bubbling, as are the tools by which you access the information stew.

Point: Stay in touch with the changes going on. For the most part things-will-just-work. But when they stop working, you'll want to know how and why, especially if you're trying to make sense of a complex world.

Saturday, December 25, 2010

In the spirit of the season I put together a list of 5 golden rings--5 great tips that everyone should know (but are especially useful for teachers and librarians to know about).

1. Google search cheat sheet–there are many Google cheat sheets out there, and this is mine. This one has the benefit of actually being correct. It's also available as a mousepad, if you'd like to have one of your own. It shows about 20 of the top tricks and search operators that are most useful. Print it out and distribute widely. You have my permission.

3. Creative Commons license search–Google also recently launched another advanced feature in Image search. When searching for images, you can also go into the Advanced Search mode and filter by CC license level.

4. Custom Search Engines–A CSE lets you create your own "mini-Google" that searches just over the sites you like. That means it's really easy to create a special-purpose search-engine for just the needs of your class... or even a specific lesson. I'll do a posting about this in the future, but if you want to get started exploring, click on the link above. It's actually very easy to do and solves all kinds of problem when letting younger searchers look for specific topics on the web.

5. Alerts–A Google Alert is a standing query that's automatically run for you on a daily or weekly basis. Any changes in the web search results (or News) are automatically sent to you as an email. Think of the Alerts tool as your personal assistant who is always scanning the net for you. (I'll also write a longer post about this as well.) I have Alerts set up for my name (so I can see who's talking about me!) and for four different topics I'm interested in. Naturally, one of those topics is "how to teach search skills," which I have set up to send me weekly updates. It's a very handy way to track the latest in your special topic of interest.

Thursday, December 23, 2010

Just so you don't think less of me, I actually knew it was an incorrect quotation, but I've heard it so often that I wanted to bring this up as a topic.

As you can see from the comments, several people were able to solve this question fairly quickly. But it's interesting to look at WHY they were able to do so.

Hans found that this was a misquote and ended up at http://www.quotelady.com/subjects/plan.html -- and he's absolutely correct. The correct quote is The best laid schemes o' mice an' men gang aft a-gley.--Robert Burns ("To a Mouse"). Fred and JPP also found this out, and give the entire stanza as:

But Mousie, thou are no thy-lane,
In proving foresight may be vain:
The best laid schemes o' Mice an' Men,
Gang aft agley,
An' lea'e us nought but grief an' pain,
For promis'd joy!

Which gives a clear clue about why it's so misremembered--it's written in a Scots dialect which is, to say the least, not part of standard American English. (See the Wikipedia article on "To a mouse" for a nice side-by-side translation.)

We've talked about mondegreens and misquotes before (see blog post from March 20, 2010: "Quotes/ misquotes / mondegreens") but I wanted to bring up another aspect of misquotation: what do you do when you've got it all wrong?

In this case, we're saved by the sheer number of people who misquote Burns. Fred's path to the correct poem was by finding the phrase "of mice and men" being reused by Steinbeck, which then led him to Burns. Luckily, SO many people make this error that there's an industry of Questions and Answers (QA) sites that points searchers rightly.

A big part of every reference librarian's life is figuring out what the patron really means when they ask for something odd. I've heard from librarians that people asking for titles like "Funny Farm," which they explain is a book about animals. It's a leap to realize that this is actually a request for George Orwell's "Animal Farm," but that's what librarians do.

It's a skill worth developing for search as it often turns out that the most difficult searches are ones when the searcher is just SURE they know something to be true... that later turns out to be incorrect. Recently I saw a searcher looking for a "Photoshop plug-in" that would do a particular transformation to their image. It was a difficult search since they were trying to find a plug-in that would convert their line drawing from 2D to 3D. In Photoshop this is hard, but in Adobe's Illustrator product it's pretty straightforward... and it makes the search MUCH easier. Ultimately, we worked out that the much better search was [ Illustrator 3d plug-in ] and that having the word "Photoshop" was just throwing everything off (even though he was SURE it had to be in there).

What's the moral here? I see two general heuristics to keep in mind:

1. Be sure of your terms, and try to work around limiting terms. In particular, a term like "Photoshop" is really, really limiting. If you're not sure of the particular category, then consider backing off and trying another description of the concept--in this case, "Adobe" or "editor" is a better search term.

2. Describe more of your search intent (but keep it short!). When looking for things like "Funny Farm" is might be good to include more terms that describe your intent (words like "book" or "totalitarianism" would be great).

These are great rules of thumb to keep in mind when trying to find those elusive search results!

Wednesday, December 22, 2010

One of the biggest problems I see with people searching is when they're convinced that they already know the answer, or at least enough of it to do a search, certain that everything is correct. They get surprised when things don't turn out the way they thought.

A while back I was searching for the rest of the poem that contains the line "...the best laid plans of mice and men oft-times go awry..."

I knew that was the line I wanted to find, and I even had a suspicion that the poet was Scottish.

So here's the challenge for today:

What WAS the original poem? And, just as important, what is the rest of the stanza in which this line appears?

Friday, December 17, 2010

As I mentioned, this isn't an especially difficult search challenge, but that's okay because it teaches us something about what's out there on the web... to wit, that there are large repositories of recorded sound that are sometimes a bit difficult to unearth, but often delightful to explore.

One of my favorite sites for recorded language is Forvo.com, which has a large collection of less-common languages (such as Micmac, Zulu and Venetian) as well as fairly large collection of words and phrases in more populated languages (Chinese, Arabic, Pastho, Khmer, etc.).

Another fun site for language listening is BabyNamesOfIreland.com where you can hear author Frank McCourt pronounce (and explain) names like Siobhan ("shi-van"), a name I've stumbled over many times in my reading.

gives a number of hits from which you quickly learn thatLlewellyn is Welsh, and originally of Celtic origin. Old forms of the name include Lugobelinos (Celtic) and Lugubelenus (Old English). Several kings and princes had the name, such as Welsh ruler Llywelyn the Great (1173-1240), and by a number of Welsh poets.

Other variations include the Anglicized name Fluellen, the Welsh Leolin, Llanberis, Llanelli and Lloyd. Short forms include Llew (Welsh), Lyn (Welsh), and Lynn (English and Welsh), and the familiar form Llelo (Welsh).

You can also quickly figure out that the key problem of the name is the initial double-L. It's an “aspirated L," which is a phoneme exclusive to Welsh. It's pronounced as an aspirated 'L' which is in practice formed by saying the sound "L" while also making a hard th and a hissing sound.

Which is why you really want to listen to someone saying the name, rather than trying to figure it out from a pronunciation guide. I don't know about you, but I can't imagine what "L" + "th" + "hiss" should really sound like.

My next query was intended to look for the category of names (rather than just the single name Llewellyn) and also look for pages that would have recordings on them.

What's the most common way of describing pages with sound bites? I'm willing to bet they ALL say something like "hear these names spoken aloud" (or equivalent language). So I chose to use the term 'hear' and the category 'Welsh names' as in this query:

This query has a great first result: The BBC site Living In Wales with a marvelous section on how to pronounce Welsh names.

One small final step. When I first went to the LivingInWales site, I saw the name pronunciation page in this form:

What surprised me was that when I clicked on the "L" section, I didn't see Llewellyn. That was a bit odd, so I went back and looked carefully and found that there's actually an "LL" button in the second row. Ah ha! In Welsh, the double-L is a distinct character (just as "LL" is a separate letter in Spanish words like "llamar").

(1) sometimes it's good to look for the category rather than the instance (that is, [ Welsh names ] instead of [ Llewellyn ])

...and...

(2) add in a word that's a context word that's very likely to be on the kind of page you're looking for. In this case, it's "hear" (and not MP3 because the audio clips might come in any of a variety of formats).

In yesterday's comments, Hans points to the http://www.pronouncenames.com/pronounce/llewellyn page, which has a nice "standard English" vs. "Welsh" pronunciation side-by-side. As Hans' solution points out, there are multiple ways to find a good answer to the original question.

Thursday, December 16, 2010

My daughter is a voracious reader, and often comes across proper names from exotic places. Yet sometimes, the words seem familiar, and yet can have an unexpected twist. One of these books has a character with an unusual name: Llewellyn. I know from other reading that Llewellyn is a names from the UK.

Question for us today is this:

How do you pronounce Llewellyn in its original language? And, given the difficulty of this particular name, can you find a native speaker saying the name aloud?

Saturday, December 11, 2010

This really WAS a hard search problem! Congrats to Hans and Fred for figuring it out.

Let's look at what Hans wrote in his comment... (edits are mine).

For me, the key takeaway lessons are that one should explore other terms than the ones that first come to mind. This particular problem took me a long time because I was stuck on the idea of the marshland lines as channels or canals. Turns out that I really needed to use a different word: ditch.

Let me first tell you that I live in the Netherlands, so I'm familiar with dikes, ditches, canals and tidal marshes. Further, I'm a professional information researcher (I work in a corporate information center) so I know a lot about how to conduct a search on the internet.

Here are my search steps:
* I found in Google Maps the name of a bay in the neighborhood "Brosewere Bay"

A good move to start with on a geographic search like this. You want to figure out what the local geographic features are named: those terms will be useful when you try to set up your search.

* A search in Google with ["long island" "brosewere bay" +canals] brought me to a website of the Long Island South Shore Estuary Reserve:

This is exactly how I started. I searched on place names ("Long Island" and "Brosewere" and "Crooked Creek") were the ones I used. And they weren't great.

I eventually found took a very close look at the lines in the satellite photo. I'd been thinking that they were 15 - 30 feet wide. I looked at a bunch, and by using the measuring grid on Maps (or the one on Google Earth, shown here), I finally figured out that they were only from 3 - 6 feet wide. In this screen shot, you can see that it's about 2 meters wide... 6 feet... That's obviously NOT a waterway (except for canoes, perhaps).

And that was the insight that led me to start searching for the term "ditch," as in my query:
[ long island wetland ditches ]

This book chapter told me that there had been extensive mosquito control ditching in the 1930's.

But back to Hans' story.

* A search on that site using Google [site:www.estuary.cog.ny.us ditches] brought me to a document mentioning grid ditching for mosquito control http://www.estuary.cog.ny.us/ISR2005/ISR%20Outcome%204.pdf

Here, Hans did a smart thing by using the search term "ditches." I had been searching with terms like "canal" "channel" "waterway" and "dredge"... but those didn't work very well. It wasn't until I realized that those long lines were pretty narrow that I switched terms.

* A search in Google on ["long island" grid mosquito control] came up with the final document: http://www.geo.sunysb.edu/lig/Conferences/abstracts07/abstracts/potente.pdf

with the explanation of the grid pattern:

"Much of the tidal wetlands of Suffolk County, as well as neighboring coastal wetlands of eastern North America, has been previously affected by hydrological manipulations by mosquito control agencies which began early in the twentieth century. In an effort to depress mosquito populations emanating from salt marshes, parallel linear ditches were dug from the high marsh zones through the low marsh and out to estuarine bays, creeks and rivers. In many cases, supplementary ditches were also dug at right angles cross-connecting the linear ditches thus creating a grid pattern of ditches on the marsh surface. The intent was to remove the standing water of the marsh surface in which the marsh mosquitoes develop through their larval stages."

Absolutely right.

After searching around a bit more, I discovered that the mosquito control effort wasn't quite as crazy as it sounds. From the same book (p 45):

"Mosquito control practices began after the Civil War as homeward bound soldiers brought malaria to Connecticut. The disease soon reached epidemic proportions, and wetlands of all types were filled or drained to prevent malaria transmission by Anopheles mosquitoes. With the elimination of malaria as a health threat, control efforts targeted the large broods of nuisance mosquitoes that originated on tidal wetlands, especially salt marshes.

Hundreds of kilometers of mosquito ditches were hand dug to drain marsh surface waters, especially the intermittent pools or pannes which are the preferred breeding habitat for salt marsh mosquitoes..."

And finally, after searching around (because I got really interested in Long Island mosquito ditches) I found an interesting technical article on the "The development of a tidal marsh: Upland and oceanic influences" written by James Clark and William Patterson of U. Mass. Amherst. In their article they have the following diagram, which pretty much convinced me that we had the right name for these right-angled ditches that appear on the jet landing path into JFK.

And that tells you pretty quickly that it's a traditional Sami (aka Lapplander) style of singing. It's pretty interesting, actually... Wikipedia tells us (with a spell-correction to yoik) that a joik is a song that tries to "transfer the essence" of a person or place to the listener, rather than being "about" a person or place.

In other words, + is the same as the Bing command noalter: (example, on Bing: [ noalter:joiker ] )

Double quotes (on both Google and Bing) serve to turn off synonymization for strings of words. Example:

Thursday, December 2, 2010

In his comment, Ahniwa describes searching for a bit, then getting to California Department of Education and browsing to the "Student Expense" page: (http://www.cde.ca.gov/ds/fd/ec/currentexpense.asp ). It's an interesting page as it tells you that cost/student is measured in terms of "Average Daily Attendance" (ADA).

Unfortunately, that page only goes back as far as 1998--but it's a pretty good solution!

If we want to go back further in time, the only way I know to get this data is through the Rand Corporation's. I found this factoid out by doing the query:

[ california annual per-pupil spending data set ] - since I was looking for the complete data, it was pretty easy to see that only Rand had all the data compiled together.

All I did was then click through to the Rand database site for California Education Statistics, login via the interstitial page (requires only my Palo Alto library card), and voila! -- I'm into the Rand dataset.

Once there, it's an easy navigate (Databases>Annual spending per pupil) to the list of counties of California, and then a quick download of the data from 2000-2009 as a TSV file (easy to then import into your favorite spreadsheet program).

Here's an example chart from the data. You can easily see the bug in the data (no, the Montebello school district did NOT spend $47K / student).

The moral of this particular story?

Sometimes you still need to find out where the data is kept... and that the access path might be through your public library!

Wednesday, December 1, 2010

Here's the situation: Suppose you're a parent who is trying to understand the way money is allocated on a per-student basis in California. What you'd like to understand is this: How much money, on average, is spent per-student in each school district in California?

And, you'd like to see the historic trends since 1990.

Challenge: You're looking for an assembly of data for all the California school districts, per-student spend, per year. Ideally, you'd like a graph that lets you compare your school district with all the others in the state.

Yes, I know you can buy such data. But your challenge is to do this little research task on a typical school budget.. which is to say, for free.

Tuesday, November 30, 2010

I had a mosquito bite that began oozing a clear-yellow fluid that became crusted. I began searching for: [ yellow discharge insect bite ]and got a lot of low-quality results - plenty of generic info from, for example, first aid kit distributors, all of which appeared to be paraphrased from similar original sources. None of it was really helpful. Some sources mentioned that the discharge might be indicative of infection, but the discharge was only mentioned in passing in any case.I decided to rephrase, and upon searching for:[ crusty yellow mosquito bite ]one of the top results mentioned something called "impetigo". I immediately did a search for:[ impetigo ] and came up with many high-quality results that appeared to be written by experts. I found articles written by people who called themselves doctors that contained detailed descriptions.

It turns out that impetigo is a bacterial infection of the skin which comes in three types, one of which results in a red, itchy bump with yellowish discharge that becomes crusted. Apparently, to the untrained eye, the symptoms of impetigo are quite similar to an insect bite, whether it started with one or not.I thought you might find this interesting, because my search pattern started with my assumption that I had a mosquito bite, and was searching in the category "insect bites", but the information I really wanted was "classified" on the web under "bacterial infections of the skin".

It took reading through quite a lot of search results, including one Google Books search result before I was able to reformulate my search terms. I had a better "scent" of information in the results which also used the word "crusted" or "crusty", and that scent led me to the word "impetigo", which had the best scent of all, and then I found my answer.

This is a pretty common kind of story--J started to search for a topic, then discovered that a shift in terms would yield MUCH better results. In this case, the shift from insect bite to a search on impetigo did a great deal to improve the quality of the search results.

I take two big lessons's from J's story..

1. The right query term selection REALLY matters. You can spend a lot of time wandering in the search wilderness without a good term. But once you have it, the search is frequently very short. (And contrariwise, if you have the WRONG search term, you can be in big trouble. Be sure you know what it is you're searching for!)

2. Learning-while-you-search is key to power searching. J's big insight was when he noticed an unfamiliar term (impetigo) in the middle of his results. By recognizing that as a potentially useful word, and turning his search to include that, he learned something about the area, and can now search much more effectively.

I had a mosquito bite that began oozing a clear-yellow fluid that became crusted. I began searching for [ yellow discharge insect bite ] and got a lot of low-quality results - plenty of generic info from, for example, first aid kit distributors, all of which appeared to be paraphrased from similar original sources. None of it was really helpful. Some sources mentioned that the discharge might be indicative of infection, but the discharge was only mentioned in passing in any case.I decided to rephrase, and upon searching for[ crusty yellow mosquito bite ] one of the top results mentioned something called "impetigo". I immediately did a search for [ impetigo ] and came up with many high-quality results that appeared to be written by experts. I found articles written by people who called themselves doctors that contained detailed descriptions.

It turns out that impetigo is a bacterial infection of the skin which comes in three types, one of which results in a red, itchy bump with yellowish discharge that becomes crusted. Apparently, to the untrained eye, the symptoms of impetigo are quite similar to an insect bite, whether it started with one or not.I thought you might find this interesting, because my search pattern started with my assumption that I had a mosquito bite, and was searching in the category "insect bites", but the information I really wanted was "classified" on the web under "bacterial infections of the skin".

It took reading through quite a lot of search results, including one Google Books search result before I was able to reformulate my search terms. I had a better "scent" of information in the results which also used the word "crusted" or "crusty", and that scent led me to the word "impetigo", which had the best scent of all, and then I found my answer.

This is a pretty common kind of story--J started to search for a topic, then discovered that a shift in terms would yield MUCH better results. In this case, the shift from insect bite to a search on impetigo did a great deal to improve the quality of the search results.

I take two big lessons's from J's story..

1. The right query term selection REALLY matters. You can spend a lot of time wandering in the search wilderness without a good term. But once you have it, the search is frequently very short. (And contrariwise, if you have the WRONG search term, you can be in big trouble. Be sure you know what it is you're searching for!)

2. Learning-while-you-search is key to power searching. J's big insight was when he noticed an unfamiliar term (impetigo) in the middle of his results. By recognizing that as a potentially useful word, and turning his search to include that, he learned something about the area, and can now search much more effectively.

Thursday, November 25, 2010

Answering this question is a bit tricky because there's so much written about the First Thanksgiving (1621, Plimouth, MA) during the past 100 years, but so little source material to work from!

The big search lesson to take from the answer to this challenge is this: Don't assume too much about what you think the answer will be. Your initial assumptions might be very wrong, and you'd waste a lot of time trying to prove something that just isn't true.

Here's the story...

Background: Cranberries (Vaccinium macrocarpon), are native to New England and a few other places in North America, growing in acidic bogs. Many members of the heath family, such as blueberries (Vaccinium spp.) and azaleas (Rhododendron spp.), also grow well in acid, peat soils.

The cranberry plant is a very long-lived perennial less than eight inches high with trailing, thin stems with small, opposite, evergreen leaves. Cranberry flowers appear around the Fourth of July; these are white to light pink, downward-pointing, bell-shaped, axillary flowers. The name cranberry is a modification of the colonial name "crane berry," because the drooping flower looked like the neck and head of the sand crane, which was often seen eating the fruits.

Cranberry sauce, as we typically think of it, is a cooked, heavily sweetened concoction that's frequently augmented with citrus. Of course, any citrus would have been impossible to have in 17th century Massachussets. An even larger problems is that there was effectively little sugar in 1621 America. (Certainly not in sufficient quantities to make cranberry sauce!) The settlers hadn't had time to make maple syrup or sugar, and honeybees were still years in the future.

But how do we search for this kind of information?

My first search was [ cranberries 1621 ] which gives a number of search results, the most interesting of which is the "The Truth About Thanksgiving" from the Planet Blacksburg (VA) news site that quotes Daniel Thorp (colonial history prof at Virginia Tech)

After looking through lots of deadend links (research is a slow process!) I finally decided to look for the original letters describing the Thanksgiving holiday in 1621 to see what I could find. From the Daniel Thorp article I found that the letter was written by Edward Winslow, making the obvious search [ Edward Winslow 1621 ] lead me directly to a transcription (and handy translation into modern speech): http://www.pilgrimhall.org/1stthnks.htm

I quote from their site (in modern spelling):

"...our harvest being gotten in, our governor sent four men on fowling, that so we might after a special manner rejoice together, after we had gathered the fruits of our labors; they four in one day killed as much fowl, as with a little help beside, served the Company almost a week, at which time amongst other Recreations, we exercised our Arms, many of the Indians coming amongst us, and amongst the rest their greatest king Massasoit, with some ninety men, whom for three days we entertained and feasted, and they went out and killed five Deer, which they brought to the Plantation and bestowed on our Governor, and upon the Captain and others. And although it be not always so plentiful, as it was at this time with us, yet by the goodness of God, we are so far from want, that we often wish you partakers of our plenty."

From there, I found that William Bradford had also written about the First Thanksgiving. The obvious query:

"...They began now to gather in the small harvest they had, and to fit up their houses and dwellings against winter, being all well recovered in health and strength and had all things in good plenty. For as some were thus employed in affairs abroad, others were exercising in fishing, about cod and bass and other fish, of which they took good store, of which every family had their portion. All the summer there was no want; and now began to come in store of fowl, as winter approached, of which this place did abound when they came first (but afterward decreased by degrees). And besides waterfowl there was great store of wild turkeys, of which they took many, besides venison, etc. Besides they had about a peck of meal a week to a person, or now since harvest, Indian corn to that proportion. Which made many afterwards write so largely of their plenty here to their friends in England, which were not feigned but true reports."

And that's about it for the historical record. So, were cranberries on the menu, and if so, in what form?

There certainly was a tradition of stewed fruits in 17th century English cooking, so it's probable that some kind of fruit cooked compote was served, but it's not actually in the record. So the truth is... it's all speculation!

Most probably, the Indians (the local Wampanoag) brought along some of their own supplies. Ninety people is, after all, a pretty big crowd to bring to Thanksgiving table. If so, then they most likely brought along cranberries in the form of pemmican, a very calorically-dense survival and travel food. It's made by mixing dried berries with dried deer meat and melted fat to form easy-to-carry slabs. Think of ground up beef jerky mixed with a bit of bacon grease and dried cranberries, and you've got the idea.

Fred's approach (in the comments) was pretty clever. Query: [cranberry history] then used the Timeline tool to narrow down the results to the 1620's. Once there, you see lots of references to pemmican.http://en.wikipedia.org/wiki/Pemmican