Changes in url queries

One truism you learn quickly at Google is “you are not a typical user.” If you’re reading this blog, the truism probably applies to you too: you’re much more likely to be a power-user, an SEO, a librarian, or someone else who is familiar with the site: operator or the info: operator. But it’s important to remember that many Google users aren’t like that.

Bearing that fact in mind helps explain a recent change in how we handle url queries. Some people call these navigational queries, but at Google a navigational query is typing in [HP] and expecting to see www.hp.com high in the search results. A url query would be something like [www.example.com].

Previously we treated the query [www.example.com] like the query [info:www.example.com], and now we treat it like [“www.example.com”]. The query [info:www.example.com] returns the single url www.example.com if we have it in our index, along with other choices like “see backlinks for www.example.com” (I’m oversimplifying a little, but nothing too bad). The query [“www.example.com”] searches for that as a phrase, and thus returns the ten best matching urls, which will usually show www.example.com at #1 or high in the search results.

Why did we make this change? Bear in mind that you, gentle reader, are not a typical user. If super-duper power-users who know how to refine site: or info: are a set of people we’ll call N, you are probably in that set N. There is a whole different range of people M who just type in www.example.com to get to www.example.com, and who sometimes misspell the url. In math, M >> N means that M is much greater than N. That is, there are many more people who casually type in urls to get to those urls (and who sometimes misspell those urls) than there are super power users. So this change helps M. The N power users can just prepend “info:” to get to the old behavior.

Let’s take an example. Suppose the query is [www.mysace.com]. If you go visit that domain, you get a 404 error. Most likely, what happened was the user was typing [www.myspace.com] and accidentally left out the “p” from myspace. Which behavior is more helpful to that user? Here’s [info:www.mysace.com]:

Not that helpful to a user who misspelled the url or who was in a hurry and left out a letter. Now here is the new behavior for [www.mysace.com]:

Notice that the new behavior gives a suggested spelling correction, results from myspace.com, and a suggested query correction with another chance to find myspace.com. The second set of results is more likely to be helpful.

We can always tweak/refine/revert if we see problems, but this change is beneficial for many of our users, and expert users can still get the old behavior by just prepending an “info:” to their url queries.

I like the new result type much better, power user or not… it just takes half a second less because I already know where the “cache” link is etc. (which is mostly what I’ll be looking for if I type a URL into the Google box).

If I write the URL in the search box of google.com, I get info on that URL, but when I use google toolbar search, it take me to that URL page! So if using google toolbar, I should still need to use info:url.

This sounds like a great update to me. I’m curious, however, why you didn’t decide to just combine the two methods?

– If a user searches for http://www.googEL.com, it would display the suggested spelling corrections. (The second screenshot you have above.)

– If a user searches for http://www.googLE.com (which is in the Google index), it would list the google homepage, then the options that are listed when you search for info:www.google.com, then other search results for googLE.com (combining the effects of screenshot 1 and 2)

Is this a possibility? Or was this step also designed to take some of the power-user functionality out of the sight of the standard-user?

Nice. I’ve always thought this was a more sensible behavior, and I’m not just saying that because it’s more like the Yahoo does it. (I had to check how we handled it, since I don’t often run queries like this myself.)

Now if only there was a simple way of causing 404 errors from manually-entered URLs to automatically forward to Google’s suggested spelling correction. I couldn’t tell you how many times I’ve been typing in a hurry and put in http://www.google.co or http://www.google.ocm. Actually, that’s weird. It doesn’t look like it suggests http://www.google.com for either of those. Any thoughts, Matt?

My only thought is that the new method displays ads where the old method did not. Obviously ads from competing sites can reduce relevancy (especially when they’re trying to search for a specific site by URL); however I believe the new method does provide a result the user is more likely to expect. I think it’s a trade up, but I can see where some people might find ads for other sites irrelevant to their search for a specific URL.

I also see that my own ad shows up for a query on our domain so I’ll probably end up paying for more clicks than I normally would.

Does this mean everyone is going to rush out and create adgroups for their competitor’s URLs? Do most people already do that?

I was actually looking for “GUID” as in “Globally Unique Identifier”, but for some reason I got a whole bunch of stuff related to “Guide”. Now I realize that this is not something your average bear would ever put into a search engine, and I know that all I had to do was filter out the word “Guide”, but still…there may be other examples of words out there that a normal human being would type that would exhibit similar behaviour.

Yes, it’s definitively better. However I do see you still need to propagate this change to all the datacenters, yes? I tried a few queries on whatever datacenter handles peru (.pe) and sometimes when I typed http://www.somesite.com, the site itself wasn’t the top result. It looked more like a link: command response…

BTW, in response to Tricia, actually it says “T Gusta el Bukkake”, which is a bad translation of “Do you like bukkake”. Actually, you can watch the reggaeton video of that (BTW, it’s not porn) by typing the url:

At first it annoyed me a little but when you explain it like that, it makes sense. After all, sometimes we (N) tend to forget that search engines jobs are to help users find what they are looking for, not give us the default info: query.

Josh, hope I didn’t come across as defensive. I was listening to the SearchCast and Danny was saying “Matt will blog why he did this and then we’ll talk about it” and I felt like I needed to get a post down, even if it was a quick post..

lots0, very well-said. I try pretty hard to be maintain my ability to put on my “regular user” hat, along with the well-worn other hats I put on sometimes: white hat, black hat, beret — that’s for when I need to think like someone from France ;).

Kilroy, we could certainly look at doing something like that down the line.

Jeremy, I didn’t even realize that Yahoo did that. Clearly I need to put on my purple Yahoo hat, my red Ask hat, and my four-colored live.com hat more often.

Multi-worded Adam, that’s our synonyms kicking in, e.g. [~guid] shows that most people mean guide when they type guid. Use [asp +guid] to do an exact search for only guid, not guide. Sharp eyes, btw.

Well done Matt. A year ago a spammer copied our primary domain but changed one letter in order to fool our industry. A search on his domain in Google now presents the user with “Did you mean: http://www.MyDomain.com” so I couldn’t be more pleased with the results.

Nice change. Now, if we could only figure out why on earth Google severally penalized us we might have some traffic from Google again. It would be nice if the sitemaps team had a DePenalize form or at least some way to open communications like the Reinclusion form which produces zero results unless you know how you offended Google.

Of course there are pros & cons to every edit: It used to be that I could tell people (“M” people, that is) to search for their full URL in order to get a quick read on whether their site was in the Google index… now I can’t do that… but overall I never liked the old results so thanks for the change!

What I wanted to say and forgot in my post is that we must place our customers/users in the first place and that is what I think that Google did here. You can’t leave out 80% of your user who are not so internet savvy. In searching as in most software things it’s still like – a don’t remember his name – said “Don’t make me think”. It has to be as intuive as possible.

If Google does ‘troll’ the sites then how does PageRank come into play when they ‘troll’ a site. If the URL never existed in the first place and Google happens to stumble on it via its incrementing process (if one exists) then how does the Page B Linking to Page A process all come in to play?

If you can track back to prove that the pages did at one point exist, can you tell me which site had that URL on it?

I have dozens of examples of what appears to be an incrementing process occurring from Google.

You know what i think is cool here? It’s how fast and agile Google is to make changes like these and make it available to the users. Just think how long it would take other companies (ie. MSN) to go thru the hoops and bureaucracy to make a somewhat simple change like this. Design – Develop – Test, re-design, 1000 emails back and forth, etc.. i’ll just take a guess… maybe 4 weeks? It’s these types of innovative quick changes that make G the leader in search.

My first reaction was, do we really want people to find myspace.com? Second : maybe it’s best if that’s what they’re looking for to quickly funnel these people over there as a kind of quarantine action. Get them off the highway.

Really appreciate the explantion, Matt! But I’m with Kilroy, who said, “Why you didn’t decide to just combine the two methods?”

I thought the old behavior was ideal for the typical user. Someone who enters some domain name into the Google search box is probably someone who is confused about the difference between the browser address bar and the search box. They know they want to reach a particular site — they’ve just put the info in the wrong place. Over the years, there’s been various debates about why exactly this happens. Have some users discovered it’s faster? Are some users confused but it doesn’t matter, because search engines do the right thing? No one has convinced me. But what we do know is that many, many “typical” searchers do this.

So when Google made the change, I didn’t think that was a bad thing for SEOs or power searchers. They already know how to get all the pages from a site or cached pages without the links. I thought it was a loss for the typical searcher, who was being well guided by the “try visiting that web page link.” Now, they aren’t. Sure, the official site might be in the results, but I liked the different behavior in this case.

The misspelling example is well taken. But that could be combined. A rough ideal situation for a search on http://www.mysace.com...

* If the URL is valid, try visiting that web page by clicking on the following link: http://www.mysace.com
* Find web pages from the site http://www.mysace.com
* Find web pages that contain the term “www.mysace.com”

Then show web results.

I’m sure you could get more creative, too. A common mispelling of a popular domain, automatically turn it into the correct domain. Or better, start building up that list of official sites. Someone searches for http://www.myspace.com, you could say:

then the rest of the regular results. IE, the top listing gets treatment somewhat similar to sitelinks, but domain specific.

If I get time, I drag out some screenshots on how search engines had dealt with navigational queries over the years. Some of the creative things I’ve liked have sadly gone away, so perhaps this is a good time to revisit that particular usage.

Danny… My Dad does this too…. I asked him once why he keeps typing URLs into the search box and this is what he said:

“because when I click the home icon.. I can just start typing and it puts the text there… and most of the time it works.”

The reason they keep doing it and refuse to learn is simple: It works. As long as it works, people will keep doing it.

As for auto correcting domains, you don’t wanna do that at all… It’s one thing to guide the user with a “did you mean” It’s another to assume they meant someting else. Often times those mispelled websites DO exist… and somebody somewhere might be looking for them.

Here’s a prime example: search google for one of my sites: NoSlang.com – a PR6 site that gets a few thousand unique visitors / day and has an alexa of about 108,000 (not bragging..just proving it’s not a small obscure site like mysapce.com)

anyway… Google for “noslang.com” and you get this:
Did you mean: notlong.com

I’m going to go out on a limb here and say that i’m 99% sure anybody searching for “noslang.com” didn’t mean “notlong.com”.

I always found it odd that people would type in “cnn.com” to get to CNN as opposed to typing it in the adress bar. But everyone uses the web differently…but it does make it interesting how valuable a default homepage is for a browser.

Matt,
Just noticed in the last few days that the site:www.mysite.com is now returning OTHER website pages (russian, etc. with no relationship or reference to mysite.com). Is this new or a resurfacing of an old problem with 301 redirects?

ah hoy there sparky, when i used to search by ip address, because spelling the website was too difficult, some people heard “s’s” others heard “f’s”.. i had them type in the ip address and we used to get “sorry no information is available, if you believe the url is valid, try clicking on the following link. now we just get “no documents found” can you put that back to the way it was matey?

A site I read regularly is no longer available on Google; http://www.uncommondescent.com/ even if you search uncommondescent there is no search result (it returns http://www.uncommondescent.org which is a dead url) this is a new phenomena. I’ve seen this for some other sites as well, they have simply disappeared. They all work on the other major search engines but they have just recently stopped returning results on Google. http://www.uncommondescent.com/ is a highly trafficed site and has returned lot’s of results on Google search until just now. Can you explain what the problem is?

>> I use the Google search box so much I frequently forget the address bar is there. And same with my (80+ year old) mom, who gets confused by any long text box. I agree with Danny S that probably both behaviors (recognition + search) would be better.

>> In the olden days, I was in charge of engineering for a company that syndicated our search engine results (ala Inktomi, but different). We were very puzzled for a while as to why a huge number of the queries we got were URLs. Then, we realized one of our partners (one whose name rhymes with phlegm-yes-men) was using javascript to set focus in the search box when the page loaded, stealing focus from the address bar, and thus getting more ad impressions 🙂

It’s good for super-mega-known sites as myspace or youtube. But what about the other 99.999% sites? Well, less than 5% of sites appear in first position when we type their domain in Google. In fact, now most of my websites don’t appear even in the first 10 results when I type their domain in Google.

When somebody typed mysapce.com, it got no results, and it had to retype properly. Now, when someone looks for my sites, it DOES get results: the sites competing with mine.

Dave Alan wrote “A site I read regularly is no longer available on Google; http://www.uncommondescent.com/ even if you search uncommondescent there is no search result (it returns http://www.uncommondescent.org which is a dead url) this is a new phenomena. I’ve seen this for some other sites as well, they have simply disappeared. They all work on the other major search engines but they have just recently stopped returning results on Google.”

From my own experience, i’ve been told that it (in my case at least) that it seems to be an algorithmic issue (according to Google). My site, which is not penalized (again, according to Google) has been dropped out of the index completely similarly to what’s been described above.

I don’t really like the new behaviour. Often on websites (especially in comments), there are URL’s that aren’t links. When I want to visit these, I select them, right click (in FF), and choose “Search for” from the context menu. Often these exact url’s are unknown to google. Now in the old situation it said something like “There is no match for http://www.example.com, but it seems to be a valid url, click here to try it”. So 2 clicks, and I was at the page. Now, if it isn’t found, I still have to copy it from the searchbar to the adressbar, which means more clicks. Yeah, not THAT many more, but it is more work, and I think I do this several times a day, so it’s annoying enough to write this post about :).

This is a good move on the part of Google. Those of us who study average users’ online actions (and I don’t think there are as many of us as there should be) have known all this for years and it’s nice to see the changes made. I have a few other thoughts on where Google (and others) could tweak the interface to help out the non-experts…

Have a question for you (I don’t know if this is the place to ask, or if I should ask you this at all), but here I go, feel free to ignore it 😛

I work for a pretty big company that has a pretty big website. Let’s call it Example, and the website is http://www.example.com.
Periodically they run a query in Google for all of their domains to check keyword density, the query is:

example site:example.com

and google returns

Results 1 – 10 of about 49,200 from example.com for example. (0.48 seconds)

The problem is, we have not made any changes that may affect those results in the last months, but take a look at those numbers:

3 August

Results 1 – 10 of about 49,200 from example.com for example. (0.48 seconds)
Results 1 – 10 of about 12,600 from example.fr for example. (0.48 seconds)
Results 1 – 10 of about 920 from example.co.uk for example. (0.50 seconds)

And now, 6 October

Results 1 – 10 of about 6,830 from example.com for example. (0.08 seconds)
Results 1 – 10 of about 1,580 from example.fr for example. (0.23 seconds)
Results 1 – 10 of about 1,270 from example.co.uk for example. (0.20 seconds)

Do you know if some of the google updates/changes could have affected those results? Also, the changes in the numbers coincide with the adding of google analytics to our website. Do you think google analytics may have something to do with this?

I regularly paste full URL’s into the google search bar (or should I say, used to).

The reason is that if you paste directly into the address bar, it leaves a history. Sometimes at work, or on someone else’s computer or even my own computer, I don’t want sites left in the history bar.

On my own computer, it’s because I use the drop down address bar like my bookmarks. And every time I paste something in the address bar directly, then it’s stuck in my “bookmarks”.

I’ve found a solution though, I just use yahoo now instead of google, since it still works with yahoo. This and a few other things have made me start using yahoo more frequently.

Bottom line is that I don’t understand the logic in removing features other than to appease advertisers. Which is a sad statement about such a profound and powerful tool. I almost see google as a tool for humanity. It kind of reminds me of how the media is so biased, because in the end, it’s not about news – it’s about money.

Google could have done all of the changes, but kept the same feature, with the description “if this REALLY is what you intended, here’s the link…” What good does displaying the non-clickable URL and saying “did not match any documents” do? If you’re going to display it, just make it a link??

And for the record I don’t buy the coddling “You guys are sooo much smarter than everyone else” bit. It’s just a lot more PC than saying “Well, we did it to make more money, even though it is less functional”.

Oh, and when I say I use the address bar as my bookmarks, I just mean for my very frequently viewed sites – email, investments, blogs, etc. I also use the traditional booksmarks for many other sites. It’s just that the drop down bar is much quicker to access than the bookmarks tab.

Nice Article. As you had mentioned, for the “N” crowd who are the “super-duper power-users who know how to refine site: or info:”, you can harness the power of tweaking the URL with http://URLParser.com
For example, you can tweak the Google search URL to have Extreme Pagination (Fast Forward or Fast Rewind) as described here:http://urlparser.com/example-pagination/#startExample