Search is a fundamental component of intelligence or even thought. Maybe that’s why Google is now calling it knowledge. Our brains are already good at search. Look around a room and every object your eye passes is identified in your brain. Something out of place? It catches your eye. That is where we are headed with Internet search, though not exactly the way one might expect.

This is where every new generation of computer scientists brings-up the idea of artificial intelligence. If only we made the network smart enough to know not only what we really mean but what we really need. Maybe someday, but for now don’t hold your breath waiting for that one. Starting in the 1980s fortunes were spent and lost developing artificial intelligence that wasn’t, well, very intelligent. Absent some breakthrough that I have yet to see, this is not a good path to follow even today. Fortunately it isn’t really needed, at least not yet.

Google’s approach is leveraging its existing strength, which is hardware optimization. A couple years ago the company did research to figure out at what processor performance level — at what percentage of CPU capacity — data center power consumption was minimized. No other company but Google would consider a strategy of deliberately throttling-back its data centers.

Whether Google even realized it this approach to transport efficiency has been around for a long time… in aviation. What Google separately sought were two very special data center power levels known in aeronautical engineering as the Breguet Number and Carson’s Speed.

If you’ve never heard of a Breguet Number don’t feel bad. It is the power level (typically represented by a cruising speed) at which a particular aircraft will have the longest range on its internal fuel. Fly faster or slower than the Breguet Number and you won’t go as far before running out of gas. Every airplane has a different Breguet Number, though the rule of thumb says that 32 percent power is pretty close. I don’t know what power level Google came up with from its data center research, but it is likely in that 32 percent range.

Google found that by operating its CPUs at very low power levels it could broadly optimize search in terms of total power consumption. Running at higher power levels (faster CPU clocks) could get you more search results but with a total power bill that was higher in simple terms of watts-per-search.

Operating data centers at their Breguet power level means building three times the facility that Google would need if they ran the place the old fashioned way — balls to the wall.

Deliberately having three times the computing power available brought unintended consequences — the same consequences that any 17 year-old experiences when they replace the stock engine in their old clunker with a powerplant three times as big. Google acquired a lead foot. The first result of that lead foot was Instant Search — using extra CPU cycles to prefetch search results in real time. It’s not something Google set out to do but rather an unintended consequence of overbuilt data centers.

This brings us to Carson’s Speed. Bruguet was a French engineer best-known for his family’s fine watches, while Carson was a professor at the U. S. Naval Academy.

The problem with Breguet Numbers for pilots is that airplanes are intended to go fast and Breguet-friendly power levels are slow and boring. Going faster is a constant temptation with airplanes because they are of necessity built with a lot of excess power — power that is needed for climbing to altitude. An airplane built with an engine small enough to only reach Breguet Number speeds wouldn’t have enough power to even get off the ground. If you have excess power (and finite patience) what is the best speed to fly?

That would be Carson’s speed — the speed to get the most extra speed for the least extra cost. Or, as Carson put it, of finding “the least wasteful way of wasting.” For aircraft the speed in question turned out to be 1.32 times the speed for most miles per gallon (the Bruguet Number). Carson’s Speed uses excess power most efficiently.

Other than three G-V’s and one Boeing 767 built for a harem, Google flies data centers, not airplanes. But Google’s situation going into its power experiment was actually very similar to aviation because it was an exercise in reducing power. Google data centers weren’t built to Bruguet specs, they were faster. Given this excess computing power that had already been paid for in capital terms, what was the most efficient way of using it? Carson’s Speed — about 43 percent power — leaving plenty of excess cycles for new services like Instant Search.

But once you enable Instant Search for everyone, the data center is again running consistently above its Carson’s Speed which means you need even more hardware to bring the building back to 43 percent. It’s an arms race that until this moment only Google may have known they were conducting.

Google competitors have been constantly building new data centers, too, but they never knew there was a specific target beyond just keeping up with Google.

Google sees the excess power above Carson’s Speed as a safety margin in case traffic spikes or a data center goes down, which makes sense. But is also a strategic advantage over Google competitors.

Having discovered its lead foot, Google will employ it more and more. We’ll see whole new types of brute force services aimed at using excess CPU cycles against expanded data sets to reduce the distance between searching and finding. If a question is answered as quickly as it is asked and all the answers are cached and analyzed the results may actually start to appear before they are needed.

56 Comments

I’m confused. Say Google applies its lead foot and starts using all that excess power beyond Carson’s speed by introducing the next big search thing after Instant Search. Wouldn’t that move them even further away from the place where they’ve optimized the power usage ? Doesn’t that defeat the purpose of all that analysis to find the Breuget number in the first place ? Not even Google can have its cake and eat it too.
Or are you saying there are two factions at Google ? One that wants to optimize and keep costs low, and another that says, “hey, look at all these unused CPU cycles, let’s do something with them”.

And one last question. Why am I the only one commenting ? Is everyone else stuck in a fire drill ?

Euro2cent
May 20, 2011 at 12:58 pm

Bob’s blog has been acting up – the three latest entries (with this one in the middle of the pack) vanished from the front page for about 24h, as seen from my end.

David White
May 20, 2011 at 4:13 pm

I saw something similar. http://www.cringely.com yielded no code with Firefox (3.6.14), but displayed normally on Internet Explorer (version 8). This persisted for two days, then the page displayed normally.

We had a cache plugin implode (explode?) but we’ve now killed that and are running without a cache for the moment. It took 24 hours to break into the system to disable the plugin since it affected the admin login, too. Even rebooting the server just restarted the corrupted plugin. Sorry

Ronc
May 21, 2011 at 2:50 pm

As of Saturday afternoon, at the time of this post, the “www” site is up to date but cringely.com without the “www” is not being updated.

Francis
May 24, 2011 at 5:35 pm

If it happens again, you can turn off all wordpress plugins by accessing the database and changing a column. WordPress is great, but the plugin world is a jungle . . .

At first the 17 year-old uses his foot sparingly or not at all. Then there is a girl to impress or a rival to crush and down comes the foot more often. Eventually the performance threshold (and the 17 year-old’s need for thrills) rises. The point here for Google is that it sets parameters and a calculable target for datacenter growth. Nobody else has that. “We need to go faster or handle more traffic!” is the mantra everywhere else. Google now can say “We need 40 percent more transactions by November.” There is power in that.

Reagan brought down the Iron Curtain by outspending the Soviets but an important component of that was psychological — only Reagan knew how much he was willing to spend. Now replace the USSR with Yahoo or Bing and you’ll see what I mean.

It was failed food production, not Reagan, that brought down the USSR.

Dan W
May 23, 2011 at 9:19 pm

Reagan pushed things over, but the situation was made possible by Nixon (much as it pains me to admit it) and detante: once the satellite nations were exposed to Westerners with all their consumer goods, maintaining those satellites became too expensive for the Soviets.

Francis
May 24, 2011 at 5:27 pm

Or their Afghanistan war and the effects of hidden inflation. This is how a Russian friend explained it to me. Prices were not allowed to rise by law. Everyone got good wage increases every year (or at least they did in my friend’s circle – he was a professor – but this was communism remember, so they probably went up for everyone else substantially as well). So what began to happen was hidden inflation. There was way too much money chasing too few goods. Prices were fixed, so things sold out fast. Meat arrives at the supermarket and it’s all gone in half an hour. People start to line up to be in the lucky first few that actually get the goods. Long queues show up everywhere.

The Soviet Union collapsed because wages were too high and the government printed too much money. Of course this was the USSR, stuff like that wouldn’t ever happen here. (And even if it did the new date for the end of the world is October 21, so who cares?)

J Peters
May 25, 2011 at 10:49 am

Guns or butter holds true.

Bob is on a roll with this months posts. Cringe has got his mojo working 😉

BAZZ
May 20, 2011 at 6:23 pm

Penny’s top status in search results some time back brings us back to reality.
1 Penny was on top by manipulation
2 Google changed that manipulation by manipulation to reduce Penny’s prominence.

SO WHAT IS THE ACTUALITY?
A fraud perpetrated by Penny
Or a fraud perpetrated by Google!

In science for both Breguet Number and Carson’s Speed the methodology is open, free and EXAMINED. The only profit Breguet and Carson got was a tag to some idea.
Why does Google not want to tell the world its search algorithm — because of the money!
Coca Cola had USA make laws to allow small quantities of chemicals to be added to Coke that do not appear on the content list. WHY? These unnamed and unannounced chemicals are what Coca Cola uses to make sure that Coke exported to a particular area are not sold elsewhere!
That is what Google does with its algorithm.

You don’t like Goldman Sach — well Robert X get educated about Google!

Vijay
May 20, 2011 at 9:35 pm

It’s funny how natural language allows you to connect concepts and observations with each other that have little or nothing to do with each other. It also allows you to express concepts that sound overwhelming while being rather illogical or contradictory at the same time, like square circles or cold fusion.

I wouldn’t advocate substituting natural language with lambda calculus so we can better understand the bindings involved with “This is not a fire drill”, but making connections between Coca Cola’s recipe, GS and Google does deserve more linkage.

BAZZ
May 22, 2011 at 6:39 am

In the world there is altruism and capitalism and like oil and water they are hard to mix.
Coke’s problem is its contractors who do not follow the contract and want to make extra on the side by side deals that cut Coke out any where in the world.
(Sell Mexico Coke in NY) Coke solves that problem by analysis of its marker chemicals and does not recontract with contract breakers — one or two bottlers lose millions to get thousands and Coke has stability of contracts — no cheats.
Google cheats for Penny. Does Google cheat for Apple, Microsoft, Sun Computers? Do you know? Google knows but is not telling. That is the problem whether Pravda tells the truth or not. Coke does not need to tell the world anything it just says to a bottling contractor it failed the contract when they cheat.
With knowledge, Robert X’s theme here, we do not know the truth – the whole truth – we have only Google’s version and we have to trust it.
That is the problem – trust.
How do you trust some data
A: scientifically by replication of the experiment
B: or faith as for religion
C: or ask a judge for verification under oath.
D: by the GOOD name of the teller as in Goldman Sach
[Goldman Sach caused the GFC by selling crap and betting for it to fail — better than a casino because there was only losers and Goldman Sach kept 150% of the takings (the extra 50% was to bankrupt the insurance company AIG by insurance fraud)
Its name is mud.]
Google’s trustworthiness is lacking
1 France where it paid a $500000 fine for stealing private data via WiFi.
2 It denied all theft in many nations including France
3 Its statements on privacy by the former CEO.
ALL lead it to make products that have faults in its favor so it can make money!
So the Knowledge that Robert X so likes could be propaganda without verification. That is the problem

But Google’s fraud has not been exposed yet!
And I’m trying to point to Google’s scam.
I hope I’ve joined all the dots between Coke Goldman Sach and Google.

I’ll tell you what bugs me. I write a column that says “I’m mad as hell about X!” and get 187 comments, but if I write a column like this one that has real information that appears nowhere else on the net, it gets 9-10 comments, with a third of those from me. Go figure.

BAZZ
May 22, 2011 at 7:08 am

Robert X you still have not replied to my inquiry as to what X means.
But I agree with the science of your blog not the altruism that you attribute to Google or the veracity of the “knowledge” that Google gives us.
And what ever the ethics of Google its science is impeccable!
When one gives an opinion on Y there’s about 50-50 for or against Y.
And my favorite business author C N Parkinson’s law on debates that time and opinions are inversely proportional to gravity of the subject.
With facts there are no opinions its either Sunday or not!

Ronc
May 22, 2011 at 1:56 pm

The real problem is the web site not being updated since the May 13th “hero” column which has 88 comments. That problem was still affecting me until a few minutes ago. People just haven’t seen the last 3 columns.

bjust
May 23, 2011 at 10:38 pm

Yep… Today 4 of your columns showed up at once on my RSS feed listing. Next time there’s a suspicious break I’ll check the page directly.

Sam Stickland
May 24, 2011 at 1:31 am

Likewise, I had four articles all show up on my RSS feed at 10:59pm yesterday (BST); I’m still getting through them.

Also, I wonder if you have a large number of impressions but not too many comments, perhaps that actually means people agree?

JT
May 24, 2011 at 5:24 pm

Would not have helped. I’ve been checking the website daily since last week. All the articles showed up at once on Monday.

Jon Du Quesne
May 23, 2011 at 2:42 pm

OK Bob, I’ll respond to that.

I’ve read your columns for years. The reason that I read them is because you do an excellent job of concisely explaining things. Though you ostensibly deal with “technology” your commentary spans more than “mere technology”. I read a lot of varied, non-necessarily-connected types of books. I read your column because I never know what new things I might learn. Now I know about the Breguet Number and Carson’s Speed. Something more to throw out at the next party that I’m not going to!

Thank you for that!

Dave
May 23, 2011 at 2:59 pm

Bob-

The cache problem you had also affected your RSS feed – I just found the last 4 articles in my feed within the last half hour. Guess what I’m reading first.

As to efficiency and over capacity, Google is wielding it like a double edged sword. Their extra capacity gives them options. Their efficiency lets them return the search results more inexpensively that anyone else. That matters to the bottom line, and makes a “free” service cheaper to supply, and harder to compete with.

Dave Cole
May 23, 2011 at 5:10 pm

bob:

fwiw, i didn’t get my usual 24-hours-later email from your wp software on any of last week’s postings … i suspect i’m not alone … so there may be many of your loyal readers who didn’t even realize you’d posted these ideas.

Vijay
May 23, 2011 at 10:09 pm

While I find the correspondence between airplanes and data centres interesting, and gladly believe that it’s important to optimize data centres accordingly, I don’t subscribe to the implicit suggestions involved with the following conjecture (from a previous column):

“Microsoft and Google and their competitors, if any, may know a crapload of computer science, but for the most part that is becoming irrelevant. The fastest way to sort numbers was discovered years ago and mathematically proved. That’s not going to change. It is completely uninteresting trying to invent a faster method of sorting.”

Search technology, or data mining for that matter, is not something that’s reached its ceiling in the manner sorting has. Finding relationships, indexing and refining search queries still is and will continue to be a super dynamic discipline. For that matter, people who speak of Google’s algorithm or complain about adaptations of this algorithm are missing the point. Both the algorithm and the development of “the algorithm” are highly dynamic and we clearly see that “the algorithm” is situated in an eco system which is adapting which calls for corresponding adaptations of “the algorithm”, although this is a minor aspect of the overall development of search technology.

If it really was like sorting, then it would be more about hardware optimizations, but it’s clearly more important to a) harvest data, b) be more intelligent than the competitor at indexing or relating the data, c) getting access to the engine as close to the consumer as possible while d) making the engine interact with the consumer most effectively.

For a) it’s about getting as many (multilingual) sources as possible together with keeping up to date on these sources if they’re prone to changes. For b), the sky’s the limit unless you believe that inverted indices are still being used. For c), it’s about synchronizing the data across the globe as efficient as possible, thereby choosing the best locations and brokering the best deals. For d), it’s about having been most intelligent in step b) allowing you to provide the best iterations for the user to find whatever they seek or were unaware they were seeking, which fares well on a global scale if you’ve been good at step c).

Vector distances, which were featured in a previous column some years ago, are interesting and provide a nice vehicle to carry a search engine, simply because there’s so much available research on vectors. But having the best education combined with working with the most knowledgeable individuals on vectors and having all the data you want doesn’t buy you a good search engine.

You can graduate from a university which can train you to be the very best at sorting but there is no university, not even Stanford, which can train you to produce the best search engine. If they started a program today, while following the program would be greatly enriching, it would be obsolete before you graduated, at least when it comes to competing in the market.

If search engine technology reaches its ceiling, then hardware optimizations may become the dominant factor, but if search engine technology truly reaches its ceiling than it’s not necessarily computer science which will become irrelevant, rather it’s humanity itself which will be irrelevant at that point in time. Having said that, combined with observations of current human behaviour, we may be reaching that point sooner than later and not necessarily because search engine technology is becoming so much better.

pdwalker
May 24, 2011 at 7:27 am

probably because most people are not smart enough intellectually to understand what it means.

Nortcele
May 24, 2011 at 8:08 am

Everyone has an opinion and they are more than willing to share it and respond to others with a differing opinion. It doesn’t require nearly as much brainpower. However, responding to the presentation of real information requires an understanding. I don’t know about others, but half the information presented on Google went over my head. The “Carson’s speed” principle for aircraft is based on tangible things. I would need to see some Google “power curve” charts with numbers to effectively correlate. And even then, I don’t know if I would have a comment. I would just absorb the info into a neuron flat file somewhere.

You sound like my wife when the mail server is down. “I’m not getting any email. Nobody likes me.”

John Dreese
May 23, 2011 at 4:04 am

I never thought I’d see a presentation of Breguet’s equation and Google’s infrastructure model. Very clever.

Jon
May 23, 2011 at 3:22 pm

My thoughts about the Breguet number relates to the ability for google to retain their popularity through “Michael Jackson is dead” moments.

If one of these moments occurs, Google continues to operate, if they didn’t people would look at Bing or Yahoo.

Too big to fail?

Happy Heyoka
May 23, 2011 at 9:08 pm

Firstly, someone else mentioned that the RSS _just_ issued four updates in one go. The timestamps on those for me are correct (thursday, thursday, friday, today)… and I have certainly hit the fetch button more than once since thursday.

I have long had the opinion that, more than anything else, Google’s greatest asset is its expertise in marshalling all that server power and distributing the software across it – and doing so with fairly generic hardware.

They really are in a great position now to do amazing, never before contemplated things with all that data and cpu; and my guess is it will be one of those “20%” time projects” that does it… “AI” is one of those asymptotic concepts but perhaps it will be some kind of live, “intelligent and predictive” aggregation of geographic data into Google Earth or something?

john raines
May 24, 2011 at 2:40 am

There must be a Breguet number for cars. It strikes me that it is likely faster for cars designed to have low aerodynamic drag than boxy cars (has anyone else noted the similarity between one of the boxymobiles and the shape of Fred Gwynn’s head?).

If I am correct, the Breguet number for a Prius is higher than the number for virtually everything else on the road. Of course none of the drivers know this so the Prius drivers poke along a little slower than average.

matt wilkie
May 24, 2011 at 8:44 am

a very intriguing thought John. Thanks for bringing it up. I hope someone picks up on it (and shares the result of they’re study).

YetAnotherBob
July 4, 2011 at 7:17 pm

For a standard internal combustion engine with an automatic transmission, the most efficient speed is 35 MPH, That would correspond to the Berguet number. The other number is 55 MPH. That’s been known for many years. Hence the Carter presidency limit of freeway speed to 55 MPH.

For standard transmission cars, and for vehicles with overdrive, the numbers are different. The highest I have seen is for diesel tractor trailers. Some of them have best speeds of up to 85 or 90 MPH.

Power is one input, air drag is another. But the biggest factor is transmission losses. For most vehicles, the speed is a fast idle in the highest gear.

This is from my Mechanical Engineering classes in College.

francis
May 24, 2011 at 10:36 am

here’s an interesting idea (or not) imaging a 128 x 128 square of pixels and give it 8 bits for red, 8 for blue and 8 for green. Now, generate every possible picture. Every possible picture – pictures of the first cave man, pictures of the President of the USA in 2025, pictures of Christ, pictures of every movie start . . . oh you get the idea. A lot of the pictures would be meaningless grey blotches if the search is serial or random. But put some AI in there to constrain the search and let it evolve in meaningful directions . . . have you built a time machine? A machine that can look into the past and the future, in fact all possible futures and all possible pasts?

GlynM
May 24, 2011 at 12:54 pm

Picking up on John Raines’ comment about the Breguet number for cars, besides the aerodynamic coefficient, I imagine it would need to be calculated on the capacity of the engine and whether it is fueled by petrol (gasoline) or diesel. The latter of course affects both the power and torque outputs of the engine at different revs. If one considers that for planes in level flight at cruising speed, the optimum power output is 32% of max, one would need to have a very powerful car engine to keep a vehicle moving at a decent speed using only 32% of capacity. Methinks some studies and comparative data are required.

Also – Given that John introduced the idea of the Breguet number for the Prius (being a hybrid), one wonders if there is a Breguet number for pure electric cars. Taking that further, for electric motors driven by AC vs electric motors driven by DC…. How about turbines and generators….? All of which leads one to think that maybe there are optimums associated with the generation of electricity on the national grid, and possible carbon offsets.

They never seemed to get behind it, however. Interested in how this fits with your thesis – perhaps the economics simply weren’t there for Yahoo, because they hadn’t overengineered (and I use this term positively) their server farms like google has.

Awesome article, but I can’t picture what drag would be in computer terms. Enlightenment please?

Anyway, an interesting side note would be that power supplies also must be considered. Power supplies are usually optimized for high loads (i.e. 70%-90% maximum power output). So I wonder if they use power supplies that can even use the CPU at full-power.

That’s certainly an aspect of this evolution, Tim. Metadata can be used to assign to sources a reliability score, for example (there are many other techniques, too). Librarians had the advantage of operating on a much smaller database with mainly professional sources, though that doesn’t always make them more correct. Then there’s the information backscatter that afflicts you. That, too, has to be controlled in some fashion, probably through better targeted search results.