Prof aims to rebuild Google with stuff In desk drawer

Dave Andersen looked into a desk drawer filled with tiny
computers. Each was no bigger than a hardback novel, and their
chips ran no faster than 600 MHz. Built by a little-known company
called Soekris Engineering,
they were meant to be wireless access points or network firewalls,
and that's how Andersen -- a computer science professor at
Carnegie Mellon -- used them in a previous research
project. But that project was over, and he thought: "They've got to
be good for something else."

At first, he decided these tiny machines could be
super-low-power DNS (domain name system) servers -- servers that
take site names and translate them to a numeric internet address --
and he asked some Ph.D. students to make it happen. "I wondered,"
he remembers, "if we could do this on a wimpy platform that
consumed only about 5 watts of power rather than 500." Those
students proved they could. But they also told Andersen he was
thinking too small.

After tinkering with his tiny machines, they realised that if
you strung a bunch of them together, you could run a massive
application each machine could never execute on its own. The trick
was to split the application's duties into tiny pieces and spread
them evenly across the network. "They were right," Andersen says of
his students. "We could use these boxes to run high-performance
large-scale key-value stores -- the kind of [databases] you would
run behind the scenes at Facebook or Twitter. And the rest
is publication
history."

The year was 2008, and as it turns out, Andersen and his
students were at the forefront of a movement that could reinvent
the way the world uses its servers, making them significantly more
efficient -- and cramming them into much smaller spaces. Startups
such as SeaMicro and Calxeda are now building servers using hundreds of
low-power processor cores originally designed for cell phones and
other mobile devices. HP is set to resell Calxeda machines as
it explores similar systems with a research effort called
Project Moonshot. And the giants of the internet -- including Google, Amazon, and Facebook -- are seriously considering
the possibility of running their operations atop the sort of
"wimpy" processors Andersen found in his desk drawer.

"Wimpy" is the official term. Now into its fourth year,
Andersen's project is known as the Fast Array of Wimpy Nodes, or
FAWN. He regrets the name. "No manufacturer wants to advertise
their products as wimpy," he says. But the name certainly suits his
research, and despite the negative connotation, the project has
attracted the interest of the largest chip maker on earth. Intel
sponsors Andersen's research, and he works closely with researchers
at the Pittsburgh lab Intel runs on the Carnegie Mellon
campus.

The rub is that the Fast Array of Wimpy Nodes isn't always fast.
In some cases, software must be significantly rewritten to achieve
high speeds on a collection of low-power processors, and other
applications aren't suited to the setup at all.

Like so many others across the server world, Intel is
approaching the wimpy-node idea with skepticism -- and not just
because it makes an awful lot of money selling the far-from-wimpy
processors that power today's servers. "Intel is trying to walk a
difficult line," Andersen says. "Yes, a lot of their profit is from
big brawny processors -- and they don't want to undercut that. But
they also don't want their customers to get inappropriately excited
about wimpy processors and then be disappointed."

Dave Andersen says that skepticism is healthy. But only up to a
point. His research shows that many applications can be far more
efficient on wimpy nodes, including not only ordinary web serving
but, yes, large databases. "Intel realises this too," he says. "And
they don't want to get blindsided."

Google Slaps Wimps Google is a search and advertising company. But it's also
the company the world looks to for the latest thinking on hardware
and software infrastructure. Google uses custom-built software
platforms to distribute enormous applications across a worldwide
network of custom-built servers, and this do-it-yourself approach
to parallel computing has inspired everything from Hadoop,
the increasingly popularopen source platform for crunching data
with vast server clusters, to Facebook's Open Compute Project, a collective effort to improve the
efficiency of the world's servers.

So when Urs Hölzle, the man who oversees Google's
infrastructure, weighed in on the wimpy node idea, the server world
sat up and noticed. If anyone believes in wimpy nodes, the world
assumed, it's Hölzle. But with a paper published
in chip design magazine IEEE
Micro, Google's parallel computing guru actually took the hype down a notch. "Brawny cores still beat
wimpy cores, most of the time," read the paper's title.

The problem, Hölzle said, was something called Amdahl's
law: If you parallelise only part of a system, there's a limit
to the performance improvement. "Slower but energy efficient
'wimpy' cores only win for general workloads if their single-core
speed is reasonably close to that of mid-range 'brawny' cores," he
wrote. "In many corners of the real world, [wimpy core systems] are
prohibited by law -- Amdahl's law."

In short, he argued that moving information between so many
cores can bog down the entire system. But he also complained that
if you install a wimpy node array, you may have to rewrite your
applications. "Cost numbers used by wimpy-core evangelists always
exclude software development costs," he said. "Unfortunately,
wimpy-core systems can require applications to be explicitly
parallelised or otherwise optimised for acceptable
performance."

Many "wimpy-core evangelists" took issue with Hölzle's paper.
But Dave Andersen calls it "reasonably balanced," and he urges
readers to consider the source. "I think you should also realise
that this is written from the perspective of a company that doesn't
want to change too much of its software," he says.

Andersen's research has shown that some applications do require
a significant rewrite, including virus scanning and other tasks
that look for patterns in large amounts of data. "We actually
locked our entire cluster because the [pattern recognition]
algorithms we used allocated more memory than our individual cores
had," he remembers. "If you're using wimpy cores, they probably
don't have as much memory per processor as the brawny cores. This
can be a big limiter."

But not all applications use as much memory. And in some cases,
software can run on a wimpy core system with relatively few
changes. Mozilla is using SeaMircro servers -- based on Intel's ATOM mobile
processor -- to facilitate downloads of its Firefox browser, saying
the cluster draws about one fifth the power and uses about a fourth
of the space of its previous cluster. Andersen points to this as an
example of a wimpy core system that can be rolled out with
relatively little effort.

Andersen's stance echos that of Intel. This summer, when we
asked Jason Waxman - the general manager of high-density computing
in Intel's data center group - about the company's stance on wimpy
nodes, he said that many applications - including those run by
Google - are unsuited to the setup, but that others - including
basic web serving - work just fine.

In other words, Google's needs may not be your needs. Even if
your applications are similar to Google's, you may be more willing
to rewrite your code. "I'm a researcher," Andersen says. "I'm
completely happy - and actually enjoy - reinventing the software.
But there are others who would never ever want to rewrite their
software. The question should be: As a company, where do you fit on
that spectrum?"