]]>IBM said today that it will develop two new supercomputers for the U.S. Department of Energy that are based on IBM’s new Power servers and will contain NVIDIA GPU accelerators and Mellanox networking technology. The new supercomputers, to be named Summit and Sierra, will be ready to roll in 2017; IBM will end up scoring a cool $325 million in government contracts for the project.

The Summit super computer will be installed at Oak Ridge National Laboratory and Sierra will be part of Lawrence Livermore National Laboratory. Both supercomputers are supposed to help the U.S. discover new ways to slow climate change, predict natural disasters, store nuclear waste and improve fuel efficiency.

IBM claimed that both supercomputers will be able to deliver in excess of 100 peak petaflops, which trumps current supercomputer reigning champs. Oak Ridge’s Titan supercomputer delivers 27 peak petaflops and China’s Tianhe-2, currently the fastest supercomputer in the world, delivers 55 peak petaflops.

The supercomputers are based on IBM’s OpenPower technology, which is managed by the OpenPower Foundation. The OpenPower tech is part of IBM’s efforts to cater to the webscale crowd that needs custom architecture to handle the kind of heavy duty workloads necessary for big-data tasks.

In early October, IBM unveiled a new server that contains both a Power8 processor and Nvidia’s GPU accelerator that IBM wants to sell to the “Linux-scale out market,” said Brad McCredie, an IBM fellow and vice president with IBM’s systems and technology group.

]]>The internet today is broken (which is why Gigaom recently devoted a series of articles to the subject of fixing it). Control of key elements is becoming more centralized; it is being increasingly censored and manipulated; many aspects are open to intrusion; and the cloud’s need for expensively massive server operations breeds business models that rely on turning everyone’s private data points into dollars.

A bunch of slightly mad Scots at a company called MaidSafe may have the solution. Over an impressively lengthy development period of 8 years, they have come up with something called the Secure Access For Everyone (Safe) network, which they are about to start testing. If it pans out, the Safe network will provide an un-censorable, secure, resilient successor to today’s internet, complete with a built-in economy and the foundations for a distributed, autonomous intelligence.

The idea is not wholly original – a dash of distributed computing here, a soupçon of altcoin there – but it is impressively glued together and actually relatively plausible as these things go. That’s not to say Safe doesn’t have problems, because it does, but they may not be insurmountable.

A post-server world

The Safe network was the brainchild of David Irvine, a network design engineer and serial entrepreneur (his startup Ayrsoft made small business server software called eBoxit) who wanted to improve systems and realized the problem lay in servers. As per the telling of Nick Lambert, MaidSafe’s chief operating officer, servers are an unnatural and unnecessary intermediary in online communications. (“MaidSafe”, by the way, is a play on RAID that stands for “Massive array of internet disks”. Sorry, it’s a horrible name.)

However, there are certain problems with moving away from the server model, as Lambert explained: “If you don’t have servers, what do you log into? You also need to have data that you could almost have with your worst enemy and they still can’t read it and decipher it. The third [problem is] you have to have a network that’s autonomous. The network must be able to heal and manage itself – for a lot of things people are unfortunately the weakest point. They’re very corruptible … this often leads to problems.”

The Safe model is a bit like SETI@home or perhaps even a botnet, federating the spare capacity of its users’ computers and internet connections to create a distributed supercomputer that obviates the need for centralized servers. Users’ computers, the nodes on the network, can all contribute resources to the pooled effort, including storage, bandwidth and processing power, by running a special app. Data going through the network are broken up into shards of encrypted information that are each stored in at least 4 places at any given time, with no one able to reassemble them but their owners or the intended recipient.

The shared resources are bound together through a routing layer based on the Kademlia distributed hash table, which also underpins filesharing networks like BitTorrent. “It’s fast to the point where, if a node goes offline, the network is aware of it in the time it takes a ping to go, so that’s 20ms,” Lambert said. All this runs on top of the standard TCP/IP internet protocol suite, so it can happily use the internet’s existing hardware infrastructure. “But it is basically rewritten from there up. We’re going from Level 3, the networking layer, right the way up to the application layer.”

A big advantage of this approach is resiliency. According to Lambert, the unintended loss of data on the network would require the loss of power in 4 continents at the same time. And in any case, if that happened, data loss would probably be the least of our worries.

From the user perspective, Safe will potentially provide a similar experience to what people are used to, with a completely different back end. Once they’ve logged in from their desktops, services would come from the network as they come from the cloud today – browsers, app stores, messaging and video chat; whatever developers come up with. “There would be nothing to stop people putting operating systems inside the network,” Lambert posited.

Incentives for users and developers

So how does the network avoid the so-called tragedy of the commons, whereby not enough people contribute resources and a few selfish users chew them all up? Well, people can use the network without contributing, but then they won’t earn safecoins, a Bitcoin-derived virtual currency that exists to reward users (“farmers”) and app developers (“builders”) for their efforts. The network pays out safecoins to developers according to how much their applications get used (direct micropayments from users are also possible) and to users according to their contribution, which is calculated in real time.

At the start, safecoins will be quite low-value: when MaidSafe raised $6 million through a safecoin sale in April, it was at a rate of 17,000 safecoins to one bitcoin (worth about $450 at the time). Nonetheless, the outfit hopes Safecoin will become a big cryptocurrency, and that will require a way to exchange safecoins for bitcoins or litecoins or, you know, real money like dollars and euros. Hence, one of the first applications needed on the network will be a distributed currency exchange, perhaps a bit like OpenCoin’s Ripple.

One question that will hopefully get cleared up during the tests, for which 500 developers have signed up, is how quickly a user can farm safecoins. “Unfortunately we don’t know until we get the test networks up and running,” Lambert said. “It’s very much dependent on how much resources are on the network. If there’s lots of people farming, there are fewer chances to earn safecoin. There are so many variables that it’s difficult to predict with any certainty what users can expect.”

A more fundamental problem is the shift to mobile. It’s all very well for a PC user to leave their machine on 24/7 in order to earn as many safecoins as they can, but you simply can’t do that with today’s mobile technology and data pricing. The connections and local processing power can probably handle it, but the batteries can’t – there’s a reason phones are forever going to sleep – and data usage caps are too restrictive. Down the line these things may change, but for now they mean mobile users are only theoretical consumers, not contributors. MaidSafe could pin basic access to the possession of safecoins, but those will be scarce in the early days. It’s a fine line to walk.

So the model isn’t perfect, but it does promise access to effectively unlimited free storage and computing power with anonymity and heavy inherent security, all for donating resources that many PC users can easily spare. The farming/mining system makes more sense than that of Bitcoin, because mining bitcoins involves using a lot of computing power to answer essentially pointless mathematical questions. And the distributed, self-optimizing nature of the Safe network could play very well with both the internet of things and content delivery.

Also, the network is effectively a giant computer that may prove very useful as such – Lambert reckons it could be powerful enough to run an artificial intelligence tat draws on all the knowledge in the network, and in fact MaidSafe has been talking with an EU-funded project called RoboEarth about this very goal.

What’s in it for MaidSafe?

There are effectively 2 MaidSafes: MaidSafe.net, a for-profit company, and the MaidSafe Foundation, a charity for education and innovation that owns half of the company and will theoretically be funded through dividends once everything takes off.

MaidSafe the company will earn safecoins by releasing its own apps, and also by improving the core code. A third revenue stream would come from what Lambert described as a “fairly significant intellectual property portfolio which is there for protection.” Yes, patents – anathema to many in the open source world, but potentially useful for protecting developers on the Safe network down the line.

The product itself is open source, dual-licensed under GPLv3. Anyone can freely use it or even fork it, as long as the resulting product is also open source – if it’s not, then the company requires payment. “It may well be that a centralized current incumbent wants to use some of our libraries; for example, our rUDP and routing libraries may be sufficient for content delivery networks like Akamai,” Lambert suggested.

In other words, if everything works out then MaidSafe can keep going, and even if the company drops off the face of the earth for some reason, the project code can live on in some future iteration. Either way, I strongly suspect this concept is where a lot of other ideas have been heading. If it’s going to happen, maybe now’s the time — we’ll find out from September, when MaidSafe hopes the initial tests will be successfully completed and the beta phase will begin.

For more information on MaidSafe and the Safe network, read their whitepaper — also check out my recent report on initiatives to reclaim online privacy through decentralization and other means.

]]>In 1976, famed computer architect Seymour Cray released one of the most successful supercomputers ever made: the Cray-1, a stylish 5.5-ton C-shaped tower that was quickly embraced by laboratories all over the world. While it soon gave way to newer, faster Cray models that then faded away entirely in the ’90s due to huge cost and performance advances in supercomputing, its iconic shape and early success left a lasting legacy in the industry.

Seymour Cray and a Cray-1 supercomputer. Photo courtesy of Cray.

That legacy led hobbyists Chris Fenton and Andras Tantos to pose what they thought would be a simple question to answer: How can I build a Cray-1 for my desk?

A tiny replica takes shape

In 2010, Fenton, a New York City-based electrical engineer who actually works on modern supercomputers, decided to replicate the physical form of the computer. Its hardware was well-documented online, so building it came together quickly. Fenton used a CNC machine and glue to build the tower and bench out of wood. Then he painted the tower and covered the bench in pleather. The complete model is 1/10 the size of the original Cray.

Considering that an iPad packs far more computing power than a Cray-1, it wasn’t difficult to find a board option that could handle emulating the original Cray computational architecture. Fenton settled on the $225 Spartan 3E-1600, which is tiny enough to fit in a drawer built into the bench. Considering the first Crays cost between $5 and 8 million, that’s a pretty impressive bargain.

A lead emerges from a Minnesota basement

But while Fenton was able to replicate the architecture, he hit a wall when he began searching for software to make the Cray model fully operational. He determined none of the code from the original OS was available via the internet, so he went analog. He asked the Computer History Museum and government whether they had a copy laying around. Nope.

His first lead came via a friend who introduced him to Donald Lee, a former Cray software engineer who had “this giant 10-pound disk pack” — an early, removable medium for data storage — in the basement of his Minnesota home. Lee told me last week that he isn’t exactly sure how the disk ended up in his possession, but he might have picked it up at a garage sale or when someone was cleaning house before leaving Cray.

To run the disk, Fenton needed a disk drive. He borrowed a 1970s-era one from the Museum of Information Technology at Arlington in Texas. Poor archival conditions had left the drive in rough shape, so he cobbled together a system involving a robot made from the innards of a MakerBot Thing-O-Matic. The robot slowly passed over the disk, allowing a converter to change the data from analog to digital and feed it onto Fenton’s computer.

The disk drive. Photo courtesy of Chris Fenton

Disappointingly, the disk pack only held factory-testing software.

A second disk pack surfaces

The publicity Fenton garnered via his documentation of the project caught the attention of Andy Gelme, an Australian software developer who once worked for Cray. He too had a disk pack.

Concerned that shipping would damage or demagnetize the disk, a friend of Gelme’s couriered it to Fenton during a planned trip to New York City a few weeks later. While the disk didn’t contain the original Cray-1 OS, it did carry the last ever version of the Cray OS, which was made for a Cray-1 successor: the Cray X-MP.

Fenton used the recovery system he developed with the disk drive to pull information off the disk. He was not able to convert it into working software, but he had begun to correspond with Tantos, a Microsoft electrical engineer, who had independently been pursuing the Cray OS.

Tantos took over working with the disk. He rewrote the recovery tools, plus a simulator for the software and supporting equipment like printers, monitors, keyboards and more. For the greater part of the last year, he arduously reverse engineered the OS from the image. Despite a few remaining bugs, the Cray OS now works.

[youtube=http://www.youtube.com/watch?v=D6R8FOANclc&w=560&h=315]

Fenton is now in the process of upgrading his desktop machine to be compatible with the Cray X-MP OS. The two are also on the lookout for a compiler: a computer program that would allow them to write their own applications and feed them into the Cray.

“For these machines (Cray-1 or X-MP) you couldn’t really go into a store and buy an application, like you do for a PC these days. Now, you just ‘install’ Word and it runs. For these machines, everything came in source-code format and you needed to compile it before you could run it. You use the … compiler to turn it into machine code the machine could understand,” Tantos said. “That was the main way you interacted with these machines. Without the compiler, you can’t feed it that.”

If you have any information that could help Fenton and Tantos, you can contact them here.

Preserving what’s available

Both hobbyists say they tackled the project for fun, but found it to be an important exercise in the preservation of computing history. Despite the fame of these early computers, little effort was made to ensure full documentation survived.

[youtube=http://www.youtube.com/watch?v=vtOA1vuoDgQ&w=420&h=315]

“In some ways it’s sad, but in other ways it’s fascinating,” Tantos said. “Seeing how extremely hard it is to come by software for these early computers, it’s even more important that we preserve what is available.

When they finally are able to pair the OS with Fenton’s desk model, it will be both novel and historical, as it will be the first working “Cray” in decades.

“The Cray-1 is one of those iconic machines that just makes you say ‘Now that’s a supercomputer!’” Fenton wrote on his blog in 2010. “Sure, your iPhone is 10X faster, and it’s completely useless to own one, but admit it … you really want one, don’t you?

]]>A team of Japanese and German researchers have carried out the largest-ever simulation of neural activity in the human brain, and the numbers are both amazing and humbling.

The hardware necessary to simulate the activity of 1.73 billion nerve cells connected by 10.4 trillion synapses (just 1 percent of a brain’s total neural network) for 1 biological second: 82,944 processors on the K supercomputer and 1 petabyte of memory (24 bytes per synapse). That 1 second of biological time took 40 minutes, on one of the world’s most-powerful systems, to compute.

If computing time scales linearly with the size of the network (a big if; I have no idea if this would be the case), it would take nearly two and half days to simulate 1 second of activity for an entire brain.

The K computer: it’s big. Source: RIKEN

Still, the researchers are excited by what they’ve accomplished. According to a quote from project leader Markus Diesmann in the press release announcing the simulation: “If peta-scale computers like the K computer are capable of representing 1% of the network of a human brain today, then we know that simulating the whole brain at the level of the individual nerve cell and its synapses will be possible with exa-scale computers hopefully available within the next decade.”

Although they’re measured in FLOPS — floating point operations per second — rather than bytes, the prefixes measuring supercomputer performance are the same as those measuring data storage. A system operating at 1 exaflop would be 1,000 times more powerful than a system operating at 1 petaflop. K — now the world’s fourth fastest supercomputer — is capable of 10.51 petaflops.

Importantly, though, the recent simulation was merely a test of the open source NEST simulation software the researchers have been developing. Research into specific diseases, and projects such as Europe’s Human Brain Project and the United States’ BRAIN initiative, will require more job-specific tuning.

Figuring out the mysteries of the human brain isn’t just a matter of sheer scale and advanced software, though. As GigaOM’s Stacey Higginbotham has reported, numerous companies and institutions, including IBM, are working on projects to better our understanding of the brain. Some are even developing chips designed to mimic it — albeit on a much smaller scale.

This article was updated at 11:34 p.m. on Aug. 3 to correct the amount of memory allocated per synapse to 24 bytes.

As anyone even casually familiar with parallel processing knows, running applications across more nodes means jobs execute faster because they’re able to share the computing workload. The more cores, the faster it runs. This what makes Hadoop, for example, so great at processing large chunks of data. The MapReduce framework on which it’s based divvies up the work across nodes and everything they find is stitched back together as the result of a job.

But even Hadoop can only scale to tens of thousands of nodes and, because of its focus on “nodes,” actually isn’t really good at utilizing multi-core processors to their fullest (expect to hear more about the limitations of Hadoop at our Structure: Data conference March 20-21 in New York). The IBM-built Sequoia supercomputer (housed at Lawrence Livermore National Laboratory) that the Stanford team used consists of 98,304 processors (or nodes), each containing 16 computing cores. That’s a grand total of 1,572,864 cores, and the researchers were able to use the majority of them, which they claim is a record of some sort.

Sequoia, decomposed

But record or not, that’s an incredibly complex undertaking. Programming the jet-engine simulation meant figuring out how to divvy the code into more than a million different tasks that could run across tens of thousands of nodes and 16 cores within each of those nodes. If even one of those processes is buggy, it could slow down or ruin the whole simulation.

Even in the world of supercomputing, where systems now regularly contain hundreds of thousands of cores — some of them special-purpose GPU co-processors — there’s a shortage of programming talent to actually use them all to their fullest potential. As my colleague Stacey Higginbotham explained in some time ago, the world of high-performance computing is hurtling toward exascale computing but a bigger problem than energy-consumption might be finding applications that need that much computing power and the algorithms capable of operating at that scale.

Still, the implications of advances in parallel programming are huge — like potentially life-altering huge. This is true not only because of the scientific questions we’ll soon be able to answer at speeds inconceivable even a decade ago, but also because of the computing power we’ll all soon be carrying around in our pockets and purses. If you think those multi-core smartphones and tablets are great now because they can run multiple applications at the same time, just wait until their processors are even bigger and badder and we have more applications — photo- and video-editing, computer-aided design, games and who knows what else — that can actually get the most out of them.

]]>A few years ago, the gaming world was thrilled by the premise that the cloud (the Cloud!) could be harnessed to power games, too — any game you wanted, anytime, on any device, served from data centers to you. Services like OnLive and Gaikai promised freedom from your hardware, the end of the lockout of exclusive games only available on one platform or another.

Reality disappointed: What we actually got was a limited library of not-new games (Homefront, anyone?), many of which you already owned, but even laggier than on your own hardware. Turns out traditional retail, game publisher, and hardware platform companies made it difficult for cloud gaming services to get the best games on the day of release, and even then the gameplay quality was slightly inferior.

But the concept of gaming in the cloud is still an idea worth pursing for a far greater promise: the ability to deliver an entirely new kind of game experience.

Historically, in games as in any other media, new distribution technologies enable new creative experiences. Pong wouldn’t have been possible without a new device plugged in to your TV. Internet-connected computers meant you could play Duke Nukem and Quake with other people online. The evolution of server technology brought massively-multiplayer games. The iPhone brought Angry Birds, a game designed for a touch interface, and so forth.

So why should a cloud gaming service be used to deliver the same old games as before that were built for a $250 machine?

What we should be wondering, then, is what new kinds of games and gaming experiences cloud delivery could inspire? Compared to the gaming hardware you own, a cloud gaming service could access much more computing power—with a limitless capacity to add processing. Consider after all that the most powerful supercomputer in the world, the Titan, is about 70,000 times more powerful than an Xbox 360. Granted the Titan costs a cool $100 million, which cuts out most households, but scaling back to basic and accessible data center prices would still offer many orders of magnitude more computing power than any current or near-future home console . (And this isn’t to say great gaming experiences are limited to powerful hardware—to the contrary mobile phones play compelling games, too. They’re just of a different sort.)

As for content itself, games purpose-built for the cloud do not yet exist — ones that aren’t encumbered by the limits of processing power, that would use the full advantage of many more, and more powerful, CPUs and GPUs. These “supercomputer games” would open up creative possibilities far beyond what games of today are capable.

Imagine supercomputer games with vividly lifelike worlds and characters (and not the almost-real, uncanny valley of current-generation graphics), or a single battlefield with 50,000 other players playing at the same time — or opponent AI on the level of IBM’s Jeopardy!-winning Watson. Supercomputer games could be dramatically different from anything you can play tonight at home. I’m no game designer, but what if we could use real-time traffic data to fill the streets of the next Grand Theft Auto, or step into a computer-generated world that looks as compelling as the Lord of the Rings movies?

Now, there are many reasons, beyond the technological, that these games don’t yet exist: It would be prohibitively expensive to pay artists to create all those detailed graphics, and simple AI is good enough to defeat most any player at most any game. But the record of creative innovators is that eventually they find a way to stretch the available technology to its limit. And some gamemakers are already beginning to probe at the games you can create if you host some of the game in the cloud.

There is a nagging constraint to the cloud, of course — bandwidth, which simply isn’t growing at the pace of Moore’s law. Network latency makes fast-twitch games, in which defeat is determined in microseconds (like with the top console genre, first-person shooters) hard to play over today’s internet. So, at least until the next engineering breakthrough, these supercomputer games might be designed around genres requiring slower player reflexes than, say, Call of Duty or StarCraft.

Best of all, the only hardware you would need at home is a basic input device like a controller and a box to render the graphics, and it could be cross-platform so that you could play from a PC or Mac or any smartphone. As one for-instance, OUYA, the new open, Android-based console I back, could be great for a cloud-delivered game (hear me, developers?), and its notable that Sony bought up Gaikai and certainly has plans. (Full disclosure: OUYA also has an announced partnership with the relaunched OnLive.)

Supercomputer games could be extraordinary. Now some intrepid game developers just have to make one.

Roy Bahat is Chairman of the open, Android-based game console company OUYA, and is former president of IGN. He is also on the faculty at UC Berkeley. Follow him on Twitter @roybahat

]]>When you mix a researcher, a massive online encyclopedia and a supercomputer, the result is a collection of insights and visualizations into what Wikipedia looks like mapped across time and space. In a partnership with high-end computing vendor SGI, University of Illinois researcher Kalev Leetaru was able to mine the entire corpus of Wikipedia posts and make some interesting discoveries along the way.

If there’s a report detailing Leetaru’s findings, I haven’t been able to find it, but even this snippet from the project’s Facebook page is pretty insightful:

From this analysis, Wikipedia is seen to have four periods of growth in its historical coverage: 1001-1500 (Middle Ages), 1501-1729 (Early Modern Period), 1730-2003 (Age of Enlightenment), 2004-2011 (Wikipedia Era) and its continued growth appears to be focused on enhancing its coverage of historical events, rather than increased documenting of the present. The average tone of Wikipedia’s coverage of each year closely matches major global events, with the most negative period in the last 1,000 years being the American Civil War, followed by World War II. The analysis also shows that the “copyright gap” that blanks out most of the twentieth century in digitized print collections is not a problem with Wikipedia where there is steady exponential growth in it’s coverage from 1924 to today.

Leetaru also visualized the findings and created a couple of 30-second videos showing Wikipedia coverage and sentiment over time and geopgraphy. His visualizations (like the one above mapping “[e]very year from 1000 AD to 2012 referenced in Wikipedia plotted and cross referenced when mentioned in the same article”) are beautiful as works of art, although one can’t readily decipher who the influential people, organizations and years are.

Still, the project is a valuable reminder of just how far we’ve come in terms of data-analysis techniques and the computing power necessary to run them. This is why the idea of big data is so popular, even if the possibilities haven’t been fully realized yet. Analyses that would have taken weeks or days now take hours, minutes or seconds, which means anyone with the right data and the right gear can learn a heck of a lot if they can keeping coming up with good questions.

]]>An effort to build a radio telescope that can see back 13 billion years to the creation of the universe is prompting a five-year €32 million ($42.7 million) effort to create a low-power supercomputer and networks to handle the data the new radio telescope will generate. The DOME project, named for a mountain in Switzerland and the covering of a telescope, is the joint effort between IBM and the Dutch space agency ASTRON to build such a network and computer.

There are three problems with building a telescope capable of reading radio waves from that far out in deep space (actually there’s a real estate problem too, because the array will require millions antennas spread over an area the width of the continental U.S., but we’ll stick to computing problems). The first problem is the data that this Square Kilometre Array (SKA) will generate. IBM estimates it will produce:

… a few Exabytes of data per day for a single beam per one square kilometer. After processing this data the expectation is that per year between 300 and 1,500 Petabytes of data need to be stored. In comparison, the approximately 15 Petabytes produced by the Large Hadron Collider at CERN per year of operation is approximately 10 to 100 times less than the envisioned capacity of SKA.

And guys, the LHC is in the midst of getting its own cloud computing infrastructure in order to handle its data. So this IBM/ASTRON project may be just the beginning for SKA. As I say in the headline, in many ways, projects like the LHC and the SKA are ambitious investigations into the origins and composition of the universe. Our investigations into dark matter will require a compute effort that could rival the engineering effort that it took to get men on the moon. Which makes big data our Sputnik and our Apollo 11.

Now, back to the problems associated with the telescope. It will generate data like a corpse breeds maggots, so the project needs a computer big enough to process it without requiring a power plant or two. Additionally that data might have to travel from the antenna arrays to the computer, which means the third problem is the network. I’ve covered the need for compute and networks to handle our scientific data before in a story on Johns Hopkins’ new 100 gigabit on-campus network, but the scale of the DOME project dwarfs anything Johns Hopkins is currently working on. From that story:

[Dr. Alex Szalay of Johns Hopkins] ascribes this massive amount of data to the emergence of cheap compute, better imaging and more information, and calls it a new way of doing science. “In every area of science we are generating a petabyte of data, and unless we have the equivalent of the 21st-century microscope, with faster networks and the corresponding computing, we are stuck,” Szalay said.

In his mind, the new way of using massive processing power to filter through petabytes of data is an entirely new type of computing which will lead to new advances in astronomy and physics, much like the microscope’s creation in the 17th century led to advances in biology and chemistry.

So we need the computing and networking equivalent of a microscope to enable us to deal with a telescope planned for 2024, and the time to start building it is now. That gives us a lot longer than the time frame we had to land on the moon. IBM views the problem as one worthy of the following infographic:

As the infographic shows, we’re going to need massively multicore, low-power computers, better interconnection using photonics and new ways of building our networks. Hopefully, the search for dark matter is worth it.

]]>Los Alamos National Laboratory is trying to build to an exascale computer, which would be 1000 times faster than Cray’s Jaguar supercomputer and could process one billion billion calculations per second. The man in charge of executing that vision, however, sees a big obstacle toward building a computer with 1 millions nodes, running between 1 million to 1 billion cores. That problem is resilience.

(c) 2012 Pinar Ozger. pinar@pinarozger.com

Speaking at GigaOM’s Structure:Data conference, Los Alamos HPC deputy division leader Gary Grider said that the exascale computer has so many parts, that some element will constantly be failing.

“It wouldn’t be worth building if it didn’t stay working for more than a minute,” Grider said. “Resilience is absolutely a must. The way you get answers to science is you run problems on these things for six months or more. If the machine is going to die every few minutes, that’s going to be tough sledding. We’ve got to figure out how to deal with resilience in a pretty fundamental way between now and then.”

Grider and Los Alamos’s technology partners have between 6 and 10 years to work on the problem, and the national lab won’t be alone. According to inside-Data president Rich Brueckner, who moderated the “Faster Memory, Faster Compute” panel Grider spoke on, countries from all over the world are in an exascale race. Brueckner said it’s just as likely as Russia, Japan, China, India or the European Union develops the exascale machine as the U.S.

It looks like Oracle has some competition when it comes to selling big iron for big data. On Wednesday, Cray, the Seattle-based company best known for building some of the world’s fastest supercomputers, said it’s getting into the big data game. A new division within Cray, called YarcData, will leverage Cray’s experience working within data-intensive environments for customers such as Boeing in order to woo large-enterprises with big data needs.

Cray was short on details in a press release announcing the new division, but new YarcData SVP and GM Arvind Parthasarathi, formerly of Informatica is quoted saying, “YarcData is the nexus of the world’s most advanced technologies from Cray being applied to solve the world’s most challenging Big Data problems.” The natural leap is that Cray will design parallel-processing systems capable of incredible data throughput — something already required in the supercomputing space, where incredible processing capacity would be wasted without a steady data stream — but that will support today’s popular big data tools (e.g., Hadoop, analytic databases and predictive analytics software).

This type of system could be very valuable for organizations such as banks and intelligence agencies that want to run big data workloads as fast as possible — even process streaming data in real time– and the deep pockets to pay for Cray’s presumably pricey systems. Despite the fact that big-data framework Hadoop gained popularity in part because it’s designed to run on commodity hardware, there’s always a place for high-end hardware when milliseconds really do matter, and there’s something to be said for pre-configured systems that take the guesswork out of building a big data environment, as I explained recently in a piece for GigaOM Pro (sub req’d).

Cray isn’t alone in pushing this high-performance, enterprise-focused big data vision, though. Oracle made a splash in October when it announced a Big Data Appliance that marries Hadoop, R, NoSQL and other technologies to the high-end hardware Oracle obtained when it bought Sun Microsystems. IBM also has an extensive big data software portfolio complemented by a systems business that includes supercomputers, as well. And although it doesn’t have an HPC pedigree like the others, Teradata has years of experience building systems optimized for analytics.

Cray won’t likely become a household name in the big data world, and its notoriously secretive customers might never divulge what they’re using its analytics products for, but there certainly is a market — however small — for super-big, super-fast and super-expensive data.