Start-up could kick Opteron into overdrive

Exclusive The 2002 film Death to Smoochy reminds us that "friends come in all sizes." AMD executives must embrace this observation on a daily basis, especially when a company such as DRC Computer appears.

The tiny DRC works out of a no frills Santa Clara office, producing technology that has the potential to give servers based on AMD's Opteron chip a real edge over competing Xeon-based boxes. DRC has developed a type of reprogrammable co-processor that can slot straight into Opteron sockets. Customers can then offload a wide variety of software jobs to the co-processor running in a standard server, instead of buying unique, more expensive types of accelerators from third parties as they have in the past.

"Current accelerators costs about $15,000 each and deliver little performance improvements beyond what you could achieve by buying more blade servers for that same price," Larry Laurich, the CEO of DRC, told us in an interview. "We have taken the approach that we must deliver three times the price-performance of a standard blade."

Neither standalone server accelerators nor FPGAs (field-programmable gate arrays), which is what the DRC modules are, stand as novel concepts in the hardware industry. Server customers, however, have largely shied from buying pricey, specialized co-processors even when such devices demonstrated dramatic performance improvements on certain workloads. The high costs of accelerators, a lack of supporting software and a large amount of custom design work needed to make the devices work well have made them not worth the trouble to most customers.

It's this tradition of disdain for accelerators that DRC will have to fight.

"People have tried a lot of special purpose processing devices over the years and, with the exceptions of graphics units and arguably floating point units, general purpose processors have always won out in the end," said Gordon Haff, an analyst at Illuminata.

DRC thinks it has solved the price and performance problems by playing off AMD's open Hypertransport specification.

"DRC's flagship product is the DRC Coprocessor Module that plugs directly into an open processor socket in a multi-way Opteron system," the company notes on its web site. "This provides direct access to DDR memory and any adjacent Opteron processor at full Hypertransport bandwidth [12.8 GBps] and ±75 nanosecond latency."

AMD's decision to open Hypertransport could end up being a key factor in Opteron's future success. Intel looks set to compete better with AMD later this year when it releases a revamped line of Xeon processors. AMD, however, can now turn to third parties such as DRC for performance boosts unavailable with Intel's chip line.

DRC appears to be making the most of its AMD ties by sliding right into Opteron sockets. That means that customers can outfit an Opteron motherboard with any combination of Opteron chips and DRC modules. Illuminata's Haff sees the DRC implementation as one way of overcoming past aversions to accelerators.

"It is true that one of the issues around PCI-based FPGA products and really anything specialized is that by the time you transfer the calculation over the special purpose board, you have often lost much of the benefit you had," Haff said. "So, putting the product within the CPU fabric certainly does help address this particular problem."

The notion of offloading certain routines to an FPGA should prove attractive to a wide variety of industries, stretching from the oil and gas sector to high performance computing buffs and possibly even mainstream server customers.

Today, for example, companies like Boeing that need specialized, embedded devices will buy a PCI board with an FPGA and do custom work designing software and a hardware unit for their system. "Those products could end up in something the size of a telephone or a bread box," said Laurich. "It may take them about six months to lay out that type of custom design."

With the DRC module, customers can pick from standard hardware ranging from blade servers on up to Opteron-based SMPs instead of building their own breadboxes.

Each DRC module will cost around $4,500 this year and likely drop to around $3,000 next year, Laurich said. That compares to products from companies such as SGI that cost well over $10,000.

So far, DRC has seen the most interest from oil and gas companies looking to put specific algorithms on the FPGAs. Manufacturing firms and financial services companies have also looked at the DRC products for help with their own routines. It's not hard to imagine companies such as Linux Networx, Cray or SGI (when it does the inevitable and backs Opteron) wanting to move away from more expensive FPGA products as well in order to service the high-performance computing market.

Eventually, standard server makers could turn to the FPGAs to help with security or networking workloads.

"There does seem to be this kind of general feeling in places like IBM and Sun that the time may be here to use some special purpose processors or parts of processors for various things," Haff said. "The FPGA approach is certainly one way of doing that. It does have the advantage that you're not locked into a particular function at any time because you can dynamically reprogram it."

The DRC products also come with potential energy cost savings that could be a plus for end users and server vendors that have started hawking "green computing." Power has become the most expensive item for many large data centers.

The first set of DRC modules will consume about 10 - 20 watts versus close to 80 watts for an Opteron chip. An upcoming larger DRC module will consume twice the power and be able to handle larger algorithms.

"We believe we will get 10 to 20x application acceleration at 40 per cent of the power," Laurich said. "At the same time, we're looking at a 2 to 3x price performance advantage."

It will, of course, take some time to build out the software for the DRC modules. The company has started shipping its first machines to channel partners that specialize in developing applications for FPGAs. An oil and gas company wanting to move its code to the product could expect the process to take about 6 months.

If DRC takes off, the company plans to bulk up from its current 13-person operation and to tap partners in different verticals to help out with the software work.

DRC also thinks it can maintain a competitive advantage over potential rivals via its patent portfolio. The modules result from work done by FPGA pioneer Steve Casselman, who is a co-founder and CTO of the company. Casselman told us that he had been waiting for something like Hypertransport to come along for years and that AMD's opening up of the specification almost brought tears to his eyes.

It's always difficult to judge how well a start-up will pan out, especially one that needs to build out systems and software to make it a success. DRC, however, does have - at the moment - that rare feeling of something special.

It's playing off standard server components and riding the Opteron wave. In addition, it is reducing the cost of acceleration modules in a dramatic fashion. That combination of serious horsepower with much lower costs is typically the right recipe for a decent start-up, and we'll be curious to see how things progress in the coming months.