>> Sunday, June 29, 2014

I designed FPGA circuits early in my career and I was surprised by how difficult it can be. The logic elements in an FPGA operate independently, so designers have to keep track of their input/output signals to make sure they're all in step. If Signal A reaches a gate before B and C are valid, the element may produce errors. Timing errors are hard to detect and very difficult to debug. Tools like gdb can't help, so designers use virtual logic analyzers like those provided by Modelsim.
OpenCL can reduce the risk and difficulty of FPGA design, but given the small developer base, Intel might not allow developers to access the Xeon's integrated FPGAs. Instead, Intel could assemble a catalog of prebuilt, fully-tested FPGA designs for special tasks. If Intel's C/C++ compiler (icc) notices that an application could be accelerated with one of these designs, it could alert the developer with a friendly dialog box:

Howdy, developer! I see you're sorting database records and performing statistical analysis. If you install Intel's RapidCore on your Xeon, this application will execute 7,364 times faster.
Buy RapidCore (Y/N)?

After the purchase is completed, the compiler downloads the core from the Internet and automatically installs it on the Xeon's embedded FPGA. This way, the developer doesn't need to understand OpenCL, logic design, or timing analysis.
The principle is similar to the downloadable content (DLC) provided by game publishers. After customers buy a game, they can pay extra to make the game easier or more interesting. With Xeon DLC, developers buy the compiler, and then they can improve performance with special-purpose FPGA designs. Similar improvements could be made available to end-users.
Read more...

>> Monday, June 23, 2014

Intel has announced that upcoming releases of the Xeon processor will have integrated Field Programmable Gate Arrays (FPGAs). At first, this amazed me. The primary languages for FPGA design are Verilog and VHDL, and both are beyond the experience of most Intel programmers. In fact, the process of designing an FPGA circuit with Verilog/VHDL is completely different than that of building a C/C++ application.

Then a thought struck me. The two main FPGA vendors, Xilinx and Altera, are developing toolsets for creating FPGA designs with OpenCL. I wouldn't be surprised if the Xeon's FPGA is intended to be accessed through OpenCL, not Verilog or VHDL.

The announcement doesn't say whose FPGAs will be integrated in the Xeon, but it's noteworthy that Intel is manufacturing Altera's latest generation of FPGAs, which includes the Arria 10 and the Stratix 10. These are the first FPGAs to provide dedicated logic for floating-point DSP. Further, Altera is working hard on its OpenCL support, and I can state from experience that their SDK is functional and polished.

So here's my prediction: Intel's new Xeons will have integrated FPGAs from Altera. Developers will be able to access the FPGAs' dedicated DSP blocks using OpenCL.

This sounds fine, but I foresee three problems:

No matter what language you use, compiling an FPGA design takes hours. Are developers willing to wait that long?

Despite my best efforts, the OpenCL developer community is pretty small. Integrating OpenCL-accessible FPGAs into high-end CPUs seems like a big risk.

Wait a minute. What if these Xeons are intended for Apple? Apple is a fervent believer in OpenCL and they probably know which floating-point routines need FPGA acceleration. Hmm.

ETA: I received a link to a post that accuses Intel of copying Microsoft's effort to use FPGAs to accelerate web searching. This may be the case, but I suspect Intel is trying to compete with Nvidia's high-speed number-crunching servers. We'll see...
Read more...