Configurable Processors as an Alternative to FPGAs

An exploration of using configurable processors as an alternative to the traditional FPGA approach to creating a custom system.

The ISEF connects to the execution unit along side the Floating Point Unit (FPU) and the Arithmetic Logic Unit (ALU). There are 32 128-bit wide registers
that feed the ISE and the DMA accessible 64KB IRAM located inside the ISEF is used to hold local data to support very high bandwidth parallel computations.

The ISEF is not made up of the familiar Look-Up-Tables (LUTs) seen in most FPGAs, but is a vast collection of ALUs, shift registers, and similar compute
fabric elements. Specific portions of the target C/C++ code can be targeted for optimization within the ISEF and thus thousands of lines of code can be
accelerated into a single processor instruction. In fact the developer can start with the C/C++ code, profile the code to identify the bottlenecks,
create a subroutine to implement the identified section of code, look at the resulting performance increase and resource use and determine if the
application goals have been met. (In many cases, Stretch developers have already done this, since that is how the dedicated Programmable Accelerator
functions are identified and implemented. The process is similar if a custom instruction is implemented in the ISEF for something not already implemented
by Stretch. This gives the customer an opportunity to add their "secret sauce," something that is very difficult to do in a traditional ASSP-based
implementation!

Once you have a device solution targeting a specific market, Bob explained that this is only the beginning of the development process. Target customers in
markets like video or audio want to see or hear the results of an implementation. Many companies have very specific requirements for audio and video
results that make these market areas more or an art than a science. For example, video product developers targeting consumers in Asia prefer hot color
tones, while in Europe blue color tones are preferred. (These are just very simple examples, but you can well imagine the level of detail video experts can
go to when trying to differentiate their look from their competitions.) In addition, the price performance points are fairly established and the customer
expectation is for more channels and more features every two years at the same price points.

To prove to the customer a solution will work, the vendor (Stretch in this case) needs to be able to demonstrate they can deliver to the customer's
functional requirements and at their desired price point. Typical sets of deliverables in this model are reference designs, detailed benchmark comparisons,
actual board level products and a significant amount of C/C++ code. This requires a significant amount of development from the vendor, before sales are
generated. If this sounds to you a lot like an Application Specific Standard Product (ASSP) model you are correct. This is also the model we have seen FPGA
and SoC FPGA companies embrace over the last several years. As integration levels have grown the amount of proof they need to include with their devices
has also grown, in many areas dramatically. The Stretch model just shows how far this trend is probably going to take us.

It seems to me that one of the many advantages this technology can deliver, in addition to the ability for "pure" software designers to access configurable
technology, is that algorithms that are commonly implemented in the fabric can be moved into the dedicated programmable accelerator on the next generation
device. This allows the configurable processor to deliver additional features and functions at a lower price, with higher performance without counting on
just process shrinks. FPGA implementations can be limited in this respect since they target much wider applications sets and must by their nature stay more
generic.

Next time we will look at some of the challenges this approach faces and the additional challenges any new start-up in the programmable space is faced with
when going against the massive installed base of the existing FPGA companies.

Configurable processors: What do you think?
Do you think configurable processors are different enough from traditional FPGAs to offer an advantage in application areas you are familiar with? Could
you use C or C++ in compute oriented designs for your applications? Leave your questions or thoughts in the comment section below!

I really want to see hybrid chips like Zynq get somewhere in the market, but the flexibility needs to be more accessible. I would like to see something along the lines of an FPGA JIT, where the OS on the fixed CPU recognises it is doing something difficult and automatically programmes a segment of the FPGA to offload that task. Think back to the transputer concepts of the 80s, well we are now doing this in software with JIT optimisation of code but we should push some of that back down to the hardware. I can foresee a JIT which reprogrammes an FPGA, I just can't make it happen.