About a year ago we covered the DEC RIGEL VAX Processor. After The RIGEL DEC moved to make a single chip VAX processor that would include the CPU, FPU, and cache controller on one single die. Work on the design began in 1987, and first silicon shipping in 1991. Performance ended up being as good or better then the very high end VAX 9000 systems (implemented in ECL logic).

The original NVAX processor was made on a 0.75u 3-Layer CMOS process (DEC CMOS-4) and contained 1.3 million transistors in a 339 pin CPGA package. Initial clock speed, in 1991 was 71MHz. NVAX was then the fastest CISC processor made. Speeds ramped up to 90.9MHz at the high end and a lower end of 62.5MHz. The first NVAX models were identified as 246B and 246C. Later versions, made well into 1996, were made on the CMOS-4S process, a 10% shrink to 0.675u and were labeled 1001C.

Internally NVAX was very familiar, the FPU was largely reused directly from RIGEL. The NVAX also maintains the 4-phase clocking scheme from RIGEL, but moves the clock generator on chip. It also maintained the 2K of on die instruction cache from RIGEL, but added a 8K data/instruction mixed cache as well. An L2 cache was supported in sizes of 256K 512K 1M or 2M, and located off chip. The NVAX continued the 6-stage pipeline of RIGEL with some enhancements. One of the greatest performance enhancements over RIGEL is the handling of pipeline stalls. In the RIGEL pipeline, a stall in one stage would stall the entire pipe line, whereas on NVAX, in most cases, a stall in one stage does not prevent the other stages from continuing.

At nearly the same time as the development of the NVAX DEC was also developing a competitor to MIPS, a RISC architecture. This new RISC architecture was codenamed EVAX, for Enhanced VAX, and was a purely RISC architecture that could run translated VAX CISC code with very little performance penalty. It did however borrow from VAX, like the NVAX, EVAX used the FPU from the RIGEL. DEC went on to brand the EVAX as Alpha AXP, to separate it from the VAX line, though its internal naming of EV4, EV5 etc was left intact, as the last remnant of VAX.

DEC 21-40568-02 299D NVAX++ 170.9MHz – 1996 – from a VAX7800

Having two high performance processor types at the same time left DEC in a bit of a dilemma so they created a third, known as the NVAX+ (DEC 262D). The NVAX+ was originally made on the same CMOS-4 process as the NVAX and ran at 90.9MHz. The NVAX+ was meant to be a bridge between the VAX line and the Alpha AXP. It was a NVAX core, wrapped in an EVAX (Alpha AXP) external interface, it was made in the same 431PGA as the Alpha 21064 and was pin for pin compatible, the same board could be used for either. It supported more L2 cache then the NVAX, supporting six cache sizes (4MB, 2MB, 1MB, 512KB, 256KB, 128KB),

In 1994 the NVAX+ was shrunk to the DEC CMOS-5 4-Layer 0.5 micron process resulting in the NVAX++ (DEC 299D) which ran from 133-170.9MHz. These speeds continued to be the fastest CISC processors until Intel released the Pentium Pro at 180 and 200MHz in 1996. Ultimately Intel’s dominance, and the coming dominance of RISC performance were the writing on the wall, and the VAX, and not long after it DEC itself were doomed to reside in the history books. By 1997 The NVAX++ was off the market. In 1997 the DEC Alpha team was operating out of offices owned by Intel (who also took over DEC’s fab’s), and in 1998 the remains of DEC, and the Alpha team, were bought by Compaq. And by 2004 Alpha was phased out in favor of Itanium (a now rather ironic decision by HP/Compaq).

DEC’s 32-bit VAX architecture saw many implementations since its introduction in 1977. Early implementations were all multi-chip, but as technology improved the VAX architecture could be implemented (at least partially) on a single VLSI chip. The first implementation on a single chip was the MicroVAX II released in 1985. It contained 125,000 transistors, made on a 3 micron NMOS (DEC proprietary ‘ZMOS’) process and ran at 5MHz (200ns cycle time).

In 1987 DEC released the CVAX, the second generation VAX on VLSI. The CVAX was made on DEC’s first CMOS process, a 2 micron design using 175,000 transistors and clocked from 10-12.5 MHz (80-10ns cycle time). The input clock was a four-phase overlapping clock (so input frequency was 4x the cycle time, or 40-50MHz). Performance was 2.5-3 times better then the MicroVAX II. About half the gain was from process improvement (increased clock speed), while the rest was from architectural changes (mainly pipelining).

DEC DC580C 78034 CVAX+ 16.67MHz

As the CVAX (and its successor the CVAX+) were released the next generation was already being designed by DEC. This was to be Rigel. Rigel has a 6-stage pipeline, and was made on a 2 micron CMOS process and the CPU contained 320,000 transistors, 140k of which were for logic, while the remaining 180k were for memory (cache). The separate FPU chip contained an additional 135,000 transistors. After some early teething pains on the new CMOS process, where yields were almost non-existent, the process finally was refined enough to make commercial samples by late 1988. The target speed for Rigel was a 40ns cycle (25 MHz clock). This would give the Rigel a 6-8x performance gain over CVAX. 2X of this was from the process shrink (and doubling of clock speed) while 3X was from the improved pipelining. The remainder was due to increased memory performance, not the least of which was due to Rigels 2KB of on chip cache.

In the mid-1970’s DEC saw the need for a 32-bit successor to the very popular PDP-11. They developed the VAX (Virtual Address eXtension) as its replacement. Its important to realize that VAX was an architecture first, and not designed from the beginning with a particular technological implementation in mind. This varies considerably from the x86 architecture which initially was designed for the 8086 processor, with its specific technology (NMOS, 40 DIP, etc) in mind. VAX was and is implemented (or emulated as DEC often called it) in many ways, on many technologies. The architecture was largely designed to be programmer centric, writing software for VAX was mean to be rather independent of what it ran on (very much like what x86 has become today).

The first implementation was the VAX 11/780 Star, released in 1977, which was implemented in TTL, and clocked at 5MHz. TTL allowed for higher performance, but at the expense of greater board real estate as well as somewhat less reliability (more IC’s means more failure points). It also cost more, to purchase, to run, and to cool.

DEC followed the Star with the 11/750 Comet in 1980. This was a value version of the Star. It ran at only 3.12MHz (320ns cycle time) but introduced some new technology. Part of the ‘value’ was a much smaller footprint. The TTL had been replaced by bi-polar gate arrays. Over 90% of the VAX architecture was implemented in the gate arrays, and there was a lot of them, 95 in a complete system with the floating point accelerator (28 arrays). The CPU and Memory controller used 55 while the Massbus (I/O) used an additional 12 gate arrays. The 95 gate arrays though replaced hundreds of discrete TTL chips. And as a further simplification they were all the same gate array.

The Largest CPU Museum!

In my daily hunt for new processors, and other chips for the museum, as well as information about new chips, I constantly come across interesting chips, in strange locations. Here you will get a chance to learn WHERE many of the chips in the museum come from and what they are.