IntroductionWith the invention of the MOS silicon-gate technology (SGT), and with its first commercial application in 1968, the full potential inherent in the MOS transistor was unleashed. This novel technology used self-aligned-gate transistors with gates made with highly-doped poly-crystalline silicon. With a single brushstroke SGT removed all the limitations in operating speed, circuit density, and reliability which until then plagued the previous metal-gate technology, transforming the industry, and eventually leading to the demise of the dominating bipolar technology.

The speed of integrated circuits (ICs) was increased by over a factor of 5, the MOS ICs reliability was brought to the level of bipolar ICs, the leakage current was reduced by two orders of magnitude, and the random-logic circuit density was increased by a factor of two, all the while using the same lithography and the same power dissipation of the incumbent technology.

The SGT was not only indispensable to the creation of reliable and fast dynamic random-access memories (DRAM) and microprocessors, but it also enabled the fabrication of novel device types used for non-volatile memories, and for CCD image sensors. Introduced in 1968 and later adopted world-wide, the SGT became the workhorse technology for the following forty years, eventually displacing bipolar technology for most applications. Today nearly all integrated circuits are built with MOS transistors.

The State-of-the-Art in the Mid-Sixties
This article will describe in detail the development of the SGT, illustrating the various contributions that made this technology a market reality in 1968 with the introduction of the Fairchild 3708, the world first commercial IC to use self-aligned gates.

To set the stage, let’s first describe the MOS process technology used in the mid-sixties. This technology used enhancement-mode, P-channel MOS transistors with gates made of aluminum, the metal that was also used to interconnect the transistors within an IC. The threshold voltage of such transistors was in the range of -5 to -8 volt, dictated primarily by the crystal orientation of the silicon wafer used, and the work-function difference between aluminum and silicon.
To maintain the isolation of the various transistors within an IC, it was necessary to have the threshold voltage of the parasitic MOS transistors higher than the highest voltage present in the IC. (A parasitic MOS transistor is an unintended transistor obtained when a metal line over the field oxide crosses two junctions. In this case the junctions act as the source and the drain of a parasitic MOS device, with the metal line acting as the gate of such device. Now, if the voltage on the metal line is high enough to cause an inversion layer in the silicon at the silicon dioxide interface -- henceforth called the field-oxide threshold voltage -- a stray conduction path between the two junctions will be created. Therefore, if there is a voltage difference between the two junctions, stray current will flow between them, where the two junctions were supposed to be isolated. This possibility is particularly damaging in the case of dynamic circuits because electrical charges stored in MOS transistor gates may leak away much faster than normal, causing malfunctions).

To avoid stray conduction paths, it was therefore necessary to have a sufficiently thick field-oxide so that its threshold voltage would be higher than the supply voltage. The higher the supply voltage, the higher the thickness of the field oxide had to be to avoid turning on parasitic MOS devices. However, the thicker the field oxide, the more difficult it was to maintain the integrity of the aluminum interconnections going over the oxide steps, creating major yield problems if the metal would break, and also potentially severe reliability problems due to electro-migration if the metal would thin out too much at those steps. (Electro-migration causes the aluminum interconnects to open if the current density exceeds a certain limit; the thinning of metal lines over the oxide steps caused by self-shadowing during the aluminum vacuum deposition process, could cause such problem in the field, creating a major reliability hazard.)

The high-threshold-voltage MOS transistor technology used in the mid-sixties, was then the only practical possibility to achieve the delicate balance between the various conflicting requirements. Where such requirements couldn’t be met, N+ channel-stopper diffusions underneath the field oxide were used to raise the field-oxide threshold voltage, thus allowing a thinner field oxide. However, this solution was undesirable because it drastically reduced the circuit density, increasing the cost of the IC.

Furthermore, since power dissipation grows with the supply voltage, by reducing the MOS threshold voltage, one could reduce the supply voltage, and hence the power dissipation, for a given target speed. Conversely, given a certain power budget, one could achieve a higher operating speed if one could use lower threshold voltage transistors.

Reducing the threshold voltage of MOS transistors to the range of -3 to -5 volt was possible by using silicon wafers with [100] crystal orientation, instead of the [111] orientation used in the high-threshold-voltage process. With such threshold voltage, the supply voltage could be reduced from -24 volt to -15 volt with approximately a factor-of-2 reduction in the power dissipation, for the same speed. However, with [100] starting material, the threshold voltage of the parasitic MOS transistors was lower than the supply voltage, demanding the use of channel stoppers, with approximately a factor-of-two penalty in circuit density and cost. Such tradeoff was hardly worthwhile. One had to find a way to reduce the MOS transistor threshold voltage without sacrificing circuit density.

Another major limitation of MOS technology was the very high parasitic gate-to-source and gate-to-drain overlap capacitances. This was a recognized problem from the beginning, but nobody had yet figured out how to make MOS transistors with consistently low overlap capacitances. Before addressing the solution to this problem, some additional background is needed.

The standard MOS fabrication process started with thermally growing the field oxide in a [111]-orientation silicon wafer, followed by masking and etching the initial oxide in the regions where the source and drain of the MOS transistors had to be located. This step was followed by doping with boron the source and drain regions; by growing some additional oxide over the exposed source and drain regions; and then removing the field oxide in the gate region of the transistor, where later a thin thermal oxide had to be grown – the gate oxide. After the gate oxide growth, contact areas to the junctions were made, followed by aluminum deposition and aluminum etching to shape the gate electrodes and the interconnections. Due to the inevitable misalignment of the gate mask with respect to the source and drain mask, it was necessary to have a fairly large overlap area between the gate region and the source and drain regions, to insure that the inversion layer would positively bridge the source and drain, even under worst-case misalignment of the gate mask.

This requirement resulted in a significant increase in the gate-to-source and gate-to-drain parasitic capacitances, over and above the amount that would be strictly necessary if perfect alignment was possible. Even worse, such parasitic capacitances did vary from wafer to wafer, depending on the direction of the misalignment of the gate mask with respect to the source-drain mask. The overlap capacitance with the most adverse consequences on circuit performance was the gate-to-drain parasitic capacitance, Cgd. Due to the well-known Miller effect, the gate capacitance of any given transistor was increased by its Cgd multiplied by the gain of the circuit of which such transistor was a part. Since the gain is generally greater than one, the impact of Cgd on the switching speed of transistors was considerable. Furthermore, because of the variability of Cgd due to the unpredictable direction of the misalignment, some wafers would be impacted very little and some other wafers would be impacted very much, producing a large and undesirable spread in the speed of the IC produced.

By 1968 much of the MOS industry was engaged in developing a low threshold voltage MOS technology that could replace the incumbent technology. This objective was eventually achieved with the use of a new technology -- ion implantation – that allowed increasing the threshold voltage of parasitic MOS transistors built with [100] silicon without using channel stoppers. This result was possible because ion implantation gave a much higher control of the doping level, particularly for low doping concentrations, than was possible with the previous thermal methods. This development was achieved at about the same time than the SGT was created. However, as we will describe later, the SGT went far further than just achieving a low threshold voltage MOS technology.

The Self-Aligned Gate
In 1966 Dr. Robert Bower realized that if the gate electrode was defined first, and then it could be used in turn to define the source and the drain at the channel boundaries, it would be possible not only to minimize the parasitic capacitances between gate and source, and gate and drain, but it would also make them insensitive to misalignment. He proposed a method in which the aluminum gate electrode itself was used as a mask to define the source and drain regions of the transistor at the gate-region boundaries. However, since aluminum could not withstand the high temperature required for the conventional doping of the source and drain junctions, Dr. Bower proposed to use ion implantation, a new doping technique still in development at Hughes Aircraft, his employer, and not yet available at other labs.

While Bower’s conception of using the gate as a mask to define the source and drain regions was sound, in practice it did not work with aluminum because aluminum could not survive the high temperatures required to perform the following processing steps. Therefore it was impossible to adequately passivate the transistors with silicon dioxide, and also to repair the radiation damage done to the silicon crystal structure by the ion implantation. Thus, Bower’s idea was good in principle, but a more refractory gate material than aluminum was needed. In fact, Bower’s process described in US Patent No. 3,472,712, (filed October 27, 1966 and issued on October 14, 1969) was never used to produce commercial integrated circuits.

In 1967 John C. Sarace and collaborators at Bell Labs fabricated discrete transistors with gate electrodes made of vacuum-evaporated amorphous silicon and succeeded in building working self-aligned gate MOS transistors. Their experiment started with a wafer in which they grew a thin oxide, followed by vacuum deposition of amorphous silicon. They then masked the silicon to define the gates, which were shaped like annular regions, and followed that step by removing the thin gate oxide all over, except where it was protected by the silicon gates. Finally they doped the wafer with boron to create the source and drain junctions. After a thin layer of oxide was grown over the structure, the following sequence of steps was performed: contact mask, aluminum evaporation, and metal mask; thus completing the process in the usual way.

The ring structure of the transistors had the drain electrodes inside the rings, while the source electrodes were the common diffusion outside the rings, thus the source electrodes of the transistors were all connected together. Therefore the process they described was useless for the fabrication of integrated circuits; it was just a proof of principle, suitable only for the fabrication of discrete transistors, and was not pursued further by its investigators.

In late 1967, Tom Klein of Fairchild Semiconductor, experimenting with MOS capacitors where the aluminum was replaced by amorphous silicon, observed that the work function difference between heavily-doped, P-type silicon and lightly doped N-type silicon was 1.1 volt lower than the work function difference between aluminum and the same N-type silicon. This meant that the threshold voltage of MOS transistors built with silicon gate could be 1.1 volt lower than the threshold voltage of MOS transistors with aluminum gate fabricated on the same starting material. Therefore one could use starting material with [111] orientation and simultaneously achieve both low-threshold-voltage MOS transistors and high parasitic MOS threshold voltage. With P-type-doped silicon gate it would therefore be potentially possible not only to create self-aligned gate transistors but also a low threshold voltage process by using the same silicon orientation of the high threshold voltage MOS process. However Klein did not figure out how to architect the process to make the isolated transistors necessary for the fabrication of self-aligned-gate integrated circuits.

The Development of the SGT and the First Commercial IC with SGT
In February 1968, Federico Faggin joined the MOS process development group, then directed by Les Vadasz, of the Fairchild Semiconductor R&D Laboratory in Palo Alto, CA. Faggin was the MOS group leader of SGS-Fairchild (now STMicro) in Italy, and was a guest engineer for six months. At Fairchild, he was put in charge of the development of a low-threshold-voltage, self-aligned-gate MOS process technology using silicon gates, reporting to Vadasz.

At that time no one at Fairchild had yet devised the necessary process architecture to make integrated circuits with silicon gate, and there wasn’t even a known etching solution for the silicon gate. Therefore Faggin’s first tasks were to invent the process architecture, to design the detailed processing steps to fabricate MOS ICs with silicon gate, and then to develop a method to precision-etch the amorphous silicon. After completing that work, he would have to design an appropriate integrated circuit to prove the performance of the new technology. Incidentally, Faggin was not aware, and was not told, of the previous work of R. Bower and J. C. Sarace’s team. He learned about their work a few years later, after having successfully completed his project.

Faggin soon invented a new process architecture that included also the use of buried contacts, i.e. a way to make direct contact between amorphous silicon and silicon junctions, without the use of metal, noting that it would allow a much higher circuit density since it would be possible to effectively have two layers of interconnection, one with silicon and one with aluminum, at the cost of only one additional masking step. Vadasz approved the use of the proposed process architecture, without the buried contact, saying that it would probably not work. Faggin, believing in his idea, decided nonetheless to put two test circuits in his test pattern called XTPG, to verify that the buried contact would work, and to characterize its properties, since it did not cost anything more to do so.

The essence of Faggin’s architecture was to first grow the initial oxide, followed by opening areas, or tubs, in the oxide where the source, drain and gate of the transistors were to be located. This step was followed by the growth of the gate oxide, followed by the deposition and etching of the poly-silicon layer, thus defining the gates. Then the thin oxide was etched away inside the tub where it was not protected by the poly-silicon, thus defining the source and drain regions of the transistors. Notice that a misalignment between the tub mask and the gate mask would simply change the geometry of the source and drain, but would not change the gate overlap capacitances of the transistor. After the removal of the thin oxide, the doping of source, drain and poly-silicon would be performed, and here the silicon gate would act as a mask against doping occurring in the gate region. After doping, a thin layer of thermal oxide would be grown to protect the exposed source, drain and gate areas, followed by a thicker layer of vapor-deposited silicon dioxide. Contact mask, contact etching, aluminum deposition and metal mask would then complete the process.

The variant of the above process used to make buried contacts was as follows: after the gate oxide was grown in the tubs, the gate oxide was removed in the areas where the buried contacts between silicon and junctions were to occur. Then the processing steps following the buried contact mask continued as described earlier. The idea here was that when the boron doping would be performed, the boron would diffuse through the deposited silicon in contact with the single-crystal silicon of the wafer, and would also form a junction in the single-crystal silicon itself, thus creating an isolated contact that would later be protected by oxide. Therefore aluminum could sit right above the buried contact, allowing a greater circuit density to be achieved.

After developing a suitable silicon etching solution, Faggin defined and calculated the detailed processing steps for his newly devised process architecture, and he also designed a test pattern to enable the initial testing and characterization of the process. By April 1968 Faggin was able to fabricate the first working MOS transistors suitable for making integrated circuits by using the full silicon gate process described earlier. He then designed the first integrated circuit using self-aligned silicon gate – the Fairchild 3708, an 8-bit analog multiplexer with decoding logic – and by July 1968, the first fully functional 3708 was fabricated.

The 3708 was functionally identical to the 3705, a production IC that Fairchild Semiconductor had difficulty making on account of its rather stringent specifications. This product choice allowed making an effective comparison between the SGT and the metal-gate technology, and also provided a platform to further improve the process during the following months. The first 3708 commercial shipment to customers occurred before the end of 1968, after further process refinements were made, of which the two most important innovations are described below:

1. Replacing the vacuum-evaporated amorphous silicon with poly-crystalline silicon obtained by vapor-phase deposition. This step became indispensable since evaporated, amorphous silicon did often break at oxide steps.

2. Using phosphorous gettering to eliminate a high percentage of impurities, always present in the MOS transistors, causing high leakage currents and reliability problems.

Phosphorous gettering was a high-temperature process that had been developed to improve the performance and reliability of bipolar ICs. However, this process needed to be done after the aluminum gate deposition, but since the aluminum could not withstand high temperatures, it could not be used for MOS ICs. The silicon gate, however, could withstand the gettering-process temperature, therefore it became possible for the first time to subject the completed and oxide-sealed MOS transistor structure to gettering, thus considerably reducing the junction leakage current and nearly eliminating the threshold voltage drift that still plagued MOS devices made with aluminum gates. With silicon gate it became soon possible to achieve the same level of long-term reliability that had been reached by bipolar ICs, thus removing another major obstacle to broad adoption of MOS technology.

By the end of 1968 the silicon gate technology had achieved impressive results. The 3708 had been designed to have approximately the same area than the 3705 to facilitate using the same production tooling that already existed for the 3705, though it could have been made considerably smaller. Nonetheless, compared with the 3705, the 3708 was 5 times faster, it had about 100 times less leakage current, and the on resistance of the large transistors making up the 8 analog switches was 3 times lower.

The SGT technology and the Fairchild 3708 were featured in the cover story of Electronics magazine in 1969: Faggin, F., Klein T. (1969). “A Faster Generation Of MOS Devices With Low Threshold Is Riding The Crest Of The New Wave, Silicon-Gate IC’s.” Electronics, September 29, 1969.

Designing Circuits with the SGT
In early 1969 the SGT was in all respects superior to the metal gate technology, except for one area. With metal gate it was easy to fabricate isolated capacitors by simply making a diffusion covered with gate oxide and aluminum. With silicon gate, however, only silicon could be over the thin gate oxide, and no diffusion was possible under the gate oxide, exactly because the silicon was used as a mask against it. Therefore to make isolated capacitors it would have been necessary to have an extra mask and an extra diffusion, adding to the cost of the process.

One of the crucial circuits that needed a capacitor was the so-called bootstrap load. A bootstrap load allows a logic gate to achieve an output swing equal to the supply voltage, Vdd, rather than (Vdd – Vt), which is the output voltage produced by a conventional MOS transistor load with threshold voltage Vt. When this output voltage is applied to the gate of a pass transistor, i.e. a transistor that is in series with the gate of another transistor, the signal out of the pass transistor is one additional threshold voltage drop below (Vdd – Vt), and this new signal is generally not sufficient, in the worst-case condition, to turn on the transistor driven by the pass transistor. (In this analysis one needs to remember the existence of the so-called body-effect, whereby the threshold voltage of a transistor whose source is at a different potential with respect to the substrate, is augmented by an amount proportional to the square root of the source-to-substrate voltage. The threshold voltage increase due to the body-effect can be significant, and this is the case for both an MOS transistor load and a pass transistor).

The pass transistor is fundamental to making dynamic circuits because it allows the temporary storage of information in the form of an electric charge stored in the gate capacitance of the transistor it drives. However, the gate voltage of the pass transistor needs to be sufficiently high to produce an adequate output signal to drive the gate of the transistor to which it is connected. This objective can be accomplished with a bootstrap load driving the gate of a pass transistor. Without bootstrap loads one could only design either simple dynamic circuits, where external clocks with high enough voltage did drive the pass transistors -- such as shift registers, for example -- or static random logic circuits. However, static random logic circuits were much slower, and required many more transistors than dynamic circuits. Complex two-phase random logic design, and push-pull drivers, absolutely needed bootstrap loads. For example, the Intel 4004 would not have been feasible in 1970 without bootstrap loads. Without them, the chip size would have been too large, and the speed too small for the available power budget, making the circuit too expensive and too slow for most applications.

The MOS engineers at Fairchild believed that bootstrap loads could not be manufactured with silicon gate technology, unless an additional masking step, followed by a diffusion layer, was used. In fact, the design engineers of the Fairchild MOS Division had been resisting the use of the silicon gate technology because they couldn’t make bootstrap load devices.

It took about one year for Faggin to figure out how to design bootstrap loads without adding another mask. The solution, when found, was actually very simple: Faggin noted that the metal side of the capacitor in a bootstrap load is always biased in a way that, if there was no diffusion underneath, there would nonetheless be an inversion layer at the interface between silicon and oxide. In other words, there would be a “virtual” diffusion there. Thus a capacitor made simply with poly-silicon over thin oxide would act as a perfectly good capacitor under such conditions, allowing the fabrication of bootstrap loads. Faggin then designed a test circuit with a variety of bootstrap load designs and verified their correct operation before the end of 1969. The first commercial use of the bootstrap load was made in each of the chips of the Intel MCS-4 chip set designed by Faggin in 1970. With the 4 chips of the MCS-4 it was possible to build a wide variety of microcomputers. Faggin did not want to patent the bootstrap load because the invention was made at Fairchild Semiconductor.

Those same design engineers who complained about the impossibility of making bootstrap loads with SGT had another big complain. They said that there was no area advantage to silicon gate when compared with metal-gate, despite Faggin’s claims to the contrary. In addressing their complain, Faggin found that the design engineers were automatically translating into silicon-gate the old aluminum-gate circuit topologies, without the rethinking that was necessary, given the substantial differences between the two technologies. Silicon gate was different enough that many old practices didn’t work well anymore, and new approaches were required. Faggin showed them that indeed one could generally design smaller circuits with SGT than with metal gate, and especially if buried contacts were used.

After Vadasz left Fairchild to join Intel, Faggin fabricated and tested the buried contact, the invention that Vadasz did not approve of. He verified that the idea he had on March, 1968 worked perfectly, and he made a number of test layouts proving that he could make more compact circuits with it, particularly for random logic circuits where random interconnections were needed. The best comparison between the density advantage of SGT with buried contact and bootstrap load, over metal-gate technology, is the Intel 8008 compared with its functionally equivalent micro-processor designed (but never produced) by Texas Instrument. The direct comparison showed that the 8008 was slightly less than half the chip-size than the TI chip. Incidentally, the 8008 was the world’s first 8-bit microprocessor, designed in 1971 by F. Faggin and Hal Feeney at Intel, using the same technology and methodology used for the 4004. The 8008 and the TI chip were originally intended to be the same custom CPU for Computer Terminal Corporation, to be used in CTC’s Datapoint 2100 intelligent terminal.

Impact of the SGT
The first company to adopt the SGT, other than Fairchild, was Intel Corporation. Intel was dedicated to the emerging semiconductor memory market and introduced in late 1969 the second commercial IC made with SGT, the Intel 1101, a 256-bit static memory, made with a similar technology used for the Fairchild 3708. Leslie Vadasz directed the design of the 1101, and when he left Fairchild to join Intel, he had already seen the 3708 working, and he knew all the details of the SGT.

In April 1970 Faggin joined Intel where he designed the Intel 4004, the world’s first microprocessor, using his most recent inventions, the bootstrap load and the buried contact, which were indispensable to design a random logic circuit of that complexity with the necessary speed, power dissipation and chip size to be commercially viable.

One important property of the SGT was that the silicon gate was entirely surrounded by top quality thermal oxide, making it possible to create new device types, not feasible with metal-gate technology. For example, in 1969-1970, Dov Frohman at Fairchild was experimenting with floating silicon-gate devices (MOS transistors whose gates are not connected to anything) to make non-volatile memory devices. He joined Intel in 1970 where he developed the first electrically programmable and UV erasable read only memory, opening the way to EEPROM devices, and eventually flash memories. Another major invention made practical by SGT was charge coupled devices (CCD). Originally invented at Bell Labs., CCDs could be successfully manufactured in 1970 with SGT at Fairchild to produce the first solid-state image sensors that revolutionized the entire field of photography. These new classes of devices dramatically enlarged the range of functions that could be made with solid state electronics.

The success of Intel was acknowledged by Gordon Moore, one of its founders, to be largely due to the adoption of the SGT. He said that the SGT was difficult enough to make that it took a while for the industry to copy it, and yet it was not so difficult that a startup company couldn’t do it. What G. Moore did not acknowledge was that Intel had privileged knowledge of the work done at Fairchild which made it possible for them to develop SGT in a relatively short time.

By the mid-seventies, the SGT was adopted by the entire industry, allowing MOS technology to eventually replace the incumbent bipolar technology for nearly all ICs produced in the world. Only recently the industry has been forced to use materials other than silicon dioxide and poly-silicon for the gate stack, in order to continue the scaling down of MOS transistor sizes, without undue loss of performance, at or below 45 nm lithography. Nonetheless, SGT remains one of the most influential ideas that have fueled the stunning progress of microelectronics during the last 40 years.