Patent application title: SRAM CIRCUITRY

Abstract:

A static ram cell is described. The cell includes a pair of cross-coupled
transistors and a pair of diode-connected transistors operated from a
wordline that provides power to the cell. The cell has three main
operating modes, reading, writing, and data retention. Reading is
performed by sensing current flowing from a powered-up wordline through a
conductive one of the cross-coupled transistors. Writing is performed by
pulsing the source of the conductive one of the cross-coupled transistors
with a positive voltage to flip the conductive states of the
cross-coupled transistors. Data retention is performed by using leakage
currents to retain the conductive states of the cross-coupled
transistors. A decoder for an array of static ram cells may be operated
synchronously and in a pipelined fashion using a rotary traveling wave
oscillator that provides the clocks for the pipeline. The cell is capable
of detecting an alpha particle strike with suitable circuitry.

Claims:

1. A method for writing data into a static ram cell, the method
comprising:setting a wordline of the static ram cell to a first reference
voltage, the wordline providing the current for writing data into the
cell, the cell having true and complement bit lines, the bit line having
a conductive path to the wordline and the complement bit line having a
non-conductive path to the wordline;connecting the bit line via a first
switch and the complement bit line via a second switch to a second
reference voltage; andpulsing the bit line of the cell to the first
reference voltage during a short time interval to write new data into the
cell, such that the complement bit line has a conductive path to the
wordline.

2. A method for retaining data in a static ram cell, the method comprising
setting a wordline of the static ram cell to a voltage that provides
leakage currents to the cell, the cell having true and complement bit
lines, the bit line having a conductive path for carrying the leakage
currents from the wordline and the complement bit line having a
non-conductive path to the wordline.

3. A method for retaining data in a static ram cell as recited in claim
2,wherein the true and complement bit lines have been set to a particular
voltage; andwherein the voltage of the wordline is more than one
threshold voltage above the particular voltage on the true and complement
bit lines.

4. A method for retaining data in a static ram cell as recited in claim 2,
further comprisingdetecting current flow in the non-conductive path to
the wordline; andrecording the detected current flow in a latching
detector, the latching detector indicating a strike event at the cell.

Description:

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001]This application is a divisional of U.S. patent application Ser. No.
11/621,108, filed Jan. 8, 2007, titled SRAM CIRCUITRY, which is a
divisional of U.S. patent application Ser. No. 11/123,880, filed May 4,
2005, titled "SRAM CIRCUITRY," which claims priority to and incorporates
by reference into the instant application Great Britain Application
titled "SRAM04," having patent application number GB0409728.3, filed May
4, 2004.

FIELD OF THE INVENTION

[0002]This invention relates generally to SRAM cells and more particularly
to a high speed, low power SRAM cell.

DESCRIPTION OF THE RELATED ART

[0003]Leakage current in sub-100 nanometer (nm) VLSI devices is becoming
an increasing problem with each generation. At such a small scale, the
transistors have problems `turning off` (because the threshold voltage
Vth is reduced, which increases sub-threshold current) and most CMOS
circuits including SRAM and static logic will leak current from VDD to
GND through P- and N-channel transistors. These currents are in the order
of 25 nA per transistor pair. With possibly 1 billion transistors on a
chip, the amount of power wasted can be large. Sub-threshold leakage
increases dramatically with temperature and supply voltage. Note that the
other, often quoted, leakage mechanism, gate leakage, can be solved with
the correct `CMOS recipe` (see Intel process press-releases due for
introduction 2007, which uses high-K dielectric and metal gate).

[0004]Many VLSI chips today are largely comprised of SRAM arrays (60% of
chip area) with the remainder being logic circuits. An SRAM is prone to
leak energy because of its CMOS (Pmos, Nmos) configuration with obvious
leakage paths.

[0005]Reducing VDD to reduce leakage current in SRAM and its
optimization/tradeoffs have been an active research topic. Please refer
to references [1] and [2].

[0009]An embodiment of the present invention is a static ram cell that
includes a pair of cross-coupled transistors and a pair of
diode-connected transistors. Each of the pair of cross-coupled
transistors has a gate node and a channel between a source node and drain
node, where the first transistor of the pair has the drain node connected
to the gate node of the second transistor of the pair, and the second
transistor of the pair has the drain node connected to the gate node of
the first transistor of the pair, the source node of the first transistor
of the pair being the true bit line and the source node of the second
transistor of the pair being the complement bit line. Each of the pair of
diode-connected transistors has a gate node and a channel between a
source node and drain node, wherein the channel of the first of the pair
of diode-connected transistors is connected between the drain of the
first of the cross-coupled transistors and a word line, and wherein the
channel of the second of the pair of diode-connected transistors is
connected between the drain of the second of the cross-coupled
transistors and the word line. The state of the cell is based on which of
the cross-coupled transistors is conductive.

[0010]A method in one embodiment of the present invention of reading the
cell includes (i) setting a wordline of the cell to a first reference
voltage, where the wordline provides the current for reading the state of
the cell, the cell having true and complement bit lines, (ii) setting a
mode line of true and complement sense amplifiers connected to the true
and complement bit lines, respectively, of the static ram cell to a
second reference voltage, and (iii) detecting the current flowing in the
true or complement bit lines of the cell to sense the state of the cell.

[0011]A method in one embodiment of the present invention of writing the
cell includes (i) setting a wordline of the static ram cell to a first
reference voltage, where the wordline provides the current for writing
data into the cell, the cell having true and complement bit lines, with
the bit line having a conductive path to the wordline and the complement
bit line having a non-conductive path to the wordline, (ii) connecting
the bit line via a first switch and the complement bit line via a second
switch to a second reference voltage, (iii) pulsing the bit line of the
cell to the first reference voltage during a short time interval to write
new data into the cell, such that the complement bit line has a
conductive path to the wordline.

[0012]A method of retaining data in the cell in one embodiment of the
present invention includes setting a wordline of the static ram cell to a
voltage that provides leakage currents to the cell, the cell having true
and complement bit lines, with the bit line having a conductive path for
carrying the leakage currents from the wordline and the complement bit
line having a non-conductive path to the wordline.

[0013]The SRAM cells of the present invention have the advantages of (i)
being very fast, on the order of 100 pS for a read on a 0.18 μm CMOS;
(ii) needing no VDD or VSS wires, as retention power is taken from the
bit and wordlines; (iii) having low leakage current through supply
modulation; (iv) having small cells, only 4 transistors and 5 active
contacts; (v) capable of being implemented as a fast "all NMOS" cell;
(vi) having a low leakage current retention mode; and (vii) being coupled
with a very fast current-mode sense amplifier.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014]These and other features, aspects and advantages of the present
invention will become better understood with regard to the following
description, appended claims, and accompanying drawings where:

[0017]FIG. 3 shows the current-mode sense amplifiers connected to each of
the bitlines;

[0018]FIG. 4a shows the voltage waveforms in the various modes of
operation;

[0019]FIG. 4b shows the current waveforms in the various modes of
operation;

[0020]FIG. 5 shows the relevant current paths and detection circuitry used
when a strike event occurs; and

[0021]FIG. 6 shows an embodiment of a pipelined decoder in accordance with
the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Array Blocks

[0022]Arrays 20 are built up in the X direction 22 to make the memory
wider, and in the Y direction 24 by adding more wordlines. The bitlines
22 are commoned. An array 20 of SRAM cell shown in FIG. 1 together with
the Wordline drivers and the bitline amplifiers 26a-d completes an
X*Yarray 20.

Multiple Blocks

[0023]Multiple array blocks would ordinarily be deployed. This helps in
several ways. It makes redundancy easier to implement on the array level.
It makes access time faster on smaller, less capacitive arrays. The
circuits in the individual blocks need only be matched to the local CMOS
PVT process conditions on that part of the die. It eases matching
requirements etc.

New SRAM Cell And Drive Circuits

[0024]The circuit here aims to give all the benefits associated with the
above approaches but in a much simpler way with less overhead in terms of
area and delay. It should be noted that the circuit is drawn using NFET
transistors, although PFETs could be used in place of any/all of the
transistors with the obvious changes of signal polarities required. Also,
depletion mode Nfets could be used as could PN diodes and breakdown
diodes as will become obvious from the explanation of how the circuit
works.

[0025]FIG. 2 is the basic SRAM 1-bit memory cell 28. It comprises two
back-back transistors 30, 32 as the state element plus two other
diode-connected NFETs 34, 36 acting as dual-purpose pullup/access
transistors. Superficially, this is similar to the cell shown in FIG. 8
"4T SRAM cell, without VDD" of [2], but the cell of the present invention
is configured differently and operates in an entirely different way. It
has no ground connection at all (suits SOI especially) and the cell has
no pass transistors activated by the wordline. The cross-coupled
transistors 30, 32 of the present invention actually connect directly to
the bitlines 38, 40. Finally, the wordline drivers are different and the
bitline detection mode and circuits are different. In the default
quiescent operation, lines bit 38 and nbit 40 are held to ground, and the
word line 42 is held to a zero or positive voltage (which can vary from
about 0 volt to full VDD as will be explained later).

Sense Amplifiers/Bias

[0026]FIG. 3 shows the current-mode sense amplifiers 44 which terminate
each of the bitlines. They serve three purposes. First, they amplify and
integrate the current-pulse signals received during reading of the array
to make logic-level voltages from the small-signal bitline currents.
Second, they provide a current-return path for the leakage/standby
current of the array in the Data-retention mode. Third, they help detect
alpha-particle or cosmic ray strikes.

Circuit Description

[0027]Basically, the sense amplifier circuit operates as a charge
integrator. During amplification, (read mode) the bitline is kept at a
low impedance by the coupling capacitor ccupl 46 and the feedback
capacitor cint 48 around the amplifier formed from inverters (ni1 50, pi1
52), (ni2 54, pi2 56), and (ni3 58, pi3 60). Therefore, the voltage on
the bitline (nominally) doesn't change during detection and instead any
integrated current from the bitline swings the Q output towards ground.

[0028]In the data-retention mode, the MODE signal 62 can be lowered which
keeps the Fets nsink 64 and nbias 66 slightly on, providing enough sink
current to power the SRAM cells but allowing for a detection of an
alpha-strike (described later) on either the Q and nQ outputs.

[0029]The following comments should be kept in mind.

[0030]In the diagram, the two sense amplifiers are both single-ended, but
could be cross-coupled to be a true differential amplifier.

[0031]Stage 2 (or other stage) of the inverting amplifier may need
gain-reduction (resistive load) to prevent the amplifier from oscillating
due to phase-lag in the amplifier.

[0032]VDD of the sense amplifiers are dropped for the data-retention mode
(circuitry Not shown).

[0033]CMOS circuits are shown, but NMOS only amplifiers are well-known and
can yield an entirely NMOS SRAM chip at lower cost than CMOS due to
reduced number of mask layers required.

[0034]Signal integrity is expected to be good because the current-mode
detection method involves almost no voltage signal switching on the
bitlines.

[0036]When the word line 42 is raised more than one threshold voltage
above bit 38 and nbit 40 voltage (usually 0 v). Transistors n1pull 34 and
n2pull 36 provide a weak pull-up current to the drains of ncross1 30 and
ncross2 32 and like an ordinary SRAM cell, whichever is the current state
of the bit is reinforced by the feedback paths of the cross-coupled
devices 30, 32. The wordline drivers are able to provide this low voltage
(between 0.2 volts . . . 0.8 volts) to retain the state of the memory at
low leakage current between write cycles (and read cycles, which fully
recharge the state). In fact, it is the just the imbalance of
sub-threshold (leakage) currents in the cross-coupled transistors (caused
by the gate voltage imbalance when the cell was written or read) which
retains the state, and all transistors are able to operate in the
sub-threshold range. Voltage on the word drive line 42 effectively sets
the pull-up or even leakage current for the block.

Read Mode

[0037]In the read mode, the word line 42 is raised to approximately VDD,
and transistors n1pull 34 and n2pull 36 provide a strong pull-up current
to the drains of ncross1 30 and ncross2 32. During read-mode, the state
of the bit is determined by monitoring the current flowing into lines bit
38 and nbit 40. For a logic 1 stored in the SRAM cell (ncross1 30 is
turned on) current flows from word 42 out to bit 38. The other bitline
nbit 40 receives current only if logic 0 had been stored in the cell, so
the two bit line currents are mutually exclusive. When operating in an
SRAM array, non-selected wordlines are held at the data-retention voltage
level. No current therefore flows into the unselected wordlines.
Additionally, the transistor common bitline loading from the inactive
wordlines does not significantly affect the detected current-signal
(current mode detection is at low voltage swing).

Write Mode

[0038]In the write mode the bitlines are used to set the state of the
memory element instead of reading the state. Write mode begins is exactly
like a read mode event (and this can be exploited to do read-modify-write
access) with word line 42 being raised to approximately VDD and with the
bit 38 and nbit 40 lines held low. This `powers up` the bit cell 28 and
allows it to be written. Assume for this example that the data currently
in the cell is a `1` (i.e., ncross1 30 is on, and ncross2 32 is off), and
it is desired to change the data to a zero. To do this, the line bit 38
is raised high very quickly to VDD whereupon, by virtue of ncross1 30
being switched on, it conducts to allow passbit signal to go high. Very
quickly this causes ncross2 32 to turn on, which discharges the gate of
ncross1 30 (node Npassbit drops low) turning it off. Once flipped, the
feedback reinforces the state which persists as bit 38 is brought back to
0 v. Note that the bitline used to write the bit of a particular state is
the opposite bitline to that which detects the current corresponding to
the state. Also note that there is a charge pumping mechanism whereby the
gates of the Nfet rise with the source signals during the write operation
and this effect could also be exploited for generating higher voltage
gate signals for hot-electron tunneling type devices such as Flash memory
cells.

Exemplary Waveforms

[0039]FIGS. 4A and 4B show the operation of a single-bit cell in a
combination of read, write and data-retention modes.

[0040]Most SRAM circuits are very sensitive to alpha particle or cosmic
ray corruption of stored bit values, especially when operating at low
voltage and low stored capacitance as is typical of sub-100 nm VLSI
circuits. The circuit of the present invention is no different in this
regard and can have the logic state altered by the impact of ionizing
radiation. Alpha strikes in the silicon lattice free current carriers
(electrons/holes) effectively turning-on transistors which may have been
off. For a conventional SRAM cell this can be catastrophic and can flip
the bit, i.e., currents induced by the free carriers could flow around
the VDD and VSS lines in the cell and could swap the charge polarities on
the back-back inverters. The event would go unnoticed outside of the bit
cell until bad data is read back on the next access. The usual methods of
prevention and/or recovery from these Single Event Upset (SEU) errors are
well known in the literature, such as Error Detection and Correction EDAC
[4], which rely on things like redundant bits coupled to Hamming-codes to
find and repair one or more errors. If needed, EDAC circuits can be built
into the synchronous pipeline of the new SRAM system (described later) to
avoid decreasing throughput, but there is another possible method of
dealing with the problem.

[0042]Scrubbing refers to a background task by which an EDAC equipped
multiport (i.e., two possible address/data access port) SRAM array is
able to transparently fix memory errors. A separate scrubbing state
machine steps through each memory location and looks at the syndrome [4]
bits to find and fix any one-bit error by writing back to the memory. The
main processor is largely unaware of the scrubbing process and in all
likelihood never sees a bad data bit because of the background
`scrubbing` process.

Error Detection And Correction Ability

[0043]An interesting aspect of the SRAM cell of the present invention is
the lack of VDD or VSS supplies and the provision of current sensing
mechanisms in the bitlines and wordline (FIGS. 3 and 5). This offers an
intriguing possibility of directly detecting for alpha (or other)
particle strikes and repairing these directly without the need for
redundancy or ECC. Observing the new SRAM cell in the data retention
mode, there should only be a static DC current corresponding to the bias
current of the cell and with SOI, because of the lack of a substrate
connection, no current can flow in the cell without it involving current
flow between two or more of the three terminals shown in the diagram. The
system works as follows. For most of the time, most of the SRAM arrays on
a chip are in the data-retention mode. In this mode the cells are
vulnerable to corruption from an alpha-strike because of low cell voltage
and weak pull-up currents. Now, consider an alpha strike on transistor
ncross2 32 while the state of the cell is "1" (ncross1 30 is on, ncross2
32 is off). Current is induced to flow in ncross2 32 and the only paths
it can flow are those shown in FIG. 5. All current flow paths are
detectable externally. Only an additional transient current-sensor 70 in
the wordline is needed. The alpha strike is then recorded in its an X and
Y position by the latching detectors 72. The Y position is given by the
corresponding wordline transient detector bit and the X position by the
location of a differential transient current between the bitlines.
Effectively, the alpha strike can be thought of as an "unexpected read"
of the SRAM data. By detecting the polarities of the bitline current
detectors, the state of the bit before it was corrupted can be found.
Alpha strikes that cause no change in signal polarity, such as strikes to
a NFET which it is already on, are of no interest. Strikes on ncross1 30
when it happens to be off (cell state is 0), are recorded with the
opposite polarity on the bitline current detectors. The principle can be
extended to strikes on the diode/pullup transistors 34, 36. The FLAG
output of the latching circuit 72 can directly invoke a "scrubbing"
process to reinstate the correct state on the bit which position and
proper state is known.

Cache Application Of Soft Error Detection

[0044]Cache SRAM memory on modern microprocessors contributes to a lack of
reliability [6] and these problems are expected get worse as dimensions
and voltage shrink. For the application of an SRAM memory cache, a
simpler but still useful version of the system described previously can
exploit the detection of soft errors, but use an existing mechanism to do
the correction. In this application, whenever the CPU requests an item
from the cache and where there is a soft-error (FLAG set from the word or
bitline decodes (see above)), we can force the return of a "Miss" on the
cache, independently of the normal hit/miss flag logic. This forces the
memory subsystem to execute a fetch from external DRAM. As is the nature
of cache, this data is automatically propagated to the cache SRAM and so
the external fetch itself fixes the fault. The error flags could apply to
a single bit, single word or an entire block of memory. For example, it
is simpler to implement a scheme on a cache where a single soft-error
anywhere in the block flags that the whole block is invalid (not in the
cache). This can be done by resetting all the entries of the cache lookup
mapping to the block in which there is a soft error. These bits go back
to normal one by one when a new external fetch occurs for that address.
Both the TAG and the RAM banks of the cache can be protected. An error
detected in the TAG cache line also invalidates a possible "HIT" caused
by that lookup in the TAG.

[0046]With the redesign of the SRAM cell and its array according to the
present invention, we can look at addressing another problem of SOC
design, which is the fact that SRAM macros cause blockages in the routing
layers. SRAMs usually only occupy metal one M1 and metal two M2. Concerns
over signal integrity of the sense amplifiers often disallows routing
over the SRAM on other metal layers, perhaps all the way to M4. It is an
aspect of the present invention that we build-in porosity to the SRAM as
far as routing is concerned, because of the reduced use of wiring (No
VDD, VSS needed), and we can rotate SRAM blocks as needed to re-orient
with the prevailing routing direction on the layers.

Pipelined Memory Decode

Multithreading

[0047]Many microprocessors today are multi-threaded, which is a mechanism
for avoiding the penalty of large memory or other off-chip latencies. A
multi-threaded machine is a multi-threaded (or multi-tasking) machine and
switches threads immediately after issuing an address request for
off-chip memory data which might take many clock cycles to return. By
switching threads, another thread can be continued whose latency will
typically have been absorbed during that thread's idle time. That is, if
a memory request has stalled the thread, the data should be available
when the thread is continued. Conceptually, by switching among the
threads, the microprocessor is kept fully utilized and the multi-threaded
application program executes faster than a single-threaded application
which is periodically idle while waiting for off-chip data.

[0048]In practice, the internal memory controller and the external memory
itself can be a bottleneck. For example, if it takes 20 clock cycles to
fetch a given data word from external memory, then the maximum number of
memory requests is 1 per 20 clock cycles of the microprocessor. The code
for each thread can comprise at most 5% external memory references if the
latency of the memory interface is not to be exposed (this figure is
relaxed when a cache is used, and various levels of internal caches have
lower levels of latency).

Pipelined Memory Decoder

[0049]A solution to the above problem, without requiring large amounts of
cache memory, is to increase the throughput of the memory system for the
multi-threaded microprocessor. Despite increased memory latency, the
throughput can be improved to one random read/write per clock cycle if
the memory decoder is pipelined [5]. Multiple requests to memory
locations flow down the pipeline in the memory decoder and perhaps take
20 clock cycles to ultimately activate the correct Row/Column. But, the
point to note is that each clock cycle presents a new Row/Column decode
and is able to perform a new random access data read or write. Using a
smaller amount of the expensive and area-consuming on-chip cache RAM,
similar performance can be achieved. The SRAM circuits described
previously are applicable to this technique.

[0050]FIG. 6 shows an embodiment of a pipelined decoder in accordance with
the present invention. Included in the embodiment are a rotary traveling
wave oscillator 100, as described in U.S. Pat. No. 6,556,089, which is
incorporated by reference into the present application, a number of
pre-decoder sections 106, 108, 110 that include shift register elements
102, which receive serial address data, combinational logic, and a number
of pipeline registers 104 that receive the outputs of the pre-decoding
sections, and a number of decoders 114 attached to the outputs of the
pipeline registers.

[0051]In the example shown in FIG. 6, address lines A0-A5 are decoded.
There are three pre-decoding stages 106, 108, 110. The first pre-decoding
stage 106 receives address bits A0 and A1 from a first shift register
that is clocked by a certain tap of the rotary clock and decodes all
possible states of A0 and A1. The second pre-decoding stage 108 receives
address bits A2 and A3 from a second shift register that is clocked by a
different tap of the rotary clock and decodes all possible states of A2
and A3. The third pre-decoding stage 110 receives address bits A4 and A5
from a third shift register that is clocked yet another different tap of
the rotary clock and decodes all possible states of A4 and A5.

[0052]The four outputs of the first, second, and third pre-decoding stages
are each captured in a set of pipeline registers 104 whose clocks are
connected to taps of the rotary clock. The twelve outputs of the pipeline
registers form a pre-decode bus 116 to which connections can be made to
complete the decoding. Note that as the traveling wave CLK and CLK* move
about the rotary clock loop, say in the clockwise direction, the first,
second and third shift registers are clocked, and then the pipeline
registers of the third, second and first pre-coding stages are clocked.
Thus, each pre-decoding stage is pipelined as between the shift register
and the pipeline registers and each pre-decoding stage is pipeline with
respect to the other pre-decoding stages.

[0054]SRAMS are often designed to be multi-port, meaning there are two
separate address accesses to the same bit-storage nodes. Another
technique to achieve the same effect is to multiplex the address and data
lines of the SRAM between two or more ports and time-share access to the
RAM. This eliminates contention between the ports but requires higher
speed for the SRAM.

[0055]An interesting possibility where the address decoder is pipelined is
to use SERDES ports to multiplex the address and data from multiple
remote sources at the 1-bit level and feed the bits alternately from each
SERDES address receiver port into the pipeline stages. Having address and
data as the single-bit streams allows easy arithmetic on the address and
data operands and could be use for memory management purposes.

[0056]Although the present invention has been described in considerable
detail with reference to certain preferred versions thereof, other
versions are possible. Therefore, the spirit and scope of the appended
claims should not be limited to the description of the preferred versions
contained herein.