Thursday, June 04, 2009

In response to our discussion Artto Aurola from Pixpolar commented on the company's MIG operation and benefits over conventional sensors. To get more visibility for his comments, I'm moving them to the front page:

--------------

Hello everybody,

I was asked to comment this conversation about Pixpolar's Modified Internal Gate (MIG) technology. The MIG technology has excellent low light performance due to high fill factor, high sensitivity, low noise and long integration time (allowed by multiple CDS read-out which is enabled due to MIG sensors non-destructive read-out ability). Therefore the low end of the dynamic range can be extended considerably in MIG sensors. A reasonable value for the dynamic range can also be obtained in the high end of the dynamic range scale – typically it is easier to improve the high end than the low end of the dynamic range in any sensor. In case you want to dive deeper into the technological aspects you'll find below a summary about the low light benefits of MIG sensors compared to Complementary metal oxide silicon Image Sensors (CIS) and to the DEpleted Field Effect Transistor [DEPFET, aka Bulk Charge Modulated Device (BCMD) etc] sensors and few words about pixel scaling.

MIG COMPARISION TO CIS IN LOW LIGHT

A critical problem in most digital cameras is the image quality in low light. In order to maximize the low light performance the Signal to Noise Ratio (SNR) needs to be as high as possible, which can be achieved:

1) BY MAXIMIZING CONVERSION AND COLLECTION EFFICIENCY, i.e. by converting as many photons to charges as possible, by collecting these photo-induced charges (signal charges) as efficiently as possible and by transporting these signal charges to the sense node as efficiently as possible. In CIS this is done by incorporating Back Side Illumination (BSI) and micro lenses to the sensor and by the design of the pinned photo diode and the transfer gate. The essentially fully depleted MIG pixel provides throughout the pixel a decent potential gradient for signal charge transport and up to 100 % fill factor in case of BSI. These aspects are missing in CIS but the micro lenses compensate this problem fairly well.

2) BY MINIMIZING NOISE.

2A) Dark noise. The most prominent dark noise component is the interface generated dark current which has been depressed in CIS considerably by using the pinned photo diode structure so that the only possible source of interface dark noise during integration is the transfer gate. In the MIG sensors the interface generated dark current and the signal charges can be completely separated both during integration and read-out. The advantage is, however, not decisive.

2B) Read noise:

2B1) Reset noise. This noise component can be avoided in CIS using Correlated Double Sampling (CDS) which necessitates four transistor CIS pixel architecture. In MIG pixel CDS is always enabled due to the fact that the sense node can be completely depleted.

2B2) Sense node capacitance. In CIS [and in Charge Coupled Devices (CCDs)] an external gate configuration is used for read-out wherein the sense node comprises the floating diffusion, the source follower gate and the wire connecting the two. The signal charge is shared between all the three parts instead of being solely located at the source follower gate meaning that CIS suffers from a large sense node stray capacitance degrading the sensitivity. In MIG sensors the signal charge is collected at a single spot (i.e. at the MIG) which is located inside silicon below the channel of a FET (or below the base of a bipolar transistor) meaning that the sense node capacitance is as small as possible enabling high sensitivity.

2B3) 1/f noise. It is mainly caused by charge trapping and detrapping at the interface of silicon and the gate isolator material of the source follower transistor. It can be reduced in CIS by using CDS and buried channel source follower transistors. However, the problem in the external gate read-out is the fact that the signal charges always modulate the channel threshold potential through the interface. Therefore and due to the large sense node stray capacitance a charge trapped at the interface below the gate of the source follower transistor causes always a larger (false) signal than a real signal charge at the sense node which complicates the read noise reduction in CIS considerably.

In MIG sensors the 1/f noise can likewise be reduced using CDS, i.e. subtracting the results measured when the signal charges are present in the sense node (MIG) and after the sense node has been emptied from signal charges (the sense node can be completely depleted!). A complementary way to reduce the 1/f noise in MIG sensors is to use a buried channel Metal Oxide Silicon (MOS) structure which has a relatively thin, high k gate isolator layer (the thin high k isolator is common in normal MOS transistors) and a relatively deep buried channel. This combination reduces the impact of the interface trapped charge considerably resulting in very low 1/f noise characteristics. In other words the signal charge at the MIG results in a much higher signal than a charge trapped at the interface since a) the gate to silicon distance is much smaller than the gate to channel distance, since b) the external gate is at a fixed potential during the read-out, since c) the sense node stray capacitance is very small and since d) the signal charge is not modulating the channel threshold potential through the interface. Instead of using a MIG transistor having buried channel MOS structure very low 1/f noise characteristics can naturally also be achieved by using JFET or Schottky gate in the MIG transistor.

2C) Multiple CDS read-out / double MIG. The read noise can be further reduced in a double MIG sensor by reading the integrated charge multiple times CDS wise. The multiple CDS read-out of the same signal charges is enabled because the signal charges can be brought back and forth to the sense node (MIG) and because the sense node can be completely depleted, i.e. signal charges do not mix with charges present in the sense node. In CIS and CCDs the external gate sense node can never be depleted and once the signal charges are brought to the sense node they cannot be separated anymore. Therefore multiple CDS read-out is not possible in CIS and CCDs. The multiple CDS read-out is a very efficient way in reducing the read noise since the read noise is reduced by a factor of one over the square root of the read times.

3) BY MAXIMIZING THE INTEGRATION TIME. The integration time is typically limited by subject and/or by camera movements. The maximum possible integration time can be achieved if the signal charge is read as often as possible (e.g. the frame rate is set at a maximal reasonable value by reading signal charge in rolling shutter mode without using a separate integration period) and if all the measured results are stored in memory. In this manner the start and end points of the integration period can be set afterwards according to the actual camera/subject movements which can considerably increase the integration time compared to the case wherein the integration time is set beforehand to a value which enables a decent image success rate. In addition the stored multiple read-out measurement results facilitate the correction of image degradation caused by camera movements (soft ware anti shake) which also enhances the maximum integration time and image quality. Finally the stored multiple read-out results enable different integration times to be used for different image areas increasing considerably the dynamic range and the quality of the image.

If the signal charge is read as often as possible in CIS, if the difference between read-out results at selected start and end points is used as a final measurement result and if the sense node is not reset in between, the final result will comprise a lot of interface generated dark current integrated by the sense node and a lot of 1/f noise. If, on the other hand, the interface generated dark noise is removed CDS wise at all the measurement occasions taking place during the afterwards selected integration period the read noise is added up (this applies also to CCDs). Therefore it may be necessary in CIS to limit the number of the read times to one or to two times at maximum. In this manner, however, the integration time must be set in beforehand and the benefits of multiple read-outs cannot be obtained. Especially the adjusting of the integration time to the subject movements becomes more or less impossible.

The big benefit of the double MIG concept is that for a given pixel one can freely choose an integration time between any two measurement occasions taking place during the afterwards selected start and end points whereby the result is free of interface generated dark noise, whereby read noise is not added up and whereby 1/f noise is considerably reduced due to CDS read-out. (To be precise the read noise is at lowest if the pixel specific start point of integration corresponds to pixel reset.) The first and second points are due to the fact that non destructive read-out and elimination of interface generated dark noise is taking place simultaneously. These two aspects are also true for the single MIG pixel. If in single MIG pixel the 1/f noise is at a low enough level (e.g. buried channel MOS approach) the pixel integration time can also be selected independently from other pixels without degrading the image quality (higher 1/f versus higher integration time) although CDS would not be used.

The afore described means a profound change for especially low light digital photography since instead of setting the integration time beforehand the image can be constructed from a stream of afterwards selected sub images taken at maximum reasonable frame rate. The latter enhances also the ability to correct the image quality software wise.

MIG COMPARISON TO DEPFET/BCMD

In the DEPFET (aka BCMD) sensor signal charges are collected in silicon in an Internal Gate (IG) below the channel of a FET where they modulate the threshold voltage of the FET. In DEPFET the sense node is the IG which can be completely depleted. In addition to that the read-out in DEPFET is non-destructive. The difference between MIGFET and IGFET (DEPFET) is the fact that the current running in the channel of the MIGFET is of the same type than the signal charges in MIG whereas the current running in the IGFET channel and the signal charges in IG are of opposite type.

Process fluctuations (e.g. small mask misalignments etc) and/or poor layout design cause small potential wells for signal charges to be formed in the IG of the DEPFET/IGFET sensor. The threshold potential in the FET channel above the locations of the small potential wells is, however, at maximum (signal charges and current running in the channel are of the opposite type). This results in the problem that signal charges are first collected at locations where the sensor is least sensitive, i.e. read noise is the higher the less there are signal charges to be read which degrades image quality in low light. In other words the IGFET/DEPFET sensor is vulnerable to process fluctuations which may degrade the manufacturing yield to an unsatisfactory level.

Another problem in the IGFET/DEPFET sensor is the fact that normal square IGFETs suffer from serious problems originating from the transistor edges necessitating the use of circular IGFETs. The doughnut shaped IG of the circular transistor increases the capacitance of the IG, i.e. of the sense node, leading to lower sensitivity. This is also likely to impede low light performance. The circular IGFET structure also significantly complicates the signal charge transfer back and forth to the IG meaning that multiple CDS read-out is difficult to perform. Additionally the large size of IG is likely to make the device even more prone for process fluctuations.

The benefit of the MIGFET is that the signal charges are always collected under the location of the channel where the threshold voltage is at highest (signal charges and the current running in the channel are of the same type) which means that MIGFET has tolerance against process fluctuations and that read noise is not increasing when the amount of signal charges decreases. Another benefit is that square MIGFETs do not suffer from edge problems. This fact enables the use of a minimum size MIGFET having a small MIG which enables very high sensitivity. In addition the square MIGFET facilitates considerably the signal charge transfer back and forth to MIG enabling multiple CDS read-out.

The less there are signal charges in the IG the higher the threshold voltage of the IGFET. This is a problem when the read-out is based on current integration since the less there are signal charges in IG the smaller the current through the IGFET and the smaller the amount of the integrated charge. The smaller the amount of the integrated charge the more the SNR is degraded due to high level of shot noise in the amount of integrated charge. In MIG sensors the situation is the opposite way round, i.e. the less there are signal charges in MIG the smaller the threshold voltage of the MIGFET and the bigger the current through the MIGFET, and the less the SNR is affected by shot noise in the amount of integrated charge.

In buried channel circular IGFETs the charges generated at the interface of silicon and gate isolator material are blocked at the origin leading to accumulation of interface generated charges at the interface between source and drain. Although the interface accumulated charge can be removed during reset the interface accumulated charge may increase the noise especially when the amount of signal charge is small and when long integration times are used even though the gate isolator material layer would be fairly thin. The ability to use square MIGFETs enables the use of the buried channel without interface generated charges being accumulated at the silicon and gate isolator material interface (the edges of the square transistor form an exit for the interface generated charges).

Instead of using a circular buried channel IGFET one can use circular JFET or Schottky gate IGFETs which do not suffer from interface charge accumulation. The JFET based IGFET is, however, more complex than other structures (i.e. there are more doping layers present) and since the IGFET is vulnerable to process fluctuations the JFET type IGFET is likely to have the lowest yield. The Schottky gate IGFET is, on the other hand, process wise quite exotic and may be difficult to be incorporated to a CMOS image sensor process.

Probably for afore described reasons the only IGFET structure that has been studied extensively is the circular surface channel IGFET. In surface channel IGFET the external gate cannot be used for row selection since it is necessary to keep the channel open continuously (excluding reset of course), otherwise the IG would be flooded with interface generated dark current. This can be prevented if at least one additional selection transistor is added to the pixel (e.g. the source of the selection transistor is connected to the drain of the IGFET). The downside of the fact that the channel needs to be kept open continuously is that the transfer of signal charges back and forth to IG is even more complicated hindering multiple CDS read-out. In surface channel MIGFET the channel can be closed without the interface generated charges being able to mix with the signal charges in MIG (there is a potential barrier in between). Therefore the external gate of the surface channel MIG can be used for row selection. Additionally multiple CDS is enabled in the surface channel MIGFET since the external gate can be used also for signal charge transfer.

PIXEL SCALING

Important aspects concerning the pixel size in today’s BSI image sensors are certainly the manufacturing line width, the SNR and the dynamic range. The manufacturing line width defines the lower limit for the pixel size: it cannot be smaller than the area of the transistors belonging to one pixel. In practice the pixel size is defined by the low light level performance, i.e. by the SNR of the pixel. In both aspects MIG sensors have an advantage since the minimum number of transistors required per pixel is only one, since many features of the MIG transistors enable noise reduction compared to CIS/CCDs and since the integration time can be maximized by setting the integration time afterwards (multiple CDS provided by double MIG, low 1/f noise provided by buried channel MIGFET). A very good analogy for the MIGFET sensor is the DEPFET sensor which has the world record in noise performance being 0.3 electrons even though it suffers from the above listed problems.

The smaller pixel size should not, however, compromise the dynamic range of the pixel. The lower end of the dynamic range scale is already optimized in MIG sensors. The small sense node (i.e. MIG) capacitance naturally somewhat limits the full well capacity of the MIG. Again a good analogy is here the IGFET/DEPFET/BCMD technology – ST microelectronics has recently reported that they achieved better performance in a 1.4 um pixel with IGFET/DEPFET/BCMD architecture than with normal CIS architecture.

29 comments:

If the MIGFET becomes adopted then probably there will be prior art issues in any patent dispute and it is best not to get into that here.

My concerns were about the use of MIGFET to solve problems with small pixels. For commercial pixels, read noise is probably small enough and SNR is limited by shot noise. With BSI there is plenty of room for 4T type shared RO pixels. Also, light blockage by wiring or transistor is not an issue in BSI. The current problems with small pixels was well described by Ahn et al in their 2008 IEDM paper which boils down to "we need more photons" and hence the current trend for RGBnW CFA kernels. So, based on this it is a little hard to imagine the driving force in adoption and commercial development of this technology except if the cost is lower by avoiding BSI. Still, by the time this technology is fully commercialized I suspect BSI will not be a major cost factor.

1T pixels are still of interest to me for deep submicron, sub-diffraction limit (SDL) pixels if they become photoelectron counting-capable.

I would also like to say that I fully support advanced R&D into new or revitalized pixel concepts for future devices. Sometimes it is hard to predict what will happen in terms of technology drivers and application pull. I encourage the MIGFET team (even if a team of one) to consider the comments made in this blog as they further advance and market the technology.

CDS is also possible for the 1T pixel. The difference compared to CIS or CCDs is that CDS is done the opposite way round: the first measurement is done when the signal charges are in the sense node and the second measurement is done after the sense node has been emptied from signal charges (reset is done with source or drain contact). So as a matter of fact the 1T MIG pixel incorporates all the functionalities [source follower, reset, row select, transfer (transfer meaning in MIG case CDS ability)] of a 4T CIS pixel in just one transistor.

The ST's results which are by the way pretty impressive are published in the article:

Summary:Ring-gate implementation of single-transistor charge modulation pixel structure obviates the need to employ shallow trench isolation for pixel isolation. This enables achievements of smaller pixel size and/or higher fill factor. It also reduces dark current by minimizing surface component contribution and band-to-band tunneling effect. Characteristics of a ring-gate 1.4-pitch 50% fill-factor pixel are compared with those of a rectangular-gate 2.2-pitch 46% fill-factor pixel, both being in a 0.13- complementary metal-oxide-semiconductor process. The 1.4-pitch ring-gate pixel has an improved conversion gain and a degraded full well capacity (FWC), as can be roughly predicted according to scaling law. It also shows substantial reductions on dark current, temporal noise, and fixed pattern noise. The resulting signal-to-noise ratio outweighs degradation of FWC, which allows a larger dynamic range.

Like you pointed out shot noise is certainly a very important factor in providing decent SNR. An advantage of a back illuminated MIG pixel is 100% fill factor which enables more photons to be collected and therefore a reduction in shot noise. Another advantage of the MIG pixel is the ability to maximize the integration time (which may, however, necessitate the use of a double MIG pixel) which also enables more photons to be collected resulting in low shot noise. The third advantage is the low read noise. Since the read noise defines the noise floor a reduction in read noise provides in a sense more photons for the read-out.

Like you also pointed out the shared read-out in 4T pixel is a very efficient way to increase the photon collection efficiency in CIS. The shared read-out increases, however, the sense node capacitance which in turn increases read noise. The increased read noise partially reduces the benefit gained by the shared read-out.

Thanks Artto.1. 100% fill factor - this is pretty much true for all BSI devices. It is the point of BSI.2. Maximize integration time. Not sure what you mean but I don't see a difference between MIG and 4T.3. Low read noise. As I said, in low light a read noise of less than 5e- rms is sufficient for consumer applications and most commercial sensors are below this level already.4. Not sure why you say sense node capacitance increases read noise unless you are talking about some 2nd order effect.

I don't want to be negative about your device. I am just trying to clear up these issues and understand your points more clearly.

Artto, Eric, thank you for the references. I recall looking through them some time ago. However, I'm unable to see there any claim of performance superiority over the traditional 4T sensors. About the only good number in the articles is 50% fill factor, but somehow it is not translated into a superior light sensitivity, not for B&W sensor, anyway. But it is probably not fair to compare a mature 4T technology with a work in a progress on small pixel CMD.

By integration time optimization, I believe Artto means multiple frame non-destructive readout, storing the frames in memory, and later choosing "the start and end points of the integration period can be set afterwards according to the actual camera/subject movements", like Arrto described in #3 in MIGFET description. Neat idea, if you ask me.

Thanks again for your interest in our technology. Here are my comments on the four points you made.

1) You can naturally have 100% fill factor in BSI 4T CIS. This requires, however, that you'll have the pinned photodiode structure situated also below the p-wells of the transistors and below the floating diffusion and the transfer gate meaning that there must be an extra p-well also in between the pinned photo diode and the floating diffusion as well as in between the transfer gate and the pinned photo diode. Since this would increase the sense node capacitance (additional p-well below the floating diffusion) and complicate the structure and the transfer of signal charges (np or npn structure below the transfer gate) and since micro lenses are also used in the BSI 4T CIS, I assume that this is not the case in the existing BSI CIS. The fill factor is, however, not decisive since the micro lenses in BSI CIS enable a decent optical fill factor.

2) By this I mean that you can set the integration time afterwards without increasing noise. In 4T SIC the integration time can also be set afterwards but the price is that noise will increase (either you'll have a lot of surface generated dark noise and 1/f noise in the image or you'll have a lot of added up read noise in the image).

3) I see the situation accordingly: in order to have the best possible low light performance in situations where flash doesn't help (e.g. the back ground of objects to be photographed) you need to maximize the integration time and the pixel collection efficiency (optical fill factor, quantum efficiency etc), and to minimize the noise. The smaller the noise the better the image quality.

4) If you use shared read-out the sense node capacitance necessarily increases since the source follower gate is connected to several floating diffusions. The higher the sense node capacitance the poorer the sensitivity and the higher the read noise.

1. Deeper high-energy implanted p-wells do not necessarily increase floating diffusion capacitance. Depending on the energy, p-well might not affect the transfer gate and floating diffusion at all. In BSI pixels the deeper portion of the photodiode is mainly a depletion space which can extend toward the back side.

3. Once the read noise is reduced below the certain level, there is no benefit in its further reduction, at least in the consumer sensor architectures. For example, if the average picture signal is 100e, one can hardly discern between read noise of 1e and 0.01e. In practice, 5-6e noise is almost as good as 0.01e. About the only real benefit of noise reduction below 5e is a possible reduction in number of dark rows and columns to suppress FPN.

4. Yes, this is true. There are ways to alleviate the problem, such as square 4-way sharing to minimize the diffusion area. Also, buried channel source followers can reduce RTS noise. Again, with all these limitations the noise in the recent generations of sensors is already good enough for consumer imaging.

Scaling pixels requires read noise scaling. When we have pixels with just a few electrons of well capacity they will not be inferior devices. Pixels can be processed as necessary to yield any other previous pixel pitch desired. To make use of small pixels, the read noise must scale. At least to the point that when recombined, they are not inferior to the effective larger size. If the noise does not at least scale linearly with pixel pitch, the pixel is inferior. Example: A pixel of 2um pitch has 4 electrons read noise. You need your 1um pixel to have lower than 2 electrons read noise with 1/4 the well capacity. If you want to process them back to the old sensor, you'll get the same thing as long as QE remains. There are a lot of other things to consider.

True, if one implements the pixel binning in the digital domain to create lower resolution picture with same noise like it was for a bigger pixel. However, smarter ways of pixel binning exist, such as binning in the charge domain. This way the read noise is same for binned and separate pixels, so whoever sacrifices resolution still gets the same noise.

As for the full well degradation for smaller pixels, if we accept this, there is no sense in pixel scaling, as pixel-level magnified picture becomes unacceptable.

> Saying that something is good enough is a business person's view and those people get burned all the time with this kind of thinking.

OK, so what is the alternative way of thinking? Should we spend time, money and efforts to make consumer sensor with 0.5e of noise? 0.2e? 0.1e? Is there good enough noise number somewhere or should we always fight for even better noise? A very low noise might enable a paradigm switch, such as digital jot idea, but it should be way below 0.1e for that.

As for the "good enough" list, we can make lists that hold and that do not. And I even know which one would be longer.

"business person's view" is important because Pixpolar wants to commercialize/license this technology so of course one must look at it from business point of view.

"prior art issues" If someone well known disagrees on prior art and publishes that disagreement in detail it might cause trouble later for a small company trying to make some business. So, to be nice, better not to post details or argument but at least inform them that there is disagreement and danger w/rto prior art.

for me the logic goes in this manner: the pixel size defines the size, the cost and the resolution of the camera (majority of the consumers still by the camera according to mega pixels and not to sensor size). In order to have a decent image quality in low light with the smaller pixel size the read noise simply must be scaled down just like Anonymous pointed out.

Artto, let me explain the point about the "low enough read noise" in consumer imaging:

Most consumers only keep "acceptable quality" pictures. Among other things, "acceptable quality" means good enough SNR. Many consider SNR=10 as an acceptable threshold. The SNR is measured after CCM, compensating for color crosstalk. For small pixels CCM causes SNR degradation by a factor of 1.5, at least. So, for "acceptable quality" the bare sensor should deliver SNR=15 for average pixel at 18% gray.

One of the major noise sources is shot noise. To provide the shot noise SNR of 15, the 18% gray pixel should accumulate ~200e of sinal. It means the full well pixel should be at the level of 1000e, otherwise whites would be clipped - here comes the minimum full well limitation for the "consumer minimum acceptable quality".

If we have the read noise of 5e, it adds almost nothing to the noise of average 18% gray pixel already having 15e of shot noise. It starts to dominate at about 5% gray level, but for average consumer this difference is hardly seen.

All these calculations are ballpark. If we remain withing current imaging processing algorithms, they are about correct. Switching to a different paradigm, like digital jots, change the logic, but this is a totally different story.

OK, it seems that we have a different philosophy here. My thinking goes in the following way. A consumer can be very happy about a part of an image (high enough SNR), e.g. a face taken in low light with flash. The image background may, however, seriously lack detail due to high read noise. The consumer keeps the image but is annoyed due to the lack of detail in the background. Such a situation is of importance since many times the most emotional moments happen in poorly lit conditions (parties, bars, etc).

Lets assume that the SNR must be one in order to see at least some detail (human eye can see some detail even below SNR=1) and that the consumers camera has 5 electrons read noise. Let us assume further that there is another consumer with similar camera but with 1 electron read noise and that this other consumer takes a similar picture. When they analyze the pictures together they find out that the image quality is similar in the foreground but the other image has far more detail in the background. Then a straight forward conclusion drawn by the two consumers is that the camera having lower read noise is of better quality.

There are of course also other factors in image quality. However, in camera reviews such comparisons are maid in low light or equivalently between poor lit image areas taken with high shutter speed and many times the results of the comparisons are rated as being of high value. Therefore I judge the low read noise indeed as a strong selling point although the word read noise is not used in marketing but instead good low light performance or high ISO.

Beside the low read noise and the high fill factor a very important benefit of the MIG sensor is the ability to decrease shot noise by providing more photons in the time domain, i.e. the ability to maximize the integration time. This is especially true for cameras having optical or sensor image stabilizers. Altogether these three aspects can be used for providing more megapixels with similar price point (i.e. smaller pixel size). Although one can justly question the sense of ever smaller pixel sizes the power of marketing should not be underestimated.

Taking your example, let's look on the whole picture. If the dark spot has 1e of signal, what is the average signal at 18% gray? If it is 20e, it means the average signal's SNR is just 4.5 due to shot noise. So, consumer would discard such a picture as unacceptable.

If 18% gray signal is 100e, then 1e signal represents a part of the picture at the level of 0.2%. This is a very deep shadow, where details are hardly seen anyway. So, average consumer does not care what happens there.

We need to add CCM noise degradation to these calculations, but for simplicity one can assume that the above numbers are after the CCM correction.

- OK, let me highlight the technology a bit further. If the integration time is set afterwards together with non-destructive read-out and high enough frame rate (and beneficially with optical or sensor stabilization and/or motion sensor), the integration time can be chosen for each pixel individually without adding up read noise or without increasing 1/f noise and integrating interface generated dark current. So in optimal conditions you can calculate the image based on such measured values of pixels wherein the pixel is just before saturation meaning that you can have (if you have enough time) full SNR (corresponding to full well capacity) for every gray scale in the image and as a matter of fact you can have much more real gray scales than is the full well capacity of the pixel. This is due to the fact that an image is a 2D measurement of the intensity, i.e. it is not necessarily a measurement of an amount of photons arriving in a certain period of time. You can actually also increase the pixel SNR above the full well capacity of the pixel by doing pixel resetting during integration.

Due to the above facts the full well capacity is not decisive for the image quality in case high enough frame rate is available. However, the full well capacity of the MIG pixel is not unreasonably small; based on simulations around 1000 signal charges per square micrometer in case the whole MIG layer is used for charge storage (so around 2000 for 1.4 um MIG pixel).

In a real situation the integration time (and the amount of frames) is naturally limited for example by subject movements. A typical situation in low light is the case when pictures are taken of a face using flash. In such a case you might get a very good image of the face but the annoying thing is that the face is surrounded by a pitch black area. In this case a very long integration time can be used for the back ground especially if there are no moving objects in the back ground provided of course that you have a stabilization and/or a sensitive enough motion sensor.

As a matter of fact, the full well of 2000 is quite restricting when it comes to the full system design. Indeed, the peak SNR, assuming favorable CCM, is about 30. For 18% gray level the SNR is about 13. In the corners of the sensor the signal is normally 50% lower due to the lens shading, thus SNR drops below 10, even for perfectly exposed picture. If a camera designer wants to apply high pass filter to compensate for the lens MTF, there is already no room for that, even in good light. Add to this EDoF processing with its noise deterioration and assume small AE errors, chip-to-chip shading variations and the things are becoming even worse.

So combining the picture out of few different exposures sounds like a good idea. Normally this requires a good pixel linearity. Do you think MIGFET pixel is linear enough? I mean the signal charge seems to be not very well localized, does this matter?

The optical cross-talk in back side illuminated sensors is actually much less severe than in front illuminated sensors since the optical stack (color filter + micro lens) is straight on top of silicon and therefore also color cross-talk should be greatly reduced.

The weaker intensity at the edges of the sensor is not a problem in MIGFET since you can use there a longer exposure time (overexposure is not an issue in the MIG sensor due to non-destructive read-out).

In our demonstrator which we have made we have seen excellent linearity (the measurements are still ongoing and we will make later a publication on the results). However, in the demonstrator the signal charge is localized under the external gate of the MIGFET. In case the full well capacity is increased sidewards towards the edges of the pixel you can expect first linear behaviour after which the sensitivity starts to decrease slightly due to higher sense node capacitance (surface area of sense node increases). This is, however, not a major concern due to following reasons.

First of all there will be no kink in the output graph since the surface area of the sense node increases gradually. Secondly the slope of the output graph will remain at a reasonable level even close to the full well capacity (sensitivity / sense node capacitance remains at a reasonable level). Thirdly the movement of the output graph up or downwards is greatly suppressed due low 1/f noise. Fourthly every pixel can be calibrated using a couple of points.

The pixel calibration in MIG sensor is much more precise than in CIS or in CCDs since the output graph can be defined exactly according to the amount of charge in the sense node (by cooling the 0.3 read noise repetivitive read-out DEPFET sensor dark counts according to individual charges could be measured at HLL of Max Planck Institute; in this manner the amount of charge in the sense node was truely counted). The calibration of charge in the sense node versue the intensity is also facilitated in BSI MIGFET since the microlenses can be omitted due to the 100% fill factor.

> The optical cross-talk in back side illuminated sensors is actually much less severe than in front illuminated sensors

Actually, 1.5 times CCM noise multiplication already factors in the low crosstalk. A more severe crosstalk in 1.4um pixel could bring this number to 2 or even 2.5.

> overexposure is not an issue in the MIG sensor due to non-destructive read-out

Does MIGFET sensor suffer from blooming, if overexposed? What happens with overflow charge, if a pixel potential well is full?

> the slope of the output graph will remain at a reasonable level even close to the full well capacity

When one combines the final image out of 2 or 3 differently exposed frames, the two neighboring pixels might come from different frames because one of the neighbors is just below the threshold while the other is above. Looking on all the mathematics to calculate the final image, the nonlinearity might introduce a big error. Indeed, assuming one pixel gets 2x exposure that the other.So, the first pixel value is divided by 2, while its its neighbor is taken as is. It works great, if pixel is linear, so double exposure brings double signal, then divided by 2 we get same signal as its neighbor. But if pixel is non-linear, the double exposure does not necessarily means double signal and we start to get some strange artifacts in the picture.

You are right, the pixel linearity calibration should solve this issue. Is this calibration done once at the sensor design stage and then can be carried over all the process fluctuations?

The optical cross talk can be reduced to very low levels in back side illuminated sensors if the following measures are taken:

First of all, as much of the front side of the pixels is covered with an opaque layer. In addition the pixels are separated by narrow trenches which extend to the back side of the sensor. The trenches are covered by a deposited or grown oxide layer and they are filled with opaque material. In order to keep the dark noise down the substrate must be n type meaning that holes must be collected instead of electrons. The positive charge in oxide forms an accumulation layer of electrons in the n type silicon at the oxide silicon interface which prevents interface dark noise and which can resist high voltages. The n type substrate is not desirable for CMOS but the area outside the pixel matrix can be doped p type. The trenches naturally reduce the quantum efficiency a little but on the other hand optical cross talk is reduced. The quantum efficiency reduction can be minimized by using highly reflective opaque material in the trenches.

Secondly opaque material is provided between the color filters and on top of the opaque material of the trench. The opaque material between the color filters needs to be at least of the same height than the color filter layer. The area of the opaque material should be narrow and the color filters should cover the rest of the back side surface area. In addition the silicon in the sensor should be essentially fully depleted, i.e. the fill factor of the silicon area separated by trenches should be maximized which enables micro-lenses to be omitted.

Afore described arrangements assure that only a very small part of the reflected or scattered light causes optical cross talk and that optical cross talk between neighboring pixels is reduced to a very small level. Beside this the trenches inhibit overall cross talk also due to the fact that signal charges are unable to diffuse or to drift to neighboring pixels. If micro lenses are used the optical cross talk remains at a considerable level since the micro lenses collect a part of the reflected or scattered light originating from neighboring pixels.

The benefit of the MIG sensor is the fact that afore described arrangements are not only possible but also easy to adapt (fully depleted nature of the MIG pixel).

Blooming is not a problem in the MIG pixel since the blooming current can be collected by either the source or the drain - no additional anti-blooming structure is required since the anti-blooming mechanism is an inherent property of the MIG pixel.

Another benefit of the MIG pixel is that the calibration can be done after the camera stack/module is finished if the read noise is low enough and if the device is cooled. This is due to the fact that the amount of dark charge can be determined more or less exactly (non destructive read-out) meaning that the charge to output calibration can be done with high precision. According to my knowledge the only way to do the charge to output calibration in CIS is by means of light during the manufacturing before the addition of the color filters and the micro lenses. This is problematic since the amount of charge is not known exactly, i.e. the light to charge conversion efficiency may vary from pixel to pixel and there is always shot noise present in the estimated amount of light used for the calibration.

Interesting discussion. Yes, I agree that a lot can be done to improve crosstalk in BSI technology. Deep trenches can really help to cut off many sources of crosstalk, and also color filter separators should help. I have difficulty estimating the extent of improvement, but there certainly should be some. My other concern is the added price and complexity, especially going for high aspect ratio trenches of over 1:20 and filling them by a special material. So far deep trenches proved to be difficult to productize. Many companies tried them, but retracted due to various sorts of problems, such as yield and excessive dark current.

Even with deep trenches, there are some other sources of crosstalk, such as spectral one, coming from non-ideality of color filter materials. That is, blue filter does not completely reject green light, green filter lets red and blue light in, and so on. Usually thicker filters have lower spectral crosstalk, but also reduce QE, so there is a trade-off.

As you have pointed, microlenses still contribute to the crosstalk, if they are used.

As a matter of fact, all these crosstalk reduction techniques are applicable to the regular 4T pixel as well, including p- and n-type doping inversion. I do not see any advantage of MIGFET here - am I missing something?

Regarding the blooming, I agree that in deep trenches isolated pixel the blooming should not be an issue. But in the "simple" STI or channel stop version, how do you make sure that whole the blooming charge is collected by the drain? Normally, this kind of path requires an exceptional process control and is very sensitive to process variations and misalignments.

As for the calibration accuracy, I'm unable to understand the MIG advantage. During the calibration procedure 4T sensor can average noise, including photon noise and give same calibration accuracy as MIG in about the same time. The averaging can be both spacial and temporal - I really see no difference between MIG and 4T. Why one needs to know the absolute charge to calibrate linearity?

That said, 4T pixels do not need calibration in most cases. Partially, this is helped by the fact that most 4T designs have higher than 2Ke full well at 1.4um node. Some companies are working on different ideas on 4T full well improvements, so that 1.1um generation would have sufficiently high full well as well.

You are totally right - the spectral cross-talk is difficult to reduce. The only really working solution would be better color filter materials.

The thickness of a trench made using 1:20 aspect ratio etching in a 3 um thick back side illuminated SOI pixel would be 150 nm meaning 75 nm per pixel. The downside is naturally reduced quantum efficiency but the upside is reduced cross-talk. The filling of the trench is not too difficult in case Atomic Layer Epitaxy (ALE, also referred to as Atomic Layer Deposition ALD) is used. There is naturally a cost factor related to ALE, however, if the trenches are made according to afore described manner the yield should not be compromised. The real problem that I see is interface dark current originating from the p-type substrate and oxide interface. In order to avoid this, the doping should be reversed, i.e. holes should be collected instead of electrons. The reversing of the doping can be done equally well for CCDs, CMOS and MIG sensors.

The micro-lenses can be omitted if all the signal charges are collected by the pinned photo-diode requiring that the bulk silicon is more or less fully depleted and that a reasonable horizontal electric field exists throughout the pixel providing transport for signal charges towards the transfer gate. Like I earlier mentioned one can situate the transfer gate and the floating diffusion like the rest of the transistors in a p type well (collection of electrons is assumed here). In addition the n type pinned photo diode can be situated under the p type wells. The p well surrounding the floating diffusion will, however, unavoidably increase the sense node capacitance and therefore it will increase the read noise. Since the p-wells are grounded adequate horizontal field may be difficult to provide throughout the pixel. Altogether due to afore described difficulties the omitting of the micro-lenses is not an easy task. I assume that this is also the reason why micro-lenses are still used in present BSI CIS.

Blooming protection in MIG can be provided if the source and/or drain doping is held at a correct potential during integration time and during read-out regardless of what kind of isolation is used if any. During the read-out the anti-blooming mechanism cannot be avoided since otherwise the charge could not be read. During integration time one can held the source or drain at the correct potential and anti-blooming will be provided. It is important to note that the vertical anti-blooming mechanism is an inherent property of the MIG pixel, i.e. no additional process/mask steps are required, no exceptional process control is necessary and neither is the vertical anti-blooming mechanism sensitive to process variations or to misalignments.

First of all the extra calibration step provided by the MIG sensor facilitates manufacturing since calibration is not anymore necessary during manufacturing. Instead the calibration can be done afterwards with a finished camera module. In case accurate calibration is desired in the very low end of the dynamic range scale any additional information will aid calibration unless the sensor is perfectly linear. In low light when there are only few signal charges to be measured and in case the sensor is not perfectly linear e.g. due to signal charge trapping the calibration must be pretty extensive in order to average the photon shot noise and the local variations in the homogeneity of the light source.

Thank you for the answers! I think I got pretty good understanding of your ideas. Like any other other technology, MIG has many strong and some weak points. On the big market there is certainly a place for many competing approaches.

One minor remark: If the trench width is 150nm, it's 150nm/pixel, rather than 75nm you state.