Avoiding Twisted Pixels: Ethical Guidelines for the Appropriate Use and Manipulation of Scientific Digital Images

Abstract

Digital imaging has provided scientists with new opportunities to acquire and manipulate data using techniques that were difficult or impossible to employ in the past. Because digital images are easier to manipulate than film images, new problems have emerged. One growing concern in the scientific community is that digital images are not being handled with sufficient care. The problem is twofold: (1) the very small, yet troubling, number of intentional falsifications that have been identified, and (2) the more common unintentional, inappropriate manipulation of images for publication. Journals and professional societies have begun to address the issue with specific digital imaging guidelines. Unfortunately, the guidelines provided often do not come with instructions to explain their importance. Thus they deal with what should or should not be done, but not the associated ‘why’ that is required for understanding the rules. This article proposes 12 guidelines for scientific digital image manipulation and discusses the technical reasons behind these guidelines. These guidelines can be incorporated into lab meetings and graduate student training in order to provoke discussion and begin to bring an end to the culture of “data beautification”.

Keywords

Digital image Ethics Manipulation Image processing Microscopy

Originally presented at the University of Alabama at Birmingham/Office of Research Integrity conference entitled “Statistics, Images, and Misconduct”, held September 2006.

Notes

Acknowledgements

This essay began as a brief two-page newsletter article in February of 2001 that was intended primarily for graduate students and staff. As the guidelines have been refined and revised over the last several years, I have benefited greatly from the insight and feedback of colleagues at the University of Arizona, with specific thanks to: Carl Boswell, David Elliott, Patty Jansma, R. Clark Lantz, Claire Payne, Dana Wise, and Jeb Zirato. Additional feedback from John Krueger of the Office of Research Integrity, and Sara Vollmer of the University of Alabama—Birmingham, is appreciated. The author would like to specifically thank Michael W. Davidson and his colleagues at the Molecular Expressions website (Florida State University) for developing the online resources that carefully explain some of the technical concepts referred to in this article. Adobe and Photoshop are registered trademarks of Adobe Systems Incorporated, San Jose, CA. Microsoft, Powerpoint, and Windows are registered trademarks of the Microsoft Corporation, Redmond, WA. Apple and Macintosh are registered trademarks of Apple Computer, Inc., Cupertino, CA. Corel and Photo-Paint are registered trademarks of the Corel Corporation, Ottawa, Ontario, Canada. This work was supported in part by the Southwest Environmental Health Sciences Center (SWEHSC), a National Institute of Environmental Health Sciences (NIEHS) funded center (ES006694). The views, opinions, and conclusions of this essay are not necessarily those of the SWEHSC, the NIEHS, or the University of Arizona.

Glossary

Term

Definition

Aliasing

Because pixels are square and biological structures rarely have straight edges, there are many approximations performed when a digital image is acquired. If an edge falls in the middle of a pixel, the average of the light and dark parts of the edge are reported as the intensity value of the pixel. This creates a pixel with a value that is intermediate between the light and dark intensities in the original (see Fig. 3). Aliasing is the stair-step artifact seen when these intermediate values are not created. Anti-aliasing, sometimes referred to as dithering, is when these intermediate pixels smooth out an edge to create an image that better represents the appearance of curved edges and is generally more pleasing to the eye. See: http://micro.magnet.fsu.edu/primer/java/digitalimaging/processing/undersampling/index.html (Retrieved 12/06/2009).

Microscope optics can be dirty and/or misaligned and CCD image sensors can have unequal sensitivities across the chip (e.g., “hot” or “dead” pixels). By collecting a background image under the same conditions as the specimen image, the background can be subtracted from the specimen image to correct for many of these problems. The use of background subtraction should be acknowledged in the figure legend or the methods section. See: http://micro.magnet.fsu.edu/primer/java/digitalimaging/processing/backgroundsubtraction/index.html (Retrieved 12/06/2009).

Bit depth

Describes the number of grey shades or colors in an image. Most greyscale images are 8 bit (28 = 256 shades). Using a higher bit depth, like 16 bit, yields a much higher number of greyscales (216 = 65,536). Color is often 24 bit: 8 bits each of red, green and blue (224 = 16.7 million).

Black level

The threshold at which a signal will be detected. If the signal for a given pixel is below the threshold, that particular pixel will be displayed as black (a value of 0 in an 8 bit greyscale image). By adjusting the black level, the amount of background electronic noise (and low level signal) in a detection system can be reduced.

CCD

Charge-coupled device—a light-sensitive semi-conductor chip that is used in most scientific digital cameras, as well as in many consumer digital cameras and digital video recorders. See: http://learn.hamamatsu.com/articles/ccdanatomy.html (Retrieved 12/06/2009).

Contrast stretch (Also known as a histogram stretch)

A technique used to improve the contrast in an image without adding any additional data. Involves remapping the brightness of all pixels (so that the brightest intensity in the image is defined as white and the darkest intensity is defined as black) to maximize the use of the available dynamic range in the image. After using this technique, the intensity histogram typically shows gaps where there was once (usually) a continuous range of intensities. The general consensus seems to be that performing this procedure on an image does not need to be reported in the figure legend or the methods section. See: http://micro.magnet.fsu.edu/primer/java/digitalimaging/processing/histogramstretching/index.html (Retrieved 12/06/2009).

Dodging and burning

Darkroom techniques where a small portion of a photographic print is exposed to less or more light (respectively), than the rest of the print. Dodging would be used to reduce the intensity of a selected area. Burning would be used to increase the intensity of a selected area. This technique was rarely admitted in the past, however, performing similar techniques today must be acknowledged in either the figure legend or the methods section.

A graph provided in most image processing programs. In an 8 bit greyscale image the X axis displays the greyscale intensity and the Y axis displays the number of pixels at the particular intensity value. For 24 bit color images there are typically three separate intensity histograms, each representing the 8 bit values in the red, green and blue channels.

JPEG

An acronym for the Joint Photographic Experts Group. An International Standards Organization (ISO), International Telecommunication Union (ITU) standard for storing bitmapped images in a compressed form using a discrete cosine transform. The JPEG file format uses lossy compression. Users can adjust the degree of compression when the file is saved (Microsoft Corporation 1997).

Interpolation

The estimation of intermediate values between two known values in a sequence (Microsoft Corporation 1997).

Loss-less file compression

“The process of compressing a file such that, after being compressed and decompressed, it matches its original format bit for bit. Text, code, and numeric data files must be compressed using a loss-less method; such methods can typically reduce a file to 40 percent of its original size.” (Microsoft Corporation 1997)

Lossy compression

“The process of compressing a file such that some data is lost after the file is compressed and decompressed. Video and sound files often contain more information than is apparent to the viewer or listener; a lossy compression method, which does not preserve that excess information, can reduce such data to as little as 5 percent of its original size.” (Microsoft Corporation 1997)

LZW

Lempel–Ziv–Welch—A loss-less file compression algorithm that makes use of repeating strings of data in its compression of character streams into code streams (Microsoft Corporation 1997).

Data about data (Microsoft Corporation 1997) that can include information regarding the conditions under which the image data were acquired.

Moiré

Derived from the French, “to water”. A visible wavy distortion or flickering in an image that is displayed or printed with an inappropriate resolution. Several parameters can cause moiré patterns, including the size and resolution of the image, resolution of the output device, and halftone screen angle (Microsoft Corporation 1997). Moiré artifacts can regularly be seen in broadcast television due to the incorrect sampling of clothing with tight repeating patterns.

A rectangular array of picture elements (pixels), with each pixel representing a discreet color or greyscale.

Resolution

As defined by the Rayleigh criterion. The ability to discern two adjacent objects as distinct and separate objects. In microscopy resolution can be calculated based on a number of optical factors, but is most strongly influenced by the wavelength of light used and the numerical aperture of the objective lens. Not to be confused with printer or monitor resolution, which is typically given in dots-per-inch (dpi). See: http://micro.magnet.fsu.edu/primer/java/imageformation/rayleighdisks/index.html (Retrieved 12/06/2009).

Sampling

The process of turning an analog signal into its digital representation. Sampling refers to the frequency of data points used to represent a continuous analog signal. The Nyquist/Shandon criterion states that analog signals should be sampled using at least twice the frequency of the highest frequency item in the signal (Pawley 2006). As an example, music CDs are created by sampling an analog signal at 44,000 Hz, which is twice the highest frequency that humans can hear, thus just satisfying the Nyquist/Shandon criterion. Oversampling refers to acquiring samples in excess of the criterion, and undersampling does not meet the criterion.

Sub-resolution point object

An object that is smaller than the diffraction-limited resolution of a microscope. In fluorescence microscopy, this is often a fluorescent bead of a size >0.2 μm.

TIFF (also Tiff or Tif)

Tagged image file format. A raster or bitmap image file format that incorporates embedded tags to include selected metadata. This format was originally developed by the Aldus corporation, which was subsequently acquired by the Adobe Corporation. This is the only image file format that is recommended by the Microscopy Society of America (MacKenzie et al. 2006). See: http://en.wikipedia.org/wiki/TIFF and http://partners.adobe.com/public/developer/tiff/index.html (Retrieved 12/06/2009).

Truncate

To cut off the beginning or end of a series of characters or numbers (Microsoft Corporation 1997). In this context, the term is used to refer to pixel data that are beyond the dynamic range displayed in the image and as such the intensity value of these specific data have been truncated to either the brightest or darkest values possible in the image.

Digital cameras are not equally sensitive to the three main colors of light (red, green, blue). To compensate for the differences in sensitivity and the different colors of illumination sources used, software can be used to adjust the balance of the colors so that the whites in the image are correctly displayed and, by extension, all the other colors as well. The use of white-level balancing should be acknowledged in the figure legend or the methods section, particularly if the balance was set automatically or differently for different images. See: http://micro.magnet.fsu.edu/primer/java/digitalimaging/processing/whitebalance/index.html (Retrieved 12/06/2009).

Committee on Ensuring the Utility and Integrity of Research Data in a Digital Age, National Academy of Sciences. (2009). Ensuring the integrity, accessibility, and stewardship of research data in the digital age. Washington, DC: National Academies Press.Google Scholar