Tutorial: Hidden Pixels

Graphical applications do not always display every pixel in a picture. This is common for images that use transparency (an alpha channel), and with JPEG cropping. Although the hidden pixels are not displayed when the picture is rendered, they still contain color values.

Different applications use different approaches for managing colors in undisplayed areas. This impacts the colors stored in the hidden pixels. These different approaches can be identified and mapped to different types of applications.

Alpha Channels

When pictures are rendered on the computer screen, colors are created by combining red, green, and blue components. These three color planes are called channels; the red color channel stores all of the red intensities, the green channel stores green intensities, and blue stores blue intensities. This is typically written as "RGB".

Pictures are not limited to three color channels. A fourth channel, called alpha (denoted by the letter "A") can be combined to define a transparency level. The resulting RGBA image defines the visible colors and the degree of transparency.

With most image formats, the alpha channel is either boolean or a value from 0% to 100% (or 0 to 255 in byte values). With the scaled value, 0% (black) denotes a location in the image where there should be complete transparency -- the RGB value is ignored and the background is completely visible. An alpha channel value of 100% (white) means that the RGB value is completely opaque (non-transparent). Intermediate values determine how much of the background should be blended with the pixel's RGB value.

Gradient alpha values are typically used to blend an overlay picture onto another image or into a web page. For example, the FotoForensics banner at the top of this page uses a transparent background to blend the logo seamlessly with the web page. If the web page's background changes color, then the image does not need to be recreated in order to blend onto the page.
Gradient alpha channels are supported by PNG files, as well as WebP, ICO, ICN, and other bitmap file formats.

Boolean alpha channels only use 0% and 100% opacity values. (A pixel is either transparent or it isn't.) While this does not permit smooth blending along borders, it does permit transparency.
GIF only supports a boolean alpha channel.

The following sample images demonstrate the use of a gradient alpha channel.

The RGB image. This picture has no alpha channel and is completely opaque.

This is an alpha channel. By itself, it identifies the relative intensities of the pixels.

This picture combines the RGB image with the alpha channel, forming an RGBA picture. The portions of the alpha channel are blended with the background. In this case, the background is a light blue color.

This is the same RGBA image on top of a checkerboard background. (The background patterns changes as you mouse-over it.) The color stripes blend into the checkerboard background based on the intensity of the alpha channel.

This RGBA image uses a boolean alpha channel. Any alpha value not "100%" is rendered as 0%. The stripes seen in the original RGB image are still present in this picture. However, the alpha channel has hidden parts of the stripes.

In each of these sample RGBA pictures, the full set of stripes from the RGB image are present. Removing the alpha channel reveals the original RGB image.

Alpha Signatures

When the alpha channel identifies a pixel that is fully or partially opaque, the RGB value for the pixel remains important for rendering the image. However, if the pixel is completely transparent, then the stored RGB value becomes unimportant. Removing the alpha channel permits viewing the hidden pixel colors.

The colors assigned to hidden pixels depends on the application and the operation that was performed. For example:

Operation

Gimp

Adobe Photoshop

BackgroundThe default RGB color for a transparent background.

By default, Gimp uses a black background. However, the default background contents depends on the Gimp version.

Gimp 2.6.9 and earlier: If the picture was opened as an RGB and a transparent layer was then added, the default background pattern will match the RGB image at the time the transparency was added.

Gimp 2.6.10 and later: Saving a PNG displays a menu that includes an option to "Save background color". This option will set the hidden pixels to whatever background color is selected at the time of the save. This can result in backgrounds that are neither black nor white.

The default background color is white.

UnchangedLoading an RGBA image and saving it without any additional alterations.

The RGBA values remain unchanged.

Adobe may apply a color profile, causing all RGB values to change a little. However, the alpha channel and the hidden RGB values will effectively remain unaltered.

ErasingUsing an eraser tool to remove color will set the alpha channel values to 0%. If 'feathering' is used, then the area between the opaque and transparent portions of the image will be a gradient (value between 1% and 99%).

With Gimp 2.6.9 and earlier, the RGB values remain unchanged. Only the alpha channel is altered.

With Gimp 2.6.10 and later, the RGB values are set to black.

The RGB colors masked by the alpha channel will be set to white.

Drawing LinesDrawing curves and lines onto an RGBA image.

Only the visible RGB values are set. The masked RGB values will remain unchanged.

Without the alpha channel, smooth curves will still appear as smooth curves. If the edges blend into the background, then the alpha channel will follow the smooth curves.

The visible RGB values are set. The masked RGB values will either remain unchanged or the selected area will be filled with black.

Without the alpha channel, lines will appear rough and jaggy. Adobe uses the alpha channel to smooth curves and blend edges into the background.

Fills and Selected AreasFills into selected regions in an RGBA image.

Only the visible RGB values are set. The masked RGB values will remain unchanged.

The visible RGB values are set. However, transparent areas outside of the selected region will have altered RGB values that match the fill pattern. This usually appears as horizontal or vertical colors stretched from the last visible pixel color. If a filled pattern is used, then the pattern is repeated outside of the visible area.

Selection MasksGraphic editors that support layers often permit using one layer as a mask for another layer. Portions outside of the mask are hidden by the alpha channel.

Only the visible RGB values are set. The masked RGB values will remain unchanged.

The visible RGB values are set. The transparent areas outside of the selected mask contain the unmasked picture.

TextText characters can be typed onto a picture.

Only the visible RGB values are set. The masked RGB values will remain unchanged.

The visible RGB values are set. The selected area containing the text is filled with the text color. The transparent bounding box containing the text will have the same RGB color as the text.

Specifically: Adobe fills a region containing the text with a color, and then uses the alpha channel to mask out non-text pixels. (Similar to how a stencil is used to paint a sign.)

If the entire text is recolored, then the bounding box is filled with the new color.

If a portion of the text is selected and recolored, then the bounding box is filled with black and each part of the text is filled with the appropriate text color. The text color is filled slightly larger than the actual text and the alpha channel is used to deselect the fine details in the visual portion.

Rendering LayersAdvanced graphic editors support drawing layers that are combined to form the final output. How they are combined may impact the hidden pixel content.

Because the hidden RGB values are usually black, the layer order does not impact the hidden RGB values. The only way to determine the order of layers or drawing actions is to look for overlapped content.

Adobe appears to combine RGB values from transparent areas based on the layer ordering. Lower layers are added first and upper layers overwrite the transparency from lower layers. In computer graphics, this overwriting approach is called a "painter's algorithm" and permits identifying the layer ordering. Specifically: if the hidden pixel coloring matches a component in the picture, then that component is on an upper layer. And if hidden pixel coloring from one component overwrites the hidden pixel coloring from another component, then it shows the overwrite comes from the upper layer.

Adobe Photoshop permits using layers as masks for other layers. This can result in a partial merger of different hidden region content.

Other OperationsThis is not a complete list of operations that common graphic editors perform. For example, most graphics programs can cut-and-paste selected regions, convert between bitmap and vector graphics, and perform skewing or scaling operations.

In general, Gimp only alters the visible pixels. Hidden areas remain unaltered. Smooth curves that blend into the background will still appear as smooth curves.

In general, Adobe either bleeds colors into the transparent regions, or fills hidden regions with black or white. Curves that blend into the background rely on the alpha channel for shaping and smoothing; removing the alpha channel will reveal rough, jaggy edges.

There are many other types of drawing applications that support PNGs and transparencies. Each application has different handling methods. For example:

Software

Hidden Pixel Handling

Facebook and Instagram

Sets new content to black , but erasing on photos only changes the alpha channel and not the RGB values. For example, erasing corners from a photo for rounding pictures will retain the corner RGB values.

This program is commonly used to resize pictures, strip metadata, and convert image formats.

If the picture is altered, then all hidden pixels are converted to black . If the picture is unaltered, then there are no changes to the hidden pixel colors.

Microsoft Office

All hidden pixels are colored gray (the color &nbsp is written #818181 or [129,129,129]). As with Adobe, text and edges without the alpha channel appear jaggy; the alpha channel provides smooth anti-aliasing.

Paint.NET

Sets hidden pixels to white . The visible areas do not appear jaggy without the alpha channel.

It is very common for people to use Paint.NET when editing a PNG that was generated by another applications, or to resave a PNG without edits. When Paint.NET saves a PNG from another application, the unaltered hidden areas retain their RGB values. Paint.NET sets all new hidden pixels from edits to white, but unaltered hidden pixels are unchanged.

Inkscape

Sets hidden pixels to white . In general, this appears similar to Paint.NET, except that edges appear jaggy without the alpha channel.

As with Paint.NET, Inkscape is commonly used to edit PNGs that were generated by other applications. Inkscape sets all new hidden pixels from edits to white, but unaltered hidden pixels are unchanged.

While it may not be possible to distinguish a PNG created with Gimp from Facebook or PicMonkey, this method can distinguish PNGs generated by Adobe or Microsoft products from other products. And in a few situations, it can distinguish "possibly Gimp" from non-Gimp applications.

When using Hidden Pixels with a PNG image, the analyzer will display the number of hidden pixels on the left-side, under the list of analyzers.

Caveats

Different applications handle transparencies differently.

This tutorial has only covered a few applications (Gimp, Adobe Photoshop, Facebook, etc.) Other applications may use similar approaches to these, or may use other approaches for managing hidden RGB values.

Combining drawing tools, such as using Gimp to create a PNG and Adobe to edit the PNG, can result in a mix of hidden RGB signatures.

Many PNG editors retain metadata. Even if the metadata says that the picture was edited by PicMonkey or Gimp or Adobe Photoshop, it may still have been edited or last-saved by a different program (e.g., Paint.NET or ImageMagick). If the metadata does not corroborate with the hidden pixel attributes, then the picture was likely edited and/or resaved by a second application.

Viewing pixels hidden by an alpha channel only permits identifying one aspect of the picture. The interpretation of results may be inconclusive. It is important to validate findings with other analysis techniques and algorithms.

JPEG Padding

The JPEG algorithm segments the picture into independent 8x8 grids.
But what happens when the rendered JPEG dimensions are not grid aligned? For example, what if the picture's dimensions are 203x250? (Neither 203 nor 250 are divisible by 8.) With JPEG encoding, the actual encoded picture is padding out to the next largest 8x8 boundary and the pixels outside the visible area are omitted during rendering. If the picture is 203x250, then the actual encoded image will be 208x256 pixels and the remaining 5 pixels along the right edge and 6 pixels along the bottom are omitted.

What about 16x16?

With JPEG encoding, the luminance (grayscale intensity) is always encoded using an 8x8 grid. However, the chrominance (color attributes) may be encoded from 8x8, 16x16, 16x8, or 8x16 pixel grids. With a 16x16 grid, there are four 8x8 luminance components and one set of chrominance components. The chrominance components are stretched over a 16x16 region. (This stretching does result in a color distortion. However, the human eye is not that sensitive to color and is unlikely to notice the lower quality distortion.)

Along the bottom and right edges of the picture, the JPEG data stream only encodes luminance components for visible elements. If only 3 of the 16 columns on the right edge are used (green), then the stream encodes the chrominance, along with only the two left-most luminance tables. JPEG pads out to the 8x8 boundary (white). The right-most luminance tables (gray) are not stored since they have no visible components.

Hidden Edge Pixels

Any type of JPEG encoding that requires recomputing with the JPEG quantization tables requires a method for padding incomplete JPEG grids. Moreover, because all values in a JPEG grid impact all other values in the same grid, simply padding with "black" will likely distort the outer edge by introducing a high-contrast edge between the visible and hidden pixels.

Different JPEG encoding systems use different approaches for padding the hidden JPEG grid. These include:

Method

Description

Lossless Cropping

If a large picture is simply being cropped to a smaller size, then there is no reason to decode and re-encode the quantized data. Any grids outside of the cropped region can be dropped, and any partial grids are stored without any alteration. This operation is called a lossless crop because it does not require recomputing the lossy JPEG encoding.

With lossless cropping, expanding the image to the 8x8 boundary will reveal original pixels from the source image that were padded out to the 8x8 boundary.

Digital Cameras

Nearly all digital cameras completely fill out the JPEG grid; there are usually no hidden pixels along the edges. This is done for two reasons. First, it maximizes the size of the picture being generated. (Camera manufacturers can then boast about the size of the image.) And second, there are no unnecessary computations and no wasted space.

A few digital cameras, including models by Huawei, Kodak, Nokia, and Sony HDR, can generate pictures that are not grid-aligned. This is typically done to match an exact aspect ratio. These cameras typically fill the remaining hidden pixels with content from the camera sensor that is not normally visible. Expanding the image to show the hidden padding will reveal additional original pixels from the camera's sensor.

However, some cameras are atypical. For example, Android 5.x devices (e.g., the Samsung SM-N910P smartphone) fills the padding with a grayscale stretch. (It looks like "stretched colors", but it is grayscaled.) Android 4.2.x fills the padding with a random slice from the image (it appears to be an uninitialized, reused buffer), and the GoPro Hero4 pads with random data (an uninitialized buffer).

Stretched Colors(Common Libraries)

The JPEG Standard (section A.2.4) specifies extending the last visible pixel color through the padded area. This minimizes distortions along the visible edge during decoding.

For example, if the JPEG uses 8x8 grids and the image only fills out 3 columns in the right-most grid, then the remaining 5 columns will be padded with stripes that replicate the previous visible values.

→

This stretched pattern indicates a common JPEG encoding library, such as libjpeg, gd-jpeg, or the Intel JPEG Library (IJL). Gimp, Facebook, and most non-Adobe products use this encoding approach. Adobe products older than CS, such as Photoshop 7.0, also use this approach.

Reversed Pattern(Adobe CS or later)

Beginning with the CS product line, Adobe reverses the visible pattern, appearing as a "butterfly" or rapid zig-zag pattern.
This padding approach attempts to soften ringing distortions from replicated patterns.

→

If the initial colors are "123xxxxx" (where 'x' needs padding), then it will be filled with "12332112" or "12332123". The pattern reverses itself and repeats.

This reversing pattern indicates an Adobe product (CS or later). It is present even when Adobe saves using the JPEG Standard quantization tables.

Steganography

While extremely rare (even among steganography users), some steganographic systems attempt to store data in these hidden pixel regions. This can result in random-looking patterns in the hidden padding.

Encryption

Encryption is different from steganography. Steganography tries to avoid being detected. Encryption tries to prevent comprehension. These are independent concepts; you can have encryption without steganography, and steganography without encryption.

Some encryption systems store data in the JPEG image. This can result in a picture that looks like random noise. While most of these applications use a standard JPEG library for encoding the picture (resulting in stretched padding), a few tools appear to use custom encoding libraries. We have seen a few libraries that pad using solid black, solid white, or faint gray. We have not yet identified these encryption tools.

When using Hidden Pixels with a JPEG image, the analyzer will display the padding dimensions on the left-side, under the list of analyzers.

Combining Analysis Tools

The hidden JPEG padding pattern can be combined with the JPEG % to identify the type of JPEG library used to last-encode the image. For example:

A reversed pattern with standard quantization tables indicates an Adobe CS product that saved using the standard quantization tables.

Displayed Regions

JPEG padding forces the image to fit into the JPEG 8x8 grid. When present, padding only occurs along the bottom and right edges. FotoForensics highlights the padded region when displaying the hidden JPEG padding.

FotoForensics mutes all elements in the picture that are not part of the padded edge region.

A white line is drawn just outside of the padded 8x8 region, so that it is easy to distinguish from the rest of the picture.

Within the padded 8x8 edge are 1-7 visible pixels that are partially muted, and the remaining padded pixels are presented in full brightness.

This highlighting permits an analyst to readily identify the padded region and determine whether the padding is stretched (from a standard JPEG library), reversed (Adobe CS or later), lossless (cropped), or randomized (steganographic).

Caveats

Because the visible JPEG grid is limited to a maximum of 8x8 pixels, the most pixels that any JPEG will have hidden is 7 pixels off the right edge, 7 pixels off the bottom, and all but the top-left pixel in the bottom-right corner's 8x8 grid. (In the most-padded scenario, only one pixel in the bottom-right corner is used.)

While the best-case for analysis can have 7 pixels padded along the bottom and right edges, the average case has fewer hidden pixels. Fewer than three pixels may not be enough to visualize additional information.

When there are only a few pixels hidden, such as 1 or 2 pixels of padding beyond the visible edge, it can become virtually impossible to distinguish stretched padding from reversed padding.

A single line (1 pixel wide) of hidden pixels may not contain enough information to identify any specific kind of encoding.

An edge with uniform coloring (e.g., a solid white edge), may not contain enough information to identify any specific kind of encoding. This is because the results from stretching or reversing a solid color will appear identical.

Adobe CS, CS2, CS3, etc. normally uses the reversed pattern padding method. However, if a JPEG only has 1 pixel, then there are no values to alternate. 1 pixel of padding will look stretched, even with the reversed pattern padding system.

Most pictures from digital cameras are aligned to the JPEG grid; there are no padded pixels. This is because camera manufacturers want to reduce waste and claim larger image sizes. However, this isn't always the case. For example, the Samsung SCH-I545 will force a 16:9 aspect ratio, even if if that means encoding some visible pixels as if they were padding. And some cameras, like the Sony C6603, are off-by-one.

Viewing hidden pixels along the cropped JPEG edge only permits identifying one aspect of the picture. The interpretation of results may be inconclusive. It is important to validate findings with other analysis techniques and algorithms.

Sample Images

The following examples demonstrate how to evaluate pictures that include hidden pixels.

Sample PNG with Transparency

As an example, the FotoForensics banner was created using Adobe Photoshop CS5. It contains an alpha channel, a drawing (the camera lens), text, an a picture of a fingerprint.

A sample banner image. The transparency permits the logo to blend into the dark blue background.

The alpha channel.

The RGB image with the alpha channel inverted: what was hidden is now visible, and what was visible is now muted (with a checkerboard background). This shows the pixels that were hidden by the alpha channel.

Without the alpha channel, we can see:

The circular logo (drawing) stretches out to fill the square area that was originally selected for creating the lens. This indicates a drawing that was created with an Adobe product since only Adobe products stretch out colors to fill the hidden regions.

Different hidden pixel areas appear to overwrite other hidden pixel areas. This means that the picture was created in layers. Each region denotes a different layer.

The main text is gray and slightly larger than the visible text size. The portions of the text colors hidden by the transparency layer are rough and jaggy. This is consistent with an Adobe product because Adobe uses the alpha channel to help shape the visible text.

The bounding box containing the gray text is black. This indicates that a portion of the text was selectively recolored. However, it doesn't identify which part of the gray text was recolored. (Although not determined from this image, the "sic" was recolored prior to being blended into the fingerprint.)

The fingerprint picture is visible outside of the masked area. It is at a higher layer than the text because it overwrites the black text background. However, the fingerprint appears merged with the text because the extreme left and right of the fingerprint does not overwrite the black band. (Adobe Photoshop permits complex layer effects and layer masking combinations.)

The lens is at a lower layer than the text. This is determined because the small, black, triangular gap between "Fo" and "toF" overwrote the stretched portion from the lens.

The red "Lab" and blue "Demo" backgrounds overwrite the lens and gray text regions. This indicates that Lab and Demo are at higher layers that the gray text and lens.

The identified layers, from bottom to top, are: the lens, gray text, fingerprint effect over "sic", "Lab", and "Demo". However, the fingerprint, Lab, and Demo do not overlap, so we cannot determine which of these three elements is at the top layer.

The remaining transparent region, outside of the text box, contains white RGB values. This is consistent with an Adobe application.

This particular picture does not contain any metadata that identifies how it was created. However based on these observations, the hidden artifacts seen in this PNG are consistent with an Adobe application. Moreover, they are inconsistent with Gimp, PicMonkey, Google Drawings, and other applications.
For example, had this picture been created with Gimp, the hidden pixels should be completely black with the exception of the fingerprint image -- which could be hidden by the alpha layer.

Even without metadata or other analysis tools, the content of the transparent regions clearly identifies that this picture was created with an Adobe product.

JPEG Padding

The following digitally altered pictures contain hidden padding.

Over on Reddit, a user posted a picture titled "Introducing, hurdlers without hurdles." The picture had been uploaded to Imgur, which strips out metadata. This means that the metadata does not identify the last program to edit the image.

The source picture is 500x333. This means that encoded image in the JPEG is 504x336 (since both 504 and 336 are multiples of 8). The width contains 4 pixels of hidden padding on the right, and 3 pixels of hidden padding along the bottom.

This shows the hidden padding.

The main picture has been muted and given a checkered background. This way, you can still see the content but it won't be confused with the padded area.

There is a solid white line showing the last edge of the full (unpadded) JPEG grid. The remaining right and bottom edges show the final row and column of JPEG grids that include some padded pixels.

Along the bottom are 8 rows of pixels: 5 come from the picture (unpadded) and 3 come from the padding. Simplarly, the right edge has 4 columns of unpadded and 4 columns of padded pixels. The unpadded pixels have been recolored to appear a little grayish; this permits visually distinguishing the unpadded from the padded pixels.

The padded pixels appear to extend the last unpadded colors. This stretched padding method is consistent with the JPEG Standard and common JPEG libraries. Although we don't know what application last encoded this picture, we can rule-out an Adobe CS product.

This ad image, originally from Walmart, has been significantly altered. Sites that point out "photoshop disasters" criticized this picture for having items digitally placed in the room. Some items, such as the plant, rug, and crib lack reflections on the floor -- they are more obvious than some of the other edits. (Variations of this picture are still used to sell items on Amazon.)

The picture is 500x500. This means the encoded image is 504x504; it contains 4 pixels of padding along the right and bottom edges.

Although the edges only have 4 pixels of padding, there is a very visible reversed padding pattern. This is most apparent in the upper-right corner, next to the white lights. The unpadded portion of the 8x8 grid tapers downward, along the curve of the white light. However, the padded portion reverses the pattern and tapers upward. This reversed pattern creates a distinct "butterfly" or "zigzag" appearance.

Similarly, the bottom center appears to show a circular shape in the bluish background. This is because the visible pixels began to spread out and the reversed padding close it back -- forming a circle.

The other areas of the padding also follow the reversed pattern. However, the muted colors do not have a distinct edge, so the reversed pattern is more difficult to see.

The padding with a reversed pattern is very distinct to an Adobe CS product (CS, CS2, CS3, etc.) Although the metadata indicates that the picture was last saved with an Adobe product, it does not identify any additional information. The hidden padding corroborates the metadata findings. Moreover, the hidden padding identifies it as an Adobe CS product and not a pre-CS product.