JPEG COM Marker Processing Vulnerability

This article explains a vulnerability in Netscape browsers that was
present since at least version 3.0 and up to Netscape 4.73 and Mozilla M15.
The vulnerability has been fixed in Netscape 4.74 and Mozilla M16 (in 2000).
The same vulnerability has been introduced into Microsoft products
a couple of years after Netscape's was fixed
(and it took another two years, for a total of four years,
for Microsoft's one to be fixed too).

At this time, information on these vulnerabilities is mostly of educational
value. Also valuable is the
generic heap-based buffer overflow exploitation technique
that was first shown in Openwall's security advisory on the Netscape instance of
the vulnerability.
Since Microsoft's instance of the vulnerability was not yet present (let alone
discovered) when the original advisory was written and published, the advisory
and hence this article, which is derived from the advisory,
focuses on Netscape only.

Impact

It may be possible, although hard to do reliably in a real-world
attack, for a malicious web site to execute arbitrary machine code
in the context of the web browser. In the case of Netscape Mail or
News, the attack may be performed via a mail message or a news
article as well.

Vulnerability details

JPEG interchange format streams consist of an ordered collection of
markers, parameters, and entropy-coded data segments. Many markers
start marker segments, which consist of the marker followed by a
sequence of related parameters. Marker segments are of a variable
length, with the first parameter being the two-byte length. The
encoded length includes the size of the length parameter itself.
Thus, lengths smaller than 2 are always invalid.

Netscape browsers use the Independent JPEG Group's decoder library
for JPEG File Interchange Format (JFIF) files. However, they install
a custom handler for processing the COM (comment) marker that stores
the comment in memory rather than just skip it like the library would
do. Unfortunately, the new handler doesn't check whether the length
field is valid, and subtracts 2 from the encoded length to calculate
the length of the comment itself. It then allocates memory for the
comment (with one additional byte for its NUL termination) and goes
into a loop to read the comment into that memory.

By setting the length field to 1, it is possible to ensure the memory
allocation call (of 0 bytes) will succeed. As the calculated comment
length is declared unsigned, it will be a huge positive value rather
than a small negative one, so the loop won't terminate until the end
of JPEG stream. It will read the JPEG stream onto the heap, possibly
overwriting other dynamically allocated buffers of Netscape, as well
as structures internal to the malloc(3) implementation. Exploiting
this vulnerability into executing arbitrary code is non-trivial, but
possible on some platforms.

The real problem

Is this problem in the lack of error checking in the code? Indeed.
A programmer's fault. Is this a problem because of the choice of a
programming language that doesn't offer overflow checking and with
compilers that traditionally don't offer bound checking? Partially.

However, let's see how many different file formats, languages, and
protocols a modern web browser has to support. Have all of the file
parsers been initially implemented with intent to be robust against
untrusted and possibly malicious data? If not, have all possible
cases been covered with additional checks now? Do we have a reason
to believe there are no bugs in the implementation of any of those, or
do we have reasons to suspect the opposite?

Solutions? There's little an end user can do, but you can take this
as yet another reminder to run the web browser you use to
access untrusted content in a restricted environment, without access
to your most critical data. Unfortunately, this may be difficult
and/or inconvenient to do, although there's
an effort to make it easy and convenient.

Source code patch (proper fix)

The specific JPEG COM marker vulnerability is no longer present in non-ancient
versions of the products (that is, unless it has been re-introduced).

At the time it was relevant, a source code patch could be (against Mozilla M15):

Binary patch (workaround)

Included in the archive accompanying the original Openwall advisory (see below)
is an unofficial binary patch for older versions of Netscape browsers on
some platforms. Source code and a Win32 binary of the patch program
were provided - with absolutely no warranty - for those who could not upgrade
to a fixed version.

This patch prevented the browser from installing its own COM marker
handler. It did not fix the handler itself. The latter would require
extra code and it would result in much larger search patterns that
wouldn't apply to as many versions of Netscape.

Although the binary patch appeared to work reliably in our testing, it could
refuse to apply or it could even apply incorrectly to some versions of Netscape
browsers that we did not test. For this reason, users were asked to check that
the patch has worked as intended by trying to display the demonstration JFIF
file (crash.jpg). The browser, if correctly patched, was supposed to display
an image identical to that in valid.jpg.

Exploiting the vulnerability

The vulnerability lets us overwrite heap locations beyond the end of
allocated area. We're limited to printable characters, NUL, and LF.
Thus, the ability to exploit this into doing more than a crash will
depend on locale settings on some platforms.

First, we need to decide on what we overwrite. Structures internal
to the dynamic memory implementation are the most promising target:
they're always there and they typically contain pointers.

For the example below, we'll assume Doug Lea's malloc (which is used
by most Linux systems, both libc 5 and glibc) and locale for an 8-bit
character set (such as most locales that come with glibc, including
en_US, or ru_RU.KOI8-R).

The following fields are kept for every free chunk on the list: size
of the previous chunk (if free), this chunk's size, and pointers to
next and previous chunks. Additionally, bit 0 of the chunk size is
used to indicate whether the previous chunk is in use (LSB of actual
chunk size is always zero due to the structure size and alignment).

By playing with these fields carefully, it is possible to trick calls
to free(3) into overwriting arbitrary memory locations with our data.
free(3) checks if a chunk adjacent to the one being freed is in use
and, if not, consolidates the two chunks by unlinking the adjacent
chunk from the list. Unlinking a chunk involves setting the previous
chunk's "next" pointer and the next chunk's "previous" pointer, where
both of these chunks are addressed via pointers from the chunk being
unlinked. Thus, in order to get control over these memory writes, we
need to overwrite the two pointers within a chunk (or maybe allocated
memory at the time) and preferably reset the PREV_INUSE flag of the
next chunk. This takes 13 bytes on a 32-bit little endian, such as
Linux/x86 (8 bytes for the two pointers, four bytes placeholder for
the previous size field, and 1 byte to reset the flag). In practice,
we would want to repeat the desired 16-byte pattern (of which only 9
bytes matter) at least several times to increase our chances in case
of larger allocated chunks.

The overwritten pointers each serve as both the address and the data
being stored, which limits our choice of data: it has to be a valid
address as well, and memory at that address should be writable.

Now we need to decide what pointer we want to overwrite (there's not
that much use in overwriting a non-pointer with an address). A good
candidate would be any return address on the stack. That would work,
but not be very reliable as the location of a return address depends
on how much other data is on the stack, including program arguments,
and that is generally not known for a remote attack. A better target
would be a function pointer. We don't want to guess exact locations
on the stack and we can't get to the ELF sections on x86 (BS isn't a
printable character), so we're effectively limited to pointers within
shared libraries. A nice one we can use is __free_hook, so that a
second call to free(3) will give us control. The debugging hooks
are always compiled in when Doug Lea's code is included in GNU libc.

Our next decision is about where we want the control transferred. We
would definitely prefer to have our "shellcode" within the JFIF file
itself. However, the character set restriction might prevent us from
passing heap addresses. We have to settle on the stack and place our
code in there via other parts of the browser prior to the overflow.
We can use a bunch of NOP's or equivalent to avoid having to provide
an exact stack location.

A compiler to produce JFIF files implementing the above approach is
included in the accompanying archive.

Please note that this is by no means limited to Linux/x86. It's just
that one platform had to be chosen for the example. So far, this is
known to be exploitable on at least one Win32 installation in a very
similar way (via ntdll!RtlFreeHeap).

References

Additional programs and sample JFIF files mentioned in the advisory were
provided in the accompanying archive in
tar.gz and
ZIP formats

Credits and contact information

This vulnerability was found and advisory written by Solar Designer
<solar at openwall.com>. I would like to thank Kevin Murray of Netscape
Communications and many others from both the Mozilla community and
Netscape for their support in handling of this vulnerability.