Sometimes you just come across a page with a bunch of images you’d like to
download. Usually it’s just img tags with a src, sometimes its Base64 encoded
JPEGs, occaisionally it’s JSON or even CSS1 that’s fetched after the page is
done loading and other times these URLs are protected to make sure you can only
view the images on this one specific website.

The website in question publishes comics. Individual pages (images) are encoded
in Base64 with added garbade characters at seemingly random positions, so that
you can’t just download and decode them. Instead you have to remove specific
chunks of the encoded data. This is done in the browser.

Given a string, the information needed can be encoded in two numbers: the offset
at wich to start and the number of characters to be removed. We have neither of
those. Instead, we get a suspicious looking global variable called nonce2.
This variable is a string consisting of numbers and lowercase letters. It always starts with a number. Let’s look at the following example and how it can be
modified to a point at which it can be used to repair our encoded images:

"3d9931cfbae4fe582cd4e4171762ef9e"

Looks like your typical hash. This string now gets divided into pairs of numbers
and letters:

["3d", "9931cfbae", "4fe", "582cd", "4e", "4171762ef", "9e"]

These groups can be split further into pairs of their numeric and alphabetic
parts. And after counting each groups letters, an array of pairs of numbers is
left:

[[3, 1], [9931, 5], [4, 2], [582, 2], [4, 1], [4171762, 2], [9, 1]]

In addition to that, the first number is assumed to be decoded as an uint8,
which leaves us with the final result:

[[3, 1], [203, 5], [4, 2], [70, 2], [4, 1], [242, 2], [9, 1]]

These pairs are now used to determine which parts of the encoded image should be
removed. The left number corresponds to the absolute position in the encoded
image and the right one to the number of characters that will be removed. The
described method is ideal for the specific task: The image might be pretty large
and thus requires large numbers to define offsets3, but you don’t necessarily
want to drastically increase its size and only add a small amount of extra
characters.

I really like the idea of problem specific encoding of 2D data in alphanumeric
strings and this is a very creative but limited way to do so.

Instead of requesting JS, this method requests dynamically generated CSS
that replaces an elements background with the image URI. ↩︎

The first occurence is a simple window.nonce = "foobar", later followed
by a bunch of JavaScript operations on window['no' + 'nce'] to make the
process a bit more involved. ↩︎

The offset could be large, but it seems to be sufficient to limit it to
255 characters. ↩︎