Abusing JSONP with Rosetta Flash

In this blog post I present Rosetta Flash, a tool for converting any SWF file to one composed of only alphanumeric characters in order to abuse JSONP endpoints, making a victim perform arbitrary requests to the domain with the vulnerable endpoint and exfiltrate potentially sensitive data, not limited to JSONP responses, to an attacker-controlled site. This is a CSRF bypassing Same Origin Policy.

High profile Google domains (accounts.google.com, www., books., maps., etc.) and YouTube were vulnerable and have been recently fixed. Twitter, LinkedIn, Yahoo!, eBay, Mail.ru, Flickr, Baidu, Instagram, Tumblr and Olark still have vulnerable JSONP endpoints at the time of writing this blog post (but Adobe pushed a fix in the latest Flash Player, see paragraph Mitigations and fix).

Update: Kudos to Twitter Security for being so responsive over the weekend, engaged and interested. They have fixed this on their end too. But they admitted I ruined their weekend:P .

This is a well known issue in the infosec community, but so far no public tools for generating arbitrary ASCII-only, or, even better, alphanum only, valid SWF files have been presented. This led websites owners and even big players in the industry to postpone any mitigation until a credible proof of concept was provided.

Paper and slides

The attack scenario

To better understand the attack scenario it is important to take into account the combination of three factors:

With Flash, a SWF file can perform cookie-carrying GET and POST requests to the domain that hosts it, with no crossdomain.xml check. This is why allowing users to upload a SWF file on a sensitive domain is dangerous: by uploading a carefully crafted SWF, an attacker can make the victim perform requests that have side effects and exfiltrate sensitive data to an external, attacker-controlled, domain.

JSONP, by design, allows an attacker to control the first bytes of the output of an endpoint by specifying the callback parameter in the request URL. Since most JSONP callbacks restrict the allowed charset to [a-zA-Z], _ and ., my tool focuses on this very restrictive charset, but it is general enough to work with different user-specified allowed charsets.

SWF files can be embedded on an attacker-controlled domain using a Content-Type forcing <object> tag, and will be executed as Flash as long as the content looks like a valid Flash file.

Rosetta Flash leverages zlib, Huffman encoding and ADLER32 checksum bruteforcing to convert any SWF file to another one composed of only alphanumeric characters, so that it can be passed as a JSONP callback and then reflected by the endpoint, effectively hosting the Flash file on the vulnerable domain.

A bit more on Rosetta Flash

Rosetta Flash takes in input an ordinary binary SWF and returns an equivalent one compressed with zlib such that it is composed of alphanumeric characters only.

Rosetta Flash uses ad-hoc Huffman encoders in order to map non-allowed bytes to allowed ones. Naturally, since we are mapping a wider charset to a more restrictive one, this is not a real compression, but an inflation: we are effectively using Huffman as a Rosetta stone.

ADLER32 checksum bruteforcing

As you can see from the SWF header format, the checksum is the trailing part of the zlib stream included in the compressed SWF in output, so it also needs to be alphanumeric. Rosetta Flash appends bytes in a clever way to get an ADLER32 checksum of the original uncompressed SWF that is made of just [a-zA-Z0-9_\.] characters.

An ADLER32 checksum is composed of two 4-bytes rolling sums, S1 and S2, concatenated:

ADLER32 checksum.

For our purposes, both S1 and S2 must have a byte representation that is allowed (i.e., all alphanumeric). The question is: how to find an allowed checksum by manipulating the original uncompressed SWF? Luckily, the SWF file format allows to append arbitrary bytes at the end of the original SWF file: they are ignored. This is gold for us.

But what is a clever way to append bytes? I call my approach Sleds + Deltas technique:

ADLER32 checksum manipulation.

Basically, we can keep adding a high byte sled (of fe, because ff doesn't play so nicely with the Huffman part we'll roll out later) until there is a single byte we can add to make S1 modulo-overflow and become the minimum allowed byte representation, and then we add that delta.

Now we have a valid S1, and we want to keep it fixed. So we add a NULL bytes sled until S2 modulo-overflows, and we also get a valid S2.

Huffman magic

Once we have an uncompressed SWF with an alphanumeric checksum and a valid alphanumeric zlib header, it's time to create dynamic Huffman codes that translate everything to [a-zA-Z0-9_\.] characters. This is currently done with a pretty raw but effective approach that has to be optimized in order to work effectively for larger files. Twist: also the representation of tables, to be embedded in the file, has to satisfy the same charset constraints.

DEFLATE block format.

We use two different hand-crafted Huffman encoders that make minimum effort in being efficient, but focus on byte alignment and offsets to get bytes to fall into the allowed charset. In order to reduce the inevitable inflation in size, repeat codes (code 16, mapped to 00) are used to produce shorter output which is still alphanumeric.

This universal proof of concept accepts two parameters passed as FlashVars:

url — the URL in the same domain of the vulnerable endpoint to which perform a GET request with the victim's cookie.

exfiltrate — the attacker-controlled URL to which POST a x variable with the exfiltrated data.

Mitigations and fix

Mitigations by Adobe

Because of the sensitivity of this vulnerability, I first disclosed it internally in Google, and then privately to Adobe PSIRT. A few days before releasing the code and publishing this blog post, I also notified Twitter, eBay, Tumblr and Instagram.

Adobe confirmed they pushed a tentative fix in Flash Player 14 beta codename Lombard (version 14.0.0.125, release notes) and finalized the fix in today's release (version 14.0.0.145, released on July 8, 2014).

Mitigations by website owners

First of all, it is important to avoid using JSONP on sensitive domains, and if possible use a dedicated sandbox domain.

A mitigation is to make endpoints return the HTTP headerContent-Disposition: attachment; filename=f.txt, forcing a file download. This is enough for instructing Flash Player not to run the SWF starting from Adobe Flash 10.2.

To be also protected from content sniffing attacks, prepend the reflected callback with /**/. This is exactly what Google, Facebook and GitHub are currently doing.

Furthermore, to hinder this attack vector in most modern browsers you can also return the HTTP headerX-Content-Type-Options: nosniff. If the JSONP endpoint returns a Content-Type which is not application/x-shockwave-flash (usually application/javascript or application/json), Flash Player will refuse to execute the SWF.

Update - Acknowledgment and reception

Thanks to Gábor Molnár, who worked on ascii-zip, source of inspiration for the Huffman part of Rosetta. I learn talking with him in private that we worked independently on the same problem. He privately came up with a single instance of an ASCII SWF approximately one month before I finished the whole Rosetta Flash internally at Google in May and reported it to HackerOne only. Rosetta Flash is a full featured tool with universal, weaponised, PoCs that converts arbitrary SWF files to ASCII thanks to automatic ADLER32 checksum bruteforcing.

The famous web development framework Ruby on Rails addressed this vulnerability in this pull request by prepending a comment to JSONP responses. More than 600,000 websites will be soon automatically protected, and this proves once again that going public with advisories is good for protecting the end user.