Making Drupal User File Uploads Safe(r)

Do you let users upload files to your Drupal site? You know that "user" is a synonym for attacker, right?.

To keep your Drupal site secure you need to remember and account for the fact that any string or file upload that comes from a user can be an attack. Even if you don't let users register without administrator approval, a site user may have re-used their password or otherwise have their account compromised and then used to attack the site.

I can't exhaustively cover the topic of file upload security in one post, but I will try to give you a couple of pointers on things you might not have thought about, and how you can configure your Drupal site to be safer.

Two things I’m going to cover are cross-site scripting (XSS) risks from uploaded files, and a more novel attack termed a "cross-domain data hijacking attack." This latter kind of attack is basically a mix of cross-site scripting and cross-site request forgery.

I am not going to cover other risks like denial of service (DoS) attacks. There are lots of things that attackers, a.k.a. users, could do as a DoS attack, such fill up the disk of your server with large files; or they could use image files and cause a heavy load by making your server resize or manipulate images over and over again. So, DoS can be a problem. But the attacks I’m going to talk about are more serious because they might allow someone to take over your site, as opposed to merely taking your site offline.

Let's take a step back, then, and mention the browser security model. The browser security model basically can be boiled down to the same origin policy. What that means is that if there’s some JavaScript being served by a website – so it’s coming from the website's domain – then that JavaScript can do everything on that website that you can do. It can, basically, make a request to the site, use your session cookies, and have full access as if it’s you taking the action. This can be intentional (like an ajax request) or very dangerous if the script was put there by an attacker - which is XSS.

This is why XSS is so dangerous. If there’s JavaScript loaded on your site and that JavaScript wants to do something like promote the attacker’s user account to administrator, add some new content, and delete all of your content, all of those things are possible. People sometimes think that an alert or pop-up is all that XSS means. That’s really just the test or probe we use to show the vulnerability. A real XSS attack can take over your site, and the JavaScript can do anything that you can do while you’re logged in.

This browser security policy can get broken or bypassed, which leads to the attacks I want to talk about. This first one is MIME-type sniffing as a possible XSS attack vector. I will then touch on cookies and how having cookies shared to different subdomains can be a problem that can bypass other protections you’re trying to implement. Then I’ll cover the cross-domain attack.

A safe-looking file extension parsed as HTML + javascript.

The screen shot above shows what a MIME-type sniffing cross-site scripting attack looks like. You can see the file name at the top: the file is named foo2.test. It’s a .test extension. So what does that mean? Drupal uses .test for simple test files. So maybe on your website you let people upload a new simple test case to share. Why not?

What we see is that I, as the attacker, have uploaded this .test file and when the user went to visit it the browser actually popped up a window. As I said, this is a test for XSS and it's printing the accessible cookies. It’s not a dangerous attack itself, but demonstrates the vulnerability.

<html><body><script> alert(document.cookie);</script></body></html>

That’s what the content of the file looks like. It’s a very brief HTML file with a script, and the script executed.

No content-type, browser "guessed."

I could have made that script go in, elevate someone’s privileges, delete all of the content in your site, anything I wanted. What happened? If we look in the response headers in the next screen shot, you can see that there’s no content type in the response headers. So the browser guessed. The browser said, "You didn’t tell me what kind of content this is so I’m going to sniff it. I’m going to start reading the content and I saw that nice HTML tag you had and some other tags that looked like HTML. And so I decided that I’m just going to render that as HTML. I’m going to run the JavaScript and I’m going to let that script take over your site, okay?" Sounds good, right?

As with most bad things in the web, we can probably blame this on Internet Explorer. Internet Explorer had a long-time behavior where it would basically try to sniff the content that was coming to it because most people back in the old days didn’t configure their servers to send the content type correctly. Internet Explorer took the attitude, "Well, of course you want me to guess and give you something that looks right rather than giving you a broken page." Chrome, following Internet Explorer’s lead and trying to be backwards compatible implemented the same behavior. The nice thing is that you can actually just tell the browsers to turn this off. There’s a header you can send "X-Content-Type-Options:nosniff", where nosniff is the only valid value.

Defused by setting nosniff.

You see in the third screen shot that if the site is changed to send that header, and the user requests that same file, it actually comes through as plain text. We told the browser, "You’re not allowed to sniff this content. If you can’t tell what it is, just serve it plain as plain text." This defused the attack. What’s even better is that this is actually handled for you now in Drupal 7 and 8 as long as you’re using that core .htaccess file. This got into Drupal 7 only last October, in Drupal 7.40. Hopefully, you all have updated well past 7.40 now.

If you needed to manually merge your .htaccess files (some people have a lot of modifications), go back and check that you got that new rule that sets this header so that your site is protected. I mentioned that the problem with IE and other browsers guessing content is trying to make up for content not being served correctly. If you see a problem with images not rendering after you enable this header, you may need to enable something like mod_mime so it correctly maps the content types into the response headers. If you’re not using Apache or you’re not allowing .htaccess files, you can still configure the web server to send this header with all of your pages and you’re done. You’re protected from this content-sniffing attack. So that’s the first danger described and an easy way to protect your site.

Now, on the site it’s not going to be executed because you’re not embedding the flash in your site. But the nice attacker who uploaded this site is going to go to their site and they’re going to use an object tag. They’re going embed this upload on your site into their site. Then maybe they’ll invite you then to come and look at their site and then what happens? This flash now runs. Because the flash comes from your site the browser says, “Well, it comes from your site, it must be okay. I can let this flash do anything it wants to your site.”

That flash can basically run everything the JavaScript can run, or it can talk to JavaScript. Now it can copy out private files. It can copy out all the form tokens and let the attacker submit all the forms on your site. It basically boils down to same complete control of your site so that’s pretty bad. Again, we don’t want this to happen.

Let’s think about how we can prevent this. The most bulletproof way to prevent this is really user-unfriendly: it’s called “content disposition.” We set a content disposition header and basically that’s going to force the browser to download each file. You basically won’t render images or PDFs or anything else in the browser.

The snippet of apache server configuration, below, shows how you might do this for PDF files. The browser is forced to download the file every time. Users really hate that. Even though that’s the most bulletproof solution, we’re not going to be able do that for most sites.

Here’s a less offensive fix that’s just about as good: serve your files from a different domain. You can actually do this really cheaply and easily by using a subdomain on the same domain that you serve your Drupal site from. You don’t have to pay for a new domain. You don’t actually have to do any extra setup. Just have your web server have the document root the subdomain as your main domain. If you can prevent the session cookies from being sent to the subdomain, you block all of these attacks. You can see that most of the really sharp engineering organizations of the world already do this.

If you’re looking for where your Gmail attachments come from, they’re actually being served from mail-attachment.googleusercontent.com. They’re not being served from Google.com for exactly this reason: Google doesn’t want to facilitate attackers who send you an e-mail attachment that might have access to basically log into your e-mail account, right? That would be bad. Google is already protecting you from this and you should do the same for your users.

There is a gotacha with Drupal, of course, and this is still true for Drupal 7, unfortunately: that if your domain name starts with "www.", Drupal will helpfully strip that off of the cookie domain. Cookie domains basically tell the browser where that cookie should be sent. In the screen shot, below, you can see I had www.drupal-7.local as the domain but the cookie is set instead for .drupal-7.local. If you have a root domain as the cookie domain -- .example.com for example -- then that session cookie would get sent to every subdomain in the example.com domain as well as to the main www site.

www. stripping for cookie domain.

That’s a real problem in our strategy to try to fix the cross-domain problem by serving files from a subdomain. That’s not going to work if Drupal basically thwarts us by sending the session cookie everywhere. You have to avoid a bare domain, right? You can’t use just Drupal.org. You have to use www.drupal.org. You need, in Drupal 7, to go into settings.php and set this very well-named “cookie_domain,” or it puts some logic around it to set it so that you actually get this full url, www.example.com, and not the truncated one.

/** * Drupal automatically generates a unique session cookie name * for each site based on its full domain name... */# $cookie_domain = 'example.com';

To wrap up, let's outline a safer files recipe. These are some things you could do for your site, basically, to avoid those two big attacks that I discussed earlier. You’re going to want to serve your uploaded files from a subdomain or from a completely different domain. In either case, you want to make sure that the session cookies are not being sent along with the request for the file. You can do that either using the CDN module or a little custom module.

Finally, you have to prevent the Drupal site from bootstrapping on that files domain. In your sites/ folder you have to use the specific domain name that you’re serving Drupal from or use sites.php or use a redirect, something to prevent Drupal from being served on that files domain. Conversely, you can’t let the files be served on your Drupal domain. If an attacker can rewrite all the links back to www. and have it work, then you haven’t prevented the attack.

Here’s a code snippet to show a hook file URL alter in Drupal 7. All I’m doing is taking, in this case, the incoming file URL and cutting it in half and getting the file name. Then I’m sticking, on the front of it, a domain name that has a subdomain that’s downloads.something. Right, so now I’m moving all the file URLs basically off of my www. domain onto this downloads subdomain. Because I set the cookie domain, my session cookies are not going to get sent to this subdomain and now I’m safe.

One caution: this doesn't handle dynamically generated image derivatives. The easiest way to handle those may be to check the path and don't rewrite it, and don't block it from being served from that one subdirectory. Image derivatives won't usually be dangerous since they are not actually the user uploaded files.

An alternative is to use a processing queue or a hook when users upload files. Or save your content to pre-generate the desired derivates so they can be served from the safe domain or subdomain.

RewriteCond %{HTTP_HOST} ^www\. [NC]RewriteRule . - [F]

Similarly, as I wrote earlier, you need to block file downloads directly from the www domain. Here are a couple of lines you could add to your .htaccess file basically to say if your URL starts with www, then deny access. Don’t let them access anything and that’s it. That’s the recipe. A bunch of these things fit into this category of preventing HOST header attacks, which means basically preventing Drupal from being served on a domain that you didn’t intend it to be served from.

Do you let users upload files to your Drupal site? You know that "user" is a synonym for attacker, right?.

To keep your Drupal site secure you need to remember and account for the fact that any string or file upload that comes from a user can be an attack. Even if you don't let users register without administrator approval, a site user may have re-used their password or otherwise have their account compromised and then used to attack the site.