Q and A on cross site request forgeries and breaking into sessions. It's one of the attacks that XSS enables and the attack of the future. For Session, fixations, hijacking, lockout, replay, session riding etc....

Suppose you are developing in a domain that already exists and has a large number of scripts on it. Also suppose that there is a standard, persistent (30-minute duration) cookie-based authentication scheme in place for many but not all pages in the domain. The final condition is that there exists a page in the domain (that for the sake of this argument doesn't require authentication) with an XSS hole.

Is it possible to protect against CSRF in this scenario?

My preliminary answer is "Yes - but it's hard."

I see the primary difficulty here being stopping the use of XMLHTTPRequests (or any other method that allows parsing tokens out of a response). Many of the protections for CSRF require the browser to enforce the same-origin policy. Due to the XSS hole in the same domain (used to craft the CSRF attack from withing the domain), the browser won't stop the responses from being read.

The textbook answer fails here. Generally, to stop CSRF you include a nonce or token in any request that changes anything permanent. For CSRF then to work, the constructed request must contain this hard to guess token. The next logical step for the attacker is to get the token and then submit the constructed CSRF request. Getting the token from the response to a request made cross-domain is what the browsers same-origin policy stops, but getting the token from a response to a request made from the same domain is allowed. Thus the textbook answer fails to stop CSRF here.

When thinking about this, I naively thought in the following way. "If they can get the token from the form being submitted, why don't we use a token to get to the form?" My answer being, "Well of course they can parse out that token as well."

Following this logic back to "the beginning" however, gave me the idea, "What if we have this session token generated at the login event and passed with every subsequent request?" This is my "Yes - but it's hard" solution. Really, any event which requires user action stops the possibility of automating the CSRF attack without further holes present.

I should clarify my proposed solution. I am attempting to create a "virtual domain" for a specific web application. The XSS hole referenced in the larger domain is not on a page within this specific web application. To get to any page in this web app, you must either include the token with the request or login.

There are a few problems with this solution:
1. If any stored XSS holes exist within the web app, then they can be used to parse out the token and start making CSRF requests (and a host of other attacks).
2. It can break standard browser navigation if the token is not the same over the course of the session. The back, forward, and refresh buttons no longer work (they require re-login).
3. Forcing users to login every time they blink lowers their guard to fake login pages and certain phishing attacks.

I would like to hear what others think about these problems and solutions.

There is another attack in this scenario to consider. Without any token to authenticate the origin of a request, the XSS hole in an unrelated script in the same domain can make arbitrary requests and parse the responses.

Originally the attack surface was requests which cause a change somewhere. Now the attack surface is the effects of any requests. What I am specifically concerned with is bypassing data retrieval access controls. CSRF from inside the domain can allow an attacker to use an authorized individual's browser to get to data that the attacker would not normally have access to.

A CAPTCHA on forms would only be useful for preventing the CSRF on the form submission, but not the attackers ability to access data that I don't want them to.

I realize that this might be straying outside the realm of CSRF and into the larger realm of access control, but the methods are essentially the same.

If an attacker can utilize javascript that is executed within your domain, you're pretty well hosed with regard to traditional http. Even if you disable XHR (would be curious to know how you planned to do this) they can still create a form and submit it.

The following is a half-baked idea:

Flash or Silverlight would give you a channel that the javascript wouldn't be able to screw with, but the same doesn't apply for any communication from it to the js. Fleeting thought would be using flash to create a signature of the js immediately prior to any page-initiated http requests, transmit the signature server side and ensure it's the same as it was when delivered. Would definitely require more thought as you could potentially do the xss code and then revert back to normal before the flash created its signature. Would also want to ensure that it's actually your flash communicating with server as it's possible to issue a cross-domain GET with query params passing a remote server your flash url, programmatically decompile (for nonce, credentials, etc), and impersonate.

The idea here is to utilize the fact that the flash can't be readily tampered with using javascript alone. If you can ensure communication from the flash is authentic, we can take its word for it when it says the prior request was authentic. Determining whether the prior request was authentic is a whole different problem.

Further:

I haven't tested this in the least; it's more a train of thought than suggestion. If you can ensure that you have first-chance at a body onload perhaps you could traverse the dom looking for script elements, send the number back to the server via the flash (e.g. one-shot routine in the flash) that would essentially notify server that given nonce is validated. This relies on the assumption that any xss requires creating an additional script tag and that said script tag could not override body's onload before it is executed. Obviously this requires that you keep a count of the number of script tags for a given page.

Even further:

Don't really even need flash for the above if the assumptions are correct. e.g. page loads with given nonce constant in javascript, body onload issues a xhr with given nonce along with script tag count, server side validates or invalidates said nonce based on script tag count and url associated with nonce. Server immediately invalidates the nonce for nonce-validation (if it was valid, a second attempt should invalidate it) and accepts nonce for form/xhr once.