Q and A on cross site request forgeries and breaking into sessions. It's one of the attacks that XSS enables and the attack of the future. For Session, fixations, hijacking, lockout, replay, session riding etc....

Im doing my master thesis at the moment in the field of static analysis. Currently Im trying to come up with ways to detect CSRF or potential CSRF. However, it seems to me that for CSRF it is inheretly impossible to detect it statically.

- Taint propagation does not work since there is nothing to taint
- Model checking is not possible
- pattern matching is implausible since there are infinite many ways to implement countermeasures against csrf

Currently I am developing a static analysis tool for php (RIPS) and thought about that problem too but in my opinion you cant detect CSRF by static analysis reliably. The tool would need to know the logic of the webapp in order to detect which parameters should be protected by tokens or which not.
Imagine this link to a news article: www.news.com/news.php?id=456. It is basically CSRF but it is also how the webapp was designed to behave so there is no CSRF protection needed because external linking is appreciated. On the other hand the link www.bank.com/transfer.php?money=456 will probably look similar code wise but does a critical operation and should be CSRF protected. So the tool needs to know what the code is trying to do and if that is critical regarding CSRF attacks.

The only difference is that the first link is "reading" and the second is probably "writing". So when thinking about it you could actually decide between GET and POST requests in the meaning of reading and writing requests. Everything that is being read from a file or database should not be CSRF protected necessarily but everything that is being written or changes the state of the application (session_destroy()) should be CSRF protected. Since you can't rely only on the request type (a POST request may also "only" initiate a search request) you could define a list of sensitive sinks that should be CSRF protected like functions that handle sessions, writes to files and so on (command/code exec) but ignore "reading" functions. For database requests you would need to analyse SQL (which should be easy in this case). Then you would need to define how an implemented CSRF protection looks like in the project you are scanning and then your tool could detect unprotected sensitive sinks.

But there are a lot of false positives and negatives coming into my mind quickly. The main problem is still there: you can only detect what you configure. But you dont know which parts of the code you scan are sensitive regarding CSRF and which not in every case.

Reiners, thanks for your reply. I faced indeed the same problem as you describe. I was thinking about annotating code/functions where CSRF is dangerous, however, this still leaves the problem at the programmer who should annotate it.

Im afraid there is no real solution, however, if someone has any idea I'm listening!

you can't really tell with static analysis if the user is logged in (or what session properties he holds) nor if the function csrfToken() builds a random token or a captcha or whatever.
however for dynamic analysis it would be a good way to go.

All the effective countermeasures I've seen in real systems involve some kind of unguessable string/token, whether in the url or as a value in a post request. I've never seen a captcha used to stop CSRF, so it must be at least somewhat rare. Aside from that I 'gree with Reiners

what I meant by that was that it is very hard to detect by analysing code statically what exactly the code is doing. So you cant really differentiate between a function that prints a captcha or a random token or something else.

you are right gareth, it was a bad example ;) looking for random string generators has the same problem: you dont know if it is really a csrf protection. it could also be a random string appended to a url to avoid browser caching when fetching an image or sth like that.

Hmmm well I'd think the most common ways of avoiding browser caching would be to use milliseconds rather than a random string. Maybe scanning for all random string generation then the variables in which the string is stored, compare each of the variables in the context which they are used for example a echo $token; would be a good indicator of a token, a SQL statement with a "random" variable would be a indicator of a salt and not a token etc

Reiners Wrote:
-------------------------------------------------------
> hm both are good ideas indeed, but impractical for
> static analysis.
>
>
>
>
> <?php echo csrfToken(); ?>
>
>
>
> you can't really tell with static analysis if the
> user is logged in (or what session properties he
> holds) nor if the function csrfToken() builds a
> random token or a captcha or whatever.
> however for dynamic analysis it would be a good
> way to go.

While I agree that it is probably impractical, it isn't necessarily impossible.

When I opened this topic my first thought was that there is no way to detect CSRF because it is business logic, however it's not completely true.

If we assume that any page that writes to the database or other persistent store needs csrf protectiom, then you could potentially use model checking or similar to see if it is possible to reach the code path where the persistent store is altered without being able to read the page. Even if you cannot create the response, you would be left with a set of variables that you need to know, which would be something to do further analysis on.

Of course, there will be writes that are unimportant, but you could potentially mark certain databases or columns safe to write to, e.g. a user visit statistics table.

Maybe take a look at Microsoft's SAGE fuzzer, it seems that kind of technology could potentially be applied here.

One of the things you will probably miss here though is relationships between pages, e.g. if you have a page somewhere that reseeds a prng with time(), the prng is broken everywhere, but you're not going to know that unless you scan for that first.

P.S. If anyone works on this, send me an email, I'd be keen to see how this goes :)

Thanks for the great input. Im not looking for a sound and complete solution, since this is undecidable.
But approximation as described in this topic are great. Looking for randomly generated strings is a start, although Im going to get a lot of false positives.

The idea of model checking is ever greater, perhaps it is possible to model check if some sensitive actions are reachable without knowing a random value. In that case, you're vulnerable to CSRF.

Im aware that also this approach will probably produce a lot of false positives, but its better than nothing.

Nice discussion. Ill send you an email with the details when I have worked something out kuza

well, I use functions to fetch post variables, and escape them at the same time. I.e. posti('wut') will return (int)$_POST['wut']. The initial reason I started doing this was to conveniently avoid notices, but since the functions also denote quality, this could possibly be further split up to have something like post_sensitive(), to enable tracking for CSRF.

Of course this won't work as a plug-in, it'll actually have to be in the codebase.