Double Encoding – One Of The Biggest Enemies While Fixing Cross-Site Scripting (XSS)

“You have X amount of Cross-Site Scripting vulnerabilities”. That is a phrase most web developers have heard at least one time, what is a Cross-Site Scripting vulnerability?

OWASP defines Cross-Site Scripting as:

“Cross-Site Scripting attacks are a type of injection problem, in which malicious scripts are injected into the otherwise benign and trusted web sites. Cross-site scripting (XSS) attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user. Flaws that allow these attacks to succeed are quite widespread and occur anywhere a web application uses input from a user in the output it generates without validating or encoding it.”

But the problem arises, how you fix it efficiently and painlessly? This might look like a simple question, but arises different scenarios and that influences where are you to place your vulnerability fixes. Most of the security personel will tell you you should always HTML encode the input and then you don’t have to worry about the output, but .. is that true?

Two Particular Scenarios:

1. A web application with modules within itself connected to a database that can be within the server or in another server, the data within the database is only instantiated within the SAME application, so we only have one point of “data injection”.

2. A web application with modules within itself that also has other applications instantiated within the same database, or gathers data from another source (such as SAP for enterprise applications).

On the first application one can HTML encode the input, and not really encode the output, as we know we control every single input point and we can be sure that the data that is going to be rendered is sanitized in a proper way, this is pretty straight forward, in Java one could use HtmlUtils.htmlEscape() function from the Spring Framework, or in php htmlentities() function can be used to encode without any problem.

I particularly favor this testing string as it will close the tag, if you have a DOM XSS it will close the <script> tag and will also close the remaining “> tag so the page will not break horribly with javascript or give you display errors, but any string will do:

As you can see the input is rendered perfectly without any problems and all is fine in wonderland.

What if you have an application that takes input from other sources that you cannot control? (such as another application also inputing data from a SAP server for instance) well here we come into a harder problem, because you might control some inputs but not all of them.

The most used solution is “encode input and encode output” which might sound good right? but let’s follow data that you don’t control that is on SAP (remember we are HTML encoding the output AND input we can control and only output of what we don’t control the input)

Well, we are OK in the security sight but, that really looks horrible! This is obviously not going to pass QA and thus it will be marked as a “User interface” problem, so before you as a developer or as a security engineer start pushing the fix, analyze where the rendering should be done, do a data flow analysis and verify control of either all user input or all user output.