How programming languages can hurt your application's security

If you are like many developers, you use one of several dynamic languages for writing applications. But languages can interact in undefined or, worse, defined yet astonishing ways that have serious implications for app security. Understanding these behaviors is key to securing your apps from malicious users.

These commonly used languages—Ruby, Java, and Python—should serve to illustrate the security-related challenges that exist in all programming languages, and what you can do to address them.

Python: Don't get in a pickle

Due to its balance of simplicity and power, Python is one of the top languages. Furthermore, its versatility has resulted in a thriving ecosystem across many different applications, from simple tools to complex web apps to data science.

Python provides a serialization mechanism known as "pickles." These are extremely useful because they are convenient, and people have used them for a variety of things, including cookie values and auth tokens. It can also be tempting to use a powerful mechanism in a way that ultimately results in compromised security.

The token is base64-decoded, unpacked, and then checked. However, what is to prevent a malicious user from creating his or her own pickle and then base64-encoding it? Nothing. That is the danger of pickles; the data can come from untrusted sources.

This was covered in a 2011 blog by Nelson Elhage of Stripe (the source of these code snippets). And Django, arguably the most popular web framework for Python, had a similar pickle bug for sessions up until version 1.6, which was two years after the blog posting on Twisted!

A mantra often heard after such discoveries is, "Don't use pickles!" Some people react by doing more JSON and YAML, but, again, you hit unexpected problems. YAML, as it turns out, has a nifty feature, unknown to many developers, to instantiate native object types.

This is completely unexpected for a developer looking for a simple way to pass around data, only to find out that executable code could be included as well. Nowadays, people will use yaml.safe_load(), but there's still a lot of yaml.load() going on. Stay safe. Always use safe_load(). And avoid pickles!

Ruby: Speed can kill

What about Ruby? It's also extremely popular as a language for web applications, in particular due to Ruby on Rails. It's also useful in DevOps as the underlying language for the configuration management tool Chef.

The aforementioned YAML problem also exists in Ruby—and in all YAML-compliant parsers, regardless of language. But there’s more.

You might think that if you are using XML with a Rails app, you’d be safe from this YAML silliness (and maybe be more worried about XXE). But in 2013 (CVE-2013-0156), people realized that Rails allowed the embedding of YAML within XML by way of any element with the attribute type="yaml".

You can guess what happened: All the problems with YAML end up as problems with XML.

The issue goes beyond data formats. Ruby/Rails opens up other possibilities due to the way it handles strings. One elegant exploit from last year turned what appeared to be directory traversal into something more. CVE-2016-0752 affected releases before Rails 4.2.5.1, and allowed file traversal via render. Being able to grab any arbitrary file on the host was dangerous enough, but render actually evaluated the file.

Sample code is as follows:

def show render params[:template]end

So one can leverage this behavior into more than just getting files off the machine. A bad actor can get an executable that can be evaluated into a file on that machine. The simplest way to accomplish that is to make the server log the malicious payload.

The attacker can accomplish this by "tainting" the log file with a bad URL request that includes a GET request with a parameter value of <%= `ls` %>; this would result in that expression being logged. Using the vulnerability to access the log file, which gets rendered, will then result in being able to run arbitrary commands on the system—in this case the command ls.

A similar risk exists in Ruby itself, as discovered last year by Brett Buerhaus. On AirBnB’s site, he found a vulnerability due to Ruby’s ability to do string interpolation. In Ruby, string interpolation works like this:

name = "Ada"puts "Hello, #{name}!"

Thus, anything within #{} is evaluated, including Ruby’s instruction for an OS-level exec, %x. Hence #{%x[‘ls’]} will execute an ls on the machine, and tricking the server into interpreting it will result in a successful compromise.

In this case, AirBnB’s code interpreted the values handed to it via JSON. A full description can be found here and illustrates how a native convenience feature of a language can have dramatic consequences when dealing with untrusted data.

Such flaws are extremely pernicious in highly dynamic languages such as Ruby, which is used for rapid application development. But what about languages that are compiled?

Java: Don't let it surprise you

Java is the venerable language of the server-side web. Despite being the most studied from a security perspective (compared to Python and Ruby), being compiled to bytecode, and having a security focus from the early days of its design, it too suffers from exploits that compromise apps in surprising ways.

Equifax is in the news for experiencing one of the worst data breaches in history. The breach, it has been announced, was due to a vulnerability in Apache Struts CVE-2017-5638. (Arguably, the failure of Equifax was not due to just an unpatched vulnerability, but rather a systematic failure of security, but that's beyound the scope of this discussion.)

The point is that despite the fact that both Java and Apache Struts technologies have been used since the early days of the web by the majority of Fortune 500 companies, simple and powerful features can lead to surprising vulnerabilities.

This particular CVE was reported in March, and the best explanation I’ve found is from Gotham Digital Science (GDS). In this attack, the attacker observed that when providing an invalid multipart content-type header, Struts will end up, in the course of creating an error message, returning the error message. However, Struts doesn't just return the error message; it evaluates it along the way!

Just as in the case of Ruby and Python, Java is subject to unexpected evaluation of user-defined data exploiting evaluation mechanisms. In this case, the problem was not a part of Java itself, but rather was in OGNL (Object-Graph Navigation Language), a package used in the implementation of Apache Struts.

From the Apache Project website: OGNL "is an expression language for getting and setting properties of Java objects, plus other extras such as list projection and selection and lambda expressions. ... OGNL started out as a way to set up associations between UI components and controllers using property names."

Such an innocuous mechanism leads to a key step on the road to catastrophe.

Apache Struts had three other RCE (remote code execution) vulnerabilities reported this year, the most recent being announced around the same time as the Equifax breach.

The Java community was rocked by a major serialization bug in 2015 that affected nearly every Fortune 500 company, but many other bugs followed that year. Among the flurry of serialization vulnerabilities, I was amused to see this informative talk on exploiting serialization by Mattias Kaiser. His presentation also included several WebLogic exploits in the T3 protocol, a proprietary protocol, unique to WebLogic, that I hadn't heard of since my days at BEA Systems.

Since XML is known to have risks associated with XXE, and YAML has the known risks listed above, many organizations have standardized on JSON as a safe way to do serialization.

But JSON is subject to these vulnerabilities, as I described above with the AirBnB situation. And, at DEF CON and Black Hat USA this year, there was an excellent talk by Alvaro Muñoz and Oleksandr Mirosh titled "Friday the 13th: JSON Attacks," covering vulnerabilities in the use of JSON that should puncture any false sense of security in either Java or .NET.

What you can do

Knowing is half the battle. All languages have quirks that can result in unexpected evaluations of untrusted data. This comes not only from the language itself, but from frameworks incorporated into your application to enable faster or more elegant implementation.

A typical application could have hundreds of third-party frameworks. These provide great power, but there is a risk that the power can be turned on the application itself.

In light of this situation, keep these best practices in mind:

Learn your language. "Wat" talks are not only entertaining (and disheartening), but educational.

Know your frameworks. In that elegantly expressed logic using the framework, what is the framework actually doing?

Use a static code analyzer such as Brakeman for Ruby to avoid common mistakes.

Keep up to date with packages. Use a tool to check which of your packages have vulnerabilities, and upgrade ASAP. Once a vulnerability has been publicized, script kiddies are locked and loaded—scanning as many websites as possible for these vulnerabilities.

Test and inspect. Ensure you have expert pen testers to evaluate your software for vulnerabilities.

Design with security in mind. Security cannot be an afterthought. The implementation of your system should be done with a security focus.

Make security everyone’s job, not just people with "security" in their titles. Security is not just a technology aspect, but a cultural aspect of how things get done.

Most people do much, if not all, of the above, yet compromises keep happening. Why? Because there is an underinvestment in a layered defense.

Layer your defenses

Vulnerabilities will happen, despite everything we do to prevent them. We must accept this fact and use a defense-in-depth approach that incorporates runtime monitoring for threats and automated response to stop attacks before damage is done or information is lost.

Defense-in-depth, or layered defense, is simply having more than one layer, so that if something fails, you have other mechanisms in place to mitigate the breach.

Many companies rely on testing and scanning tools to ensure that they have no vulnerabilities. No such assurance is possible. Those tools provide only a single line of defense that is done pre-production. Defense-in-production is just as important.

Without runtime defenses, attackers have free rein to exfiltrate valuable data, causing lasting and catastrophic damage. Zero-day exploits are shopped on the dark web and by state actors constantly. The best approach is to assume that vulnerabilities will be exploited sooner or later. When they are, make sure you have fail-safe mechanisms in place to minimize what a malicious actor can accomplish.

Other than some notable examples—ahem, JavaScript, cough—languages are designed with the principle of least astonishment in mind. They succeed to varying degrees, but not enough to remove the need for following the best practices described above.

The power of code to help you build valuable systems also brings with it humbling dangers that require more and more vigilance as the incentives to exploit those systems continue to increase.