This chapter is from the book

This chapter is from the book

Information Disclosure

The Information Disclosure section covers attacks designed to acquire system
specific information about a web site. This system specific information includes
the software distribution, version numbers, and patch levels, or the information
may contain the location of backup files and temporary files. In most cases,
divulging this information is not required to fulfill the needs of the user.
Most web sites will reveal some data, but it’s best to limit the amount of
data whenever possible. The more information about the web site an attacker
learns, the easier the system becomes to compromise.

Directory Indexing

Automatic directory listing/indexing is a web server function that lists all
of the files within a requested directory if the normal base file
(index.html/home.html/default.htm) is not present. When a user requests the main
page of a web site, he normally types in a URL such as
http://www.example.com,
using the domain name and excluding a specific file. The web server processes
this request and searches the document root directory for the default filename
and sends this page to the client. If this page is not present, the web server
will issue a directory listing and send the output to the client. Essentially,
this is equivalent to issuing a "ls" (Unix) or "dir" (Windows)
command within this directory and showing the results in HTML form. From an
attack and countermeasure perspective, it is important to realize that
unintended directory listings may be possible due to software vulnerabilities
(discussed next in the example section) combined with a specific web
request.

When a web server reveals a directory’s contents, the listing could
contain information not intended for public viewing. Often web administrators
rely on "Security Through Obscurity," assuming that if there are no
hyperlinks to these documents, they will not be found, or no one will look for
them. The assumption is incorrect. Today’s vulnerability scanners, such as
Nikto, can dynamically add additional directories/files to include in their scan
based upon data obtained in initial probes. By reviewing the /robots.txt file
and/or viewing directory indexing contents, the vulnerability scanner can now
interrogate the web server further with this new data. Although potentially
harmless, directory indexing could allow an information leak that supplies an
attacker with the information necessary to launch further attacks against the
system.

Directory Indexing Example

The following information could be obtained based on directory indexing
data:

Backup files—with extensions such as .bak, .old, or .orig.

Temporary files—these are files that are normally purged from the
server but for some reason are still available.

Hidden files—with filenames that start with a "."
(period).

Naming conventions—an attacker may be able to identify the
composition scheme used by the web site to name directories or files. Example:
Admin versus admin, backup vs. back-up, and so on.

Enumerate user accounts—personal user accounts on a web server
often have home directories named after their user account.

Configuration file contents—these files may contain access control
data and have extensions such as .conf, .cfg, or .config.

Script contents—Most web servers allow for executing scripts by
either specifying a script location (e.g., /cgi-bin) or by configuring the
server to try and execute files based on file permissions (e.g., the execute bit
on *nix systems and the use of the Apache XBitHack directive). Due to these
options, if directory indexing of cgi-bin contents are allowed, it is possible
to download/review the script code if the permissions are incorrect.

There are three different scenarios where an attacker may be able to retrieve
an unintended directory listing/index:

The web server is mistakenly configured to allow/provide a directory
index. Confusion may arise of the net effect when a web administrator is
configuring the indexing directives in the configuration file. It is possible to
have an undesired result when implementing complex settings, such as wanting to
allow directory indexing for a specific sub-directory, while disallowing it on
the rest of the server. From the attacker’s perspective, the HTTP request
is normal. They request a directory and see if they receive the desired content.
They are not concerned with or care "why" the web server was configured
in this manner.

Some components of the web server allow a directory index even if it is
disabled within the configuration file or if an index page is present. This is
the only valid "exploit" example scenario for directory indexing. There
have been numerous vulnerabilities identified on many web servers that will
result in directory indexing if specific HTTP requests are sent.

Search engines’ cache databases may contain historical data that
would include directory indexes from past scans of a specific web site.

Apache Countermeasures for Directory Indexing

First of all, if directory indexing is not required for some specific
purpose, then it should be disabled in the Options directive, as outlined in
Chapter 4. If directory indexing is accidentally enabled, you can implement the
following Mod_Security directive to catch this information in the
output data stream. Figure
7.1 shows what a standard directory index web page
looks like.

Web pages that are dynamically created by the directory indexing function
will have a title that starts with "Index of /". We can use this data as
a signature and add the following Mod_Security directives to catch and
deny this access to this data:

Information Leakage

Information Leakage occurs when a web site reveals sensitive data, such as
developer comments or error messages, which may aid an attacker in exploiting
the system. Sensitive information may be present within HTML comments, error
messages, source code, or simply left in plain sight. There are many ways a web
site can be coaxed into revealing this type of information. While leakage does
not necessarily represent a breach in security, it does give an attacker useful
guidance for future exploitation. Leakage of sensitive information may carry
various levels of risk and should be limited whenever possible.

In the first case of Information Leakage (comments left in the code, verbose
error messages, etc.), the leak may give intelligence to the attacker with
contextual information of directory structure, SQL query structure, and the
names of key processes used by the web site.

Often a developer will leave comments in the HTML and script code to help
facilitate debugging or integration. This information can range from simple
comments detailing how the script works, to, in the worst cases, usernames and
passwords used during the testing phase of development.

Information Leakage also applies to data deemed confidential, which
aren’t properly protected by the web site. These data may include account
numbers, user identifiers (driver’s license number, passport number,
social security numbers, etc.) and user-specific data (account balances,
address, and transaction history). Insufficient Authentication, Insufficient
Authorization, and secure transport encryption also deal with protecting and
enforcing proper controls over access to data. Many attacks fall outside the
scope of web site protection, such as client attacks, the "casual
observer" concerns. Information Leakage in this context deals with exposure
of key user data deemed confidential or secret that should not be exposed in
plain view even to the user. Credit card numbers are a prime example of user
data that needs to be further protected from exposure or leakage even with the
proper encryption and access controls in place.

Information Leakage Example

There are three main categories of Information Leakage: comments left in
code, verbose error messages, and confidential data in plain sight. Comments
left in code:

Here we see a comment left by the development/QA personnel indicating what
one should do if the image files do not show up. The security breach is the host
name of the server that is mentioned explicitly in the code,
"VADER."

An example of a verbose error message can be the response to an invalid
query. A prominent example is the error message associated with SQL queries. SQL
Injection attacks typically require the attacker to have prior knowledge of the
structure or format used to create SQL queries on the site. The information
leaked by a verbose error message can provide the attacker with crucial
information on how to construct valid SQL queries for the backend database. The
following was returned when placing an apostrophe into the username field of a
login page:

In the first error statement, a syntax error is reported. The error message
reveals the query parameters that are used in the SQL query: username and
password. This leaked information is the missing link for an attacker to begin
to construct SQL Injection attacks against the site.

Confidential data left in plain sight could be files that are placed on a web
server with no direct html links pointing to them. Attackers may enumerate these
files by either guessing filenames based on other identified names or perhaps
through the use of a local search engine.

Apache Countermeasures for Information Leakage

Preventing Verbose Error Messages

Containing information leaks such as these requires Apache to inspect the
outbound data sent from the web applications to the client. One way to do this,
as we have discussed previously, is to use the OUTPUT filtering capabilities of
Mod_Security. We can easily set up a filter to watch for common
database error messages being sent to the client and then generate a generic 500
status code instead of the verbose message:

Preventing Comments in HTML

While Mod_Security is efficient at identifying signature patterns,
it does have one current shortcoming. Mod_Security cannot
manipulate the data in the transaction. When dealing with information
disclosures in HTML comment tags, it would not be appropriate to deny the entire
request for a web page due to comment tags. So how can we handle this? There is
a really cool feature in the Apache 2.0 version called filters:
http://httpd.apache.org/docs-2.0/mod/mod_ext_filter.html.
The basic premise of filters is that they read from standard input and print to
standard output. This feature becomes intriguing from a security perspective
when dealing with this type of information disclosure prevention. First, we use
the ExtFilterDefine directive to set up our output filter. In this
directive, we tell Apache that this is an output filter, that the input data
will be text, and that we want to use an OS command to act on the data. In this
case, we can use the Unix Stream Editor program (sed) to strip out any comment
tags. The last step is to use the SetOutputFilter directive to activate
the filter in a LocationMatch directive. We can add the following data
to the httpd.conf file to effectively remove all HTML comment tags,
on-the-fly, as they are being sent to the client:

Path Traversal

The Path Traversal attack technique forces access to files, directories, and
commands that potentially reside outside the web document root directory. An
attacker may manipulate a URL in such a way that the web site will execute or
reveal the contents of arbitrary files anywhere on the web server. Any device
that exposes an HTTP-based interface is potentially vulnerable to Path
Traversal.

Most web sites restrict user access to a specific portion of the file-system,
typically called the "web document root" or "CGI root"
directory. These directories contain the files intended for user access and the
executables necessary to drive web application functionality. To access files or
execute commands anywhere on the file system, Path Traversal attacks will
utilize the ability of special-character sequences.

The most basic Path Traversal attack uses the "../" special-character
sequence to alter the resource location requested in the URL. Although most
popular web servers will prevent this technique from escaping the web document
root, alternate encodings of the "../" sequence may help bypass the
security filters. These method variations include valid and invalid
Unicode-encoding ("..%u2216" or "..%c0%af") of the forward slash
character, backslash characters ("..\") on Windows-based servers,
URL-encoded characters ("%2e%2e%2f"), and double URL encoding
("..%255c") of the backslash character.

Even if the web server properly restricts Path Traversal attempts in the URL
path, a web application itself may still be vulnerable due to improper handling
of user-supplied input. This is a common problem of web applications that use
template mechanisms or load static text from files. In variations of the attack,
the original URL parameter value is substituted with the filename of one of the
web application’s dynamic scripts. Consequently, the results can reveal
source code because the file is interpreted as text instead of an executable
script. These techniques often employ additional special characters such as the
dot (".") to reveal the listing of the current working directory, or
"%00" NUL characters in order to bypass rudimentary file extension
checks.

Path Traversal Examples

Path
Traversal
Attacks
Against
a Web
Server

GET /../../../../../some/file HTTP/1.0
GET /..%255c..%255c..%255csome/file HTTP/1.0
GET /..%u2216..%u2216some/file HTTP/1.0

Path
Traversal
Attacks
Against
a Web
Application

In the previous example, the web application reveals the source code of the
foo.cgi file because the value of the home variable was used as
content. Notice that in this case, the attacker does not need to submit any
invalid characters or any path traversal characters for the attack to succeed.
The attacker has targeted another file in the same directory as index.htm.

Path Traversal Attacks Against a Web Application Using Special-Character
Sequences

In this example, the web application reveals the source code of the
foo.cgi file by using special-characters sequences. The "../"
sequence was used to traverse one directory above the current and enter the
/scripts directory. The "%00" sequence was used both to bypass file
extension check and snip off the extension when the file was read in.

Apache Countermeasures for Path Traversal Attacks

Ensure the user level of the web server or web application is given the least
amount of read permissions possible for files outside of the web document root.
This also applies to scripting engines or modules necessary to interpret dynamic
pages for the web application. We addressed this step at the end of the CIS
Apache Benchmark document when we updated the permissions on the different
directories to remove READ permissions.

Normalize all path references before applying security checks. When the web
server decodes path and filenames, it should parse each encoding scheme it
encounters before applying security checks on the supplied data and submitting
the value to the file access function. Mod_Security has numerous
normalizing checks: URL decoding and removing evasion attempts such as directory
self-referencing.

If filenames will be passed in URL parameters, then use a hard-coded file
extension constant to limit access to specific file types. Append this constant
to all filenames. Also, make sure to remove all NULL-character (%00) sequences
in order to prevent attacks that bypass this type of check. (Some interpreted
scripting languages permit NULL characters within a string, even though the
underlying operating system truncates strings at the first NULL character.) This
prevents directory traversal attacks within the web document root that attempt
to view dynamic script files.

Validate all input so that only the expected character set is accepted (such
as alphanumeric). The validation routine should be especially aware of shell
meta-characters such as path-related characters (/ and \) and command
concatenation characters (&& for Windows shells and semi-colon for Unix
shells). Set a hard limit for the length of a user-supplied value. Note that
this step should be applied to every parameter passed between the client and
server, not just the parameters expected to be modified by the user through text
boxes or similar input fields. We can create a Mod_Security filter for
the foo.cgi script to help restrict the type file that may be referenced in the
"home" parameter.

Predictable Resource Location

Predictable Resource Location is an attack technique used to uncover hidden
web site content and functionality. By making educated guesses, the attack is a
brute force search looking for content that is not intended for public viewing.
Temporary files, backup files, configuration files, and sample files are all
examples of potentially leftover files. These brute force searches are easy
because hidden files will often have common naming conventions and reside in
standard locations. These files may disclose sensitive information about web
application internals, database information, passwords, machine names, file
paths to other sensitive areas, or possibly contain vulnerabilities. Disclosure
of this information is valuable to an attacker. Predictable Resource Location is
also known as Forced Browsing, File Enumeration, Directory Enumeration, and so
forth.

Predictable Resource Location Examples

Any attacker can make arbitrary file or directory requests to any publicly
available web server. The existence of a resource can be determined by analyzing
the web server HTTP response codes. There are several Predictable Resource
Location attack variations.

Blind Searches for Common Files and Directories

/admin/
/backup/
/logs/
/vulnerable_file.cgi

Adding
Extensions
to Existing
Filename:
(/test.asp)

/test.asp.bak
/test.bak
/test

Apache Countermeasures for Predictable Resource Location
Attacks

To prevent a successful Predictable Resource Location attack and protect
against sensitive file misuse, there are two recommended solutions. First,
remove files that are not intended for public viewing from all accessible web
server directories. Once these files have been removed, you can create security
filters to identify if someone probes for these files. Here are some example
Mod_Security filters that would catch this action: