OWASP Hatkit Proxy Project

Main

The Hatkit Proxy is an intercepting http/tcp proxy which is based on the Owasp Proxy.

Background

The primary purpose of the Hatkit Proxy is to create a minimal, lightweight proxy which stores traffic into an offline storage where further analysis can be performed, i.e. all kinds of analysis which is currently implemented by the proxies themselves (webscarab/burp/paros etc).

Also, since the http traffic is stored in a MongoDB, the traffic is stored at an object-level, retaining the structure of the parsed traffic.

Hatkit proxy features

The additions which have been implemented on top of Owasp Proxy are:

Swing-based UI,

Interception capabilities with manual edit, both for TCP and HTTP traffic,

Syntax highlightning (html/form-data/http) based on JFlex,

Storage of http traffic into MongoDB database,

Possibilities to intercept in Fully Qualified mode (like all other http-proxies) OR Non-fully qualified mode. The latter means that interception is performed *after* the host has been parsed, thereby enabling the user to submit non-valid http content.

A set of filters to either ignore or process traffic which is routed to the proxy. The 'ignored' traffic will be streamed to the endpoint with minimal impact on performance.

Getting started

To use the proxy, download the zip-file which can be found on the source code repository (you can also use the direct-link on the release-page).

The proxy window should now pop up. Before the proxy actually starts, you need to make some settings. It has one tab for HTTP-proxy mode, and another for TCP-proxy mode.

These settings are available:

Session

This is the name of the MongoDB database that will be used (and created if it does not exist). Defaults to the current date. Optional - if not supplied, no traffic will be stored.

WL Domains

This sets a list of domains that are whitelisted by the proxy. This means that any request that matches any of these domains are 'passed' for the next test-the blacklisting test. Requests which do not match any of these domains are streamed with minimal processing to the target host, and the proxy does not store any information about them. If you leave this field blank, all domains will be included. You do not have to specify subdomains, those are automatically included. Example : "google.com, ru" would include "a.b.c.google.com" and "evil.ru"

WL Networks

This sets a list of networks that are whitelisted by the proxy. If you leave this field blank, ip-addresses will not be checked (auto-pass). You can specify networks in two ways. Example:

"10.0.2.2/24, 10.1.0.1/32, 193.*, 192.168.*, 8.8.8.8"

Blacklist

This sets a blacklist for what is treated by the proxy. Blacklists are good for specifying images or static content you wish to avoid.

Tip: If you use white/black-lists, you should not need to use foxyproxy or similar tools. The blacklist is applied after the whitelist.

Fwd proxy

Forwarding proxy, The format to use is e.g

PROXY 127.0.0.1:8008

or

SOCKS 10.0.2.2:8080

Listen interface - where the proxy will listen

MongoDB

If you leave this field blank, traffic will not be stored, but you can still use the proxy to intercept and modify traffic.

Loglevel - well, how much do you want to see in the console?

Log ignored

If checked, the proxy will report in the console each time a request is ignored. Useful for trimming those WL/BL filters.

Log treated

If checked, the proxy will report in the console each time a request is processed by the proxy. Useful for trimming those WL/BL filters.

* Fwd address

Specify the remote endpoint where you want all the traffic to be sent, e.g foobar.com:80

Listen interface

Specify where you want the tcp proxy to listen

SSL

If enabled, the proxy will listen to ssl connections and use ssl for the remote connection

These are all documented within the application, if you click the ?-button, you will see more information about the setting in question. Tip: Most of these settings can be modified later, so you don't have to restart the proxy to e.g. redefine the filters determining what is captured and what is ignored.

In order to actually store traffic, you also need to install mongodb. Please see MongoDB for suitable version for your platform. Note: MongoDB is usually also available through Linux packet managers, if you want to do it the simple way:

sudo apt-get install mongodb

Running the proxy (HTTP mode)

The stats-pane contains some basic counters to show what is happening. One implementation detail in the proxy is that it should not increase it's resource consumption by e.g. generating sitemaps. The only statistics measured are counters on the different request verbs and the response status codes.

The intercept-pane is where you select to start intercepting data, control whether you want syntax highlightning and if you want to do it in FQ or NFQ mode.

When the browser sends a request to a proxy, the request is fully qualified, i.e the first line looks something like this:

The proxy then normally parses the requestline into (host, port, isSsl, URI) and connects to the specified host:port possibly using ssl. It then sends a NFQ request, which in this case would look like this:

"GET /gazonk?a=b HTTP/1.1"

In most other proxies, the interception is made *before* the proxy parses the browser request, so the user is always in FQ mode. With HatKit proxy, you can edit the request in NFQ mode if you want. These are the basic differences:

FQ mode:

Copy-paste compatibility with other proxies, e.g WebScarab and Burp.

The proxy will, by necessity, perform a bit of validation that the request is valid,at least that host, port, isSSl and the URI can be parsed from the requestline.

NFQ mode:

The proxy will do no validation of the request. You can type basically anything in the request, since the host, port, isSSL is already parsed. This means that you are not bound to http, and if you are testing http servers, you can malform the requests any way you want.

You can still change host/port/isSSL by the individual input fields available

All settings where the 'Apply'-button is enabled can be modified on-the-fly

Running the proxy (TCP mode)

Todo.

Issues

There will be a Trac for issue tracking, but in the mean time, please report any issues to the mailing list: owasp-hatkit-proxy-project@lists.owasp.org.

Known issues :

HTTP-intercept: Some button/checkboxes in the interception window does not work

TCP-intercept: The statistics counters are incorrect.

Roadmap

Todo

Storage

The Hatkit Proxy is a 'recorder' which (optionally) records http traffic into a MongoDB database. MongoDB is a document-oriented database, part of a group of databases also coined "NoSql" since they do not implement SQL.

NoSQL type datastorage is usually associated with massively parallel distributed systems with high requirements on scaleability. However, for Hatkit project, MongoDB was chosen for different reasons, since there are advantages to using it when storing data which fits the dynamic (schemaless) model. Having no schema enforced by the database does not imply that the database is just a disk-based hash table with unstructured data content. Instead, it can be argued that many NoSQL solutions are a lot like the (currently out-of-fashion) object databases, with the difference that they have more generic API's (json/bson/http) which does not bind the data to any particular framwork, application-specific classes or programming language. Certain kinds of data fit very well into these models.

Http traffic is very dynamic. Some requests are basically "GET / HTTP/1.1" while others contain forms or json and lots and a multitude of headers. Using MongoDB, it is possible to represent the data more at an object-level, e.g.

Another reason, beside being very dynamic, why a non-relational database was chosen, is that http traffic was perceived as being pretty much non-relational. Each HTTP dialogue is stored as an object with no foreign keys or relation to any other database objects.

This object representation of a http dialogue allows for different requests/responses to contain different amounts of information. For example, it would be possible (but perhaps not desirable) to store the entire html response as a DOM model, which would allow database queries on html tags and attributes. MongoDB has very powerful querying-facilities. Since each object is stored with this structure in the database, it is possible to reach into objects during queries and perform e.g these kind of queries:

"give me response.body where request.parameters.filename exists", or

"give me request.body.parameters where

request.body.parameters.__viewstate does not exist"

It is also possible to create javascript selection filters which are evaluated within the database. Such functionality can e.g be used to perform evaluations using JavaScript to investigate characteristics on the response html source code.

Project About

The Hatkit Proxy is an intercepting http/tcp proxy based on the Owasp Proxy, but with several additions. These additions are:

Swing-based UI,

Interception capabilities with manual edit,

Syntax highlightning (html/form-data/http) based on JFlex,

Storage of http traffic into MongoDB database,

Interception capabilities of tcp-traffic,

Possibilities to intercept in Fully Qualified mode (like all other http-proxies) OR Non-fully qualified mode. The latter means that interception is performed *after* the host has been parsed, thereby enabling the user to submit non-valid http content.

The primary purpose of the Hatkit Proxy is to create a minimal, lightweight proxy which stores traffic into an offline storage where further analysis can be performed, e.g. all kinds of analysis which is currently implemented by the proxies themselves (webscarab/burp/paros etc).

Also, since the http traffic is stored in a MongoDB, the traffic is stored at an object-level, retaining the structure of the parsed traffic, which enables a user to perform advanced queries later.

The proxy should also be a good choice for 'defenders' who wants to (temporarily?) monitor traffic. The proxy itself is, as stated, very lightweight, and the backend MongoDB storage scales very well and should be able to handle extreme amounts of data. This would allow defenders to perform advanced post-mortem or real-time analysis of incoming traffic.