ModSecurity Blog: ModSecurity Rules

27 October 2010

Detecting Malice with ModSecurity: GeoLocation Data

I would like to introduce a new blog series entitled - Detecting Malice with ModSecurity and will be used as an alternative to the Advanced Topic of the Week blog posts. The the idea is to take the outstanding web application intrusion/fraud detection concepts introduced by my friend Robert "Rsnake" Hansen in his book Detecting Malice and provide real-life implementation examples using ModSecurity. Rsnake has given me the green light to provide actual snippets of text/quotes directly from his book! The format will be that Rsnake will setup the underlying issues and methods to identify bad behavior and then I will present some practical implementations using ModSecurity.

In this installment of Detecting Malice with ModSecurity, we will discuss how to use Geolocation data.

IP addresses aren’t quite like street addresses, but they’re sometimes close. There are a number of projects that correlate, with varying success, IP address information to geographic locations. Some very large retailers help these databases by tying in IP address information collected along with address information given by their users during registration. Others tie in ISP information and geographic information with the IP address – a far less precise method in practice. Yet others still use ping time and hops to get extremely accurate measurement estimates of physical location.

The following information is typically available in GeoIP databases:

 Country

 Region (for some countries)

 City

 Organization (typically from the WHOIS database)

 ISP

 Connection speed

 Domain name

There are lots of reasons you may be interested in the physical location of your users. Advertisers are always on the lookout for ways to make their ads more interesting and meaningful (because that increases their click-through rate), and serving local ads is a good way to achieve that. Relevance is key, and that can be optimized by targeting the results to a user’s location.

Geographic information can also give you clues to the user’s disposition. If you are running a website in the United States and someone from another country is accessing your site, it may tell you quite a bit about the possible intentions of that user. That becomes more interesting if the site is disliked in principle by the population of other countries, or doesn’t do business with that country in particular. For example, many sites are totally blocked by the Chinese’s system of firewalls because the Chinese government believes certain words or thoughts are disruptive to the government’s goals. If someone from China IP space visits your site despite you being blocked by the Chinese government’s firewalls, there is a high likelihood the traffic is state sponsored.

Similarly, if you know that a large percentage of your fraud comes from one location in a particular high crime area of a certain city, it might be useful to know that a user is accessing your site from a cyber café in that same high crime area. Perhaps you can put a hold on that shipment until someone can call the user and verify the recipient. Also if you know that fraud tends to happen in certain types of socio- economic geographies, it could help you heuristically to determine the weighted score of that user. This sort of information may not be available to you in your logs, but could easily be mined through external sources of relevant crime information. Companies like ESRI heavily focus on mapping demographics, for instance.

In terms of history, it doesn’t hurt to look back into time and see if the IP address of the user has ever connected to you before. That information combined with their status with the website can be useful information. Another thing to consider is the type of connection used. If you know the IP space belongs to a DSL provider, it may point to a normal user. If it points to a small hosting provider, and in particular a host that is also a web server, it is highly likely that it has been compromised. Knowing if the IP belongs to a datacenter or a traditional broadband or modem IP range is useful information.

Knowing the physical location can help in other ways as well. If your website does no business in certain states or with certain countries due to legislative issues, it may be useful to know that someone is attempting to access your site from those states. This is often true with online casinos, which cannot do business with people in the United States where the practice is prohibited. Alerting the user of that fact can help reduce the potential legal exposure of doing business.

One concept utilizing physical location is called, “defense condition.” Normally, you allow traffic from everywhere, but if you believe that you're being attacked you can switch to yellow (the meaning of which is specific to the websites in question but, for example, can mean that your website blocks suspicious traffic coming from certain countries) or red (your websites block any traffic from certain countries). While this may seem like a good idea, in some cases this can actually be used against a website to get a company to block other legitimate users, so the use of this concept is cautioned against.

Utilities like MaxMind’s geoipaddress lookup can give you high level information about the location of most IP addresses for free. There are other products that can give you a lot more information about the specific location of the IP address in question, which can be extremely helpful if you are concerned about certain traffic origins.

$ geoiplookup.exe 157.86.173.251 GeoIP Country Edition: BR, Brazil

There has been quite a bit of research into this concept of bad geographic origins. For instance McAfee published a study citing Hong Kong (.hk), China (.cn) and the Philippines (.ph) as the three most likely top level domains to pose a security risk to their visitors. While these kinds of studies are interesting, it’s important not to simply cast blame on an entire country, but rather think about why this is the case – which can give you far more granular and useful information. It should be noted that this study talks about the danger that websites pose to internet surfers, not to websites in general, but there may be a number of ways in which your website can contact or include portions of other sites, so it still may be applicable.

Using GeoLocation Data in ModSecurity

ModSecurity supports using geolocation data through the integration with the free MaxMind GeoLite Country or GeoLite City databases. The first step is to download the database of your choice and put it somewhere on the local filesystem where ModSecurity can use it (for example in the same directory as the Core Rule Set).

Before you can use the GeoIP data, you first have to use the SecGeoLookupDb directive:

SecGeoLookupDb

Description: Defines the path to the geographical database file.

Syntax:SecGeoLookupDb /path/to/db

Example Usage:SecGeoLookupDb /usr/local/geo/data/GeoLiteCity.dat

Processing Phase: N/A

Scope: Any

Version: 2.5.0

Dependencies/Notes: Check out maxmind.com for free database files.

Once the SecGeoLookupDb directive is active, you then need to use a rule similar to the following which will pass the client's IP address (REMOTE_ADDR) to the @geoLookup operator:

The vast majority of organizations are global and would not want to implement this type of restriction for day-to-day operations. This type of rule is useful, however, when the organization wants to initiate a more restrictive "Defense Condition Level" when under a direct attack. You could make the previous rule conditional and only used when under an increased DefCon Level:

While this example works, one of the big challenges in effectively using the concept of DefCon Levels with ModSecurity rules is the fact that you would have to initiate an Apache restart/reload in order for these DefCon Level variable data to be used. This becomes a real scaling challenge if you are protecting a large server farm. Under these types of scenarios, you can actually utilize the concept of dynamic rules which are activated based on web requests from authorized sources.

Imagine you are running a SOC and you identify that your web sites are under attack and you want to initiate the new DefCon Level restriction rules to only allow requests from specific geographic regions. Rather then needing to push out new Apache/ModSecurity configs to all web servers and initiate restarts, you could alternatively initiate a web request from an authorized web client system from within the SOC. This client would have a specific IP address, User-Agent String and make a request to a specific URL with a specific parameter payload. When this request is received by ModSecurity, the rules would then set a new DefCon level variable in a persistent Global collection. When this happens, the GeoIP restriction rules would become active. Once the threat has passed and the DefCon Level has been restored, a 2nd request can be sent out to all web servers to remove the DefCon Level variable and return the rules to their normal processing. Again, the advantage of this approach is that it is a much faster method to toggle on/off a subset of rules vs having to update Apache/ModSecurity configs and initiate restarts.

This rule does not block the request but it does increase the "TX:FRAUD_SCORE" variable data. The TX data can be increased by other rules as well, such as those doing IP Reputation checks with @rbl, and then it can be evaluated at the end of the request phase when a decisions is made after correlating all detection data.

Per-User GeoIP Data

Another Fraud related concept that can use GeoIP data is to track the Country/City data used by each user. You can save this data in a per-user persistent collection and simply raise the fraud_score if the data changes. This concept can be implemented either across multiple user sessions or for possible session hijacking detection during the course of one session.

Here are some example rules that will save the GEO Country Code data in a persistent User collection. It will count the number of times that user has connected from that same country and once it is gone over the defined threshold (10), it will then issue alerts when the use logs in from a different country.

The final example shows how you can identify if the GEO Country Code data changes during the course of a Session (using JSESSIONID as the example). The key is to save the GEO:COUNTRY_CODE data when the application issues a Set-Cookie SessionID and then validate it on subsequent requests.

As a result of the acquisition of Breach Security (and thus ModSecurity) by Trustwave, we thought that it was a good time to run another User Survey to get a better understanding of how the community is using ModSecurity and, most importantly, how we can make it better. We ran a survey during the month of September 2010. Thanks to everyone who took the time to respond as it will help to steer the development of ModSecurity.

We plan to run these official user surveys once a year, however don't forget that you can always voice your concerns and bring up issues on the ModSecurity mail-list and also in the Jira ticketing system. Speaking of Jira, I wanted to remind everyone that you can always log into Jira to see the latest ticket/roadmap items. And taking this one step further, you can actually VOTE on an item which will help us to prioritize which issues get fixed first and also which feature requests will make it into different versions.

ModSecurity User Survey 2010 Results

You can review the complete stats here. For this blog post, however, I wanted to highlight the most interesting questions/results and provide a bit of commentary.

﻿﻿﻿ Analysis:

As you can see, ModSecurity is being used widely across vertical markets. It is interesting to see that the top current users are now Consultants. I attribute this mostly to the emerging importance of using ModSecurity as a Virtual Patching tool to mitigate identified vulnerabilities.

Analysis:

As mentioned in the previous section, virtual patching is probably the most popular use-case scenario for initially deploying ModSecurity (or any WAF for that matter). Some other interesting benefits are blocking technical/sensitive data leakages and also increased HTTP transactional logging. Speaking of the latter, I can attest to the effectiveness of promoting ModSecurity's Audit Engine capabilities to help with operational trouble-shooting. This is how I initially pitched deploying ModSecurity to the web server admins back in the day. By showing them complete HTTP transactions, they were able to more quickly fix issues.

Analysis:

I am hoping that the @modsecurity Twitter account will pick up some more followers. :)

Analysis:

The responses to this question were pretty much as expected. There any different ways to try and address web application security issues and organizations should use them all. The combination of conducting static/dynamic web application testing and then implementing custom virtual patches is gaining momentum as there seem to be more and more scenarios where identified vulns can't be fixed in the code at all or it will take a significant period of time. In the former case, a virtual patch is about the only option, and in the latter case, a virtual patch can at least provide some level of mitigation until the source code level fix is implemented.

Analysis:

There are two main rationales for wanting custom rule sets built for a specific piece of software -

To reduce false positives

To reduce false negatives

These are constant battles with "generic" signatures as they do not know of exact attack vector locations. Application rule set packages will need to either have a listing of known vuln signatures or a positive security profile created. In either case, it will require on-going monitoring of updates to public software to identify new vulns and also create new positive security profiles when new functionality is added. Trustwave SpiderLabs is researching the feasibility of creating a real-time rules feed for newly released webappsec vulns in public software. We are are also looking at adding in learning capabilities to ModSecurity by using new Lua scripts for profiling web apps.

Analysis:

The good news is that we are working on plans to implement just about all of these features/capabilities into ModSecurity :) Keep an eye out for future updates.

Analysis:

As you can see, there is a pretty wide range of the number of sites being protected. Many users use ModSecurity just to protect one local web site while there are others who use ModSecurity to protect entire server farms consisting of hundreds of systems. That is the beauty of the embedded deployment model - scale.

Analysis:

I believe that there is a misconception that ModSecurity can only be used in an open-source/LAMP type of deployment. As evidenced by the responses to this question, ModSecurity is platform/application language agnostic. It will protect any type of web application including commercial/custom coded applications that are used for ecommerce, banking, etc...

Analysis:

This question also proves the point raised in the previous section and that is that if you have an Apache reverse proxy with ModSecurity, you can front-end any back-end web applications with WAF technology. I know of many users who use this setup to protect different back-end apps including IIS/ASP.Net types of technology.

Analysis:

Woohoo! I am glad to see that the vast majority of users are using the newest ModSecurity versions. The main hurdle to upgrading is if the currently deployed Apache version is the 1.3 branch then they can't use the newer code.

Analysis:

Pretty evenly split between embedded mode vs. proxy mode.

Analysis:

Ideally, all users would be using the OWASP CRS (with local exceptions) along with their own custom rules for identified virtual patches. I would venture to guess that the main reason why the people who are *only* using custom rules is that they ran into false positive issues with the CRS. Making exceptions easier is a major goal of future ModSecurity/CRS releases.

Analysis:

The total number of rules running may become important if you start to experience performance issues such as higher latency of transactions or spikes in CPU/RAM. The total number of rules isn't usually the biggest concern but rather if you have any specific rules that are not performing well.

Analysis:

This is an interesting question when you consider how much time the average user has to review ModSecurity alerts/audit log data. Keep in mind that this number is usually impacted by both untuned rules and by the amount of attack traffic you receive. Even if you tune your rules, however, you may still get a rather large number of events if your site is actively being probed/attacked. If you want high fidelity alerts (meaning ones in which you may need to take some form of incident response action upon) then you will have to spend a bit of time up front to tune your rules.

Analysis:

Managing ModSecurity audit events is a critical issue from a situational awareness perspective. The two best options currently are:

Christian Bockermann's AuditConsole - it has really taken the concept created by the original ModSecurity community console and taken it to new heights. It is highly recommended as it was built from the ground up to handle ModSecurity audit log data.

SIEM Integration - it is rather easy to reconfigure your Apache host to send its error_log data (which will include the short ModSecurity message) onto a remote SIEM host via syslog. Your mileage will vary though as to how well the SIEM is able to parse, search and report on ModSecurity messages.

Analysis:

This task is certainly tricky... It is often tough for the average user to confirm if a ModSecurity event is accurate or may need some level of tuning to handle authorized functionality. When I work with commercial customers who are implementing WAFs, I often refer them to the excellent OWASP Best Practices: Use of Web Application Firewall document. Specifically, organizations should make sure that they have the proper staff to administer the WAF. In this case, we are referring to the WAF Application Manager role:

8.3.2 WAF application manager (per application)

Tasks:

Implementation and maintainance of the WAF configuration specific to the application

Monitoring and analysis of the log files (at least on the second level)

Contact for error messages, in particular false positives analysis in collaboration with the application manager

Close cooperation with the WAF application managers and platform managers

Test of WAF functionalities for the application, especially when deploying new versions of the application

Knowledge:

In-depth knowledge of the WAF configuration in relation to application-specific security mechanism

Very good knowledge of the behaviour of the application, in particular input, output, uploads, downloads, character sets, etc.

Analysis:

In general, a false positive rate of >10% isn't bad. The main issue is when you fact in any blocking methods being used. If a rule is incorrectly disrupting non-malicious user activities then it can become a big problem. Interestingly, anyone who actually answered this question is more then likely in a smaller subset of ModSecurity users who actually do spend time reviewing their logs! People who don't review their logs can't answer this question.

Analysis:

The people who answered the they are in DetectionOnly mode more than likely don't have enough time to properly tune their rules to their application environment which means they have a higher degree of false positives. The users who answered that they are in a Blocking mode have been able to tune and therefore have a higher comfort level with taking action.

Analysis:

Let me ask a question - an attacker figures out some sort of evasion for your current signatures, how are you ever going to know what happened? Or look at it from this perspective - say that you identify someone attacking your site and then you initiate incident response and the CISO asks you to provide logs of *everything* that user did. Will you be able to do this? Unfortunately, most users run ModSecurity in an "Alert-centric" mode in which audit logs are only created when rules match. This is a shame as it is missing one of its greatest strengths - HTTP Auditing. If you want to do session reconstruction, you MUST audit log all transactions. Now, this being said, you can still choose to not log requests for static content (such as images, etc...) and this will drastically cut down on the number of transactions being logged.

Analysis:

While there are use-cases for utilizing different disruptive actions, I am becoming more and more of a fan simply emulating how the application responds currently to similar threats. Does it do a redirect back to the homepage or throw a 403 alerts? Whatever it is, simply mimic that response so that it is not as easy for an advanced attacker to enumerate that fact that you are running a WAF.

Analysis:

Hopefully the new "Advanced Topic of the Week" blogpost series is helping to shed some light on these features and capabilities.

Analysis:

As I mentioned in the previous section, we are trying to provide more data about how to actually use these advanced ModSecurity features.

Wouldn't it be cool if your WAF could share its data with the application it is protecting? This concept is similar to anti-SPAM SMTP apps that will add additional mime headers to emails providing the SPAM detection analysis information. The CRS is attempting to mimic this concept at the HTTP layer by adding additional request headers that provide insight into any ModSecurity events that may have triggered during processing. The advantage of this approach is that it allows a WAF to be in a detection-only mode while still providing attack data to the destination application server. The recieving app server may then inspect the WAF request headers and make a determination whether or not to process the transaction. This concept is valuable in distributed web environments and hosting architectures where a determination to block may only be appropriate at the destination app server.

This concept has actually been discussed as part of the OWASP AppSensor Project and we have added a new Detection Point for it entitled - RP2: External User Behavior

This information can be used by the application to contribute to its knowleage about a potential attacker. In some cases, the information could be detected by the application itself (e.g. XSS pattern black listing), but may be more effectively identified by the external device, or is not known to the application normally (e.g. requests for missing resources that the web server sees, but does not pass onto the application).

CONSIDERATION

The greater the knowledge a device or system has about the application, the greater confidence can be given to evidence of suspicious behaviour. Therefore, for example, attempted SQL injection detexcted by a web application firewall (WAF) might be given greater weight than information from a network firewall about the IP address.

The power of AppSensor is its accuracy and low false positive rate, and the usage of external data should be carefully assessed to ensure it does not contribute to a higher false positive rate.

EXAMPLES

Example 1: An IDS has detected suspicious activity by a particular IP address, and this is used to temporarily tighten the attack detection thresholds for requests from all users in the same IP address range.

Example 2: An application is using the ModSecurity web application firewall with the Core Rule Set, and utilises the anomaly score data passed forward in the X-WAF-Events and X-WAF-Score HTTP headers (optional rules in modsecurity_crs_49_header_tagging.conf) to adjust the level of application logging for each user.

Example 3: Information from an instance of PHPIDS suggests request data may be malicious.

Introduction

In last week's post on Identifying Improper Output Handling, we showed a method to use ModSecurity to identify if client request data is echoed back in html responses thus identifying a potential XSS vector. While this can prove useful to a large chunk of XSS flaws, it is not foolproof as there are many scenarios where the inbound data is altered slightly by the application and thus turns a benign payload into something executable (see the Giorgio Maone's Lost in Translation post for a perfect example with ASP classic). In this situation, the example rules to identify improper output handling wouldn't have matched...

There is a lot a WAF can do with outbound traffic to help protect web applications from information leakages. There has not been as much progress made, however, in analyzing, manipulating or adding data to outbound dynamic code being sent from the web application to the clients. This is the concept that I want to discuss today.

Concept

Previous versions of ModSecurity did not alter any of the actual transactional data (either inbound or outbound). ModSecurity would make copies of the data, place it into memory and then apply all data transformations, etc... and it would then decide what disruptive action to take if there was a rule match on the data. While this process works well in defense of the vast majority of web application security issues, there are still certain situations where it is limited.

Client-side security issues are difficult to address in this architecture since the WAF has no visibility on the client (inside the DOM of the browser). With the new Content Injection capabilities in ModSecurity, we have added two actions which will allow ModSecurity rule writers to either "prepend" or "append" any text data to text-based (html) outbound data content.

The really useful idea is to inject a JavaScript fragment at the top of all outgoing HTML pages. The advantage of using ModSecurity's Content Injection approach is that you can be assured that your code runs prior to any other user-supplied code that is leveraging an XSS flaw. For example you could detect JavaScript code in places where it is not expected, look for weird HTML/JavaScript code indicative of attacks, remove external links, and so on. While full support for DOM manipulation on the server is not available yet, ModSecurity does support content injection, where you can inject stuff at the beginning or at the end of the page. The original idea behind this this feature was to make DOM XSS detection possible within the client browser. The idea is to inject a chunk of JavaScript to analyse the request URI from inside the browser to detect attacks.

Content Injection Directives and Variable Usage

Description: Appends text given as parameter to the end of response body. For this action to work content injection must be enabled by setting SecContentInjection to On. Also make sure you check the content type of the response before you make changes to it (e.g. you don't want to inject stuff into images).

Description: Prepends text given as parameter to the response body. For this action to work content injection must be enabled by setting SecContentInjection to On. Also make sure you check the content type of the response before you make changes to it (e.g. you don't want to inject stuff into images).

This rule set enables the Content Injection capabilities an then sets a default action. The important point to see on the SecDefaultAction line is the "setvar:tx.alert=1" action. What this will do is set a transactional variable if any of the rules trigger a match. The last two lines of the configuration are a chained rule set that runs in phase:3. The first part of the chain will simply look for the "alert" tx variable. If it is set, that means that the client's request has triggered one of the ModSecurity rules and is thus some form of attack. The second part of the rule then makes sure that the response data is of the correct content type (text/html). If so, this rule will then insert some javascript that will issue an alert pop-up box asking them why they are trying to hack the web site :)

Let's take a look at the modsec_debug.log file to see exactly how this new operator is functioning:

One of the main challenges in secure web application development is how to mix user supplied content and the server content. This, in its most basic form has created one ofthe most common vulnerabilities now a days that affect most if not all websites in the web, the so called Cross Site Scripting (XSS).

A good solution would be a technology capable of limiting the scope of XSS attacks, by a way to tell the browser what trusted code is, and what is not. This idea has been out for ages, but it has always required some sort of browser native support. ACS (Active Content Signature) comes to solve this problem, by creating a JavaScript code that will work in all browsers, without the need to install anything, and that will completely remove from the DOM any type of dangerous code.

I will start by demonstrating how ACS solves XSS issues, without the need ofthe server to do any type of parsing or filtering, and without endangering the user. ACS will be appended at the beginning of the webpage's HTML code. As a simple external JavaScriptcode, something like:

When this script is loaded, it will automatically stop the parsing of the rest ofthe HTML page. It will, then recreate the DOM itself, taking out dangerous snippets of code (like event handlers,or scripts), making the browser display it's content's without any danger. Now, if the user wants to make exceptions so their own scripts are loaded,they can do so, ina very simple way... adding a signature.

After discussing my goals with Eduardo and working through some tests, we came up with a working proof of concept integration using ModSecurity's Content Injection capabilities to prepend the ACS JS code to the top of selected web pages. The advantage of this approach is that you don't have to alter the html source of any of your pages, as ModSecurity will prepend this data for you on the fly.

Here is what you need to test out this concept:

Download the ACS JS file from Eduardo's site and place it in your site's DocumentRoot as clients will need to access this file when our injected JS executes.

Download the localized ACS file from the ModSecurity site and place it within your DocumentRoot as well as the clients will also need to access this file. This is the file that creates the temporary DOM, validates/allows the authorized JS to run on your site and then recreates the DOM. You will need to update the "default-src" regular expression to allow JS to run from authorized domains (such as Google Analytics, 3rd party sites, etc...)

Add the following ModSecurity directives/rules to your ModSecurity configuration:

In order to help facilitate community testing of this proof of concept, we have created an online demo. We encourage the community to help test this implementation to help identify any evasions or bugs. You can toggle on/off the XSS Defense to ensure that your payload executes and then continue testing with the defense in place.

Current Limitations

The current implementation of ACS does not allow for inline scripts, so web pages would need to be re-written to reference external src files.

There are still browser quirks that cause parsing errors in the new DOM rendered by ACS (surprise, surprise in IE...)

The blacklist tags in the ACS code still need to be expanded

Conclusion

We hope that this concept helps to provide a layered defense for XSS issues and to show yet another advanced feature of ModSecurity!

As I sat down this morning thinking about an appropriate topic for this week's blog post, I quickly found my subject - XSS flaws. The Twitter-verse exploded this morning with an outbreak of an XSS worm that is using the onmouseover event to propagate itself (technically I believe that this is a CSRF attack as it is forcing the user to send a Twitter update that they didn't intend to send... but I digress.). I quickly added this incident into the WASC Web Hacking Incident Database.

How Prevalent is XSS?

Just how big of a problem is XSS? Most web application security statistics projects have it listed right at the top -

First of all, make sure that all of your developers read, then reread, and then really sit down and take notes and seriously get a game plan together to address the issues raised in the OWASP XSS Prevention Cheatsheet document. Make sure you understand this section on untrusted data:

Untrusted data is most often data that comes from the HTTP request, in the form of URL parameters, form fields, headers, or cookies. But data that comes from databases, web services, and other sources is frequently untrusted from a security perspective. That is, it might not have been perfectly validated. The OWASP Code Review Guide has a decent list of methods that return untrusted data in various languages, but you should be careful about your own methods as well.

Untrusted data should always be treated as though it contains an attack. That means you should not send it anywhere without taking steps to make sure that any attacks are detected and neutralized. As applications get more and more interconnected, the likelihood of a buried attack being decoded or executed by a downstream interpreter increases rapidly.

Traditionally, input validation has been the preferred approach for handling untrusted data. However, input validation is not a great solution for injection attacks. First, input validation is typically done when the data is received, before the destination is known. That means that we don't know which characters might be significant in the target interpreter. Second, and possibly even more importantly, applications must allow potentially harmful characters in. For example, should poor Mr. O'Malley be prevented from registering in the database simply because SQL considers ' a special character?

While input validation and looking for generic attack payloads is useful for identifying attack attempts, it is not the same as identifying the actual underlying application weakness that is exploited by XSS attacks and that is Improper Output Handling. Essentially, what you need is Dynamic Taint Propagation detection where you can track where unstrusted user data is insecurely used within your application. For XSS you need to identify where user supplied data is echoed back to clients in an unescaped form.

Dynamic Taint Propagation with ModSecurity

In CRS v2.0.7, we added an optional ruleset called modsecurity_crs_55_application_defects.conf which, among other things, implements an experimental dynamic taint propagation ruleset to identity if/when/where an application does not properly output encode/escape user supplied data.

Keep in mind - this ruleset's goal is not to try and identify malicious payloads but rather to indirectly identify possible stored/reflected XSS attack surface points by flagging when an application resources does not properly output escape user supplied data. An advantage to this approach is that we don't have to be as concerned with evasion issues as they rules are not looking for malicious payloads. Instead, they will monitor as normal users interact with the application and will alert if the application does not correctly handle the data.

Using this type of application weakness detection, it is possible to identify the underlying issues in the application and get a plan together to address them rather than perpetually playing "Whack-a-Mole" with inbound attacks.

On first invocation of this action the collection will be empty (not taking the predefined variables into account - see initcol for more information). On subsequent invocations the contents of the collection (session, in this case) will be retrieved from storage. After initialisation takes place the variable SESSIONID will be available for use in the subsequent rules.This action understands each application maintains its own set of sessions. It will utilise the current web application ID to create a session namespace.

SESSION

This variable is a collection, available only after setsid is executed. Example: the following example shows how to initialize a SESSION collection with setsid, how to use setvar to increase the session.score values, how to set the session.blocked variable and finally how to deny the connection based on the session:blocked value.

#
# This rule file will identify outbound Set-Cookie/Set-Cookie2 response headers and
# then initiate the proper ModSecurity session persistent collection (setsid).
# The rules in this file are required if you plan to run other checks such as
# Session Hijacking, Missing HTTPOnly flag, etc...
#
#
# This rule set will identify subsequent SessionIDs being submitted by clients in
# Request Headers. First we check that the SessionID submitted is a valid one
#
SecMarker BEGIN_SESSION_STARTUP
SecRule REQUEST_COOKIES:'/(j?sessionid|(php)?sessid|(asp|jserv|jw)?session[-_]?(id)?|cf(id|token)|sid)/' ".*" "chain,phase:1,t:none,pass,nolog,auditlog,msg:'Invalid SessionID Submitted.',setsid:%{matched_var},setvar:tx.sessionid=%{matched_var},skipAfter:END_SESSION_STARTUP"
SecRule SESSION:VALID "!@eq 1" "t:none"
SecAction "phase:1,t:none,nolog,pass,setuid:%{session.username},setvar:session.sessionid=%{tx.sessionid}"
SecMarker END_SESSION_STARTUP

Further description of this ruleset is discussed below.

So What?

Why should you try and validate SessionIDs sent by clients? I am reminded of President Ronald Reagan's famous saying "Trust, But Verify." Put it this way, just because a client sends a SessionID within the Request Header Cookie data does not mean that it is a valid one. There are numerous attacks where malicious clients will send in bogus SessionIDs with the most prevalent one being Credential/Session Prediction where an attacker manipulates the SessionID data in hopes that they can achieve a Session Hijacking attack. Another attack scenario that was just recently identified is the 'Padding Oracle' Crypto Attack against ASP.Net applications where the attacker sends in bogus encrypted SessionID. Based on the errors sent back out my the application they may be able to identify the crypto keys which would allow them to possibly decrypt sniffed cookie data or forge tickets.

The key concept with mitigating these types of SessionID attacks is to simply ask yourself - Do you trust the client or the application? The critical security moment in identifying many application session attacks is to initiate session collection data only when the application hands out a SessionID in a Set-Cookie response header. If you use the ModSecurity setsid action at this point, you can then also set a new variable within the collection to mark the SessionID as "valid" meaning that it is one that application actually handed out to a client. With this Session collection data saved, you can then use validation rules that will extract out any SessionID data submitted by client within Request Header Cookie data and verify if the "valid" variable has been sent. If this variable is not present, then you know that the client is sending a bogus SessionID.

I am excited to announce that the OWASP ModSecurity Core Rule Set (CRS) has completed its official review and has been promoted to a Release Quality Project! I want to thank both Ivan Ristic and Leonardo Cavallari Militelli who served as project reviewers.

While this is certainly exciting news and will help to further promote ModSecurity and the Core Rule Set, there is still much work left to be done on the project. We will be expanding the CRS documentation, re-organizing the rules to more accurately categorize the quality of the rules (with relation to false positive potential), as well as, cross-referencing rules with the OWASP AppSensor project.

We are starting a new blog post series here on the ModSecurity site called "Advanced Feature of the Week" where we will be highlighting many of ModSecurity's really cool capabilities. These are the features that seldom used or fully understood by the average ModSecurity user however can provide detection of sophisticated attacks if used properly. It is our goal with these blog posts to help shed light on these unique features and to provide some real-world, in-the-trenches gotchas for successful usage of these features.

This blog post series will have the following major topic sections -

1) ModSecurity Reference Manual Information

Provide reference manual data.

2) Use Within the OWASP Core Rule Set (CRS)

Outline if/when/how the CRS is utilizing this feature.

3) So What?

Will provide some context as to why you as a user should even care about this capability. What advanced attack/vulnerability is this attempting to catch.

Reference Manual

validateByteRange

Description: Validates the byte range used in the variable falls into the specified range.

Example:

SecRule ARGS:text "@validateByteRange 10, 13, 32-126"

Note

You can force requests to consist only of bytes from a certain byte range. This can be useful to avoid stack overflow attacks (since they usually contain "random" binary content). Default range values are 0 and 255, i.e. all byte values are allowed. This directive does not check byte range in a POST payload when multipart/form-data encoding (file upload) is used. Doing so would prevent binary files from being uploaded. However, after the parameters are extracted from such request they are checked for a valid range.

validateByteRange is similar to the ModSecurity 1.X SecFilterForceByteRange Directive however since it works in a rule context, it has the following differences:

You can specify a different range for different variables.

It has an "event" context (id, msg....)

It is executed in the flow of rules rather than being a built in pre-check.

OWASP ModSecurity CRS

Use of @validateByteRange in the OWASP ModSecurity CRS (from the end of the modsecurity_crs_20_protocol_violations.conf file) -

#
# Restrict type of characters sent
# NOTE In order to be broad and support localized applications this rule
# only validates that NULL Is not used.
#
# The strict policy version also validates that protocol and application
# generated fields are limited to printable ASCII.
#
# -=[ Rule Logic ]=-
# This rule uses the @validateByteRange operator to look for Nul Bytes.
# If you set Paranoid Mode - it will check if your application use the range 32-126 for parameters.
#
# -=[ References ]=-
# http://i-technica.com/whitestuff/asciichart.html
#
SecRule ARGS|ARGS_NAMES|REQUEST_HEADERS|!REQUEST_HEADERS:Referer "@validateByteRange 1-255" \
"phase:2,rev:'2.0.8',pass,nolog,auditlog,msg:'Invalid character in request',id:'960901',tag:'PROTOCOL_VIOLATION/EVASION',tag:'WASCTC/WASC-28',tag:'OWASP_TOP_10/A1',tag:'OWASP_AppSensor/CIE3',tag:'PCI/6.5.2',severity:'4',t:none,t:urlDecodeUni,setvar:'tx.msg=%{rule.msg}',tag:'http://i-technica.com/whitestuff/asciichart.html',setvar:tx.anomaly_score=+%{tx.notice_anomaly_score},setvar:tx.protocol_violation_score=+%{tx.notice_anomaly_score},setvar:tx.%{rule.id}-PROTOCOL_VIOLATION/EVASION-%{matched_var_name}=%{matched_var}"
SecRule TX:PARANOID_MODE "@eq 1" "chain,phase:2,rev:'2.0.8',pass,nolog,auditlog,msg:'Invalid character in request',id:'960018',tag:'PROTOCOL_VIOLATION/EVASION',tag:'WASCTC/WASC-28',tag:'OWASP_TOP_10/A1',tag:'OWASP_AppSensor/CIE3',tag:'PCI/6.5.2',severity:'4',t:none,t:urlDecodeUni,tag:'http://i-technica.com/whitestuff/asciichart.html'"
SecRule REQUEST_URI|REQUEST_BODY|REQUEST_HEADERS_NAMES|REQUEST_HEADERS|!REQUEST_HEADERS:Referer|TX:HPP_DATA \
"@validateByteRange 32-126" \
"t:none,t:urlDecodeUni,setvar:'tx.msg=%{rule.msg}',setvar:tx.anomaly_score=+%{tx.notice_anomaly_score},setvar:tx.protocol_violation_score=+%{tx.notice_anomaly_score},setvar:tx.%{rule.id}-PROTOCOL_VIOLATION/EVASION-%{matched_var_name}=%{matched_var}"

As you can see, the CRS is, by default, only restricting the existence of Nul Bytes. If the user initiates the PARANOID_MODE variable, however, the CRS will restrict down the allowed byte range to only allow 32-126 which are the normal printable characters for US ASCII (space character through the tilde character).

So What?

Why would you need to use @validateByteRange to restrict anything more than Nul Bytes? The short answer is because of the potential of Impedance Mismatches between a security inspection system (IDS, IPS or WAF) and the target web application. The process of data normalization or canonicalization and how the destination web application handles best-fit mappings can cause issues with bypasses. Here is a great recent references for this issue.

Lost In Translation - Giorgia Maone (of the NoScript FF extension fame) outlines how ASP classic web apps attempt to do best-fit mappings of non-ASCII Unicode characters. One example issue is the following XSS payload

ASP classic, however, will try and do best-fit mapping and actually will normalize the data into working JS code -

<script>eval('alert("XSS")')</script>

The issue that this raises, for security inspection, is that the inbound payload will most likely not match most XSS regular expression payloads however the application itself will modify it into executable code!

So, this brings us to today's advanced ModSecurity feature - @validateByteRange. By restricting the allowed character byte ranges, you can help to identify when unexpected character code points are used. If the payload above is sent, you should receive an alert message similar to the following -