July 7, 2016

Blind SQL injection vectors are considered either partial-blind or full-blind in terms of feedback provided to the attacker. Often SQL injection vulnerabilities will be blind when the web application is configured to show only generic error messages, but has not mitigated the SQL injection vulnerability, or when injecting into an INSERT statement where the attacker cannot see the added record. The feedback is determined by the configuration and defenses of the target web-server in conjunction with the type of statement being injected into.

In blind SQL injection vectors, the attackers traditionally have been limited to asking true or false queries about the back-end database records (successfully retrieving 1 bit per request). However, advancements presented in this article illustrate how to go beyond simple true / false queries, which have been the basis for blind-enumeration-based techniques and tools. Instead attackers are now able to extract multiple bits of information at a time (not just true / false), as well as perform targeted searches of the back-end records.

Due to the complexity of this attack, its best to automate exploitation; it is rather inefficient to sit and measure requests using a stopwatch. The steps required to initiate this attack vector consist of investigation, mathematical calculation, and finally the construction of the query itself from the mathematically derived values.

This article covers this technique exclusively for MySQL database systems, and our upcoming SQL injection workshop (currently available for pre-order at a 50% off discount for academic users) covers this technique for Microsoft SQL Server and PostgreSQL server as well. Furthermore, the SQL injection workshop features a comprehensive SQLi sandbox for testing, learning, and experimenting with time extraction and other techniques. For news about the workshop, follow us on twitter or subscribe to our subreddit.

Calculations & Equations

Each chunk of data naturally will occupy multiple bits. A form of bit shifting is used to address and extract specific bits (and not just the most significant or least significant). Unfortunately since bit-shift operators are filtered out in most real-world SQL injection vulnerabilities (e.g. << and >>), one should instead rely on equivalent mathematical functions, such as integer division & modulus (aka div mod).

This shifting process allows one to construct queries that address specific bits of target data, and shift up and down each byte (or chunk of bytes) to iterate over all bits. Integer division combined with modulus facilitates this process, as dividing by two is equivalent to bit-shifting to the right by one position. One can then multiply the specific bits by a minimum sleep time, and force the server to sleep for the resultant amount; however, we must modulo it by the factor value in order to operate within in the latency and timeout constraints of the server. Finally the data must be normalized into a numerical format, so that division and modulo can occur. This is accomplished with conversion and casting.

So given the variables defined in Table 1-A, values for the variables presented in Table 1-B are derived using equations:

The shift variable is used as a counter. Data is extracted by the following process:

Until s is zero the response time of the remote server sleeping for (d divided by (fs) modulo f) * m is locally divided by m, multiplied by (fs) and added to the accumulator while s is decremented.

When s has reached zero the accumulator contains a full block of normalized data in integer form. It is converted back into hexadecimal and then back into a string.

Infographic:Timing-based SQLi Extraction Fundamentals

Timeout Discovery

Given a full-blind SQL injection vulnerability, one must assess the timing constraints of the target, which is dictated by a combination of network quality, web server config, and database management system config. To begin, one must discover the SQL query timeout OR the maximum server script execution time (e.g. php.ini's max_exec_time). Under some circumstances equations are not actually based on the query timeout but rather the HTTPD timeout, or script timeout, which can prematurely return the http request before the query has finished executing.

A simple sleep injection is used to cause the database to hold the request until either the HTTPD timeout, query timeout, or script execution timeout has been reached. In some cases it is obvious which timeout the attack has encountered based on:

For this example, the amount of time this request took to respond will be called timeout or t.

In order to make sure the request didn't lock any threads, and that subsequent requests will work, one should wait to make any additional requests until 255 - t (in seconds) has elapsed.

This is important because if the HTTPD or script interpreter prematurely returns, the DBMS may still be running the sleep query, resulting in any further requests being denied, held up, or abandoned. This part entirely depends on the remote server configuration, and for a successful attack it is simply better to be safe than be sorry. Once this secondary wait time has elapsed, the next calculations can begin.

Minimum Sleep Time

The 'min_sleep' value (also called m for the equation) is a sleep increment that a single lag spike will not exceed. Any lag spike or network disruption exceeding 'min_sleep' (in seconds) can result in an inaccuracy or corruption of the acquired data. There are also times during heavy traffic that the remote server load may cause the page load time to increase, so running a ping for an hour may not be as reliable a metric as making a lot of HTTP requests (but this could generate more noise). For this reason, using a value of 1 for min_sleep is ill-advised unless running this attack over a LAN.

For these calculations, the hypothetical vulnerable query will be select "Testing123". Thus, the length of the result will be 10, and the timeout discovered previously will be 60. Example 1 illustrates this scenario.

Note: Even if the timeout value is lower than 60 (e.g. 30) and the min_sleep setting is as high as 10, that is still 4 possible values (two bits) that can be extracted from the server per request.

Maximum Bits Per Request

The maximum bits variable or "max_bits" represents the number of bits that can be retrieved in a single HTTP request. Given timeout t and min_sleep m, one can determine max_bits b using the equation presented in Formula 1:

This number is important for determining block/round size as well as for determining viable metrics using a division and modulus algorithm implemented to reduce the required number of syntax characters for successful exploitation utilizing timing extraction.

Factor

Factor f is important for selecting specific bits remotely as well as for reconstructing the data locally. The Factor is calculated as 2 to the power of max_bits b, as shown by Formula 2:

Formula 2:Defining the factorf = 2b

Block Size

The round size (or block size) is calculated based on the number of bits that can be extracted from the vulnerability in a single request. For example, when a vulnerability allows for the extraction of 3 bits per request, retrieving 1 byte would take 3 requests. Therefore, if one were to extract 3 bytes, but only examine 1 byte at a time, this would take 9 requests. However, if one were to examine 3 bytes at a time, rather than a single byte, 3 bytes would take only 8 requests to retrieve.

The 'number_of_bytes' is effectively the round size, or number of bytes to be examined per iteration. In order to calculate it one must find the least common multiple between 8 (the number of bits in a byte) and max_bits. The result of this is then divided by 8, per Formula 3 (shown in ruby in Example 2).

Formula 3:Block size discoveryblock_size = lcm(8,b)/8

Example 2:Block Size Determination

irb(main):007:0> number_of_bytes = 8.lcm(max_bits)/8
=> 3

Note: Due to the limitations of SQL servers, a single round may not exceed the maximum register width of the target system. This means that a 32-bit target's maximum block size is 4 bytes, and that 64-bit systems have a maximum round of 8 bytes.

Number of Shifts

The "number_of_shifts" variable (or s) is the number of HTTP requests required to fully extract a 'number_of_bytes' block from the server. The block size in bits is used to determine this value along with max_bits b. Formula 4 shows the math behind the logic in Example 3.

Iterations

Iterations is the number of blocks within the target data to be processed before extraction is successful, defined as the length of the result to be extracted divided by the 'number_of_bytes'. One could also view this as the number of rounds that exist in the target data to extract. If the 'number_of_bytes' does not evenly divide into the length of the query then an additional iteration must be added to the total. The purpose is to calculate how many times a round or 'block' will be extracted to fetch the complete query result.

irb(main):008:0> iterations = length / number_of_bytes
=> 3

Query Construction

After finishing the calculations, one can begin construction of the queries. The queries and the calculations work together in a series of steps which allows one to extract multiple bits of data per request to the server. This section will illustrate how to automate the construction and execution of queries for time extraction.

The following uses the sample data provided in Example 1. One should have finished all above calculation steps by this point.

The Substring: Extracting a single block

Utilizing 'number_of_bytes' variable (which in this case, the block size is 3 bytes), select a sub-string from the query using 'from ... for ...' syntax and MySQL's substring() function. Since this is simulating the first iteration of requests, the injection will start at position 1 in the result of the query, substring((select "Testing123") from 1 for 3). This would return "Tes", shown in Example 4.

Afterwards, change the hex result into decimal by a call to MySQL's conv() function, to change the base of the number from base 16, hexadecimal, to base 10, decimal. This effectively casts the substring's hexadecimal value into an integer value, shown in Example 5-B.

This allows one to apply any mathematical equation to the block to recreate the decimal number locally. This article presents an approach that uses integer division and modulus operators to avoid the syntax characters involved with bit shifting; however bit shifting is also a valid method of extracting the data.

The Query

Next, iterate from the previously calculated variable number_of_shifts (or s in the formulae) down to zero for the retrieval of this integer. After each iteration, apply a sequence of divisions and modulus, which returns a multiplication value for the decimal place to be extracted from the server.

During each iteration, re-calculate the divisor, which is the 'factor' found earlier raised to the power of the current shift. Since this is the first iteration, the shift is 7, resulting in a divisor equaling 2097152.

irb(main):018:0> divisor = factor ** 7
=> 2097152

Given the normalized data integer value is 5530995, the factor in this example is 8, and the divisor is 2097152 (8^7), the math looks something like:

5530995 div 8^7 mod 8

The result of this is multiplied by min_sleep to ensure extraction reliability, causing the query to look something like Example 6 when being injected.

In this example, the blind sleep is applied to the result of the div + mod query and multiplied by 'min_sleep'. The result from the div + mod math is 2, and the query sleeps for 10 seconds. This is multiplied by 'min_sleep' because it makes interpreting results easier, and less prone to errors due to latency on the wire. For instance, one may be trying to interpret results that last 1, 2, or 3 seconds, while after the multiplication one would be interpreting results that last 5, 10, 15 seconds, which are easier to correct for.

Evaluating Return Data

Next, multiply the remainder into a place holder, which will slowly accumulate the results of the value to be extracted. To do this, first divide the response time the request took by 'min_sleep', to get the result back from the query. Then multiply the result by the divisor and accumulate it into a variable.

irb(main):019:0> extraction_value += (2 * 2097152)
=> 4194304

Afterwards, decrement shifts s.

Completing the Round

This process is repeated 6 more times until 'number_of_shifts' reaches 0. For brevity, Table 2 shows all the iterations and the values they produce.

After all iterations are done and number_of_shifts s has been decremented to zero, one can take the accumulated total and convert this decimal number back into hex, and then back into ASCII. The following example uses MySQL's unhex() function, although there are thousands of ways to achieve this conversion:

Looking at the data that has been extracted, one can see the process has used 8 requests to extract 3 bytes (24 bits) of data. This also means that it has extracted 3 bits per request, as calculated earlier.

Finally, to finish the extraction one would re-iterate these steps, but instead applying all steps to a new sub-string which starts from where the previous extraction left off.

To visualize this process one can look at this psuedo-code to see how the iterations, shifts, factors, and divisors all fall into place in extracting our data:

Further Tests

To show the usefulness of this technique on a LAN as well as how all the variables mesh together, Table 3 shows how this would work if min_sleep m were equal to 1, the timeout t was 60, and factor f were 32. The target converted substring is '362479318121' , which equates to "Testi".

One can see if 'min_sleep' is set to 1 instead, it allows for 60 'sleep_segments', which results in allowing one to extract 5 bytes (40 bits) of the query result in 8 requests (5 bits per request).

Conclusion

This post has covered the math, algorithm, and fundamentals of timing extraction attacks against fully blind SQLi vulnerabilities. Many believe this technique is only exploitable on a local network, which is incorrect. We have shown that you can algorithmically account for network latency via the min_sleep factor to accomplish this attack over the Internet.

Conclusions For Defenders

Parameterized queries are currently the best solution for preventing the problem of SQL injection, but incident handlers and forensic analysts should note that the number of requests generated by this technique will be much lower than that of attacks using boolean enumeration. Because of this, the log footprint to extracted data ratio will be much smaller. It is relatively trivial for a forensic examiner to determine from a server log which data was taken (see the payload query in italics):

Note that this technique also requires the syntax characters of comma, both parenthesis, and the asterisk, so certain character filters and encodings will prevent the attack from succeeding to begin with.

See it in action!

This 45-minute video shows the HTTP GET requests, the response time, and the query results of the attack described in this article. It's extremely long because of the fact that its demonstrating a timing attack. Verbosity has been fully enabled so that the application logic can be easily understood by the user.

Note: Javascript required to see the above video. Alternatively, see it here.