13.3. Worm-Blocking Techniques

This section discusses techniques that have been researched and built at Symantec as alternative solutions in preventing first and second -generation exploits that worms use. We speculate that most worms would rather target vulnerable systems (though completely unprotected against overflows) because there are definitely more of such installations than protected systems.

From the attacker's perspective, it is currently pointless to make the exploit itself especially tricky because the attack could be successful without that effort. This basic conclusion comes from reviewing recent worms, such as Linux/Slapper and W32/Slammer, which were responsible for the most recent widespread outbreaks.

The techniques described in this section can effectively stop such attacks, but the set of ideas is arbitrary, and its purpose is to show how effective the solutions can be. It is by no means a complete set; rather, it is a demonstration of certain behavioral rules that can be effective enough against fast-spreading computer worms. Such behavioral rule enforcement might be a subsystem of a large access control system or could be combined with similar systems.

13.3.1. Injected Code Detection

One of the most common ways to execute code on a remote system is to run injected code in the address space of a victimized process. In most cases, the injected attack code will run from the stack or the heap, and it will eventually execute operating system or subsystem calls. Our goal is to detect, based on exploit profiles, the injected code execution at an early enough stage to stop the attack, or at least its spread, effectively. As such, we will be somewhat exploit-specific, but still sufficiently generic.

The benefits gained by stopping attacks as early as possible make all the efforts worthwhile. Accidental programming bugs, however, could falsely trigger attack detection. Such false positives can be avoided by using better attack profiling because good attack profiling can capture attack variations.

Systems that can detect code injection can be used to develop both manual and automated attack signatures. These signaturesbehavioral, binary, or bothcan then be distributed to systems that do not run injected code detection, but instead use the signatures to stop attacks.

For instance, a behavioral signature could include the telltale sequences of common API calls that the worm exploit code uses. On Windows, such signatures might include the sequences of API calls or single calls with certain characteristics, including GetProcAddress(), GetModuleHandle(), LoadLibrary(), CreateThread(), CreateProcess(), listen(), send(), sendto(), connect(), CreateFile(), and so on, as well as variations thereof. Functions responsible for creating user accounts also need to be protected.

The observation that many attacks use these APIs makes them prime targets for hooks, which can be used for early detectionfor example, by detecting that the caller of such APIs is on the heap or the stack. (Some similar techniques have been adopted in intrusion prevention systems such as Okena and Entercept.)

13.3.1.1 Shellcode Blocking via Code Injection Detection

UNIX-based worms and many Windows-based exploits execute a shell or a command prompt on a remote system. On a UNIX-based system, we typically see the execution of the execve() or a similar system call.

Examples of worms that use a shellcode-based attack include the Morris worm (which runs the attack on VAX systems), the Linux/ADM worm, FreeBSD/Scalper, the Linux/Slapper worm, and a large number of hacker exploits. These worms and exploits can be detected and prevented using the same attributes. Using the API attack profiles, the injected code can be detected and stopped early.

We can invoke our own safeguards within the process address spaces of key services by hooking selected APIs in user or kernel mode. When a selected API, such as execve() on UNIX or CreateProcess() on Windows, is executed, we trace the return address and check the kinds of attributes that the page has. Instead of checking only for the writeable attribute, we also can see whether the API was called from a location mapped in from a file. (Most legitimate code will have been loaded from executable files and will therefore have been mapped in from a file.)

Alternatively, we could watch only for stack execution, which would have lower performance impact due to fewer context switches. However, for appropriate server protection, we would also need to detect heap execution.

On Windows, the easiest way to determine whether a memory page is mapped from a file is to check for the SEC_IMAGE flag, because that is what this flag indicates. This technique is not susceptible to false positives from self-modifying packed code, but injected code from the stack or heap will still trigger detection. Optionally, we could prevent certain processes from executing selected APIs from writeable locations. These methods can potentially limit first and second-generation worms effectively.

The most promising feature of this idea is its capability to provide protection even for kernel-level (ring 0) attacks, as these techniques also can be used in this case. This is a great advantage over other solutions, which ignore the kernel mode and only apply in user mode.

Consider the examples in the following sections, which demonstrate the effectiveness of shellcode blocking techniques.

Example 1: Blocking a Microsoft SQL Server Exploit

David Litchfield's example exploit19 demonstrated a vulnerability in Microsoft SQL Server 2000. Microsoft patched the vulnerability at the time of the exploit's publication at the BlackHat Conference. Unfortunately, this attack was still effective against many systems even six months later. Obviously, many systems went unpatched, enabling the widespread outbreak of the Slammer worm, which took advantage of this vulnerability via a minor variant of this exploit code without using shellcode.

Let's see how the exploit code works:

First the attacker executes a utility such as nc (NetCat)20 to listen on a specified port. For example, when the attacker launches nc l p53, his/her system will begin listening on port 53.

The exploit tool (sqlexplo.exe) has four parameters:

Target IP address, which is of the attacked system

IP address of the attacking system

Opened port on the attacking system (53 in our example)

SQL Server service pack ID

The exploit uses a stack-based buffer overflow attack that reconnects to the attacking system and uses the CreateProcess() API to run "cmd.exe" (a Windows command prompt). In this attack, the shellcode is encrypted, which is an increasingly common trick that still presents the attack code as a string and avoids detection by signature-based IDS.

Let's examine the log file of a system that uses our shellcode-blocking prototype. When we execute the attack against a protected system, our NetCat window will not see a command prompt because the attack is thwarted. The prototype blocks the attack by hooking the CreateProcess() API and blocking if the call comes from a stack or heap address.

The detection of the caller's location is based on the return address of the CreateProcess() API. In our example, the intercepted CreateProcess() API has a return address of 0x2204dcf2, which has page attributes indicating that the page is a writeable, private page in the process address space of sqlserv.exe shown in Table 13.2:

Table 13.2. Need TH

Time

PIDLog entry

14.19224477

[460] Shellcode based Intrusion Detected!

14.19591311

[460] Return Address: 2204dcf2 (stack!)

14.19953704

[460] AllocationProtect=PAGE_READWRITE, Type=MEM_PRIVATE

19.02997363

[460] Shellcode based Intrusion Prevented!

Example 2: Blocking CodeRed's Exploit Code-Based Attack

Long after the peak period of the original CodeRed worm, some hackers created a new attack tool out of a modified version of the original worm's code by using the exploit portion and then by extending the payload to launch the shellcode.

A Web-based tool was used to generate the shellcode. Thus the attacker did not need to understand the exploit or the shellcode portion to create the attack buffer. Because the original CodeRed worm did not exist as a file, this attack was merely a dump that the attacker injected, using a tool such as NetCat.

As an example, the following command will inject the attack buffer on port 80 (HTTP) on a target system with the IP address 192.168.50.131:

[c:\test]nc 192.168.50.131 80 <CRSHELL2.BIN

This particular exploit is a typical shellcode-based attack. It executes cmd.exe, which is associated with a port on which the exploit code listens. When successfully executed, the exploit listens on the attacked system on port 8008. Therefore, the attacker can reuse NetCat and connect to this port, leading to a command prompt that provides complete access to the remote system:

When shellcode blocking is active, the attack will not succeed, based on exactly the same criterion seen in the previous example. We successfully detected the attack based on the stack and the return address shown in Table 13.3

Table 13.3. The Log of Blocking the Shellcode of CodeRed Worm

Time

PIDLog entry

7.12189255

[636] Shellcode based Intrusion Detected!

7.12214063

[636] Return Address: 00aff6bb (stack!)

7.12234848

[636] AllocationProtect=PAGE_READWRITE, Type=MEM_PRIVATE

9.19175122

[636] Shellcode based Intrusion Prevented!

Note

Other means can prevent these exploits, but in these examples, we focused strictly on the idea of shellcode blocking itself.

Example 3: Blocking W32/Blaster's Shellcode-Based Attack

The Blaster worm21 appeared on August 11, 2003, and exploited DCOM RPC vulnerability via a shellcode-based attack. Blaster is the first Win32 worm to have used the shellcode technique, previously seen only in UNIX worms. Therefore this tendency was properly predicted, and shellcode blocking indeed managed to stop Blaster from successfully infecting a vulnerable system.

The Blaster worm was responsible for the largest outbreak on 32-bit Windows systems so far. Based on various estimates, it infected well over a million systems worldwide!

The attack is blocked when the vulnerable DLL (rpcss.dll) is exploited in the context of the svchost.exe container process. The criterion to stop the attack is very similar to that of previously demonstrated examples. We can detect and block the attack based on a return address that points to a stack on call of the CreateProcess() API.

Table 13.4. The Log of Blocking the Shellcode of Blaster Worm

Time

PIDLog entry

171.67155490

[440] Shell code based Intrusion Detected!

171.67394096

[440] ReturnAddress: 0052f976 (stack!)

171.67632730

[440] AllocationProtect=PAGE_READWRITE, Type=MEM_PRIVATE

239.61852470

[440] Shell code based Intrusion Prevented!

Example 4: Blocking W32/Welchia's Shellcode-Based Attack

The Welchia worm was developed as a counterattack against Blaster. Welchia attempts to fight Blaster.A infections by deleting the worm from the system and installing patches against the RPC exploit. Welchia uses two buffer overflow exploits instead of one because a Blaster-infected system could not be exploited again. One of Welchia's attack codes exploits the same vulnerability as Blaster.

The shellcodes of the two worms have nothing in common as a sequence of bytes because Welchia's shellcode was rewritten by the attacker. The second exploit was known as the "WebDav"NTDLL.DLL exploit. (We predicted that this vulnerability would be exploited by a Windows worm in a matter of a few months.) The two attacks ultimately used the same shellcode as in the first exploit to execute cmd.exe for the attacker system on the remote machine.

Welchia could be successfully stopped with a shellcode-blocking system for both exploits:

13.3.2. Send Blocking: An Example of Blocking Self-Sending Code

Worms like W32/CodeRed and W32/Slammer do not exist as files on the host computer. Rather, such worms dynamically locate the addresses of a few APIs that they need to call within the address space of a vulnerable host process, and they keep running as part of such a process.

One particular API is important for such worms: a send function to propagate the worm's code on the network to new locations. Worms like CodeRed and Slammer use the WINSOCK library APIs, such as WS2_32!send() or WS2_32!sendto(), to send themselves to new targets on TCP or UDP.

Send blocking takes advantage of these worm characteristics. A set of API hooks is put in place to filter the send APIs on the system. When a send() or sendto() API is called, the call is monitored, and the parameters are examined.

First, a stack-tracing function takes place to identify the caller's location. The return address of the API will point into the caller's code. We call this point the caller's address (CA). We suspect that the code near the CA may be that of a computer worm. To determine whether the code near the CA is a worm, we need to see whether the CA is within the address range of a buffer being sent.

Consider the example of a send() function on a Windows system (see Listing 13.6).

Worms that use the send() API will use it to transfer themselves from an active process on the system by sending their code in the buf parameter of the API. In our hook procedure, we can check where buf points to and see whether CA is located in the actual range of the buf[] area. This can be easily checked using the following conditional (true when the worm is suspected):

buf<=CA<buf+len

where len is typically the size of the worm.

Using this technique, we can detect blocks of code that attempt to use the send() API to send to themselves, and we can prevent this code from propagating to new addressesthereby stopping fast-spreading worms.

Consider the examples in the following sections, which demonstrate the effectiveness of send-blocking techniques.

13.3.2.1 Blocking the W32/Slammer Worm

Slammer uses the WS2_32!sentto() API to send itself to new targets. In the example log entry that follows, from an infection attempt on a protected system, the sendto() API receives a pointer to a buffer located at 0x1050db73. The worm attempts to send 376 bytes. The stack trace function determines the CA of sendto() as 0x1050dce9.

The conditions of this call satisfy our blocking criteria, as CA is in the range of buf: 0x1050db73 <=0x1050dce9 < 0x1050dceb. In this example, we block the Slammer worm when it attempts to send itself to a randomly generated IP address of 186.63.210.15 on UDP port 1434 (SQL Server).

Here we see that we have experienced an API call from an address 0x0041dcea, which is located on the heap of the inetinfo.exe (IIS Service process). The actual body of the worm in this example is 3,569 bytes. The start of the buffer is at 0x0041d246; the end of the buffer is at 0x0041d246+3569=0x41e037. Thus the criterion for blocking is met because 0x41dcae is in the range of the buf: 0x41d246 <=0x41dcae < 0x41e037.

We can block such unwanted events by terminating the host process in which the attack is detected. Such blocking can at least prevent the propagation of detected worms until security updates are applied. In this way, we reduce the attack of a full-blown worm outbreak to a short-term DoS. Hopefully, the fact that the attack is detected and blocked at the same time will result in a quicker and more appropriate security response in general.

An attacker could thwart this kind of send blocking by allocating a buffer, copying the code into the buffer, and then sending that buffer, thereby masking the self-sending behavior from this detection method. To prevent such an attack specifically, we can compare the buffer being sent with code around the CA. However, most worms can be prevented by the shellcode-blocking approach. Thus even W32/Witty22, which does not send its running code but its copy from the heap, is covered by the shellcode-blocking technique (Witty's attack is explained in detail in Chapter 15). Send blocking is an additional safeguard because it will detect self-sending code originating from a page-marked executable.

Another important feature of this blocking technique is that it can capture the worm body. A scanner system, such as an antivirus or IDS system, can then use the captured code to identify the worm exactly. If the attack turns out to be new, the captured code can be sent to another system for automatic or manual IDS and/or AV signature generation.

Once the signature is distributed, pass-through IDS systems, firewalls, and other gateway scanning systems can block network traffic that matches the signature. Such a system has the potential for largescale automatic detection and blocking of exploit use and worm outbreaks with a short security response time.

13.3.3. Exception Handler Validation

On operating systems such as Windows 9x and Windows NT/2000/XP, programmers can use structured exception handling (SEH) to catch programming errors or naturally problematic situations.

Windows systems implement SEH using stack-based structures. A chain of exception handlers for the current thread is available in the thread information block (TIB) located at FS:[0].

Whenever there is an exception, the OS kernel eventually executes a user mode exception handler dispatcher. On Windows NTbased systems, this function is called KiUserExceptionDispatcher() and is part of NTDLL.DLL (the native API). The dispatcher routine walks by a chain of exception handler frames each time an exception occurs. If an exception handler is available, the dispatcher will run the handler when a problem such as a GP fault, division by zero, and so on, occurs.

The idea of exception handler validation is to hook the KiUserExceptionDispatcher() so that before the original exception handling can take place, the hook routine performs the exception handler validation, consisting of the following critical checks that prevent the execution of possible attacks:

If the exception frame addresses are not in the proper order, the execution of the handler can be blocked. Each successive exception frame should be on a higher address.

If an exception handler's address is on the stack or heap, executing such handlers can be blocked.

If an exception frame pointer is invalid, exception handling can be blocked, or the thread or process can be terminated.

Consider the following exploit examples that can be prevented based on these three criteria.

13.3.3.1 Wrong Exception Handler Order

For example, an exploit targeting the Microsoft IIS Servers via the "WebDav"NTDLL.DLL vulnerability is blocked, based on the wrong exception handler order criteria. See Table 13.6, which shows the exception frame addresses of 0x00f5ecdc, 0x00f5ef84, and 0x00c100c1 (!). The attacker hopes to execute the passed-in shellcode on the heap at location 0x00c100c1. This address is only a guess. Depending on the actual heap layout of the attacked process, the attacker might need to adjust this value manually for different systems or even for the same system at different times.

*I allowed the attack to continue to have a complete log at this point.

67

61.75953962

[736]

Checking exception frame ptr: 00c100c1

68

61.76222655

[736]

AllocationProtect:00000004 (PAGE_READWRITE)

69

61.76243524

[736]

Type: 00020000 (MEM_PRIVATE)

70

61.76277886

[736]

Exception frame ptr seems fine!

71

61.76297497

[736]

Found exception frame at: 00c100c1

72

61.76317695

[736]

Found exception handler at: 4e4e4e4e

73

61.76358091

[736]

Bad exception handler detected!

*I allowed the attack to continue to have a complete log at this point.

74

64.54264228

[736]

Checking exception frame ptr: 4e4e4e4e

75

64.54291634

[736]

AllocationProtect:00000000 (INVALID!)

76

64.54310491

[736]

Type: 00000000 (INVALID!)

77

64.98332191

[736]

Bad exception frame pointer detected!

When the attacker's stack-based buffer overflow is successful, the value 0x00c100c1 will overwrite the address of an exception handler. The overflow will also overwrite other exception frame pointers. These corruptions create conditions in which the exception frames are out of order, and thus can be detected.

Note

In this example, the attack could have been stopped at phase 66, but I let the attack continue to log all the exception handling problems.

This particular attack can be detected and prevented even earlier, based on the exception handler's location.

13.3.3.2 Exception Handler on Heap or on Stack

This is the same idea described for injected code blocking, and it can be easily performed by checking for the IMAGE_SEC attribute on the page containing the actual exception handler to see whether it was mapped from a file. The previously described exploit example also can be stopped based on this criterion.

13.3.3.3 Exception Frame Pointer Is Invalid

Computer worms such as W32/CodeRed overwrite a particular exception handler frame stored on the stack of a particular thread. When the buggy DLL, in which the overflow occurred, realizes that some of the stack parameters to a function are incorrect, an exception is raised. As a result, KiUserExceptionDispatcher() will be triggered. However, W32/CodeRed sets up a new handler that runs the startup code of the worm. W32/CodeRed uses a trampoline technique to run the worm body. As part of its trampoline, the worm corrupts an exception handler pointer, so that it points to the code inside the Visual C run-time library, MSVCRT.DLL, at 0x7801cbd3.

This location appears to be a valid handler because it is not located on the heap or the stack. As a result, its incorrectness cannot be easily detected as noted in Phase 59 of Table 13.7. However, the next exception frame pointer is overflowed with the value 0x68589090, which points to a completely invalid location; this is how this criterion can be used to stop this attack. In the absence of our blocking techniques, KiUserExceptionDispatcher() would run the "exception handler" at 0x7801cbd3. This triggers the worm or an exploit because the instructions at that address are expected to return control to the stackto the worm start code that will eventually find the worm body on the heap inside the (illegal) body of a GET request and then execute it. Consider Table 13.7 for an illustration of the blocking feature in action.

Table 13.7. Detecting and Preventing CodeRed and Related Exploits

Phase

Time

PID

Action logged

52

13.02454613

[676]

Entering to SEH Dispatcher

53

13.02489813

[676]

Checking exception frame ptr: 016af094

54

13.02512777

[676]

AllocationProtect=00000004 (PAGE_READWRITE)

55

13.02533142

[676]

Type=00020000 (MEM_PRIVATE)

56

13.02553005

[676]

Exception frame ptr seems fine!

57

13.02573455

[676]

Found exception frame at: 016af094

58

13.02593904

[676]

Found exception handler at: 7801cbd3

59

13.02616114

[676]

Exception handler seems fine! (Note: )

60

13.02636647

[676]

Checking exception frame ptr: 68589090

61

13.02664640

[676]

AllocationProtect=00000000 (INVALID!)

62

13.02685173

[676]

Type=00000000 (INVALID!)

63

13.02704952

[676]

Bad exception frame pointer detected!

One of the most common attacks on Windows systems is the smashing of stack-based exception handler frames. Using simple modifications to the previously mentioned exception handling dispatch routine can easily prevent such attacks. Surprisingly, older Windows systems did not implement similar safeguards, but Microsoft introduced some changes in Windows XP, SP2.

13.3.4. Other Return-to-LIBC Attack Mitigation Techniques

In the case of a return-to-LIBC attack, the attacker typically, cleverly overflows the stack in such a way that a return address will point to a library function in a loaded library inside the process address space.

Therefore when the overflowed process uses the return address, a library function (or a chained set of library functions) is executed. The attacker has a chance to run at least one API, such as CreateProcess() on Windows or execve() on UNIX, to remotely run a command shell, thereby compromising the system. The attacker must also place the parameters properly for the desired function call on the stack via the overflow.

This trick poses a serious problem for prevention solutions that rely solely on stopping stack or heap execution.

13.3.4.1 Process Address Space Randomization

The predictability of process address space layouts is one of the major problems that must be addressed. By default, each executable, as well as each dynamic library, has a base address that specifies where the module is supposed to be loaded in the process address space. Modules have a relocation section that contains required information if the module cannot be placed at its preferred location because something else has already been loaded there. In this case, the system uses the relocation information to "relocate" the image by patching the executable image in memory.

When compared to not performing this action at all, this relocation work is expensive. It also creates an extra load on system memory and the paging file. Due to the performance and resource benefits, many DLLs and processes are rebased and "bound" to avoid relocation and memory image patching. This is especially true of common, shared code, such as CRT and system code. Unfortunately, this benefit has a drawback: Attackers can predict where code will be in a target application's address space.

The idea of process address space randomization is inspired by the fact that many attacks depend on hard-coded locations. If the attacker can predict the location of the global offset table (GOT) entry in the ELF files, he/she will be able to patch the table. An attacker who can predict the location of a particular code pattern in an address space of the targeted process can take advantage of this knowledge.

For instance, the W32/CodeRed worm clearly depends on the hard-coded address 0x7801cbd3. If this location does not have the particular instruction sequence required to pass control to the proper place, the attack will fail.

If we can always manage to trick the operating system's loader into loading process modules at different addresses, the attacker will have a more difficult time predicting hard-coded addresses. This can be achieved by various means: one of the easiest is to rebase the images on disk at least once in a while (although this method might cause problems with digitally signed code).

Dynamic rebasing is feasible, but it could have a significant impact on performance (in addition to the increase in load time) because more copy-on-write pages take up more physical memory and page file space. Furthermore, some modules might not like to be moved around.

When modules are not placed in predictable locations, the attacker has an extra obstacle to overcome. An attacker must use brute-force methods and more difficult information leakage techniques to craft an attack. Overcoming these extra hurdles will slow the attack and make it noisierand therefore more obvious. For example, incorrect overflows usually result in a large number of crashes, which can be considered early evidence of an attack.

Note

Some worms do not always land on library calls. For example, the Blaster worm lands on the Unicode.nls memory-mapped file on Windows 2000 systems.

13.3.4.2 Detecting Direct Library Function Invocations

A typical legitimate API call involves pushing parameters onto the stack, followed by a call instruction. Executing the call instruction results in pushing the return address onto the stack. At exactly the point after the call instruction has been executed (before the called function sets up its own stack frame), the top of the stack [ESP] contains the return address, which is the address of the instruction immediately following the issued call instruction. See Figure 13.6 for an illustration of the stack.

Figure 13.6. Stack under normal call conditions.

In a typical stack overflow situation, control is diverted from its originally intended path by overwriting the stack location containing the originally intended return address. In a return-to-LIBC attack, the overwritten value is the address of the attacker's intended API (that is, CreateProcess() on Windows or execve() on UNIX for a shellcode attack). Besides overwriting the return address with that of an intended API, the attacker also must place on the stack what appears to be a return address (the simulated "return address" in Figure 13.6) and the parameters to that API call.

The simulated return address must be on the stack because the called API expects it to be there and will not otherwise get the parameters correctly. The value of this simulated return address is not relevant unless the attacker needs to run something else after the call (if the call runs shellcode, the attacker does not need to execute anything after the call) or unless the attacker needs to deceive some overflow detection technique. See Figure 13.6.

When the function that fell victim to the stack overflow executes a RET instruction, instead of returning to the caller, control is diverted to the API that the attacker intended to target. Executing the RET instruction results in popping the "return address," which is the API's address, off the stack and into the EIP register. In such an overflow situation, at exactly the point after the RET instruction has been executed, [ESP-4] will contain the address of the intended API call because the previous top of the stack will be at this location and will be untouched. See Figure 13.7 for an illustration of the stack.

Figure 13.7. Stack in crafted, return-to-LIBC condition.

This is the key to the anti-return-to-LIBC technique. The address of the "called" API appears at [ESP-4] when control is transferred to the API via a RET instruction. This condition is unlikely to occur otherwise. Thus the suggested technique is to have certain APIs hooked and to have our hook procedures check for their own addresses at [ESP-4], at the point of invocation.

If this condition is met, the call is suspected to be a return-to-LIBC attack and can be blocked. This technique would return a false positive for legitimate code that pushes an API address and would transfer control there via a RET instruction; however, most compiled code does not perform this.

Section 13.3.1.1 described a technique whereby certain APIs are hooked, and the hooking routine examines the page attributes of the return address to see whether the call originated from somewhere it should not have, such as on the stack or the heap. This process is useful for detecting code injection attacks, which transfer control to the code on the stack or in the heap, which then calls into such hooked APIs.

For a return-to-LIBC attack, this technique is insufficient because there is no real "return address" to examine. The "call" is really a RET. Even if our hook procedure could see where control was transferred from, it would be to a RET instruction within some legitimate code page and thus would not be from the stack or the heap.

Further, if the attacker were able to manipulate the stack so that a RET instruction would transfer control to an API and make proper parameters available, the attacker could, to a point, make the stack appear consistent with a legitimate call instruction invocation of the API.

The attacker would need to place a legitimate code page address on the stack on which our hook function expects to see a return addressif the transfer of control happened via a call instruction. (See the simulated "return address" in Figure 13.7 for an illustrated example.)

When our hook procedure is invoked, it will look at the top of the stack (ESP) to find such a return address. In this scenario, our hook procedure would find a simulated return address that is not from the heap or the stack. However, our new technique would still detect the attack because the API address would match the contents at [ESP-4], the previous top of the stack.

Our first thought was to verify that a transfer of control came from a call instruction by returning to examine the code at the assumed return address (at the location on the top of the stack [ESP]), disassembling the instruction at the location before the return address and verifying that such code is a call instruction. This technique would be susceptible to the type of manipulation just described because the attacker could easily point the simulated return address to the instruction following a legitimate call instructiona pre-existing one somewhere in legitimate code or one crafted through the overflow.

If this antioverflow technique did not also check for heap and stack pages, the simulated return address could point to code that the attacker placed on the stack or in the heap through the overflow. Such code looks like a legitimate call to this type of verification technique.

To summarize, we can detect return-to-LIBC attacks by hooking key APIs and having our hook routines, at the exact point of entry, check for their own addresses at [ESP-4]. Combining this technique with the other described call verification techniqueschecking for a return address on the stack or the heap and checking for an actual call instruction at the expected locationwith the load address randomization technique and the exception dispatch verification techniques should significantly raise the bar for attackers.

13.3.5. "GOT" and "IAT" Page Attributes

Attackers often abuse obvious function address locations, such as the GOT, by redirecting the function addresses. For instance, the Linux/Slapper worm2 uses this technique to run its shellcode on the heap of an Apache server process by exploiting an OpenSSL vulnerability and redirecting the address of the free() library function in the GOT.

This raises the following questions: Why should such function address locations always be writeable ELF (UNIX) or PE (Windows) executable files (the IAT is optionally writeable in the case of some linker versions)? Shouldn't they be read-only most of the time?

For most applications, these tables only need to be writeable by the loader when performing fixes. They could safely be marked read-only after the fixes have been completed, which happens at the earliest stages of the loading process. Not surprisingly, some OS vendors have recognized the validity of this idea and have incorporated it into the operating system itself. Some new releases of OpenBSD implement this idea for the GOT.

Another good example of this is the Windows XP kernel mode service table, which is no longer writeable by default, at least on systems with 128MB or less of physical memory. Even kernel-mode drivers (in ring 0) must take extra steps to hook the service table, rather than simply patch it as they do in Windows NT/2000.

Note

The kernel-mode service table is nonwriteable on systems with 128MB of memory or less when the read-only kernel memory is on, as discussed in Chapter 12, "Memory Scanning and Disinfection."

13.3.6. High Number of Connections and Connection Errors

The preceding ideas focused on techniques for blocking malicious buffer overflow attacks. Although these ideas are particularly useful in stopping worm replication, they are only a subset of the possible methods that can be used against fast-spreading worms.

An even more generalized worm behavior-blocking rule is to detect abnormally high connection rates to novel systems and then delay such connections to slow possible worm replication. HP researchers found virus throttling23 useful against a variety of worms, including script-based, binary-based, and even injected threats, such as the W32/CodeRed or W32/Slammer worms.

The basic idea of fast-spreading worms is to locate new targets rapidly on the Internet. Unless the worm has preselected known targets, scanning will result in a large number of connection failures; typically a successful worm will result in a large number of connection successes.

An abnormally high frequency and/or quantity of connection attempts, successes, and/or failures can be used to detect and stop worm-like behavior. In addition, the targeting algorithms of current worms are random when compared to nonworm connection patterns; that is, both successful and failed connection patterns of a worm are likely to display a high degree of entropy. This too can be used to detect and stop worm-like behavior.

Unlike most legitimate network applications, worms do not usually perform a name resolution before attempting to connect to a target; most worms generate their list of IP addresses and do not use names. Thus connecting to an address without prior name-lookup activity also can be used to detect and stop worm-like behavior.

These ideas provide additional means to detect and slow fast-spreading worms. The challenges for such systems are the same as for those of other blocking techniques because the attacker's code is already running on the system when the connections occur. This can lead to retroviral-type conditions, where the system is susceptible to attacks that target the defenses themselves. Moreover, techniques that are overly generic are often not deployable in real-world environments because of the high number of false positives.

In addition, these ideas may have an interesting impact on future worm developments, as described in the next section.Windows XP SP2 implemented a similar feature to virus throttling by not allowing programs to aggressively scan for other systems on the network.