Executable space protection

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit (no-execute bit), or in some cases software emulation of those features. However technologies that somehow emulate or supply an NX bit will usually impose a measurable overhead; while using a hardware-supplied NX bit imposes no measurable overhead.

The Burroughs 5000 offered hardware support for executable-space protection on its introduction in 1961; that capability remained in its successors until at least 2006. In its implementation of tagged architecture, each word of memory had an associated, hidden tag bit designating it code or data. Thus user programs cannot write or even read a program word, and data words cannot be executed.

If an operating system can mark some or all writable regions of memory as non-executable, it may be able to prevent the stack and heap memory areas from being executable. This helps to prevent certain buffer-overflowexploits from succeeding, particularly those that inject and execute code, such as the Sasser and Blaster worms. These attacks rely on some part of memory, usually the stack, being both writeable and executable; if it is not, the attack fails.

OS implementations

Many operating systems implement or have an available executable space protection policy. Here is a list of such systems in alphabetical order, each with technologies ordered from newest to oldest.

For some technologies, there is a summary which gives the major features each technology supports. The summary is structured as below.

Other Supported: (None) or (Comma separated list of CPU architectures)

Standard Distribution: (No) or (Yes) or (Comma separated list of distributions or versions which support the technology)

Release Date: (Date of first release)

A technology supplying Architecture Independent emulation will be functional on all processors which aren't hardware supported. The "Other Supported" line is for processors which allow some grey-area method, where an explicit NX bit doesn't exist yet hardware allows one to be emulated in some way.

Android

As of Android 2.3 and later, architectures which support it have non-executable pages by default, including non-executable stack and heap.[1][2][3]

FreeBSD

Initial support for the NX bit, on x86-64 and IA-32 processors that support it, first appeared in FreeBSD -CURRENT on June 8, 2004. It has been in FreeBSD releases since the 5.3 release.

Linux

The Linux kernel supports the NX bit on x86-64 and IA-32 processors that support it, such as modern 64-bit processors made by AMD, Intel, Transmeta and VIA. The support for this feature in the 64-bit mode on x86-64 CPUs was added in 2004 by Andi Kleen, and later the same year, Ingo Molnar added support for it in 32-bit mode on 64-bit CPUs. These features have been part of the Linux kernel mainline since the release of kernel version 2.6.8 in August 2004.[4]

The availability of the NX bit on 32-bit x86 kernels, which may run on both 32-bit x86 CPUs and 64-bit IA-32-compatible CPUs, is significant because a 32-bit x86 kernel would not normally expect the NX bit that an AMD64 or IA-64 supplies; the NX enabler patch assures that these kernels will attempt to use the NX bit if present.

NX memory protection has always been available in Ubuntu for any systems that had the hardware to support it and ran the 64-bit kernel or the 32-bit server kernel. The 32-bit PAE desktop kernel (linux-image-generic-pae) in Ubuntu 9.10 and later, also provides the PAE mode needed for hardware with the NX CPU feature. For systems that lack NX hardware, the 32-bit kernels now provide an approximation of the NX CPU feature via software emulation that can help block many exploits an attacker might run from stack or heap memory.

Non-execute functionality has also been present for other non-x86 processors supporting this functionality for many releases.

Exec Shield

Red Hat kernel developer Ingo Molnar released a Linux kernel patch named Exec Shield to approximate and utilize NX functionality on 32-bit x86 CPUs.The Exec Shield patch was released to the Linux kernel mailing list on May 2, 2003, but was rejected for merging with the base kernel because it involved some intrusive changes to core code in order to handle the complex parts of the emulation. Exec Shield's legacy CPU support approximates NX emulation by tracking the upper code segment limit. This imposes only a few cycles of overhead during context switches, which is for all intents and purposes immeasurable. For legacy CPUs without an NX bit, Exec Shield fails to protect pages below the code segment limit; an mprotect() call to mark higher memory, such as the stack, executable will mark all memory below that limit executable as well. Thus, in these situations, Exec Shield's schemes fails. This is the cost of Exec Shield's low overhead. Exec Shield checks for two ELF header markings, which dictate whether the stack or heap needs to be executable. These are called PT_GNU_STACK and PT_GNU_HEAP respectively. Exec Shield allows these controls to be set for both binary executables and for libraries; if an executable loads a library requiring a given restriction relaxed, the executable will inherit that marking and have that restriction relaxed.

PaX

The PaX NX technology can emulate NX functionality, or use a hardware NX bit. PaX works on x86 CPUs that do not have the NX bit, such as 32-bit x86. The Linux kernel still does not ship with PaX (as of May, 2007); the patch must be merged manually.

PaX provides two methods of NX bit emulation, called SEGMEXEC and PAGEEXEC. The SEGMEXEC method imposes a measurable but low overhead, typically less than 1%, which is a constant scalar incurred due to the virtual memory mirroring used for the separation between execution and data accesses.[5] SEGMEXEC also has the effect of halving the task's virtual address space, allowing the task to access less memory than it normally could. This is not a problem until the task requires access to more than half the normal address space, which is rare. SEGMEXEC does not cause programs to use more system memory (i.e. RAM), it only restricts how much they can access. On 32-bit CPUs, this becomes 1.5 GB rather than 3 GB.

PaX supplies a method similar to Exec Shield's approximation in the PAGEEXEC as a speedup; however, when higher memory is marked executable, this method loses its protections. In these cases, PaX falls back to the older, variable-overhead method used by PAGEEXEC to protect pages below the CS limit, which may become quite a high-overhead operation in certain memory access patterns. When the PAGEEXEC method is used on a CPU supplying a hardware NX bit, the hardware NX bit is used, thus no significant overhead is incurred.

PaX supplies mprotect() restrictions to prevent programs from marking memory in ways that produce memory useful for a potential exploit. This policy causes certain applications to cease to function, but it can be disabled for affected programs.

PaX allows individual control over the following functions of the technology for each binary executable:

PaX ignores both PT_GNU_STACK and PT_GNU_HEAP. In the past, PaX had a configuration option to honor these settings but that option has been removed for security reasons, as it was deemed not useful. The same results of PT_GNU_STACK can normally be attained by disabling mprotect() restrictions, as the program will normally mprotect() the stack on load. This may not always be true; for situations where this fails, simply disabling both PAGEEXEC and SEGMEXEC will effectively remove all executable space restrictions, giving the task the same protections on its executable space as a non-PaX system.

Architectures that can only support these with region granularity are: i386 (without PAE), other powerpc (such as macppc).

Other architectures do not benefit from non-executable stack or heap; NetBSD does not by default use any software emulation to offer these features on those architectures.

OpenBSD

A technology in the OpenBSDoperating system, known as W^X, marks writable pages by default as non-executable on processors that support that. On 32-bit x86 processors, the code segment is set to include only part of the address space, to provide some level of executable space protection.

Solaris

Solaris has supported globally disabling stack execution on SPARC processors since Solaris 2.6 (1997); in Solaris 9 (2002), support for disabling stack execution on a per-executable basis was added.

Windows

Starting with Windows XPService Pack 2 (2004) and Windows Server 2003 Service Pack 1 (2005), the NX features were implemented for the first time on the x86 architecture. Executable space protection on Windows is called "Data Execution Prevention" (DEP).

Under Windows XP or Server 2003 NX protection was used on critical Windows services exclusively by default. If the x86 processor supported this feature in hardware, then the NX features were turned on automatically in Windows XP/Server 2003 by default. If the feature was not supported by the x86 processor, then no protection was given.

Early implementations of DEP provided no address space layout randomization (ASLR), which allowed potential return-to-libc attacks that could have been feasibly used to disable DEP during an attack.[7] The PaX documentation elaborates on why ASLR is necessary;[8] a proof-of-concept was produced detailing a method by which DEP could be circumvented in the absence of ASLR.[9] It may be possible to develop a successful attack if the address of prepared data such as corrupted images or MP3s can be known by the attacker.

Microsoft added ASLR functionality in Windows Vista and Windows Server 2008. On this platform, DEP is implemented through the automatic use of PAEkernel in 32-bit Windows and the native support on 64-bit kernels. Windows Vista DEP works by marking certain parts of memory as being intended to hold only data, which the NX or XD bit enabled processor then understands as non-executable.[10] In Windows, from version Vista, whether DEP is enabled or disabled for a particular process can be viewed on the Processes/Details tab in the Windows Task Manager.

Windows implements software DEP (without the use of the NX bit) through Microsoft's "Safe Structured Exception Handling" (SafeSEH). For properly compiled applications, SafeSEH checks that, when an exception is raised during program execution, the exception's handler is one defined by the application as it was originally compiled. The effect of this protection is that an attacker is not able to add his own exception handler which he has stored in a data page through unchecked program input.[10][11]

When NX is supported, it is enabled by default. Windows allows programs to control which pages disallow execution through its API as well as through the section headers in a PE file. In the API, runtime access to the NX bit is exposed through the Win32 API calls VirtualAlloc[Ex] and VirtualProtect[Ex]. Each page may be individually flagged as executable or non-executable. Despite the lack of previous x86 hardware support, both executable and non-executable page settings have been provided since the beginning. On pre-NX CPUs, the presence of the 'executable' attribute has no effect. It was documented as if it did function, and, as a result, most programmers used it properly. In the PE file format, each section can specify its executability. The execution flag has existed since the beginning of the format and standard linkers have always used this flag correctly, even long before the NX bit. Because of this, Windows is able to enforce the NX bit on old programs. Assuming the programmer complied with "best practices", applications should work correctly now that NX is actually enforced. Only in a few cases have there been problems; Microsoft's own .NET Runtime had problems with the NX bit and was updated.

Xbox

In Microsoft's Xbox, although the CPU does not have the NX bit, newer versions of the XDK set the code segment limit to the beginning of the kernel's .data section (no code should be after this point in normal circumstances). Starting with version 51xx, this change was also implemented into the kernel of new Xboxes. This broke the techniques old exploits used to become a TSR. However, new versions were quickly released supporting this new version because the fundamental exploit was unaffected.

Limitations

Where code is written and executed at runtime—a JIT compiler is a prominent example—the compiler can potentially be used to produce exploit code (e.g. using JIT Spray) that has been flagged for execution and therefore would not be trapped.[12][13]

In information security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations.

Buffers are areas of memory set aside to hold data, often while moving it from one section of a program to another, or between programs. Buffer overflows can often be triggered by malformed inputs; if one assumes all inputs will be smaller than a certain size and the buffer is created to be that size, then an anomalous transaction that produces more data could cause it to write past the end of the buffer. If this overwrites adjacent data or executable code, this may result in erratic program behavior, including memory access errors, incorrect results, and crashes.

Exploiting the behavior of a buffer overflow is a well-known security exploit. On many systems, the memory layout of a program, or the system as a whole, is well defined. By sending in data designed to cause a buffer overflow, it is possible to write into areas known to hold executable code and replace it with malicious code, or to selectively overwrite data pertaining to the program's state, therefore causing behavior that was not intended by the original programmer. Buffers are widespread in operating system (OS) code, so it is possible to make attacks that perform privilege escalation and gain unlimited access to the computer's resources. The famed Morris worm in 1988 used this as one of its attack techniques.

Programming languages commonly associated with buffer overflows include C and C++, which provide no built-in protection against accessing or overwriting data in any part of memory and do not automatically check that data written to an array (the built-in buffer type) is within the boundaries of that array. Bounds checking can prevent buffer overflows, but requires additional code and processing time. Modern operating systems use a variety of techniques to combat malicious buffer overflows, notably by randomizing the layout of memory, or deliberately leaving space between buffers and looking for actions that write into those areas ("canaries").

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.

Typically, buffer overflow protection modifies the organization of stack-allocated data so it includes a canary value that, when destroyed by a stack buffer overflow, shows that a buffer preceding it in memory has been overflowed. By verifying the canary value, execution of the affected program can be terminated, preventing it from misbehaving or from allowing an attacker to take control over it. Other buffer overflow protection techniques include bounds checking, which checks accesses to each allocated block of memory so they cannot go beyond the actually allocated space, and tagging, which ensures that memory allocated for storing data cannot contain executable code.

Overfilling a buffer allocated on the stack is more likely to influence program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls. However, similar implementation-specific protections also exist against heap-based overflows.

There are several implementations of buffer overflow protection, including those for the GNU Compiler Collection, LLVM, Microsoft Visual Studio, and other compilers.

Technical variations of Linux distributions include support for different hardware devices and systems or software package configurations. Organizational differences may be motivated by historical reasons. Other criteria include security, including how quickly security upgrades are available; ease of package management; and number of packages available.

These tables compare each active and noteworthy distribution's latest stable release on wide-ranging objective criteria. It does not cover each operating system's subjective merits, branches marked as unstable or beta, nor compare Linux distributions with other operating systems.

In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is a way of executing computer code that involves compilation during execution of a program – at run time – rather than prior to execution. Most often, this consists of source code or more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.

JIT compilation is a combination of the two traditional approaches to translation to machine code – ahead-of-time compilation (AOT), and interpretation – and combines some advantages and drawbacks of both. Roughly, JIT compilation combines the speed of compiled code with the flexibility of interpretation, with the overhead of an interpreter and the additional overhead of compiling (not just interpreting). JIT compilation is a form of dynamic compilation, and allows adaptive optimization such as dynamic recompilation and microarchitecture-specific speedups – thus, in theory, JIT compilation can yield faster execution than static compilation. Interpretation and JIT compilation are particularly suited for dynamic programming languages, as the runtime system can handle late-bound data types and enforce security guarantees.

MINIX 3 is a project to create a small, high availability, high functioning Unix-like operating system. It is published under a BSD license and is a successor project to the earlier versions, MINIX 1 and 2.

The main goal of the project is for the system to be fault-tolerant by detecting and repairing its own faults on the fly, with no user intervention. The main uses of the system are envisaged to be embedded systems and education.As of 2017, MINIX 3 supports IA-32 and ARM architecture processors. It can also run on emulators or virtual machines, such as Bochs, VMware Workstation, Microsoft Virtual PC, Oracle VirtualBox, and QEMU. A port to PowerPC architecture is in development.The distribution comes on a live CD and can be downloaded as a live USB stick image. The latest release is "minix_R3.4.0rc6-d5e4fc0.iso.bz2" (09 May 2017).MINIX 3 is believed to be used in the Intel Management Engine (ME) found in Intel's Platform Controller Hub starting with the introduction of ME 11 which is used with Skylake and Kaby Lake processors.Its use in the Intel ME makes it the most widely used OS on Intel processors starting as of 2015, with more installations than Microsoft Windows, GNU/Linux, or macOS.

Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that has not been allocated to it. This prevents a bug or malware within a process from affecting other processes, or the operating system itself. Protection may encompass all accesses to a specified area of memory, write accesses, or attempts to execute the contents of the area. An attempt to access unowned memory results in a hardware fault, called a segmentation fault or storage violation exception, generally causing abnormal termination of the offending process. Memory protection for computer security includes additional techniques such as address space layout randomization and executable space protection.

The modified Harvard architecture is a variation of the Harvard computer architecture that allows the contents of the instruction memory to be accessed as if it were data. Most modern computers that are documented as Harvard architecture are, in fact, modified Harvard architecture.

The NX bit (no-execute) is a technology used in CPUs to segregate areas of memory for use by either storage of processor instructions (code) or for storage of data, a feature normally only found in Harvard architecture processors. However, the NX bit is being increasingly used in conventional von Neumann architecture processors, for security reasons.

An operating system with support for the NX bit may mark certain areas of memory as non-executable. The processor will then refuse to execute any code residing in these areas of memory. The general technique, known as executable space protection, is used to prevent certain types of malicious software from taking over computers by inserting their code into another program's data storage area and running their own code from within this section; one class of such attacks is known as the buffer overflow attack.

Intel markets the feature as the XD bit (execute disable). Advanced Micro Devices (AMD) uses the marketing term Enhanced Virus Protection (EVP). The ARM architecture refers to the feature, which was introduced in ARMv6, as XN (execute never). The term NX bit itself is sometimes used to describe similar technologies in other processors.

The Openwall Project is a source for various software, including Openwall GNU/*/Linux (Owl), a security-enhanced Linux distribution designed for servers. Openwall patches and security extensions have been included into many major Linux distributions.

As the name implies, Openwall GNU/*/Linux draws source code and design concepts from numerous sources, most importantly to the project is its usage of the Linux kernel and parts of the GNU userland, others include the BSDs, such as OpenBSD for its OpenSSH suite and the inspiration behind its own Blowfish-based crypt for password hashing, compatible with the OpenBSD implementation.

PaX is a patch for the Linux kernel that implements least privilege protections for memory pages. The least-privilege approach allows computer programs to do only what they have to do in order to be able to execute properly, and nothing more. PaX was first released in 2000.

PaX flags data memory as non-executable, program memory as non-writable and randomly arranges the program memory. This effectively prevents many security exploits, such as some kinds of buffer overflows. The former prevents direct code execution absolutely, while the latter makes so-called return-to-libc (ret2libc) attacks difficult to exploit, relying on luck to succeed, but doesn't prevent overwriting variables and pointers.

PaX is maintained by The PaX Team, whose principal coder is anonymous.

The grsecurity project includes PaX, along with other Linux kernel patches unique to grsecurity.

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.In this technique, an attacker gains control of the call stack to hijack program control flow and then executes carefully chosen machine instruction sequences that are already present in the machine's memory, called "gadgets". Each gadget typically ends in a return instruction and is located in a subroutine within the existing program and/or shared library code. Chained together, these gadgets allow an attacker to perform arbitrary operations on a machine employing defenses that thwart simpler attacks.

Sigreturn-oriented programming (SROP) is a computer security exploit technique that allows an attacker to execute code in presence of security measures such as non-executable memory and code signing.It was presented for the first time at the 35th Security and Privacy IEEE conference in 2014 where it won the best student paper award.This technique employs the same basic assumptions behind the return-oriented programming (ROP) technique: an attacker controlling the call stack, for example through a stack buffer overflow, is able to influence the control flow of the program through simple instruction sequences called gadgets.

The attack works by pushing a forged sigcontext structure on the call stack, overwriting the original return address with the location of a gadget that allows the attacker to call the sigreturn system call.Often just a single gadget is needed to successfully put this attack into effect. This gadget may reside at a fixed location, making this attack simple and effective, with a setup generally simpler and more portable than the one needed by the plain return-oriented programming technique.Sigreturn-oriented programming can be considered a weird machine since it allows code execution outside the original specification of the program.

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer.

Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow (or buffer overrun). Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

Stack buffer overflow can be caused deliberately as part of an attack known as stack smashing. If the affected program is running with special privileges, or accepts data from untrusted network hosts (e.g. a webserver) then the bug is a potential security vulnerability. If the stack buffer is filled with data supplied from an untrusted user then that user can corrupt the stack in such a way as to inject executable code into the running program and take control of the process. This is one of the oldest and more reliable methods for attackers to gain unauthorized access to a computer.

In computer science, a tagged architecture is a particular type of computer architecture where every word of memory constitutes a tagged union, being divided into a number of bits of data, and a tag section that describes the type of the data: how it is to be interpreted, and, if it is a reference, the type of the object that it points to.

Notable examples of American tagged architectures were the Lisp machines, which had tagged pointer support at the hardware and opcode level, the Burroughs large systems, which had a data-driven tagged and descriptor-based architecture, and the non-commercial Rice Computer. Both the Burroughs and Lisp machine were examples of high-level language computer architectures, where the tagging was used to support types from a high-level language at the hardware level.

In addition to this, the original Xerox Smalltalk implementation used the least-significant bit of each 16-bit word as a tag bit: if it was clear then the hardware would accept it as an aligned memory address while if it was set it was treated as a (shifted) 15-bit integer. Current Intel documentation mentions that the lower bits of a memory address might be similarly used by some interpreter-based systems.

In the Soviet Union, the Elbrus series of supercomputers pioneered the use of tagged architectures in 1973.

W^X ("Write XOR Execute"; spoken as W xor X) is a security feature in operating systems and virtual machines. It is a memory protection policy whereby every page in a process's or kernel's address space may be either writable or executable, but not both. Without such protection, a program can write (as data) CPU instructions in an area of memory intended for data and then arrange to run (as executable) those instructions. This can be dangerous if the writer of the memory is malicious.

Some early Intel 64 processors lacked the NX bit required for W^X, but this appeared in later chips. On processors with more limited features, such as the Intel i386, W^X requires using the CS code segment limit as a "line in the sand", a point in the address space above which execution is not permitted and data is located, and below which it is allowed and executable pages are placed.Linker changes are generally required to separate code (such as trampolines and other code needed for linker and library runtime functions) and data.

This page is based on a Wikipedia article written by authors
(here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.