At the beginning there’s a DOS header, which is an MS-DOS compatible executable that always consists of exactly 100 bytes that outputs an error message such as “This program cannot be run in DOS mode.” The error message is displayed if we try to run program on the DOS system. Because the executable must display that text message, there’s a 16-bit DOS program included in the DOS header that actually does just that.

Then there’s a PE File Header, which is the structure IMAGE_FILE_HEADER and has the following members:

Machine [16 bits]: indicate the system the binary is intended to run on

NumberOfSections [16 bits]: number of sections that follow the headers

TimeDateStamp [32 bits]: the time the file was created

PoinerToSymbolTable [32 bits]: used for debugging (usually 0)

NumberOfSymbols [32 bits]: used for debugging (usually 0)

SizeOfOptionalHeader [16 bits]: is sizeof IMAGE_OPTIONAL_HEADER

Characteristics [16 bits]: a collection of flags:

IMAGE_FILE_RELOCS_STRIPPED: set if there is no relocation information in the file (in sections themselves)

IMAGE_FILE_EXECUTABLE_IMAGE: set if file is an executable (it is not an object of a library)

IMAGE_FILE_LINE_NUMS_STRIPPED: set if the line number information is stripped – not used for executable files

IMAGE_FILE_LOCAL_SYMS_STRIPPED: set if there is no information about local symbols in the file – not used for executable files

IMAGE_FILE_AGGRESIVE_WS_TRIM: set of the OS is supposed to trim the working set of the running process (the amount of memory the process uses) aggressively by paging it out

IMAGE_FILE_BYTES_REVERSED_LO and IMAGE_FILE_BYTES_REVERSED_HI: set if the endianess of the file is not what the machine would expect and must swap bytes before reading

IMAGE_FILE_32BIT_MACHINE: set if the machine is expected to be a 32 bit machine

IMAGE_FILE_DEBUG_STRIPPED: set if there is no debugging information

IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP: set if application may not run from a removable medium such as floppy of a CD Rom (in this case, the OS is advised to copy the file to the swapfile and execute it from there)

IMAGE_FILE_NET_RUN_FROM_SWAP: set if application may not run from the network (in this case, the OS is advised to copy the file to the swapfile and execute it from there)

IMAGE_FILE_SYSTEM: set if the file is a system file such as a driver

IMAGE_FILE_DLL: set if the file is a DLL, otherwise it’s an EXE

IMAGE_FILE_UP_SYSTEM_ONLY: set if the file is not designed to run on multiprocessor systems

All of the above members and also all the other members of the PE header can be found by using the RVA, which is a relative virtual address. This is useful, because we don’t actually have to know the exact address of that member in memory, but only the offset within the current executable/library.

Let’s now take a look at the optional header, which contains the following elements:

Magic [16 bits]: always contains 0x010b

MajorLinkerVersion [16 bits]: set by linker

MinorLinkerVersion [16 bits]: set by linker

SizeOfCode [32 bits]: size of executable code

SizeOfInitializedData [32 bits]: size of initialized data

SizeOfUninitializedData [32 bits]: size of the uninitialized data

AddressOfEntryPoint [32 bits]: a RVA: offset of the entry point – execution starts here (the address of DLL’s LibMain or a program’s startup code)

BaseOfCode [32 bits]: offset to the executable code

BaseOfData [32 bits]: offset to the initialized data

ImageBase [32 bits]: preferred linear load address of the entire binary, including all the headers. This is the address (always multiple of 64KB) the file has been relocated to by the linker – if the binary can in fact be loaded to this address, the loader doesn’t need to relocate the file again. The preferred load address cannot be used if another image has already been loaded to that address (which can happen quite often if a linker’s default address is used). In this case, the image must be loaded to some other address and it needs to be relocated.

SectionAlignment [32 bits]: alignment of PE file’s sections in RAM

FileAlignment [32 bits]: alignment of PE file’s section in file

MajorOperatingSystemVersion [16 bits]: major version

MinorOperatingSystemVersion [16 bits]: minor version

MajorImageVersion [16 bits]: binary major version

MinorImageVersion [16 bits]: binary minor version. Many linkers don’t set this information correctly and many programmers don’t bother to supply it, so it’s better to rely on the version resource if one exists.

MajorSubsystemVersion [16 bits]: major subsystem version

MinorSubsystemVersion [16 bits]: minor subsystem version. This should be suppled correctly, because it is checked and used.

Win32VersionValue [32 bits]: unknown (usually 0)

SizeOfImage [32 bits]: the amount of memory the image will need in bytes. It is the sum of all headers and section lengths if aligned to SectionAlignment. It is a hint to the loader how many pages it will need in order to load the image.

SizeOfHeaders [32 bits]: the length of all headers including the data directories and the section headers. It is at the same time the offset from the beginning of the file to the first section’s raw data.

Checksum [32 bits]: checksum, which is only checked if the image is NT-driver, which will fail to load if the checksum isn’t correct. For other binary types, the checksum isn’t used and may be 0.

Subsystem [16 bits]: tells you in which of the NT-subsystems the image runs:

IMAGE_SUBSYSTEM_NATIVE: the image doesn’t need a subsystem (drivers)

IMAGE_SUBSYSTEM_WINDOWS_GUI: the image is win32 graphical binary

IMAGE_SUBSYSTEM_WINDOWS_CUI: the image is win32 console binary

IMAGE_SUBSYSTEM_OS2_CUI: the image is the OS/2 console binary (the =OS/2 binaries will be in OS/2 format, so this is rarely used)

IMAGE_SUBSYSTEM_POSIX_CUI: the image is a POSIX console binary

DllCharacteristics [16 bits]: if the image is a DLL, it tells you when to call the DLL’s entry point

SizeOfStackReserve [32 bits]: size of reserved stack

SizeOfStackCommit [32 bits]: size of initially committed stack

SizeOfHeapReserve [32 bits]: size of reserved heap

SizeOfHeapCommit [32 bits]: size of initially committed heap. The reserved amounts are address space (not real RAM) that is reserved for specific purpose. At program startup, the commited amount is actually allocated in RAM.

LoaderFlags [32 bits]: unknown (usually 0)

NumberOfRvaAndSizes [32 bits]: number of valid entries in the directories that follow immediately (unreliable – rather use the constant IMAGE_NUMBEROF_DIRECTORY_ENTRIES)

DataDirectory: This is an array of additional data structures that are stored inside PE header. This data structure contains a directories that describe their contents and are the following [20]:

Export Table: Lists the names and RVAs of all exported functions in the current module.

Import Table: Lists the names of modules and functions that are imported from the current module. For each function, the list contains a name string (or an ordinal) and the RVA that points to the current function’s import address table entry. This is the entry that receives the actual pointer to the imported function in runtime, when the module is loaded.

Resource Table: Points to the executable’s resource directory, which is a static definition of various user-interface elements as string, dialog box layouts and menus.

Base Relocation Table: Contains a list of addresses within the module that must be recalculated in case the module gets loaded in any address other than the one it was built for.

Debugging Information: Contains debugging information for the executable. This is usually presented in the form of a link to an external symbol file that contains the actual debugging information.

Thread Local Storage Table: Points to a special thread-local section in the executable that can contain thread-local variables. This is managed by loaded when the executable is loaded.

Load Configuration Table: Contains a variety of image configuration elements, such as a special lock prefix table, which can modify an image in load time to accommodate for uniprocessor or multiprocessor systems. This table also contains information for a special security feature that lists the legitimate exception handlers in the module (to prevent malicious code from installing an illegal exception handler).

Bound Import Table: Contains an additional import-related table that contains information on bound import entries. A bound import means that the importing executable contains actual addressees into the exporting module. This directory is used for confirming that such addresses are still valid.

Import Address Table (IAT): Contains a list of entries for each function imported from modules. These entries are initialized at load time and hold the names of the functions as well as actual addresses to them.

Delay Import Table: Contains special information that can be used for implementing a delayed-load importing mechanism whereby an imported function is only resolved when it is first called. This mechanism is not supported by the OS and is implemented in the C runtime library.

We didn’t actually specify all the additional data directories that hold the data. We can see all of them specified inside the winnt.h header file and are presented on the picture below:

After that, there are also various sections like .data and .text that are an important part of the executable, because the hold the data of the program and the instructions that will be executed once the executable is loaded into the memory. There are also a lot of other structures, but we will not look at them in this article.

Conclusion

We’ve looked at the various fields of the PE file header. At the end, we determined that data directories are an important part of the executable/library, because they contain useful information like RVA addresses of imported/exported functions, resources, debugging information, etc… After the data directories there are also different sections that comprise the executable: the .idata, .data, .text and other sections. The .data section holds the executable data, while the .text section holds the executable instructions that will be executed when the executable is loaded in memory and started.

Dejan Lukan is a security researcher for InfoSec Institute and penetration tester from Slovenia. He is very interested in finding new bugs in real world software products with source code analysis, fuzzing and reverse engineering. He also has a great passion for developing his own simple scripts for security related problems and learning about new hacking techniques. He knows a great deal about programming languages, as he can write in couple of dozen of them. His passion is also Antivirus bypassing techniques, malware research and operating systems, mainly Linux, Windows and BSD. He also has his own blog available here: http://www.proteansec.com/.

Your email address will not be published. Required fields are marked *

Comment

Name *

Email *

Website

+ 2 =

About InfoSec

InfoSec Institute is the best source for high quality information security training. We have been training Information Security and IT Professionals since 1998 with a diverse lineup of relevant training courses. In the past 16 years, over 50,000 individuals have trusted InfoSec Institute for their professional development needs!

Join our newsletter

File download

First Name

Last Name

Work Phone Number

Work Email Address

Job Title

Why Take This Training?

How will you fund your training?

What is your training budget?

InfoSec institute respects your privacy and will never use your personal information for anything other than to notify you of your requested course pricing. We will never sell your information to third parties. You will not be spammed.

Comments

What is Skillset?

Skillset

Practice tests & assessments.

Practice for certification success with the Skillset library of over 100,000 practice test questions. We analyze your responses and can determine when you are ready to sit for the test. Along your journey to exam readiness, we will:

1. Determine which required skills your knowledge is sufficient
2. Which required skills you need to work on
3. Recommend specific skills to practice on next
4. Track your progress towards a certification exam