For a new software product that was being developed, we needed to be able to achieve Single-Sign-On (SSO) to other systems from 802.1X sessions with Network Policy Server (NPS), Microsoft’s RADIUS server implementation. (This was to be done without involving any integration with DHCP servers or SNMP to avoid taking dependencies and to minimise operational and deployment complexity.)

(One of these other systems, for example, was Palo Alto firewalls via the supplied User-Id API. To call this XML-based API, a client’s SAM compatible user name is provided along with a client’s IP address, as well as an assertion that either a log on or a log off event has occurred for that client, optionally, a time out too.)

To achieve this successfully, a client’s identity, its IP address and knowledge of when its session starts and when it ends must be ascertained so that appropriate events can be generated and appropriate calls in to other systems can be made. The best way to achieve this with NPS is by writing a Network Policy Server Extension.

Unfortunately, a major roadblock was hit along the way in trying to achieve this, uncovering a multifaceted identity spoofing security vulnerability in recent versions of NPS. Due to multiple product issues, this is a vulnerability that NPS now forces both upon consumers of its extensions API and also on to the wider RADIUS ecosystem in other products that are already in the market and deployed widely. These other products typically do not integrate with NPS directly via an extension but use the identity available from RADIUS accounting, SNMP or Syslog.

The core of the vulnerability can be traced back to a change in product behaviour that was made in Windows Server 2008 R2 with the introduction of EAP identity privacy support. As of this release, the value of the EAP-Response/Identity packet used by a NAS in the User-Name attribute in RADIUS Access-Request and Accounting-Request packets and the information that they make available via Syslog and SNMP can no longer be used as a source of identity information. This is because they now only contain the value of the EAP outer-identity where TLS-based EAP authentication is in use. This can be anonymised, spoofed or differ to the EAP inner-identity, the real identity of the client. Identity privacy is supported by EAP types such as PEAP.

For example, the EAP outer-identity could be anonymous@example.com or administrator@example.com, the EAP inner-identity actually being fred@example.com. This gives rise to a trivial identity spoofing security vulnerability if this value is used for access control, auditing or logging purposes, the scope of which is limited only by what the identity is subsequently used for by dependent systems.
(By impersonating another account, which can be higher privileged, this often allows a user to circumvent Web filtering restrictions, to falsely implicate another user for actions that they themselves have carried out or to get access to resources that their user account has no rights to.)

It is possible to configure identity privacy in many platforms such as Windows, OS X and iOS (via MobileConfig profile), and Android.
In Windows, this is from Windows 7 onwards. (The field is easily accessible so the barrier to entry is low.)

To get access to a client’s real identity when Accounting occurs, either the Authentication/Authorization that has previously occurred must conceptually be linked/bound to the Accounting, or the clients real identity must be returned to the NAS and the NAS must use this when Accounting so that the value of the User-Name attribute is correct.
(Where a revised identity is returned, this should also be used by a NAS in any session information that it makes available via Syslog or SNMP.

The first approach, binding, is most desirable as it maintains EAP identity privacy guarantees so this is the gold standard when designing a solution. This is explored first, as a primary concern, therefore. (The second approach is explored later on as a potential workaround.)

To perform such binding, the following process is conceptually required on the EAP terminating NPS instance:

From the Authorization extension point where an Access-Accept will be generated, a client’s real identity is available in SAM compatible format to an NPS extension in the ratStrippedUserName extended RADIUS attribute. (This cannot have been spoofed or anonymised at this point.)

From the Authorization extension point where an Accounting-Request packet has been received (Start, Stop, Interim-Update), a client’s IP address is available to an NPS extension as well as knowledge of when its session starts and when it ends. (For IPv4, the client’s IP address is available via the Framed-IP-Address attribute in Interim-Update Accounting-Request packets. For IPv6, client IP addresses are instead available via Framed-IPv6-Address attributes. They are made available by NASes that implement DHCP snooping functionality.)

This Attribute is available to be sent by the server to the client in an Access-Accept and SHOULD be sent unmodified by the client to the accounting server as part of the Accounting-Request packet if accounting is supported.
The client MUST NOT interpret the attribute locally.”

A Class attribute is generated by NPS by default and is included in the Access-Accepts that it sends. (It is, in all but name, a session cookie.)

All subsequent accounting must then always include this Class attribute, which allows binding of Authentication/Authorization to Accounting to be performed against it.

The format of the Class attribute that is generated by NPS is documented by Microsoft as follows:

“Attribute: Class
ID: 25
Data type: Text

Represents the attribute sent to the client in an Access-Accept packet, which is useful for correlating Accounting-Request packets with authentication sessions. The format is:

Type contains the value 25 (1 octet).

Length contains a value of 20 or greater (1 octet).

Checksum contains an Adler-32 checksum that is computed over the remainder of the Class attribute (4 octets).

Vendor-ID contains the ID of the NAS vendor (4 octets). The high-order octet is 0 and the low-order 3 octets are the SMI Network Management Private Enterprise Code of the vendor in network byte order, as defined in “Private Enterprise Numbers” at http://www.iana.org/assignments/enterprise-numbers.

Version contains the value of 1 (2 octets).

Server-Address contains the IP address of the RADIUS server that issued the Access-Challenge message. For multihomed servers, this is the address of the network interface that received the original Access-Request message (2 octets).

Service-Reboot-Time specifies the time at which the first serial number was returned (8 octets).

Unique-Serial-Number contains a unique number to distinguish an individual connection attempt (8 octets).

String contains information that is used to classify accounting records for additional analysis (0 or more octets). In NPS, the Class attribute is copied into the String field.

The Class attribute is used to match the accounting and authentication records if it is sent by the NAS in the Accounting-Request message. The combination of Serial-Number, Service-Reboot-Time, and Server-Address must be a unique identification for each authentication that the RADIUS server performs.”

(Poor aspects of this format is that it is a predictable construction that leaks information. It is also not apparent how the Server-Address field is handled when IPv6 is in use.)

The Class attribute is generally defined for NPS under the RADIUS_ATTRIBUTE_TYPE enumeration as ratClass and is documented by Microsoft as follows:

“Specifies a value that is provided to the NAS by the authentication provider. The NAS should use this value when communicating with the accounting provider. The value field in RADIUS_ATTRIBUTE for this type is a pointer. See RFC 2865 for more information.”

The first fault in NPS that we encountered during the course of implementation was that the NPS generated RADIUS Class attribute (ratClass) is not made available to extensions at these points during the Authentication/Authorization process via the RadiusExtensionProcess2 API or the older generation RadiusExtensionProcessEx and RadiusExtensionProcess APIs. It should, both as documented and conceptually, be inserted at least before Authorization Extension DLLs have been called earlier on in the packet flow process, not after.

“In an Authorization DLL, RadiusExtensionProcess2 receives both the attributes generated by the NPS authorization service and the attributes generated from previously called Authorization DLLs.”

Had the generated Class attribute been inserted at the correct place, as documented, it would have allowed binding to be performed as the attribute would have been available to Authorization Extension DLLs at the point when an Access-Accept is pending to be sent in response. But, by inserting it after all extensions have been called, it does not allow for any binding to occur within an extension based on its value.

This is, therefore, a clear and unambiguous bug in Network Policy Server that causes extensions to be security vulnerable when they interact with RADIUS accounting information. It, for example, prohibits anybody from reliably and robustly implementing and integrating a Single Sign On / Federated Authentication system via the extensions API in a way where the solution can then be both widely distributed and immune to identity spoofing, while still meeting the privacy guarantees expected of EAP identity privacy. The same is true if the use case was auditing or logging the accounting information. (Potential workarounds, which turn out to be nuanced and complex, are explored later on.)

The NPS RADIUS packet flow diagram, originally provided by Microsoft, has been annotated in red to show the fault graphically and, hopefully, in a more clear manner:

Fault Isolation

In Wireshark, we can observe and validate that the RADIUS Class attribute is being sent in Access-Accepts to a NAS (switch or access point):

However, when a debugging mechanism is attached and is used to look at the contents of the response RADIUS_ATTRIBUTE_ARRAY at the Authorization extension point, we can observe and validate that the Class attribute (ratClass) is missing:

It is interesting to note that EAP-Message has also been appended past the Authorization Extension point. The only RADIUS attribute that should conceptually be added to or updated in the response RADIUS attributes after Authorization Extension DLLs have been called is the Message-Authenticator RADIUS attribute, which is used to sign the packet and is therefore sensitive to its contents.

Test Case

A concise and simple test case written in C is given below that confirms reproducibility of the issue.

Compile the code to a DLL and ensure that an ACE in the ACL for the file permits the Network Service account to be able to Read and Execute.

Ensure that a client can gain access to a network due to Access-Accept packets being generated by NPS and delivered to a NAS.

Stop the Network Policy Server service (IAS), enable the extension by modifying the registry as documented in Setting Up the Extension DLLs and start the service again.

Access attempts are now rejected where the generated Class attribute is missing, demonstrating that it is impossible to perform binding based on its value in subsequent accounting.
(The extension explicitly rejecting the authentication attempt can be verified in the appropriate Event Log.)

The following table provides a guide to which attributes
may be found in which kinds of packets, and in what
quantity.
Request Accept Reject Challenge # Attribute
0 0+ 0 0 25 Class

It is possible via the extensions API and therefore valid according to the specification to add an additional Class attribute to have one that is known using a GUID or something with similar properties. However, multiples cannot be sent in practice in an Access-Accept in the real world as many NASes do not support the presence of them. They will either drop the packet or respond with only the last Class attribute it receives. It therefore stops an extension from being one that can be widely distributed. Where there is a need to support many different devices in heterogeneous environments, some of which will be end-of-life and out of all software support, getting a change made of this nature is not a remotely realistic proposition.
(For example, for Comware based devices, used on many HP, H3C and 3Com switches and wireless access points, only the last Class attribute received will be used in Accounting-Request packets.)

We have validated that an additional, known Class attributes can be added via the extensions API resulting in two being sent in an Access-Accept.

2) Use the User-Name Attribute in Access-Accept Packets

It is possible to add a User-Name attribute to an Access-Accept packet via the extensions API with the value of the ratStrippedUserName extended RADIUS attribute. This contains the client’s real identity in SAM compatible format.

“It MAY be sent in an Access-Accept packet, in which case the client SHOULD use the name returned in the Access-Accept packet in all Accounting-Request packets for this session.”

This is not, however, a general workaround to the problem in all deployment scenarios for three specific reasons:

Firstly, as the client’s real identity is returned back to the NAS from the EAP terminating RADIUS server and all subsequent accounting will contain this identity, the privacy guarantees that EAP identity privacy is meant to bring are broken.
(It has to be said, however, that this is acceptable in many use cases due to there being differing degrees of identity privacy, it is not an all or nothing concern, and there is the likely consideration that security should trump absolute identity privacy. The main practical concern that identity privacy solves today is to not leak the identity in to the air of wireless clients in an unencrypted way. The integrity of the path from the NAS to the RADIUS server is usually of less concern as it is typically over a more physically secured hardwired, back end connection where there is far less of a tangible risk of interception.)

Secondly, the User-Name attribute is sometimes rewritten in RADIUS proxying scenarios. This adds fragility where the authentication path is not fully controlled.

Thirdly, not all NASes support processing the User-Name attribute where one is present in an Access-Accept packet so will continue to account with the EAP outer-identity. At present, devices based on Extreme’s XOS and HP/H3C’s Comware are examples of this.

Many firewalls and Web filtering platforms have become identity aware and have started using either RADIUS accounting information via a method that does not involve an NPS extension or Syslog snooping for Single-Sign-On (SSO) purposes. In these systems, where the Authentication/Authorization process is inherently isolated and opaque and because there is no integration with it, it is necessary either that:

The EAP terminating RADIUS server returns the User-Name attribute with the client’s real identity AND that the NASes support processing this attribute.

The EAP terminating RADIUS server else mandates that the EAP outer-identity and EAP inner-identity resolve to the same discrete user, prohibiting the use of anonymous EAP outer-identities. (This is normally a bad idea, but doing this acts as a compromise workaround where the concern of preventing identity spoofing trumps privacy.)

It is unfortunate that Microsoft has not added either of these two bits of functionality within NPS itself. These issues, like this missing Class attribute, facilitate security vulnerable deployments. It is an oversight in the design and is a variation of the vulnerability.

From a defense in depth perspective, it would also be beneficial to have the ability on the EAP terminating RADIUS server to constrain the EAP outer-identity so that the user portion of the User-Name must have the value null or “anonymous” where it does not resolve to the same discrete user represented by the EAP inner-identity. This would mitigate the identity spoofing attack, assuming that a genuine user with the username anonymous does not exist. It forces EAP identity privacy use to be explicit and it would, therefore, be inherently obvious where it has not been handled properly.

Such functionality, if implemented, would be best configurable via boolean profile attributes.
The profile attributes would presumably be named something like Generate-User-Name, Bind-User-Identity and Limit-Outer-Identity, maintaining similarity to the existing Generate-Class-Attribute and Generate-Session-Timeout options, and the MMC-based GUI would need to be enhanced to be able to configure these. These options should all be disabled by default.

Some of the vendors, for example, who are affected by this today who have products that use RADIUS accounting or Syslog integrations for SSO purposes are:

Microsoft’s support for EAP identity privacy was added from both a client and server perspective starting with Windows 7 and Windows Server 2008 R2. While it is a highly desirable property to have for authentication, it appears that little to no consideration was given to the wider security impact of these changes to the ecosystem. The nature of the change was to silently introduce an identity spoofing security vulnerability both for existing IAS/NPS extensions and for external systems that integrate with NASes directly for identity information, typically via RADIUS accounting or Syslog.

Many of the people in the profession who we have engaged in discussion about the vulnerability have usually been aware of the feature but not of the implications that flow from it. This needs to change.

To resolve this properly, Microsoft, for its part, would likely need to do five things:

Ensure that the Class attribute that NPS generates is available to extensions so that they can be implemented to perform binding from auth to accounting so that identity spoofing cannot occur in extensions on the EAP-terminating RADIUS server.

Add functionality to allow NPS to be configured to mandate that the user portion of the EAP outer-identity must be null or “anonymous” where the EAP outer-identity and inner-identity do not resolve to the same discrete user. making identity privacy use explicit and therefore preventing identity spoofing. (Enabled by default.)

Add functionality to allow NPS to be configured to return the client’s real identity back in the User-Name attribute of an Access-Accept to allow a NAS to account with it on the EAP-terminating RADIUS server, preventing identity spoofing. (Disabled by default with a warning about it partially compromising identity privacy.)

Add functionality to allow NPS to be configured to mandate that the EAP outer-identity and inner-identity must resolve to the same discrete user on the EAP-terminating RADIUS server, preventing identity spoofing. (Disabled by default with a warning about it fully compromising identity privacy.)

Document the issues so that implementers at all levels become aware of the vulnerability and can take appropriate remedial action in their products and deployments.

You will be asked for the password to the keystore that has been created, which is “iMCV300R002“, and the password to the imported key, which is the password to the original source keystore.
When asked for the password for the cloned key that is to be created, specify “iMCV300R002” and confirm it.

Enter keystore password:
Enter key password for <le-5aa48bcf-bb89-4447-bdc5-1a1aa868355c>
Enter key password for <imc>
(RETURN if same as for <le-5aa48bcf-bb89-4447-bdc5-1a1aa868355c>)
Re-enter new password:

4) Delete the now superfluous key from the keystore by specifying its alias and giving the keystore password, “iMCV300R002“:

It is a seldom known fact that, as of the NT branch, Windows supports case sensitivity. Using it, however, is non-obvious, requiring knowledge of the design and inner workings of the operating system.

Beneath the Win32 environment subsystem that most programmers are familiar with lies the NT Executive. It is accessed via the NT Native API and, at this level, programmers must be familiar with the OBJECT_ATTRIBUTES structure that is used when creating or opening executive objects. It is a knowledge of this structure, and how the Win32 subsystem uses it, that is required to truly understand case handling in Windows.

When it comes to case sensitivity, however, most users will be concerned with only one object type, Files.

The OBJECT_ATTRIBUTES structure, as documented, “specifies attributes that can be applied to objects or object handles by routines that create objects and/or return handles to objects“.

One of its members is the Attributes bitmask that specifies the attributes for the handle to an object.

The flag within this that is of interest is the OBJ_CASE_INSENSITIVE flag.
It is documented as follows:

If this flag is specified, a case-insensitive comparison is used when matching the name pointed to by the ObjectName member against the names of existing objects. Otherwise, object names are compared using the default system settings.

This means that when the flag is specified, case insensitive operation is guaranteed, however, when it is not, it is left to the system to decide.

This documentation is arguably incomplete as it does not state what the ‘default system settings’ are, or how to control them.

It turns out that there are two parts to this. Firstly, a single setting that controls case sensitivity on system-wide basis, obcaseinsensitive, and then, assuming that case sensitive operation is supported, file system support on a per-volume basis.

The obcaseinsensitive value is located in the registry at “\REGISTRY\MACHINE\System\CurrentControlSet\Control\Session Manager\Kernel” and is a DWORD. It is read only as a system boots.
(In Win32 nomenclature, the path is “HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Kernel”.)

When set to 0, the object manager runs in case sensitive mode.

When set to 1, the object manager runs in case insensitive mode.

When unspecified, NT 5.1 (Windows XP) and later editions default to running in case insensitive mode.

obcaseinsensitive has no meaning in NT 5.0 (Windows 2000) and prior versions of NT, which always run in case sensitive mode.

At the Win32 environment subsystem, the only exposure to case sensitivity is for File objects and via the following APIs:

CreateFile / CreateFileTransacted and the FILE_FLAG_POSIX_SEMANTICS flag, which ensures that the OBJ_CASE_INSENSITIVE flag is not set in the OBJECT_ATTRIBUTES structure used in internal calls to NtCreateFile.

FindFirstFileEx / FindFirstFileTransacted and the FIND_FIRST_EX_CASE_SENSITIVE flag, which ensures that the OBJ_CASE_INSENSITIVE flag is not set in the OBJECT_ATTRIBUTES structure used in internal calls to NtCreateFile to open a directory file before subsequent calls to NtQueryDirectoryFile.

(It is worth noting that most existing Win32 applications will not set the FILE_FLAG_POSIX_SEMANTICS flag or the FIND_FIRST_EX_CASE_SENSITIVE flag when calling these APIs, or will rely on other APIs that internally do not. Because of this, the OBJ_CASE_INSENSITIVE flag will always end up being set by the Win32 subsystem in the OBJECT_ATTRIBUTES structures that it creates to call NT Native APIs. This always enforces insensitivity for such applications.)

Official documentation from Microsoft for obcaseinsensitive is sparse and patchy:

The value is also managed by group policy, where the best explanation given by Microsoft can also be found:

This security setting determines whether case insensitivity is enforced for all subsystems. The Win32 subsystem is case insensitive. However, the kernel supports case sensitivity for other subsystems, such as POSIX.

If this setting is enabled, case insensitivity is enforced for all directory objects, symbolic links, and IO objects, including file objects. Disabling this setting does not allow the Win32 subsystem to become case sensitive.

There are times that applications need to determine if a Windows system is running with case insensitivity, perhaps for path comparison. A naive implementation will simply query the obcaseinsensitive value from the registry, but this has drawbacks:

The value only represents how the NT executive’s object manager will operate at the next boot, not necessarily how the system is right now.

NT 5.0 (Windows 2000) and prior, if supported, must be special cased as case insensitivity is not supported. Further, the obcaseinsensitive value may be present in the registry even though it is meaningless.

Future versions of Windows may decide to implement things differently, such as in a more granular way. (Perhaps on a per-thread or per-process basis via a privilege.)

The solution is to use NtOpenSymbolicLinkObject to attempt to open the known symbolic link, “SystemRoot”, which resides in the root directory of the object namespace with incorrect casing.

By attempting to open “\SYSTEMROOT” with a DesiredAccess of 0x0, which means no access rights to the object, and with no flags specified in the attributes member of the OBJECT_ATTRIBUTES structure passed, the current operational mode can be easily and efficiently determined.

Where the object manager is running in case insensitive mode, a NTSTATUS value of STATUS_ACCESS_DENIED will be returned.

Where the object manager is running in case sensitive mode, a NTSTATUS value of STATUS_OBJECT_NAME_NOT_FOUND will be returned.

(If the SYMBOLIC_LINK_QUERY specific access right (0x1) is requested, the open operation will succeed if the user is permitted to open the symbolic link via its DACL. This is the case typically only if the user is an Administrator. The operation will fail and STATUS_ACCESS_DENIED will be returned otherwise.)

Redhat’s Corinna Vinschen of Cygwin fame pointed out that the same check can also be achieved via NtOpenDirectoryObject in much the same way.

Where the object manager is running in case insensitive mode, a NTSTATUS value of STATUS_OBJECT_TYPE_MISMATCH will be returned.

Where the object manager is running in case sensitive mode, a NTSTATUS value of STATUS_OBJECT_NAME_NOT_FOUND will be returned.

(Cygwin has had a patch committed to its source code to ensure that it now tests for case sensitivity rather than reading the obcaseinsensitive value from the registry.)

As mentioned previously, once support for case sensitivity system-wide has been established, the ability to use files in a case sensitive way is then determined by the file system in use on a per-volume basis.

This is established by the presence of the FILE_CASE_SENSITIVE_SEARCH file system attribute flag for a volume, documented as meaning that “the specified volume supports case-sensitive file names”.

For programmers who work with any of the interesting features of NTFS and have not read it yet, they should find the time to do so. It is, potentially, just about to disrupt their world.

One of the goals for the file system is apparently to:

Maintain a high degree of compatibility with a subset of NTFS features that are widely adopted while deprecating others that provide limited value at the cost of system complexity and footprint.

And snuck away in the FAQ section is the following:

Q) What semantics or features of NTFS are no longer supported on ReFS?

The NTFS features we have chosen to not support in ReFS are: named streams, object IDs, short names, compression, file level encryption (EFS), user data transactions, sparse, hard-links, extended attributes, and quotas.

This is a large list of features that are not going to work in ReFS, at least in its initial implementation. It has the potential to have significant compatibility implications and to break many of the existing applications that are in use today.

To ensure compatibility of applications going forward, developers should ensure that they properly check that a feature is available before attempting to use it. This comes with a caveat:

The FILE_SUPPORTS_HARD_LINKS, FILE_SUPPORTS_EXTENDED_ATTRIBUTES, FILE_SUPPORTS_OPEN_BY_FILE_ID and FILE_SUPPORTS_USN_JOURNAL flags were only added in NT 6.1 (Windows 7 and Windows Server 2008 R2) and yet support for all of them has been implicit since NTFS 3.0 that was introduced with NT 5.0 (Windows 2000). A literal check for NTFS by name is therefore also needed to determine the implicit presence of these features.

(It is important to also ensure that such literal checks for NTFS by name are only performed to determine the implicit support of these features and for no other purpose. This is to ensure that applications are as forward compatible with other file systems as is reasonably possible by not taking a dependency on NTFS.)

For applications that depend on any of the features that are not available, appropriate workarounds will have to be found or a declaration made that they are not compatible with the underlying file system due to a lack of feature support.

The good news? Many of the features that are missing are relied upon heavily in Windows internally. This explains why it will not be possible to use ReFS as the file system for the boot volume. ReFS appears to be merely in its infancy and it has simply not been implemented fully yet, many of the missing features to appear by the time the file system becomes available in client versions of Windows.

There are a few edge cases when a programmer has no choice but to terminate a hung thread.

An example of this being where an unknown handle originating from another process is being queried via the partially documented NtQueryObject() to establish its type. Prior to the NT 6.1 kernel (Windows 7 and Windows Server 2008 R2), the call would hang indefinitely for pipes and the only way to recover from this was to time out the operation and to forcefully terminate the thread.

Microsoft provide a documented Win32 API to do this, TerminateThread(). This ultimately calls the undocumented NT API, NtTerminateThread() to request termination of a thread.

(The SYNCHRONIZE standard access right is not required to be granted on the thread’s handle for it to be terminated by TerminateThread, but beware, thread termination completes asynchronously. This means that the caller must wait on the thread for it to become signalled to know that termination has completed, for this, the right is required. This behaviour is not documented, but is implicit.)

When terminating a thread that has been created via the Win32 API, it is important to terminate it via the same means, at the same level of abstraction. This is to allow book keeping and internal cleanup to take place appropriately, fundamentally to not unnecessarily break the abstraction via a layering violation. This is true even where such internal mechanisms are not implemented in the abstraction today, they may be in the future.

Among the many caveats listed of forcefully terminating a thread, one that easily bites is that 1 MiB is, by default, leaked per-thread-terminated under versions of Windows prior to and including Windows Server 2003 (NT 5.2 kernel):

As of the NT 6.0 kernel (Windows Vista and Windows Server 2008), the thread’s user mode stack is freed when a thread is terminated.

Thankfully, there is an undocumented API that can be used to easily free the thread’s user mode stack in older versions of Windows, RtlFreeUserThreadStack(). It is exported via ntdll.dll.

This function was first exported in the NT 4.0 kernel (Windows NT 4.0) and was discontinued in the NT 6.0 kernel (Windows Vista and Windows Server 2008). This means that, for all the versions of Windows that will realistically be in use today, the ability to easily free a thread’s user mode stack if it must be forcefully terminated is available. Its availability also tallies perfectly with the versions of Windows for which this leak occurs.

(Note: The offset to the DeallocationStack member of the 32-bit TEB structure is 0xE0C. The offset to the same member of the 64-bit TEB is 0x1478.)

Before calling the RtlFreeUserThreadStack() API, ensure that the target thread has been suspended to guarantee that it will not be active when its user mode stack is freed. This avoids the potential implications of the race condition that exists when you decide to terminate a thread that is considered hung, it may well spring back to life after the decision to terminate it has been made and before it is terminated.

The API is documented unofficially below:

The RtlFreeUserThreadStack routine frees a thread’s initial stack given a handle to the containing process and handle to the thread where the stack should be freed.