Archive for January, 2008

Not all of a reverse engineer’s life has to be about undoing — sometimes it is equally as fun to build something from scratch, whether that means a new tool… or the Star Wars 30 Year Anniversary Lego Ultimate Collector’s Millennium Falcon! Over the course of the last three weeks, my best friend and myself have spent countless hours building this magnificent model, which has over 5000 pieces, 91 “major” construction steps (with each step taking up to 30 sub-steps, sometimes 2x’d or 4x’d) and a landscape, 8×14″, 310 page instruction manual.

Late last night, we completed the final pieces of the hull, the radar dish, and the commemorative plaque (itself made of Lego). We had previously built the Imperial Star Destroyer (ISD) last year, but nothing in our Lego-building lives had ever quite come close to the work we put into this set. Complete full-size pictures after the entry.

Along the way, we both learnt some important facts about the Lego manufacturing process — for example, we had already noticed that in our ISD set had some extra pieces, and that other people’s sets had different extra pieces, however, we weren’t too sure what to make of it. This year however, due to the fact we were missing exactly half the number of lever pieces and 1×2 “zit” pieces, we did some extra digging.

The first that boggles many Lego builders of such large sets, is the arrangement of pieces within the bags. These pieces are not arranged in construction order. If you break all the bags and sort the pieces, there is nothing wrong with doing so! Whether or not that will save you time however, is up for discussion. We did do a small sorting, mostly to separate hull pieces from thick pieces from greebling pieces, and that did seem to help a lot. However, I wouldn’t recommend spending 5 hours sorting the pieces as was done on Gizmodo.

By step 21 unfortunately, we noticed we were missing a piece — something that Lego says should never happen (more on this shortly). Step 22 also required that piece, and the next few steps required even more. I came up with an interesting idea — a missing bag. We counted the number of pieces we’d still need, and it came up to be exactly the number of pieces we had used. In other words, half of the pieces were missing. Those pieces were in the same bag as the level-looking pieces, and those too, after calculations, were missing half their number. By this point, we were sure that we were missing a bag, and went to check Lego’s site for help.

It turns out that Lego’s site does have a facility available for ordering missing set pieces, and for free! After putting in the appropriate information, including the set number, and piece number, I filled out my address… and received a “Thank you”. Unfortunately, we never got any further confirmation, and the missing pieces have yet to arrive. Most interesting however, was a notice on Lego’s website, claiming that missing pieces are quite rare, because each set is precision-weighted, so missing sets get flagged for extra, human QA. At first sight, this makes it very unlikely for a set to be missing a piece — so how did we end up missing nearly 60 pieces? (Note to readers: the missing pieces are all hull greeblin (decorative, not structural), and we dully marked down the steps at which they were required, so we can add them once we receive them).

The answer to that question only came to us once we completed construction of the Falcon. There it was, on our workshop table (my kitchen table): an opened bag, full of Lego pieces… which we had already used up, and which weren’t required! It then became pretty obvious to us that Lego has a major flaw in their weight-based reasoning: replacement! We couldn’t scientifically verify it, but the extra bag we had was of similar size and weight (being small pieces) to the second bag of levels and 1×2 pieces we needed. Evidently, a machine error (most probably) or human error caused an incorrect bag of pieces to be added to the set. At the QA phase, the set passed the weight tests, because this bag was of the same (or nearly the same) weight as the missing bag!

Furthermore, due to the fact both the ISD and the Falcon had pieces that were not included in the Appendix of piece counts (again, probably due to machine error while composing the set), their combined weight may have even pushed the weight of our set past the expected weight. It is unlikely that Lego would flag heavier sets for QA — at worst, the customer would get some free pieces of Lego. However, when that weight helps offset the weight of missing pieces, it can certainly become a problem. And when a bag of pieces is accidentally replaced by a similar bag, then weight measuring doesn’t do much at all.

Granted, a lot of our analysis is based on assumptions, but they certainly do check out. Lego says their primary method of checking sets is to weigh them, and we have an extra bag, and a missing bag, both of similar size and composure. The hypothesis seems valid, and perhaps a phone call to Lego will confirm it (if I don’t get the pieces soon, I certainly plan on doing that).

Ironically, this breakage of a QA test through replacement was similar to an interesting security question I received by mail recently: why is it a bad idea to use the Owner SID of an object as a way to authenticate the object, or its creator? It turns out this can leave you vulnerable to rename operations, such as on a file with write and delete access, allowing someone to impersonate the Owner SID but have their own data in the object. Breaking a security test by renaming an object, or breaking a quality assurance test by replacing a bag — the two stem from the same problem: bad design and simplistic assumptions.

Today I want to introduce another utility for Vista and Windows Server 2008 called ScTagQuery (short for Service Controller Tag Query), a tool which will allow you identify to which running service a certain thread inside a service hosting process (e.g Svchost.exe) belongs to, in order to help with identifying which services may be using up your CPU, or to better understand the stack trace of a service thread.

If you’ve ever had to deal with a service process on your system taking up too much CPU usage, memory, or other kinds of resources, you’ll know that Task Manager isn’t particularly helpful at finding the offending service because it only lists which services are running inside which service hosting process, but not which ones are consuming CPU time (in fact, some Svchosts host more than a dozen services!)

Process Explorer can also display this information, as well as the names of the DLLs containing the services themselves. Combined with the ability to display threads inside processes, including their starting address (which for some service threads, identifies the service they are associated with by corresponding to their service DLL) and call stack (which helps identify additional threads that started in a generic function but entered a service-specific DLL), Process Explorer makes it much easier to map each thread to its respective service name. With the cycles delta column, the actual thread causing high CPU usage can reliably be mapped to the broken or busy service (in the case of other high resource usage or memory leaks, more work with WinDBG and other tools may be required in order to look at thread stacks and contexts).

However, with each newer Windows release, the usage of worker pool threads, worker COM or RPC communication threads, and other generic worker threads has steadily increased (this is a good thing â€“ it makes services more scalable and increases performance in multiprocessor environments), making this technique a lot less useful. These threads belong to their parent wrapper routines (such as those in ole32.dll or rpcrt4.dll) and cannot be mapped to a service by looking at the start address. The two screenshots below represent a service hosting process on my system which contains 6 services — yet only one of the DLLs identified by Process Explorer is visible in the list of threads inside the process (netprofm.dll).

Again, there’s nothing wrong with using worker threads as part of your service, but it simply makes identifying the owner of the actual code doing the work a lot harder. To solve this need, Windows Vista adds a new piece of information called the Service Tag. This tag is contained in the TEB of every thread (internally called the Sub-Process Tag), and is currently used in threads owned by service processes as a way to link them with their owning service name.

When each service is registered on a machine running Windows Vista or later, the Service Control Manager (SCM) assigns a unique numeric tag to the service (in ascending order). Then, at service creation time, the tag is assigned to the TEB of the main service thread. This tag will then be propagated to every thread created by the main service thread. For example, if the Foo service thread creates an RPC worker thread (note: RPC worker threads donâ€™t use the thread pool mechanism â€“ more on that later), that thread will have the Service Tag of the Foo service.

ScTagQuery uses this Sub-Process Tag as well as the SCM’s tag database and undocumented Query Mapping API (I_ScQueryTagInformation) to display useful information in tracking down rogue services, or simply to glean a better understanding of the system’s behavior. Letâ€™s look at the option it supports and some scenarios where they would be useful.

During a live session, the -p, -s and -d options are most useful. The former will display a list of all service threads which have a service tag inside the given process, then map that tag to the service name. It can also function without a process ID, causing it to display system-wide data (enumerating each active process). The -s option, on the other hand, dumps which service tags are active for the process, but not the actual threads linked to them — it then links these tags to their service name. Finally, the -d option takes a DLL name and PID and displays which services are referencing it. This is useful when a thread running code inside a DLL doesn’t have an associated service tag, but the SCM does know which service is using it.

The -a and -n options are particularly useful when youâ€™ve obtained a service tag from looking at a crash dump yourself, and run the tool on the same system after a reboot. The -t option will let you map the service name if you know the PID of the service. If the PID changed or is otherwise unrecoverable, the -a option will dump the entire SCM tag database, which includes services which are stopped at the moment. Because these mappings are persistent across reboots, you’ll be able to map the thread with the service this way.

On the other hand, if all you’re dealing with is one thread, and you want to associate it to its service, the -t option lets you do just that.

Back to that same svchost.exe we were looking at with Process Explorer, here are some screenshots of ScTagQuery identifying the service running code inside rpcrt4.dll, as well as the service referencing the fundisc.dll module.

Note that Thread Pool worker threads do not have a Service Tag associated to them, so the current version of ScTagQuery on Vista cannot yet identify a service running code inside a worker pool thread inside ntdll.dll.

Finally, I should mention that ScTagQuery isnâ€™t the only tool which uses Service Tags to help with troubleshooting and system monitoring: Netstat, the tool which displays the state of TCP/IP connections on a local machine, has also been improved to use servicre tags to better identify who owns an open port (Netstat uses new information returned by various Iphlpapi.dll APIs which now store the service tag of every new connection). With the -b option, Netstat is now able to display the actual service name which owns an active connection, and not just the process name (which would likely just be svchost.exe).

The tool page for ScTagQuery is located here. You can download ScTagQuery in both 32-bit and 64-bit versions from this link. Windows Vista or higher is required.

After my departure from the ReactOSÂ project and subsequent new work for David Solomon, it wasn’t clear how much research and development on Windows internals I would still be able to do on a daily basis. Thankfully, I haven’t given up my number one passion — innovating, pushing the boundaries of internals knowledge, and educating users through utilities and applications.

In this vein, I have been working during my spare time on various new utilities that use new undocumented APIs and expose the internals behind Windows Vista to discover more about how the operating system works, as well as to be able to provide useful information to administrators, developers, students, and anyone else in between. In this post, I want to introduce my latest tool, MemInfo. I’ll show you how MemInfo can help you find bad memory modules (RAM sticks) on your system, track down memory leaks and even assist in detecting rootkits!

One of the major new features present in Windows Vista is Superfetch. Mark Russinovich did an excellent writeup on this as part of his series on Windows Vista Kernel Changes in TechNet Magazine. Because Superfetch’s profiling and management does not occur at the kernel layer (but rather as a service, by design choice), there had to be a new system call to communicate with the parts of Superfetch that do live in the kernel, just like Windows XP’s prefetcher code, so that the user-mode service could request information as well as send commands on operations to be performed.

Here’s an image of Process Explorer identifying the Superfetch service inside one of the service hosting processes.

Because Superfetch goes much deeper than the simple file-based prefetching Windows XP and later offer, it requires knowledge of information such as the internal memory manager lists, page counts and usage of pages on the system, memory range information, and more. The new SuperfetchInformationClass added to NtQuery/SetInformationSystem provides this data, and much more.

MemInfo uses this API to query three kinds of information:

a list of physical address ranges on the system, which describe the system memory available to Windows

information about each page on the system (including its location on the memory manager lists, its usage, and owning process, if any)

a list system/session-wide process information to correlate a process’ image name with its kernel-mode objectÂ

MemInfo ultimately provides this information to the user through a variety of command line options described when you run the utility. Some of its various uses include:

Seeing how exactly Windows is manipulating your memory, by looking at the page list summaries.

The Windows memory manager puts every page on the system on one of the many page lists that it manages (i.e. the standby list, the zeroed page list, etc). Windows Internals covers these lists and usage in detail, and MemInfo is capable of showing their sizes to you (including pages which are marked Active, meaning currently in-use and by the operating system and occupying physical memory (such as working sets) and not on any of the lists). This information can help you answer questions such as “Am I making good use of my memory?” or “Do I have damaged RAM modules installed?”.

For example, because Windows includes a bad page list where it stores pages that have failed internal consistency checks, MemInfo is an easy way (but not 100% foolproof, since Windowsâ€™ internal checks might not have detected the RAM is bad) to verify if any memory hardware on the system is misbehaving. Look for signs such as a highly elevated count of pages in the zeroed page list (after a day’s worth of computer use) to spot if Windows hasn’t been fully using your RAM to its potential (you may have too much!) or to detect a large memory deallocation by a process (which implies large allocations previously done).

Windows Vista also includes a new memory manager optimization called prioritized standby lists — the standby state is the state in which pages find themselves when they have been cached by Windows (various mechanisms are responsible for this of caching, including the cache manager and Superfetch) and are not currently active in memory. Mark covered these new lists in his excellent article as well.

To expose this information to system administrators, three new performance counters were added to Windows, displaying the size of the prioritized standby lists in groupings: priorities 0 through 3 are called Standby Cache Reserve, 4 and 5 are called Standby Cache Normal Priority, and finally, 6 and 7 are called Standby Cache Core. MemInfo on the other hand, which can also display these new lists, is an even better tool to identify memory in the standby state, since it is able to display the size of these lists individually.

While memory allocations on Windows XP (which could be part of application startup, the kernel-mode heap, or simple memory allocations coming from various processes) would consume pages from a single standby list and thus possibly steal away pages that more critical processes would’ve liked to have on standby, Windows Vista adds 8 prioritized lists, so that critical pages can be separated from less important pages and nearly useless pages. This way, when pages are needed for an allocation, the lower priority standby lists are used first (a process called repurposing). By making snapshots of MemInfo’s output over a period of time, you can easily see this behavior.

Here’s MemInfo output before, during, and after a large allocation of process private memory. Notice how initially, the bulk of my memory was cached on the standby lists. Most of the memory then became Active due to the ongoing large allocation, emptying the standby lists, starting by the lowest priority. Finally, after the memory was freed, most of the memory now went on the zero page list (meaning the system just had to zero 1GB+ of data).

Seeing to what use are your pages being put to by Windows

Apart from their location on one of the page lists, Windows also tracks the usage of each page on the system. The full list includes about a dozen usages, ranging from non-paged pool to private process allocations to kernel stacks. MemInfo shows you the partitioning of all your pages according to their usage, which can help pinpoint memory leaks. High page counts in the driver locked pages, non-paged pool pages and/or kernel stack pages could be indicative of abnormal system behavior.

The first two are critical resources on the system (much information is available on the Internet for tracking down pool leaks), while the latter is typically tightly maintained for each thread, so a large number may indicate leaked threads. Other usages should also expect to see a lower number of pages than ones like process private pages, which is usually the largest of the group.

At the time of this writing, here’s how Windows is using my 4GB of memory:

Looking at per-process memory usage, and detecting hidden processes

Internally, Windows associates private process pages with the kernel executive object that represents processes as managed by the process manager — the EPROCESS structure. When querying information about pages, the API mentioned earlier returns EPROCESS pointers — not something very usable from user-mode! However, another usage of this API is to query the internal list of processes that Superfetch’s kernel-mode component manages. This list not only allows to take a look at how much memory, exactly, belongs to each process on the system, but also to detect some forms of hidden processes!

Hidden processes are usually the cause of two things. The first is processes which have been terminated, but not yet fully cleaned up by the kernel, because of handles which are still open to them. Task Manager and Process Explorer will not show these processes, but MemInfo is the only tool apart from the kernel debugger which can (so you donâ€™t have to reboot in debugging mode). See below on how MemInfo is showing a SndVol32.exe process, created by Windows Explorer when clicking on the speaker icon in the system tray — Explorer has a bug which leaks the handles, so the process is never fully deleted until Explorer is closed.

The second cause of why a process may be hidden is a rootkit that’s hooking various system calls and modifying the information returned to user-mode to hide a certain process name. More advanced rootkits will edit the actual system lists that the process manager maintains (as well as try to intercept any alternate methods that rootkit detection applications may use), but MemInfo adds a new twist, by using a totally new and undocumented Superfetch interface in Windows Vista. It’s likely that no rootkit in the wild currently knows about Superfetch’s own process database, so MemInfo may reveal previously hidden processes on your system. Unfortunately, as with all information, it’s only a matter of time until new rootkits adapt to this new API, so expect this to be obsolete in the next two years.

There’s many more uses for MemInfo that I’m sure you can find — including as a much faster replacement for !memusage 8 if you’ve used that command in WinDBG before. MemInfo is fully compatible with both 32-bit and 64-bit versions of Windows Vista (including SP1 and Windows Server 2008 RC1 â€“ even though Server 2008 does not include Superfetch, because both client and server versions of the OS will use the same kernel as of Vista SP1, the API set is identical), but not any earlier version of Windows. Apart from these simple summary views, MemInfo is powerful enough to dump the entire database of pages on your system, with detailed information on each — valuable information for anyone that needs to deal with this kind of data. Furthermore, unlike using WinDBG to attach to the local kernel, it doesn’t require booting the system into debug mode.

You can download a .zip file containing both versions from this link. Make sure to run MemInfo in an elevated command prompt — since it does require administrative privileges. The documentation for MemInfo is located on the following page (this page is part of an upcoming website on which I plan to organize and offer help/links to my tools and articles).