The Case of the Printing Failure

The most interesting cases I receive are those that demonstrate a unique troubleshooting technique or uncover an interesting root cause. I received this one recently that has both characteristics. The case opened when a systems administrator got a report from a user that they were unable to print from their computer. There was no visible reaction to clicking on a print dialog or menu item, where normally they saw a dialog stating that the document had been sent to the printer and a tray icon appear representing the active print queue.

The first thing the administrator did was to scan the event logs of the user’s machine, looking for any printing-related events. He quickly came upon two that correlated with the user’s most recent printing attempt:

It appeared that the Spooler Service started when the user tried to print, but terminated, apparently with an exception (unexpectedly), about a minute later. The question was why?

The administrator turned to Sysinternals Procdump. Procdump is a utility that generates crash dump files of a process when triggers you specify occur. Implemented triggers include CPU usage, virtual memory usage, unhandled exceptions, and process termination. You can use the CPU usage trigger, for example, to capture the state of a process when it hits a short-lived CPU burst, allowing you to look into the process to see the reason for the spike. The administrator guessed that the stack trace of the terminating thread might provide a clue.

He knew that he had some time to get Procdump running after the Spooler Service started, so he launched Notepad, tried to print, and then executed Procdump with the –e option and the name of the Spooler Service process (Spoolsv) to have Procdump wait until the service exited before writing the dump file. A few seconds later Procdump reported that it had completed the job and saved a dump file:

He opened the dump in Windbg and executed the ‘k’ command, which has Windbg dump the stack of the thread that caused the crash. The stack trace, which essentially lists a record of the function calls executed before the crash, showed that the process died in a sequence of calls that included several Ldr functions, including LdrpLoadDll:

A web search revealed that LdrpLoadDll is a function related to the system’s DLL loader. Suspecting that the process was dying either because it couldn’t find a DLL it was looking for or was loading an incorrect DLL, he turned to Process Monitor, which would enable him to see the process’s DLL-related file system activity. He started Process Monitor, attempted to print again, and then stopped the capture. Working his way from the end of the trace back to the beginning, he scanned for hints of the root cause. Shortly before Spoolsvc exited, he saw it searching unsuccessfully for Localspl.dll in various directories on the system:

He assumed that the DLL was supposed to be there. When he looked at an another identically configured Windows XP system on his network, he found Localspl.dll in the \Windows\System32 directory, but not on the system experiencing the problem:

The file’s description reported it to be the Local Spooler DLL, which explained the Spooler’s inability to support printing operations. After he copied the file from the working system, he was able to print successfully.

As far as the user was concerned, the case was closed and he was able to get back to work, but the administrator was left with the question of what had happened to the original DLL. Another web search turned up forum posts from others that had experienced the same problem. One post in particular described the exact symptoms he’d seen, including the event log entries, suggested the same fix of copying Localspl.dll from another system, and blamed uninstallers of third-party print and fax software for incorrectly deleting Localspl.dll:

He couldn’t say for certain that was the case for this particular system because the end user didn’t remember uninstalling printing or fax software, but the post had at least given him a plausible theory to replace the unease he would have been left with that files were mysteriously being deleted from his systems. He could now close the case thanks to Procdump and Process Monitor.

If you solve an interesting case with Sysinternals tools, please send me screenshots (.PNG preferred) and log files so that I can share them with others in this blog and my presentations.

Reminds me that I need to debug my wifes computer: on about every 3rd boot the touchpad does not work (reboot fixes), on around every 10th boot touchpad and keyboard don’t work (suspend and wakeup fixes keyboard, but not touchpad).

No messages in the event log. Probably just a messy touchpad driver from Fujitsu-Siemens, but installing the most recent did not change much.

SFC is designed for compatability with the offending programs i.e. it lets the programs go through with their system-altering deletions/moves/editing… and then silently restores the file a short time after.

this is great for dealing with j.random.1990’s app that wants to edit svchost for it’s own (no longer required) reasons… not so great when considering an antivirus, as when SFC restores the file:

– the antivirus will just move it again

– SFC will have alerted the antivirus to the existence of the backup file, which it’ll also move

basically, SFC is a defence against accidental altering of system files, not deliberate… assuming svchost.exe is SFC-protected (I’d assume so)

Johan: I agree, error messages usually only notify you of the problem, rarely do they help understand or solve the problem. However, it’s not always because the developers are idiots or don’t care, it’s hard problem to solve. When a executable has static links to DLLs that are missing, the EXE doesn’t get a chance to write a custom error message because it never gets loaded by the OS or given a chance to run. The developer needs to write extra code to dynamically link to every DLL that might be used and load them separately and handle logging for each one. I’m not saying they shouldn’t write the extra code, I think they should, I’m just explaining that it’s not as simple as taking an extra 15 seconds to type a more informative error message.

There seems to be some confusion over SFC and system file protection. The former you must run manually ("sfc /scannow") and it has a list of files (you can see the list – it's in an ascii file in the system32 directory somewhere from memory) that it looks for. If it finds any file on the list has been altered from original, it grabs a copy from the dllcache or Windows CD and puts it in place. The list of files it has is not exhaustive, so I'm not surprised the dll in this case wasn't in the list. I find a repair install fixes a lot of these sorts of issues when SFC doesn't though.

System file protection is a different feature. It's the one that sits in the background and automatically replaces files that are deleted/modified a few seconds after the change.

Amazing blog. My printer HP laserjet 2100tn wasn't working properly on Wi 7 . The driver available on HP website also didnt work. However, I find answer here that install Laserjet 2200 driver instead of 2100 from Windows system and it worked fine. Thanks guys!!!

If we run ProcDump on system while say the program is crashing, before explorer or anything is running, is there a way to Schedule ProcDump to run during the starting phase of the system while its running so it can generate the dmp file? Or will running it as he did during system already running , automatically schedule it to dmp file whenever the process we are monitoring even if we reboot system after?