Friday, December 7, 2007

WinSxS Breaks Old Libraries

After my previous experiences with handling Windows Side-by-Side Assemblies (WinSxS) in remote debugging and Isolated COM, I thought I was actually starting to get a handle on how it worked. Today I got stuck on another WinSxS problem, this time while porting our application to Visual Studio 2008.

The problem was that the application would fail to fail to start, with an error about the manifest being wrong. I used sxstrace, a handy tool under Vista, to try and determine what was happening. Sxstrace generated 180 lines of information about Vista's attempt to find and load the correct assemblies. It ended up being too much information. The only obvious problem I saw was that one of the DLLs being loaded was from Visual Studio 2005, not 2008.

I used Depends to look at the file and received the same errors. I looked in the Event Viewer and saw this error:

This is even stranger, because 20706 was the version ID of Visual Studio 2008 Beta 2. I'm running the final release.

I used Visual Studio to open XSales.exe in Resource mode so I could look at the manifest itself. Here I found references to four versions of DebugCRT. The question was, where were they coming from? My code makes no explicit reference to assembly versions.

What I discovered was that the intermediate manifests generated by the compiler and linker include assembly information from all objects used by the linker, including objects from libraries, of which I had two. One of my .LIB files was generated by VS2005 and another was generated by VS2008 Beta 2, which is what caused the references to old assembly versions. Once I rebuilt those .LIB files with VS2008, the problem went away.

The lesson learned from all of this is that .LIB files are no longer easily portable between versions if they rely on any of the CRT or MFC DLLs. The painful part of this is that the problem doesn't show up until you try and run the software because none of the development tools warn about the inconsistency.

All of the LIBs that triggered the problem were specific to my own project and were not shipped from Microsoft (or any other 3rd party.) However, all of the referenced DLLs were just standard C Run-time and MFC DLLs.

Trying to figure out which LIBs are causing the trouble is a challenge. I ended up recompiling all of our in-house libraries. Historically you could use the NODEFAULTLIB switch to solve the problem, but that doesn't affect manifest, so it no longer works.

As I've said in other parts of my blog, reinstalling Windows is almost never the right solution. The problem is that your computer may be in a configuration that one of your customers may have too.

Reinstalling Visual Studio is typically only useful if you have problems with the GUI. I've been developing with Visual Studio since version 1.0. In that time, I don't think I've ever had to reinstall VS to fix a build problem.

You should copy your colleague's EXE to your computer and see if it runs. Then copy your EXE to your colleague's computer and see if it runs. If your EXE consistently fails, then open the EXE in resource mode in VS and compare the RT_MANIFEST resource in your EXE with the RT_MANIFEST resource in your colleague's EXE.

About Me

My technical passion is for building consumer software applications. I'm best known for my work in Windows and C++, but lately I've been working on cross platform Android/iPhone mobile solutions in Flutter and Dart.

My book Multithreading Applications Win32 was one of the top 3 best-selling books on Amazon on Windows development for over five years. I've been the architect of software projects for Google, Intel, Brother, Northrop Grumman, and numerous smaller companies.

Click "Email" in my profile to contact me.

I have been writing commercial software for Microsoft operating systems since MS-DOS 1.0. I am published in magazines such as Dr. Dobbs Journal, C++ Users Journal, and Visual C++ Developers Journal. I am in the Giant List of Classic Game Programmers.