Sourcepack (indexing PDB files with source archive file)

Introduction

When using third-party libraries in your projects, you often would like to know what’s happening inside the library. For close-source projects, there is no other way than using Reflector or ILSpy debugger (though we may end up with unreadable obfuscated code), but for Open Source projects, there should be no problems with loading the valid source code into the debugger. There is a great initiative (http://www.symbolsource.org/) which aims at providing source-indexed symbol files for Open-Source projects, but unfortunately, not many projects are available there and even for those accessible, their version is not always accurate. Finally, you are left with a manual setup and compilation of the source code (referencing the binaries using, for example, reference paths). But we could do it differently. I observed that it’s more and more popular for Open Source project authors to provide the binaries with their PDB files. The PDB files in this form (especially for managed applications/libraries) are not very useful, although it’s not difficult to change this situation. You will need a Sourcepack script and source code zip file (which is usually provided by the author besides the binaries). Sourcepack modifies the symbols PDB files so that they will reference the source archive file and any debugger (which supports source server) will be able to extract the required source file on demand.

The Sourcepack script is a PowerShell script which examines and then modifies the PDB files in the given directory to make them reference the source code archive file. Firstly we must have a quick look at the PDB file structure.

PDB file structure

When you build your application/library either in the debug or PDB-only release mode, the compiler will emit, besided the binaries, PDB files. In general, PDB files contain information for the debugger on how to bind the processor instruction addresses with the lines of the source code file (PDB files are even more important for native builds when they store metadata about the types and functions declared in the binaries). Normally, PDB files contain only absolute paths to the source code files and thus are usable only on machines which store the source code files in the same place as defined in the PDB file. To find those absolute addresses, you may use the srctool application (with the -r switch) which is a part of Debugging Tools for Windows:

srctool.exe -r ConsoleApplication1.pdb

However, it’s not the only way the PDB files may reference source code. There is a special stream in the PDB file which can inform the debugger where to look for a source code file. The stream format is very extensible and you can actually put there any command you want under only one condition - it must extract the desired source code file into the target directory. Normally, with Debugging Tools for Windows, you receive a bunch of scripts for different source code repositories (SourceSafe, CVS, Subversion). The source indexing usually consists of the following steps:

Index all source code files.

Index PDB files and match them with the already found source files.

Create a temporary stream file, which looks more or less like the one below (I marked with bold the mandatory fields, and in blue the section names):

As you may see in the snippet above, there is a special SRCSRVCMD variable which will be run by the debugger if it does not find the source code file at the absolute path.

Sourcepack script

The Sourcepack script may be considered as just another tool for indexing the PDB files which uses the archive file (zip, 7z, or any other) as a source code repository. So the extract operation will simply consist of calling one of the packer applications (7z, WinZip, Rar etc.) with the correct arguments. For example, for a 7z command, the temporary stream file may look as follows:

To compile this code, we need to download the binaries, from for example nlog.codeplex.com, and reference them while compiling. Fortunately, binaries come with PDB files, so let’s have a look at which files they reference (below you can see a small snippet of the output):

We can see that the author kept the sources at the root location: c:\NLogBuild\ - we will need this information for further actions. We could stop here, download the source code, extract it to the c:\NLogBuild\ directory, and start NLog source stepping. However, taking this approach for all the source projects you would like to debug, firstly might not always work out, and secondly will result in a really messy directory tree and a big loss of your hard drive space (source files are kept uncompressed). Sourcepack was designed to resolve all those problems. It enables you to keep all the compressed source packages in one place and modify only the downloaded PDB files to reference them. In our NLog example, let’s create a C:\Sources folder and copy the NLog source package there. Now, let’s run the sourcepack.ps1 command (you may get it from sourcepack.codeplex.com):

Now, start your favourite debugger (must support source server streams in PDB files) and try to step into, for example, the Log method of the Logger object. The debugger should prompt whether you want to execute the source server command which was found in the PDB file. In Visual Studio 2010, this dialog looks as follows:

After you agree to run the command, you should start source stepping the NLog code.

Sourcepack argument reference

The table below describes all possible parameters that can be passed to the Sourcepack script:

Parameter

Status

Default value

Description

-symbolsFolder

MANDATORY

N/A

The path of the root symbols directory. The directory is then recursively searched for any PDB files to be indexed.

-sourceArchivePath

MANDATORY

N/A

The path of the archive file in which all sources lie.

-sourcesRoot

OPTIONAL

guessing from PDB

The root of the source folder - usually it’s just a path to the folder from which the archive file was created.

-dbgToolsPath

OPTIONAL

none

Path of Debugging Tools for Windows (the srcsrv subfolder) - if not specified, the script tries to find it. If you don’t have Debugging Tools for Windows in the PATH variable, you need to provide this argument.

-archiverCommandPath

OPTIONAL

script_path\7za\7za.exe

With the script you probably also downloaded a 7za.exe application. If you unpack all the files into the same directory, you don’t need to provide this argument. If not, please provide a path to 7za.exe or 7z.exe including the exe file in it, e.g., c:\program files\7-zip\7z.exe.

Installation requirements

The following applications must be installed for the script to work:

PowerShell - to run the script

Debugging Tools for Windows

Any file archiver (free 7zip, for instance)

License

This article, along with any associated source code and files, is licensed under The MIT License