For my last project I needed to provision uncustomized (aka ghosted) files to WSS document libraries through code. The scenario was that the client managed several "services" in a Sharepoint site. Any service consisted of various SharePoint artifacts (document libraries, custom landing pages for document libraries, custom views, permissions, workflows, form pages …) tied together in a certain fashion. The client also needed to be able to provision new such services through a set of custom application pages at run-time. These are aspx-pages which provision said SharePoint artifacts and configure them appropriately. For one of the document libraries, a shared landing page from a feature needed to be configured.

Uncustomized (ghosted) vs. customized (unghosted)

Just a short reminder: Content Pages (actually: pages, documents, document templates …) in SharePoint can either be uncustomized (the artist formerly known as ghosted) or customized (unghosted). The basic difference between these two is that the content stream of uncustomized artifacts is read from the file system of the WFE, whereas for customized files the content stream is read from the content database. SharePoint manages some metadata for both cases in the content database. Since a uncustomized file has an entry in the content database but no content stream, it also needs to store a reference to the file in the file system.

Single point of maintenance for any landing page. We have a high number of document libraries, each of which references the same set of aspx-pages. We just need to service the aspx-files in the file system through a feature, and any document library will instantly pick up the modified version. There is no need to apply the changes to each document library individually.

Storage. Uncustomized files are only stored once in the file system and not in the content database, even if it is referenced from hundreds of document libraries. For customized files, the content stream is stored in the db for each instance of the file.

Security context. Uncustomized pages run in a less restricted security context since they can only be serviced by high privileged accounts (who supposedly know what they are doing).

The ability to still customize a file for a single document library through SharePoint Designer (or other tools). Once you do this, you have a regular customized file (which you could also revert later) with the content stream in the SharePoint content database.

Now, as anybody knows, adding ghosted files is very easy via features, but not easily possible through the UI or the object model. When adding files through the object model, you end up with customized (unghosted) files. There is no obvious way to programmatically link an item in a SharePoint list to a file to the file system (below the features folder) at all.

SharePoint API

Now, besides the .NET/COM based object model, Microsoft also has published documentation for other APIs, most notably the database API in which many of the stored procedures (and the required sequence to call them) are documented. So this solution is actually documented (and not a hack). You can find the SharePoint protocol documentation here in the Microsoft Download Center in PDF format (the contents are also available through the MSDN online version).

Files in SharePoint document libraries

Second reminder: You can add files to a document library in two ways. One is to add the file, without creating a list item. This is done for the standard views in the /Forms folder of standard document libraries. These files are used by the SharePoint infrastructure to display the list, but are not items in the list. They do not have a content type or metadata. They correspond to entries in the dbo.AllDocs/dbo.AllDocStreams tables in the content database. The second option is the first one plus creating a list item. The latter has a file plus a list item linked to it. The list item has a content type assigned to it and also stores the metadata (the values of the fields which the content type defines). This metadata ends up in the dbo.AllUserData table.

From an API point of view, in our case the stored procedures proc_AddGhostDocument, proc_AddListItem, and proc_GenerateNextId are of interest (all in the SharePoint content database). proc_AddGhostDocument does the first step, adding the file to SharePoint and storing the reference to the file system. proc_AddListItem adds the second step, the creation of the list item. proc_GenerateNextId is needed to generate the Id for the list item.

The sequence in which we call these stored procedures is as follows:

Case 1: Just add the document, without creating a list item

Call proc_AddGhostDocument, which links the document in the SharePoint content database with the item in the file system (below the features hive)

Case 2: Add a document and create a list item

Allocate a new id for the document in the corresponding list using proc_GenerateNextId

call proc_AddGhostDocument, which links the document in the SharePoint content database with the item in the file system (below the features hive)

Write the list item information to the database using proc_AddListItem. In this step, the content type, metadata, and optionally ACL information is stored. In our case, we write two rows.

Now, how do we call the stored procedures from our code in an application page?

First of all, you should make sure that you really want to allow the user making the request to your page to make the changes. Usually, this will be restricted using some kind of application role or privilege. You could leverage SPWeb.DoesUserHavePermissions combined with SPUtility.HandleAccessDenied for this.

Elevate the security context using SPSecurity.RunWithElevatedPrivileges. I try to avoid using privilege elevation if I can, but we need it here. Just makes it more important to validate your caller and arguments in step 1! And make sure you handle all the implications that come with RunWithElevatedPrivileges correctly.

Good practice is to put a try/finally block around your ADO.NET code, and then call sqlCommand.Cancel() and sqlConnection.Close(). The Cancel does not cancel or roll back the command. Just in case you are retrieving a TDS stream from the database which has not been processed completely, ADO.NET will cancel the stream and do the cleanup a bit quicker.

Sample code (boiler plate, not production ready) is shown below, both SQL and a C#method.

Sample walkthrough

I am testing this functionality on a sample document library labeled “42-GhostInTheShell”. We are going to provision the file LandingPage.aspx to the /Forms folder (without adding a list item), as well as the GhostedDocument.xml file to the root folder for which we also add a list item.

After provisioning the items, the GhostedDocument.xml is visible in the UI with the content type SOP (that is a custom content type).

Because we are curious we also take a look at the SharePoint content database using two very simple SELECT-statements (old German saying: nur kucken, nicht anfassen). The following screenshot from SQL Server Management Studio shows the records returned from the dbo.AllDocs and the dbo.AddUserData tables (unimportant columns omitted). You can see how the document in SharePoint links to the file under the features-folder in the file system using the SetupPath column. The AllUserData-table contains the metadata for the list item, including its content type. (TimeCreated column vs. Modified date in the UI: Content database store UTC timestamps)

For the LandingPage.aspx provisioned by us, we take a look using SharePoint Designer, which lists it as a uncustomized file. (I took a bad picture of this one. The right side actually shows the contents of library “42 – GhostInTheShell”)

Again, we take a look at the SharePoint content database (without touching anything), querying similar tables as before. Again, the entry in the dbo.AllDocs table shows how the document links to the file system using the SetupPath column. The SetupVersionPath just contains the information that the path root is the WSS v3 (aka 2007) path root, […]\Common Files\Microsoft Shared\Web Server Extensions\12.

As we did not create a list item for this file, no rows from the dbo.AllUserData-tables are returned, as expected.

For debugging SQL, the obvious tools like SQL Server Management Studio, Query Profiler, or sqlcmd come in handy. With Query Profiler you can always trace what SharePoint itself is doing. If you trust these guys, you can also happily use Quadrant.