Table Valued Function magic in SQL Server 2005 using the new SQL/CLR integration

In my last post, I said we would look at how we could return a table that contains a directory listing of all the .xml files on my machine.Using the SQL/CLR LoadFile method we created in the last post, our goal is to come up with a single T/SQL statement that can dump all the .xml documents sitting as files in the file-system into our new SQL Server 2005 database.

Of course, to get a directory listing, I could use good old, rather unsafe xp_cmdshell:

xp_cmdshell'dir c:\*.xml /s /b'

(NOTE: In SQL Server 2005 we have turned xp_cmdshelloff by default, to among other things, encourage people not to use it!)

The xp_cmdshellroute works, sort of.But what we also need is a ‘Last Write Time’ so we can do subsequent incremental loads.This makes the xp_cmdshellroute trickier; I have to come up with some ‘dir command’ and then start parsing the xp_cmdshelloutput.Yuck!

Another xp_cmdshell problemThe account that is used to get the directory listing is the service account of the sqlservr.exe process.What I want is the credentials of the user who is asking for the directory listing to be used to access the file system.

We are going to look at a new feature in SQL Server 2005 which will give us a safe, secure, fast and rich directory file listing, these are SQL CLR Table Valued Functions (also known as TVF) and a glimpse of the new security EXECUTE AS framework that will ensure the right, correctly privileged, security credentials are used to access the filesystem. [UPDATE 7/21: I am wrong here. EXECUTE AS does NOT affect the security credentials we use when leaving the sqlserver.exe process. We always use the service account credentials to access resources external to the process.]

Show me the TVF magicTVFs are powerful, but there is a little bit of ‘magic source’ you just have to know, and that is the magic behind the ‘FillRow’ method.The ‘FillRow’ method is specified by adding the SqlFunction ‘function attribute’ to the header of your C# function, like this:

[SqlFunction(FillRowMethodName = "FillRow")]

publicstaticIEnumerable DirectoryList( ...

The method name specified in the FillRowMethodName is implicitly called by the SQL/CLR framework each time the MoveNext() method on the returned IEnumerableobject(or type that implements the IEnumerable interface)is called.The FillRow method must have a signature that looks like this:

Where the 1st parameter (Object obj), once cracked as an object array, contains the values of one output row. The subsequent function parameters (out<col1_type> <col1> etc.)are ‘out’ parameters that contain the values that will appear in the columns for the row that is being constructed.If you don’t get this now, don’t worry, the code is usually just a simple pivot of a list of array elements from the cracked Object to the outparameter variables.

ArrayList rowsArray = newArrayList(); // Already implements IEnumerable, so we don't have to

DirectorySearch(rootDir, wildCard, subDirectories, rowsArray);

return rowsArray;

}

The entry function must be static and return a class that implements the IEnumerableinterface.Notice the ‘magic source’, the SqlFunction attribute that specifies the FillRow method name, this FillRow method get called implicitly each time the SQL/CLR framework implicit calls the MoveNext() method on the rowsArray collection, and it is required for SQL/CLR TVFs.

Next, we go down to the filesystem and search the Directories, and then Files for the files that match the wildCard, adding each matching file as an element to the rowsArray collection.

The FillRow method passes in an object as the 1st parameter, this object is then cracked as an array whose elements need to be assigned to the outparameters which match the columns for the row being constructed.

Save the source above as a single file ‘directorylist.cs’, and build it as an assembly library:

csc /target:library /out:"c:\DirectoryList.dll" "directorylist.cs"

(TIP: There maybe several csc.exe’s hanging out on your machine. You can find out which one you need by using the new sys.dm_clr_propertiesdynamic management view)

SELECTvalueFROMsys.dm_clr_propertiesWHEREname='directory'

value

-----------------------------------------------

C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\

To add the assembly and expose the C# function as a T/SQL function, run this:

IF EXISTS(SELECT * FROM sys.objects WHERE name ='DirectoryList')

DROPFUNCTION DirectoryList;

IF EXISTS(SELECT * FROM sys.assemblies WHERE name ='SqlClrAssembly')

DROPASSEMBLY SqlClrAssembly;

CREATEASSEMBLY SqlClrAssembly

FROM 'C:\DOCUME~1\stuartpa\LOCALS~1\Temp\SqlClrObjects.dll'

WITH PERMISSION_SET = EXTERNAL_ACCESS;

GO

CREATEFUNCTION DirectoryList (@root_directory nvarchar(max),

@wildcard nvarchar(max),

@subdirectories bit)

RETURNS TABLE (filename nvarchar(max), last_write_time datetime)

ASEXTERNALNAME SqlClrAssembly.UserDefinedFunctions.DirectoryList;

GO

By default SQL/CLR functionailty is what we call ‘Off By Default’ (or 'OBD'), this means that while we can do the above CREATEASSEMBLYand CREATEFUNCTION, we cannot actually invoke the function. To turn the SQL/CLR functionality on, run the following command:

Putting it all togetherAs I promised in my last post, we can now write a single T/SQL statement to insert the contents of all my (well-formed) .xml files as rows in the xml_documents table.Using the LoadFile SQL/CLR function and the xml_documents table created in my last post, with the DirectoryList SQL/CLR TVF we created in this post, the single T/SQL statement below dumps the contents (using dbo.LoadFile) of all the .xml files on my filesystem (as found with DirectoryList) into the XML data type column in the xml_documents table.

/* Insert the contents of all .xml files on c: into the xml_documents table */