.NET 1.0 revolutionized Visual Basic data access with a whole new object model for interacting with files, connecting to databases, and manipulating XML. In .NET 2.0, the revolution continues with a slew of minor improvements, some new features, and a tool for generating data access code automatically, all designed to make life better for the VB programmer.

Tip

What you won't learn about in this chapter are the new .NET Framework features designed for SQL Server 2005. These include using .NET code to program user-defined data types and stored procedures, SQL Server 2005 notifications, and multiple active recordsets (MARS). For more information about these SQL Server 2005 features, please pay a visit to the MSDN SQL Server 2005 Developer Center at http://msdn.microsoft.com/SQL/2005.

Get Drive Information

.NET includes handy DirectoryInfo and FileInfo classes for gathering information about files and directories. However, in .NET 1.x there wasn't any way to get a list of all the drives on your computer without using unmanaged calls to the Windows API. Thankfully, the DriveInfo class finally debuts in .NET 2.0.

Note

The DriveInfo class lets you easily retrieve information about the drives on your computer.

Tip

Before going any further, start by importing the System.IO namespace. The file access classes in .NET 2.0 (FileInfo, DirInfo, and DriveInfo) all exist there. Whether you access them directly or through the My object, you'll need to import this namespace.

How do I do that?

The My.Computer.FileSystem provides a quick way to get a list of all the drives on the current computer. All you need to do is loop through the Drives collection, which exposes a collection of System.IO.DriveInfo objects.

For example, to see all the drive letters on the current computer, enter the following code:

' Display a list of drives.
For Each Drive As DriveInfo In My.Computer.FileSystem.Drives
Console.WriteLine(Drive.Name)
Next

This writes a list of drive letter names ("A:\", "C:\", "D:\", and so on). You can also create a DriveInfo object for a specific directory by using the My.Computer.FileSystem.GetDriveInfo( ) method, and specifying the letter of the drive you want to examine as a string. For example, try the following code to take a closer look at drive C:

This example displays the volume label that's set for the drive. You can also examine other properties of the DriveInfo object to get more information (such as the DriveType, TotalFreeSpace, and TotalSize). But keep in mind that you can't retrieve this information for a removable drive if there's no media present (for example, if the CD or diskette isn't in the drive). To guard against this possibility, check to make sure the DriveType doesn't include DriveType.Fixed before trying to get more detail.

Example 5-1 puts these concepts together with a complete, simple console application that displays information about all the drives on your computer.

Example 5-1. Displaying information about all the drives on a computer

What about...

...getting information about the rest of the filesystem? The .NET Framework has always made it easy to get directory and file information using DirectoryInfo and FileInfo objects. Once you have instantiated a DriveInfo object, you can use its RootDirectory property to get a DirectoryInfo object that wraps the root directory (e.g., C:\). You can then use methods like DirectoryInfo.GetFiles( ) and DirectoryInfo.GetDirectories( ) to retrieve the files and subdirectories contained in the root directory.

Where can I learn more?

For more information about all the properties of the filesystem information classes, look for the "DriveInfo," "DirectoryInfo," and "FileInfo" index entries in the MSDN class library reference. You can also refer to other labs in this chapter, which show new shortcuts available with the My.Computer.FileSystem object. These include:

"Get File and Directory Information," which shows how you can quickly get information about a specific file or directory without directly creating a FileInfo or DirectoryInfo object.

"Copy, Move, and Delete Files," which shows how you can easily shuffle files and directories from one place to another.

"Read and Write Files," which shows the quickest way to extract text content from a file or write to a file.

Get File and Directory Information

In VB 2005, you can access all the file and directory information you need from a single starting point: the new My.Computer.FileSystem object.

Note

The new My.Computer. FileSystem object lets you get file and directory information with a bare minimum of code.

How do I do that?

Here are four key methods of My.Computer.FileSystem that you can use to get file and directory information. Every method has the same signature—it takes a single string parameter whose value is the complete path of the file or directory that's the subject of your query. The methods are:

FileExists( )

Returns True if the file exists.

DirectoryExists( )

Returns True if the directory exists.

GetFileInfo( )

Returns a FileInfo object. You can examine its various properties to get information such as file size, attributes, and so on.

GetDirectoryInfo( )

Returns a DirectoryInfo object. You can examine its various properties to get information such as directory size, attributes, and so on.

The code snippet shown in Example 5-2 first determines whether a file exists and then displays some information when it does.

What about...

...searching for directories and files? The My.Computer.FileSystem object also provides a GetDirectories( ) method to retrieve the names of all the subdirectories in a directory and a GetFiles() method to retrieve the names of all files in a given directory.

Note

In early beta versions, Visual Basic included new FolderProperties and FileProperties classes that duplicated the DirectoryInfo and FileInfo classes. Fortunately, Microsoft decided not to reinvent the wheel, and went back to the . NET 1.x standards.

Both methods offer additional flexibility via an overloaded version that accepts additional parameters. You can specify an array with one or more filter strings (for example, use *.doc to find all the files with the extension .doc). You can also supply a Boolean includeSubFolders parameter that, if True, searches for matching files or directories in every contained subdirectory.

Here's an example of an advanced search that finds all the .exe files in the c:\windows directory:

' Get all the EXE files in the Windows directory.
For Each File As String In My.Computer.FileSystem.GetFiles( _
"c:\windows\", True, "*.exe")
Info = My.Computer.FileSystem.GetFileInfo(File)
Console.WriteLine(Info.Name & " in " & Info.Directory.Name)
Next

Note that the GetFiles( ) and GetDirectories( ) methods just return strings. If you want more information, you need to create a FileInfo or DirectoryInfo object for the file or directory, as shown above.

There is one caveat: when you perform a search with the GetFiles( ) method, the matching file list is first created and then returned to your code. In other words, if you're performing a time-consuming search, you won't receive a single result until the entire search is finished.

Copy, Move, and Delete Files

In addition to helping you gather information about the directories and files on your system, the My.Computer.FileSystem object also gives you quick access to a number of methods for performing common file-management tasks, such as copying, moving, and deleting files.

Note

In VB 2005, you can perform common file-management tasks with a single line of code.

How do I do that?

The My.Computer.FileSystem object provides several self-contained methods for performing common file-management operations. They are:

CopyFile( ) and CopyDirectory( )

MoveFile( ) and MoveDirectory( )

RenameFile( ) and RenameDirectory()

DeleteFile( ) and DeleteDirectory()

The way you use each of these methods is fairly straightforward. You supply two parameters: a source path and, if required, a target filename or path. For example, you can rename a file with this line of code:

My.Computer.FileSystem.RenameFile("c:\myfile.txt", "newname.txt")

These methods are also available in overloaded versions that give you additional features. We'll take a look at those next.

The move and copy methods of FileSystem are available in a variety of overloaded versions. If you need to overwrite an existing file or directory, be sure to use a version that includes the Boolean parameter overwrite and set it to True. Otherwise, you'll receive an exception and the operation won't be completed. Here's an example of one such option:

Note

In some beta versions, the user interface for moving or deleting a file doesn't appear, even when you choose to see it. However, the underlying task (moving or deleting a file) is always performed correctly.

Interestingly, among the copying and deleting methods are versions that accept the showUI Boolean parameter. If that parameter is set to True, the operation works exactly as if a user had initiated the delete or copy operation in Windows Explorer: dialog boxes appear asking the user to confirm the request to overwrite or delete files, and a progress indicator appears with a Cancel button when a file copy or delete operation is in progress (unless the operation completes very quickly). You can even specify what should happen when the user clicks Cancel (either an exception is thrown or nothing at all happens) using the onUserCancel parameter.

Example 5-3 provides a complete console application that lets you test this behavior.

As shown in this example, the DeleteFile( ) and DeleteDirectory( ) methods have one additional frill available. By default, when you delete a file, it bypasses the Windows recycle bin. However, you can use an overloaded version of DeleteFile( ) or DeleteDirectory( ) that accepts a sendToRecycleBin parameter. Set this to True to keep your file around as a safeguard.

What about...

...file operations that use special folders? The new My.Computer.FileSystem object allows you to retrieve references to many system-defined folders through the SpecialDirectories class. For example, you can quickly retrieve the path for temporary files, user documents, and the desktop. Here's an example:

Read and Write Files

If you need to work with text files or raw binary data, VB 2005 provides a new solution that bypasses the lower-level classes of System.IO for small files. Now you can read and write text in a single atomic operation using the My.Computer.FileSystem object. Best of all, you no longer need to create streams, track your position, or clean up afterward.

Note

At last, a way to read and write files without the complexities of streams and stream readers.

How do I do that?

The My.Computer.FileIO object provides the absolute quickest way to read or write the contents of a file. Its secret lies in a few self-contained methods. These include:

ReadAllText( )

Reads the content of a text file and returns it as a single string.

ReadAllBytes( )

Reads the content of any file and returns it as an array of bytes.

WriteAllText( )

Writes text as a string to a file in one atomic operation. You can either add to an existing file or create a new file, depending on whether you supply True or False for the Boolean append parameter.

WriteAllBytes( )

Writes a byte array to a file in a single operation. You can either add to an existing file or create a new file, depending on whether you supply True or False for the Boolean append parameter.

Example 5-4 creates a simple text file and then reads it back into memory.

What about...

...the limitations of this approach? The methods that you'll find in the My.Computer.FileSystem object are unmatched for sheer convenience, but they aren't always appropriate. Here are some reasons you might be better off using the lower-level classes of the System.IO namespace:

You have an extremely large file, and you want to read and process its contents one piece at a time, rather than load the entire file into memory at once. This is a reasonable approach it you're dealing with a long document, for example.

You want to use other data types, like numbers or dates. In order to use the My.Computer.FileIO methods to handle numeric data, you'll need to first convert the numbers into strings or byte arrays manually using other .NET classes. On the other hand, if you use a FileStream instead, you simply need to wrap it with a BinaryReader or BinaryWriter.

You want to use other stream-based .NET features, such as compression (explained in the next lab, "Compress and Decompress Data"), object serialization, or encryption.

The core .NET classes for reading and writing files are found in the System.IO namespace and haven't changed in .NET 2.0. The most useful of these are FileStream (allows you to open a file directly for reading or writing), StreamReader and StreamWriter (used for reading and writing text, one line at a time), and BinaryReader and BinaryWriter (used for converting basic .NET data types to binary data and back). Look these classes up in the MSN Help for the traditional file-access techniques. Also, in the next lab, you'll see a more advanced example that uses FileStream to encrypt data in a file.

Compress and Decompress Data

Even with the ever-increasing capacity of hard drives and the falling price of computer memory, it still pays to save space. In .NET 2.0, a new System.IO.Compression namespace makes it easy for a VB 2005 programmer to compress data as she writes it to a stream, and decompress data as she reads it from a stream.

Note

Need to save space before you store data in a file or database? . NET 2.0 makes compression and decompression easy.

How do I do that?

The new System.IO.Compression namespace introduces two new stream classes: GZipStream and DeflateStream, which, as you'd guess, are used to compress and decompress streams of data.

The algorithms used by these classes are lossless, which means that when you compress and decompress your data, you won't lose any information.

To use compression, you need to understand that a compression stream wraps another stream. For example, if you want to write some compressed data to a file, you first create a FileStream for the file. Then, you wrap the FileStream with the GZipStream or DeflateStream. Here's how it works:

Dim fsWrite As New FileStream(FileName, FileMode.Create)
Dim CompressStream As New GZipStream(fsWrite, CompressionMode.Compress)

Now, if you want to write data to the file, you use the GZipStream. The GZipStream compresses that data, and then writes the compressed data to the wrapped FileStream, which then writes it to the underlying file. If you skip this process and write directly to the FileStream, you'll end up writing uncompressed data instead.

Like all streams, the GZipStream only allows you to write raw bytes. If you want to write strings or other data types, you need to create a StreamWriter. The StreamWriter accepts basic .NET data types (like strings and integers) and converts them to bytes. Here's an example:

Dim Writer As New StreamWriter(CompressStream)
' Put a compressed line of text into the file.
Writer.Write("This is some text")

Finally, once you're finished, make sure you flush the GZipStream so that all the data ends up in the file:

Writer.Flush( )
CompressStream.Flush( )
fsWrite.Close( )

The process of decompression works in a similar way. In this case, you create a FileStream for the file you want to read, and then create a GZipStream that decompresses the data. You then read the data using the GZipStream, as shown here:

What about...

...unzipping .zip files? Unfortunately, the .NET 2.0 compression streams can't deal with ZIP files, file archives that are commonly used to shrink batches of files (often before storing them for the long term or attaching them to an email message). If you need this specific ability, you'll probably be interested in the freely downloadable #ziplib (available at http://www.icsharpcode.net/OpenSource/SharpZipLib).

Where can I learn more?

For more information about the GZipStream and DeflateStream algorithms, look them up in the MSDN Help. You can also look up the "compression" index entry for a Windows application example that uses these classes.

Collect Statistics on Your Data Connections

Most programmers like to look at statistics. Considered carefully, they can suggest the underlying cause of a long-standing problem, explain the performance problems of an application, or suggest possible optimization techniques. If you're using the SQL Server provider, you can make use of a new SqlConnection.RetrieveStatistics() method to get a hashtable with a slew of diagnostic details about your database connection.

Note

Want to find out what's really going on while you're connected to a database? In . NET 2.0, you can get ahold of much more information, but only if you're using SQL Server.

How do I do that?

Before you can call RetrieveStatistics( ), you need to instruct it to collect statistics by setting the SqlConnection.StatisticsEnabled property to True. Once you take this step, the SqlConnection class will gather statistics for every database command you execute over the connection. If you perform multiple operations with the same connection, the statistics will be cumulative, even if you close the connection between each operation.

To take a look at the statistics at any time, you call the RetrieveStatistics( ) method to retrieve a hashtable containing the accumulated data. The hashtable indexes its members with a descriptive name. For example, to retrieve the number of transactions you've performed, you'd write this code:

To get a good idea of the different statistics available, try running Example 5-6, a console application that iterates over the statistics collection and displays the key name and value of each statistic it contains.

To reset the values of the statistics collection to zero at any time, simply call the ResetStatistics() method:

con.ResetStatistics( )

What about...

...making sense of the various statistics gathered and putting them to use? Unfortunately, the MSDN Help doesn't yet provide the full lowdown on the SQL Server statistics. However, several statistics are particularly useful and not too difficult to interpret:

BytesReceived

Gives a snapshot of the total number of bytes retrieved from the database server.

ServerRoundtrips

Indicates the number of distinct commands you've executed.

ConnectionTime

Indicates the cumulative amount of time the connection has been open.

SumResultSets

Indicates the number of queries you've performed.

SelectRows

Records the total number of rows retrieved in every query you've executed. (In the previous example this is 182, because each query retrieved 91 rows.)

And for an example where statistics are used to profile different approaches to database code, refer to the next lab, "Batch DataAdapter Commands for Better Performance."

Batch DataAdapter Commands for Better Performance

Many databases are able to execute commands in batches, reducing the total number of calls you need to make. For example, if you submit 10 update commands in a single batch, your code only needs to make 1 trip to the server (instead of 10). Cutting down the number of round-trips can increase performance, particularly on networks that have a high degree of latency. In .NET 2.0, the SqlDataAdapter is enhanced to use batching for updating, inserting, and deleting records.

Note

If you need an easy way to optimize DataSet updates, ADO. NET 's new batching can help you out.

How do I do that?

In previous versions of .NET, you could batch direct commands by concatenating them in a single string, and separating each with a semicolon. This syntax requires support from the database provider, but it works perfectly well with SQL Server. Here's an example that inserts two rows into a table:

As useful as this feature is, previous versions of .NET didn't provide any way to batch commands to one of the most important ADO.NET provider objects—the data adapter. The data-adapter object scans a DataSet, and executes insert, delete, and update commands whenever it finds a new, removed, or changed row. Each of these commands is executed separately, which means that if your DataSet contains three new rows, the data adapter will make three round-trips to the server.

Note

It makes good sense to have batching support in the data adapter, because the data adapter is often used to commit more than one modification at a time.

.NET 2.0 improves the picture with a new SqlDataAdapter.UpdateBatchSize property. By default, the value of this property is set to 1, which causes each insert, update, or delete command to be executed separately. If you set the UpdateBatchSize to a larger number, the data adapter will group its commands into batches.

Example 5-7 is a console application, BatchedDataAdapterTest, that puts this technique to the test. BatchedDataAdapterTest retrieves data from the Orders table in the Northwind database and then makes changes to each row. To make life interesting, the module applies this update not once, but twice—once without batching, and once with batch sizes set to 15. BatchedDataAdapterTest displays connection statistics for each approach, allowing you to compare their performance.

When you run this application, the rows will be updated, and a list of statistics will appear. Take a close look at these statistics, paying special attention to the number of round-trips made to the database, the total connection time, and the amount of data required to complete the updates. Here's a portion of the output generated by one run of the application that highlights some of the more important numbers:

This listing reports that, in the batched update, 831 rows were updated in 56 batches of 15 commands each. As you can see, batching reduced the amount of data that needed to be sent (by packing it more effectively into batches), which is one of the most important metrics of database scalability. On the other hand, the overall performance of the application hardly changed at all, and the connection time even increased slightly. Clearly, to make a meaningful decision about whether to use batching, you need to profile your application in a real-world scenario.

What about...

...the quirks and limitations of batched updates? Currently, only the SqlDataAdapter supports batching, although other providers may implement this functionality in the future. The actual implementation details will differ for each provider—in the case of the SqlDataAdapter, the provider uses the sp_executesql system stored procedure to execute the batch. As for quirks, you'll notice a change to how the RowUpdated and RowUpdating events of the SqlDataAdapter work. When batching is enabled, these events fire once for every batch, not once for every row. That means that when the RowUpdated event fires, you can determine the number of rows affected, but not the row-by-row details of the changes made. This loss of information can make it more difficult to handle errors that occur somewhere inside a batch.

The ideal batch size depends on a variety of low-level factors, including the network architecture and the size of the rows. The best advice is to test your application with different batch settings. If you want all updates to be done in a single batch of unlimited size, set the UpdateBatchSize property to 0.

Bulk-Copy Rows from One Table to Another

Most SQL Server gurus are familiar with the BCP command-line utility, which allows you to move vast amounts of information from one SQL Server database to another. BCP comes in handy any time you need to load a large number of records at once, but it's particularly useful when you need to transfer data between servers. In .NET 2.0, the SqlClient namespace includes a new SqlBulkCopy class that allows you to perform a bulk-copy operation programmatically.

Note

The new SqlBulkCopy class gives you the most efficient way to copy large amounts of data between tables or databases.

How do I do that?

The key ingredient in a bulk-copy operation is the new SqlBulkCopy class. It performs all of its work when you call the WriteToServer() method, which can be used in two ways:

You can submit your data as a DataTable or an array of DataRow objects. This makes sense if you want to insert a batch of records from a file you created earlier. It also works well if you're creating a server-side component (like a web service) that receives a disconnected DataSet with the records that need to be loaded into a table.

You can submit your data as an open DataReader that draws records from another SqlConnection. This approach is ideal if you want to transfer records from one database server to another.

Before you call WriteToServer( ), you need to create the connections and commands you need and set up mapping between the destination and source table. If your source and destination tables match exactly, no mapping is required. However, if the table names differ, you need to set the SqlBulkCopy.DestinationTableName property to the name of the target table. Additionally, if the column names don't match or if there are fewer columns in the target table than there are in the source data, you also need to configure column mapping. To set column mapping, you add one mapping object for each column to the SqlBulkCopy.ColumnMappings collection. Each mapping object specifies the name of the source column and the name of the corresponding target column.

Figure 5-1. Creating a CustomersShort table

To try this out, create a new SQL Server database named NorthwindCopy and a table named CustomersShort. The CustomersShort table is designed to offer a subset of the information in the Customers table. You can create it by using a tool like SQL Server Enterprise Manager (see the column settings in Figure 5-1), or you can use the script included with the downloadable content for this chapter to create it automatically (look for the file GenerateNorthwindCopy.sql).

Once you've created CustomersShort, you have a perfect table for testing a SQL Server bulk-copy operation. All you need to do is create two connections, define the mapping, and start the process. Example 5-8 has the code you need.

What about...

...other SqlBulkCopy properties? SqlBulkCopy provides two useful properties: BulkCopyTimeout (which allows you to set how long you'll wait for an unresponsive server) and BatchSize (which allows you to set how many operations are batched together, as described in the lab "Batch DataAdapter Commands for Better Performance"). Errors are handled in the same way as when you directly execute a SqlCommand. In other words, if an error happens on the server side (like a unique value conflict), the process will be interrupted immediately, and you'll receive a SqlClient exception with the full details.

Where can I learn more?

For a complete list of class members, look up the SqlBulkCopy class in the MSDN help library reference. Or, for information about the original BCP utility, look for the index entry "bcp utility" in the SQL Server Books Online help.

Write Database-Agnostic Code

In developing ADO.NET, Microsoft set out to create a new data access architecture that would be more flexible, better performing, and more easily extensible than its previous COM-based OLE DB and ADO architectures. They did this by creating a model where every data source must supply its own data provider: a set of managed classes that allow you to connect to a particular data source (e.g., SQL Server, Oracle), execute commands, and retrieve data. In order to ensure that these providers are consistent, each implements a standard set of interfaces. However, this approach creates major challenges for developers who want to write provider-agnostic code—for example, a basic database routine that can be used equally well with the SQL Server provider or the Oracle provider. Usually, you use provider-agnostic code because you aren't sure what type of database the final version of an application will use, or because you anticipate the need to migrate to a different database in the future.

Note

Want a way to write database code that isn't bound to a specific data source? This challenge becomes a whole lot easier in . NET 2.0.

.NET 2.0 takes major steps to facilitate generic database coding by introducing a new factory model. (A factory model is a pattern where one class has the exclusive responsibility for creating instances of other classes.) In this model, you can use a database provider factory to build the ADO.NET connections, commands, and many other types of objects required for a particular database. The factory automatically returns the type of object that you need for your data source (e.g., a SqlCommand or an OracleCommand), but when you write your code, you don't worry about these details. Instead, you write generic commands without regard to the particular details of the data source.

How do I do that?

In provider-agnostic code, you still use all the same strongly typed objects. However, your code manipulates these objects using common interfaces. For example, every command object, whether it's used for SQL Server or Oracle, implements the common IDbCommand interface, which guarantees a basic set of methods and properties.

Note

Because provider-agnostic code attempts to be as generic as possible, it's more difficult to properly optimize a database. As a result, this technique isn't suitable for most large-scale enterprise applications.

Provider-agnostic code is structured so that you specify the type of database you're using early on, usually by reading some information from a configuration file. You use this information to retrieve a DbProviderFactory for your database. Here's an example where the factory string is hardcoded:

In this example, the code uses the shared GetFactory( ) method of the System.Data.Common.DbProviderFactories class. It specifies a string that identifies the provider name. For example, if you use the string System.Data.SqlClient, the GetFactory() method returns a System.Data.SqlClient.SqlClientFactory object. The DbProviderFactories class can create factories for all the data providers included with .NET, because they are explicitly configured in the machine.config configuration file on the current computer. Essentially, the configuration record tells the DbProviderFactories class to create a SqlClientFactory when the programmer passes the exact string "System.Data.SqlClient." If you develop your own provider, you can also register it to work in this way (although that task is beyond the scope of this lab).

The SqlClientFactory object has the built-in smarts to create all the objects used by the SQL Server provider. However, your code can be completely generic. Instead of interacting with the specific SqlClientFactory class type, it should use the generic base class DbProviderFactory. That way, your code can work with any type of DbProviderFactory, and therefore support any database provider.

Once you have the DbProviderFactory, you can create other types of strongly typed ADO.NET objects using a set of common methods by using the CreateXxx() methods. These include:

All these methods create a provider-specific version of the object they name.

To get a better understanding of how generic database code works, it helps to try out a complete example that can switch from one data provider to another on the fly. First of all, you need to create an application configuration file that stores all the provider-specific details. To do this, create a console application and open the app.config file. Add the following three settings, which specify the factory name, the connection string for the database, and the query to perform:

This example uses the SQL Server provider to connect to the Northwind database and retrieve a list of all the records in the Orders table.

Now you can retrieve the configuration file information and use it with the DbProviderFactories class to create every ADO.NET provider object you need. In Example 5-9, the query is executed, a DataSet is filled, and a list of OrderID values is displayed in the console window.

Mostly, this is a fairly pedestrian piece of data access logic. The only exciting part is that you can switch from one provider to another without modifying any of the code or recompiling. You just need to modify the provider information and connection string in the configuration file. For example, make these changes to the configuration file to access the same table through the slower OLE DB provider interface:

After saving the configuration file, you can run the application again. It will work just as well, displaying the same list of order records.

What about...

...the challenges you'll encounter in writing database-agnostic programs? The new factory approach is a giant leap forward for those who want to write provider-agnostic code. However, a slew of problems (some minor and some more significant) still remain. These include:

Handling errors

Every database provider has its own exception object (like SqlException and OracleException), and these objects don't derive from a common base class. That means there's no way to write an exception handler that catches database exceptions generically. All you can do is write exception handlers that catch the base Exception object.

Provider-specific functionality

Some features aren't exposed through the common interfaces. For example, SQL Server has the ability to execute FOR XML queries that return XML documents. To execute this type of query, you use the SqlCommand.ExecuteXmlReader( ) method. Unfortunately, this isn't a standard command method, so there's no way to access it through the IDbCommand interface.

Handling parameters

Some providers (like SQL Server) recognize command parameters by their name. Others (like OLE DB) recognize command parameters by the order of their appearance. Minor differences like this can thwart provider-agnostic programming.

Use the New XPathDocument and XPathNavigator

.NET provides a range of options for dealing with XML in the System.Xml namespaces. One common choice is XmlDocument, which lets you navigate in-memory XML as a collection of node objects. For more efficient performance, the XmlWriter and XmlReader classes offer a streamlined way to read and write a stream of XML. Unfortunately, neither solution is perfect. The XmlDocument consumes too much memory, and navigating its structure requires too much code. Furthermore, because the XmlDocument is based on a third-party standard (the XML DOM, or document object model), it's difficult to improve it without breaking compatibility. On the other hand, the XmlWriter and XmlReader are too restrictive, forcing you to access information linearly from start to finish. They also make it prohibitively difficult for a developer to provide an XML interface to non-XML data.

Note

Talk about an improvement! The revamped XPathDocument sets a new standard for XML parsing in . NET.

.NET 2.0 proposes a solution with the System.Xml.XPath.XPathDocument. The XPathDocument is a cursor-based XML reader that aims to become the only XML interface you need to use. It gives you the freedom to move to any position in a document, and it provides blistering speed when used with other XML standards such as XQuery, XPath, XSLT, and XML Schema validation.

How do I do that?

To use an XPathDocument, you begin by loading the document from a stream, XmlReader, or URI (which can include a file path or an Internet address). To load the content, you can use the Load( ) method or a constructor argument—they both work in the same way. In this example, the XPathDocument is filled with the content from a local file:

Dim Doc As New XPathDocument("c:\MyDocument.xml")

To actually move around an XPathDocument, you need to create an XPathNavigator by calling the CreateNavigator( ) method.

Dim Navigator As XPathNavigator = Doc.CreateNavigator( )

The XPathNavigator includes a generous group of methods for navigating the structure of the XML document. Some of the methods include:

MoveToRoot( )

Jumps to the root, or document element that contains all the other elements.

MoveToID( )

Moves to an element that has a specific ID, as identified with the ID attribute.

MoveToNext( )

Moves to the next node at the same level (technically called a sibling).

MoveToPrevious( )

Moves to the previous node at the same level (technically called a sibling).

MoveToFirstChild( )

Moves down a level to the first node contained by the current node.

MoveToParent( )

Moves up a level to the parent that contains the current node.

Once you're positioned on an element, you can read the element name from the Name property. You can retrieve the contained text content from the Value property.

Now that you've learned this much, it's worth trying a basic example. In it, we'll use an XML document that contains a product catalog based on Microsoft's ASP.NET Commerce Starter Kit. This XML file (which is available with the downloadable content for this chapter) has the structure shown in Example 5-10.

When you run this code, you'll see a display with a list of model names for all the products.

Interestingly, the XPathNavigator also provides strong typing for data values. Instead of retrieving the current value as a string using the Value property, you can use one of the properties that automatically converts the value to another data type. Supported properties include:

ValueAsBooleanValueAsDateTimeValueAsDoubleValueAsIntValueAsLong

To try this out, you can rewrite the loop in Example 5-11 so that it converts the price to a double value and then displays a total with added sales tax:

What about...

...other ways to search an XML document with the XPathNavigator? To simplify life, you can select a portion of the XML document to work with in an XPathNavigator. To select this portion, you use the Select( ) or SelectSingleNode( ) methods of the XPathNavigator class. Both of these methods require an XPath expression that identifies the nodes you want to retrieve. (For more information about the XPath standard, see the "Introducing XPath" sidebar.)

For example, the following code selects the <ModelName> element for every product that's in the Tools category:

' Use an XPath expression to get just the nodes that interest you
' (in this case, all product names in the Tools category).
Dim XPathIterator As XPathNodeIterator
XPathIterator = Navigator.Select ( _
"/Products/Product/ModelName[../CategoryName='Tools']")
Do While (XPathIterator.MoveNext( ))
' XPathIterator.Current is an XPathNavigator object pointed at the
' current node.
Console.WriteLine(XPathIterator.Current.Value)
Loop

Tip

The examples in this lab use an XML document with no namespace. However, namespaces are often used in programming scenarios to allow your program to uniquely identify the type of document it references. If your document uses namespaces, you need to use the XmlNamespaceManager class and rewrite your XPath expressions to use a namespace prefix. If you'd like an example of this technique, refer to the downloadable samples for this lab, which demonstrate an example with a product catalog that uses XML namespaces.

Introducing XPath

Basic XPath syntax uses a path-like notation to describe locations in a document. For example, the path /Products/Product/ModelName indicates a ModelName element that is nested inside a Product element, which, in turn, is nested in a root Products element. This is an absolute path (indicated by the fact that it starts with a single slash, representing the root of the document).

You can also use relative paths, which search for nodes with a given name regardless of where they are. Relative paths start with two slashes. For example, //ModelName will find all ModelName elements no matter where they are in the document hierarchy. Other path characters that you can use include the period (.), which refers to the current node; the double period (..) to move up one level; and the asterisk (*) to select any node.

XPath gets really interesting when you start to add filter conditions. Filter conditions are added to a path in square brackets. For example, the XPath expression //Product[CategoryName='Tools'] finds all Product elements that contain a CategoryName element with the text "Tools." You can use the full range of logical operators, such as less than and greater than (< and >) or not equal to (!=). For much more information about the wonderful world of XPath, refer to XML in a Nutshell (O'Reilly).

Where can I learn more?

The XPathNavigator class is too detailed to cover completely in this lab. For more information, refer to both classes in the MSDN Help. Additionally, you can learn about XML standards like XPath, XQuery, and XML Schema from the excellent online tutorials at http://www.w3schools.com.

In addition, you'll find one more lab that can help you extend your XPathDocument skills: "Edit an XML Document with XPathDocument," which explains the editing features of the XPathDocument.

Warning

The editable XPathNavigator has undergone extensive changes, and the features demonstrated in the next lab (Section 5.11) weren't working in the last build we tested. Although it's expected to return, features are sometimes cut even at this late stage. If the coding model changes, you'll find updated code in the downloadable examples for the book.

Edit an XML Document with XPathNavigator

The XPathNavigator is the XML interface of choice for Visual Basic 2005 applications. And in .NET 2.0, it doesn't just work as a view onto read-only XML data—it also allows you to change XML documents, such as by modifying text content, inserting new elements, or removing a branch of nodes.

Note

The best feature of the XPathNavigator is its new support for editing and inserting content.

How do I do that?

In the previous lab, Section 5.10, you learned how to load XML data into an XPathDocument, and then browse and search through it using an XPathNavigator. If you want to make changes, you still start with the same XPathDocument. The secret is that you also use a couple of additional methods in the XPathNavigator:

SetValue( )

This method inserts a new value in the current element, replacing the existing value.

DeleteCurrent( )

This method removes the current node from the document.

Remember, you have two basic choices for creating an XPathNavigator:

Use the XPathDocument.CreateNavigator( ) method

This method returns an XPathNavigator for the whole document. You can then move to the portion of the document you want to change.

Use the XPathDocument.Select( ) method with an XPath expression

This returns an XPathNodeIterator that allows you to move through your results, retrieving an XPathNavigator for each selected node.

Example 5-12 modifies the XML document shown in Example 5-10. It increases all the prices by 10% and then deletes nodes that don't fall into the Tools category. Finally, it displays the altered XML document.

Example 5-12. Modifying an XML document with XPathNavigator

Imports System.Xml.XPath
Imports System.Xml
Module XPathNavigatorTest
Sub Main( )
' Load the document.
Dim Doc As New XPathDocument(My.Computer.FileSystem.CurrentDirectory & _
"\ProductList.xml")
' Use the XPathNavigator to make updates.
Dim XPathIterator As XPathNodeIterator = Doc.Select("//UnitCost")
' Increase the price by 10%.
For Each Editor As XPathNavigator In XPathIterator
Editor.SetValue((1.1 * Editor.ValueAsDouble).ToString( ))
Next
' Delete nodes that aren't in the Tools category.
XPathIterator = Doc.Select("/Products/Product[CategoryName!='Tools']")
For Each Editor As XPathNavigator In XPathIterator
Editor.DeleteCurrent( )
Next
' Show changes.
XPathEditor.MoveToRoot( )
Console.WriteLine(XPathEditor.OuterXml)
End Sub
End Module

When you run this application, the XML for the changed document is displayed in the console window. You can also open the ProductList_new.xml file where the changes are saved.

In many cases, you won't just want to change a value—you'll need a way to insert new elements or entire sections. The XPathNavigator includes a handful of methods for inserting new elements and attributes in one shot. However, the easiest way to add a block of XML is to use an XmlWriter. If you've worked with XML and .NET before, you probably recognize the XmlWriter. The XmlWriter was commonly used to write XML content directly to a file in .NET 1.x applications. The difference in .NET 2.0 is that the XPathEditor allows you to use the XmlWriter to write directly to your in-memory XPathDocument.

All you need to do is start by calling one of the XPathEditor methods that returns an XmlWriter. These include the following, which differ on where each places the inserted XML:

AppendChild( )

Adds a new element inside the current element, after all existing child elements.

PrependChild( )

Adds a new element inside the current element, before any existing child elements.

InsertAfter( )

Adds a new element after the current element (and at the same level).

InsertBefore( )

Adds a new element just before the current element (and at the same level).

Example 5-13 uses the AppendChild() method to add a new product to the product list XML document.

Example 5-13. Using the AppendChild( ) method to add a new element to an XML document

You can create multiple navigator and editor objects to work with the same XPathDocument. However, the editors don't perform any locking, so you can't edit an XPathDocument on multiple threads at the same time unless you take your own safeguards.

What about...

...validating your XML? The XPathNavigator and the XmlWriter both force you to write valid XML. However, it's also important to check XML documents to make sure they match specific rules. The best tool for this task is an XML schema document that defines the elements, structure, data types, and constraints for a document.

The actual schema standard is beyond the scope of this chapter. (For a good introduction, refer to the tutorial at http://www.w3schools.com/schema.) However, assuming you have a schema for your XML, you can validate your document at any time by calling XPathNavigator.CheckValidity(). This method returns True if the document conforms to the schema. Here's how to do it:

' Load the document.
Dim Doc As New XPathDocument("c:\ProductList.xml")
' (Make updates).
' Load the schema.
' Technically, you can load a collection of schemas,
' one for each namespace in the document that you want to validate.
Dim Schemas As New XmlSchemaSet( )
Schemas.Add("http://www.ibuyspy.com/ProductCatalog", "c:\ProductListSchema.xsd")
Schemas.Compile( )
' Validate with the schema.
' Instead of submitting a null reference (Nothing), you can supply
' a delegate that points to a callback method that will be triggered
' every time an error is found when the validation check is performed.
Dim Valid As Boolean
Valid = Doc.CreateNavigator( ).CheckValidity(Schemas, Nothing)

Where can I learn more?

For more information about editing the XPathDocument, look up the "XPathNavigator class" index entry in the MSDN Help. If you've used earlier betas of .NET 2.0, which included the same features in a different class ( XPathEditableNavigator), you may want to refer to http://blogs.msdn.com/dareobasanjo/archive/2004/09/03/225070.aspx for some explanation straight from Microsoft bloggers.