This chapter is from the book

Far and away the single most important technique for corporate VB developers
is data access. Most of the applications written in today's corporate
environments involve displaying and manipulating operational data and line of
business data. Historically, Microsoft has provided support for accessing data
in a variety of relational and nonrelational data stores by shipping APIs that
existing VB developers have become quite familiar with over the years including
VBSQL, ODBC, DAO, RDO, OLE DB, and ADO. Although this continued evolution meant
that VB developers had to learn new techniques along the way, the benefits of
learning the models meant a unified relational data access model (ODBC),
increased simplicity (DAO), increased performance (RDO), and increased reach to
nonrelational sources (ADO).

To that list you can now add the data access classes of the Services
Framework. These classes, collectively termed ADO.NET, serve to implement a
managed interface to OLE DB and SQL Server, increase performance when using SQL
Server, and allow data to be manipulated in a fashion commensurate with
distributed application development utilizing the Internet and XML. Even though
learning a new data access model may at first be daunting, it will also help you
build modern, distributed applications.

In this chapter, we'll discuss the architecture of ADO.NET and how it
can be used to build distributed applications. This and the three chapters that
follow form a progression that illustrates the techniques useful in VB.NET to
build distributed applications. The example code discussed in this and
subsequent chapters illustrates a somewhat simplified implementation of
Quilogy's online education system where students enroll in classes over the
Web and Quilogy employees view and manipulate data through an Intranet site. The
enrollment data is stored in a SQL Server 2000 database so where appropriate,
specific features of SQL Server 2000 will be utilized.

Because data is the foundation for applications such as this one we'll
begin with a discussion of accessing data in the .NET world.

ADO.NET Defined

As mentioned previously, ADO.NET is comprised of classes found in the
System.Data namespace that encapsulate data access for distributed
applications. However, rather than simply mapping the existing ADO object model
to .NET to provide a managed interface to OLE DB and SQL Server, ADO.NET changes
the way data is stored and marshaled within and between applications. The
primary reason ADO.NET redefines this architecture is that most applications
developed today can benefit from the scalability and flexibility of being able
to distribute data across the Internet in a disconnected fashion.

Because the classic ADO model was developed primarily with continuously
connected access in mind, creating distributed applications with it is somewhat
limiting. A typical example is the need to move data through a
Recordset object between tiers in a distributed application. To
accomplish this in classic ADO you have to specifically create a disconnected
Recordset using a combination of properties including cursor location,
cursor type, and lock type. In addition, because the Recordset is
represented in a proprietary binary format, you have to rely on COM marshalling
code built into OLE DB to allow the Recordset to be passed by value
(ByVal) to another component or client code. This architecture also
runs into problems when attempting to pass recordsets through firewalls
because these system level requests are often denied.

On the other hand, if you elected not to use disconnected
recordsets, you had to devise your own scheme to represent the data
using Variant arrays, delimited within in a string, or saved as tabular
XML using the Save method (although the latter option is really only
viable when using ADO 2.5 and higher). Obviously these approaches have their
downside because they run into problems with performance and maintainability not
to mention interoperability between platforms.

In addition, the classic ADO model doesn't handle hierarchical data
particularly well. Although it is possible to create hierarchical recordsets
using the Microsoft data shape provider, it is not simple and is therefore not
often used. Typically JOIN clauses are used inside stored procedures or
inline SQL to retrieve data from multiple tables. However, this does not allow
you to assemble data from multiple data sources and easily determine from where
the data comes. As a result, classic ADO provides a flat view of data that is
not strongly typed.

To alleviate these problems, ADO.NET is built from the ground up for
distributed applications used in today's disconnected scenarios. For
example, the central class in ADO.NET is the DataSet, which can be
thought of as an in-memory XML database that stores related tables,
relationships, and constraints. As you'll see, the DataSet is the
primary mechanism used in VB.NET applications to cache data and pass it between
tiers in a distributed application thereby alleviating the need to rely on
proprietary schemes or COM marshalling.

Using XML alleviates several of the burdens of classic ADO. For example, by
storing the data as XML it can easily pass through firewalls without special
configuration. In addition, by storing related tables and representing the
relationships between those tables the DataSet can store data
hierarchically allowing for the easy manipulation of parent/child relationships.
The self-describing nature of XML combined with the object-oriented nature of
VB.NET also allows for direct programmatic access to the data in a
DataSet in a strongly typed fashion. In other words, the data need not
be accessed using a tables, rows, and columns metaphor but can be accessed in
terms of the definition of the data that can be type checked by the
compiler.

Furthermore, this disconnected model combined with connection pooling schemes
frees resources on the database server more quickly, allowing applications to
scale by not holding on to expensive database connections and locks.

In the following section we'll dig deeper to take a look at how ADO.NET
is designed with the goals of disconnected access, scalability, and
interoperability in mind.

Relationship to OLE DB and ADO

Some readers may get the impression that ADO.NET replaces the existing ADO
and OLE DB architecture. On the contrary, you should think of ADO.NET as an
enhancement to this technology for distributed application development that, for
the most part, relies on the underlying OLE DB and ADO architecture. For
example, ADO.NET does not contain managed code to natively access any data
source other than SQL Server. For all other data sources OLE DB providers are
required.

In fact, because ADO.NET was designed for disconnected and distributed
applications it might not be suitable for all types of applications you need to
create with VB.NET, especially those that rely on server side cursors and are
continuously connected. As a result, you may want to use classic ADO with VB.NET
through the COM Interop layer as discussed in Chapter 8.