Abstract

The goal of this library
is to make ODBC recordsets look just like an STL container. As a
user, you can move through our containers using standard STL
iterators; and if you insert(), erase() or replace() records in
our containers changes can be automatically committed to the
database for you. The library's compliance with the STL iterator
and container standards means you can plug our abstractions into
a wide variety of STL algorithms for data storage, searching and
manipulation. In addition, the C++ reflection mechanism used by
our library to bind to database tables allows us to add generic
indexing and lookup properties to our containers with no special
code required from the end-user. Because our code takes full
advantage of the template mechanism, it adds minimal overhead
compared with using raw ODBC calls to access a database.

Background

Introduced in 1990, STL
and templates represent one of the most significant advances in
the C++ language in the last decade. The guiding force behind the
power of the standard template library is the notion of Generic
Programming. At the heart of Generic Programming is the idea of
abstracting operations across as broad a set of data types as
possible to create algorithms that are as generic as possible.
This kind of design leads to abstractions that are centered
around a set of requirements on the data types themselves.
Examples in STL include notions such as iterators, containers and
set operations. We have taken these abstractions and applied them
to the problem of representing tables in a database. In what
follows, we will show how this simplifies the task of
manipulating data and provides instant access to a broad range of
algorithms that come with the standard template library.

A First Example, Reading and Writing Records in a Table

As our first example, we
show what the code would look like to open a table and read a set
of rows from a database.

Accessing a Table in Four Easy Steps:

Define an object to hold the rows from your query.

Define an association
between fields in your query and fields in your object. This is
what we call a 'BCA', which is short for Bind Column Addresses.
In the example below, this is done via the functor "BCAExample".
The job of the BCA is to equate SQL fields with object fields via
the '==' operator which will then establish ODBC bindings to move
data to or from a user query.

Create a view to select
records from. This view is built from the template DBView and
establishes which table(s) you want to access, what fields you
want to look at (via the BCA), and an optional where clause to
further limit the set of records that you are working with. The
DBView template forms a semi-Container in the STL sense.
1.

Use the DBView container
to obtain an iterator to SELECT, INSERT, UPDATE or DELETE records
from your view. These iterators may be used to either populate
STL containers or apply algorithms from the Standard Template
library.

In all the examples that follow we will assume that our database contains a table called DB_EXAMPLE of the form

1 See
http://www.sgi.com/tech/stl/Container.html for the definition of an STL container, we
call DBView a semi container because it supports all
standard container methods except size(), max_size() and
empty(). We explain why these were left out by design in the
documentation for the DBView template.

At this point, it is worth discussing
the types of iterators exposed by DBView. The iterators that
DBView provides are either Input iterators or Output iterators.
In simple terms, an Input iterator can read elements, but not
write them. An Output iterator can write elements, but not read
them. These notions were first envisaged for working with C++
input and output streams but they apply equally well to reading
and writing table data. Input and Output iterators are also
minimal types of iterators in that they don't guarantee that
table records will be read in any kind of specific or consistent
order and they don't provide for random access in the sense that
users cannot ask them to 'skip' ahead a given number of records
or go to a particular record number in the table. An exact
description of the functionality provided by Input and Output
iterators may be found at http://www.sgi.com/tech/stl/InputIterator.html
and http://www.sgi.com/tech/stl/OutputIterator.html.

By restricting the iterators from DBView to be either input or
output iterators, we are able to provide database access with a
minimum amount of code overhead; thereby ensuring that read and
write operations remain efficient as compared with raw ODBC calls.
The iterators provided by DB_View are as follows:

Input Iterators:select_iterator

Output Iterators:insert_iterator
update_iterator
delete_iterator

To illustrate the use of an output iterator we show how a vector of rows would be inserted into a table.

In WriteData() we have used an output iterator to insert records
into our table in much the same way that we used a read iterator
to read records from a table. In addition, this example
introduces notion of client-side validation. Often, when reading
or writing records from a table we want to do client side
validation to make sure that the fields in a record are not null
or lie within an acceptable range of values. DBView supports this
through SelValidate and InsValidate functions. The SelValidate
function validates records as they are selected from the database.
The InsValidate function validates records as they are inserted
into the database. In the example above, we define a
DefaultInsValidate function which validates records before
insertion to make sure the exampleStr, exampleDouble and
exampleLong fields contain acceptable values before allowing them
to be inserted into the database.

In general, the constructor for DBView<class DataObj, class
ParamObj = DefaultParamObj<DataObj&gt> takes the form

which allows the user to define table names, field names, a where
clause, query parameters, a selection validation function, an
insert validation function and a database connection to use when
processing queries. If the user does not supply a validation
function then the default functions named DefaultSelValidate and
DefaultInsValidate will be called. To see how the postfix clause
and parameters work we will next examine a more complex case.

A Second Example, Parameterized Queries

We now turn to a more
general class of queries; the case where we may be joining across
multiple tables and/or have join conditions that restrict the set
of records to be retrieved.

This works in exactly the same way as
the select iterator shown previously. The only new elements here
are that instead of a single table name we provide a list of
tables, we set a where clause, and we bind parameters to fill in
values for the clause. To bind parameters we first create what we
call a BPA, or Bind Parameter Addresses, functor. A BPA functor
establishes a correspondence between parameters that are
identified in a postfix clause by "(?)" and fields in a
parameter object. If you examine the function BPAJoinParamObj you
will notice that unlike our BCA functor the parameter fields are
bound by number. This is partly because parameter fields do not
have distinct names the way that table fields do, and it is
partly due to the fact that using a number here allows the
binding operator to distinguish between binding output columns
and input parameters. Observant readers will also note that our
postfix clause contains instructions to sort the retrieved
objects in a particular manner ( "ORDER
BY SAMPLE_LONG" ). In fact,
the postfix clause need not contain a WHERE command at all. In
practical applications this might be simply a sorting statement
or a GROUP BY clause, and our 'field' names in the BCA functor
may be SQL functions like "SUM(INT_VALUE)"
instead of simple column names.
The BCA and BPA are specified as function objects, i.e. functors.

Tables R Us, The IndexedDBView

In practice, the most
common operations performed on a set of table records are: read
the records into a container, search the records by different key
fields (i.e. indexes), and delete, insert or update records in
the container. For this reason, we have developed a more advanced
container for holding database tables. This IndexedDBView
container is a specialization of a Unique Associative Container
as defined by the standard template
library http://www.sgi.com/tech/stl/UniqueAssociativeContainer.html.

In addition to the base methods defined by the STL standard we
have coded features to make the container more copesetic with the
underlying rows that it contains. The main new features are the
easy creation of indexes into rows and synchronization
capabilities that can automatically propagate any changes back to
the database. This container comes at a price. It incurs more
overhead than the simple DBView and because it works at a higher
level you lose a bit of the fine-grained control that you get
with simple iterators. To explain, we begin with an example:

The first parameter here is a view object; this defines the SQL
Query that will be used to read and write records as described in
the previous two examples. The second parameter is
IndexNamesAndFields; this defines indexes on the rows in the
container and we will examine it in more detail shortly. The
BoundMode and KeyMode control whether or not changes to the
container data are synchronized with the database, and if so what
key fields are used for the synchronization. If BoundMode =
BOUND, then any changes to the container are sent to the database.
If BoundMode = UNBOUND then any changes to the container will
only apply locally. Finally, the SetParams function allows the
user to pass in an explicit function for setting parameters in
the where clause for the view if they so desire.

The IndexNamesAndFields parameter is interesting.
IndexNamesAndFields is used to automatically create named indexes
into our rows. In the above example we have

What this does is create two indexes on the data that is read
into the container. The first index is designated to be a UNIQUE
with the name "PrimaryIndex" and is based on the field
called STRING_VALUE. Because this key is designated as unique
this forms a constraint on the container whereby every entry for
(STRING_VALUE) must be unique in order for the associated row to
be added to the table. The second index is created with the name
"AlternateIndex" and is based on the fields
EXAMPLE_LONG and EXAMPLE_DATE. AlternateIndex is not designated
to be unique here and is created only to provide a way to quickly
look up rows based on the values in the EXAMPLE_LONG and
EXAMPLE_DATE fields.

Why do we care about this? Doesn't the normal STL associative
container already provide lookup and retrieval using keys? Well,
the normal associative containers in STL have two limitations
that we found quite tedious to work with in practice. The first
limitation is that if you want an STL container to provide lookup
capabilities then you need to manually write comparison functions
for each class and index that you want to use. As the number of
tables and indexes grow, manually maintaining these comparison
functions gets to be a bit tedious. The IndexNamesAndFields
syntax can automatically create indexes given a list of field
names. The internal comparison functions that are created are
slightly slower than using hand made comparison operators, but,
the performance difference is not that great and we feel that the
loss is more than made up for by the increased ease of use and
maintainability. The second limitation is that the STL containers
only support a single index on the data. We found this rather
confining since we often want to be able to search the same set
of rows quickly using various subsets of the row fields. For this
reason, IndexNamesAndFields allows you to create multiple indexes
on the rows in your container. To see how these features are used
to search based on the PrimaryIndex and AlternateIndex we examine
the following lines from the above example:

As per the standard, we provide a find(DataObj) method to locate
elements in the container. Our default find method uses the first
index passed into the IndexDBView constructor to locate objects,
and will return a match based only on the fields in that index.
In addition to the default find method, we have added overloaded
versions of the find method to perform a find using only the
fields needed by the index. For example, in the case of indexed_view.find(string("Foozle")),
the find() function resolves to
find<DataField> (const DataField &df1). This is useful,
because it allows us to execute a find by directly supplying the
criteria fields that we care about rather than having to manually
initialize an entire data object just to perform a find operation.

In addition to find() operations using the primary index, we can
also find an object based upon any of the indexes named in the
constructor for IndexDBView. This is done via the find_AK
function. For example, we could say indexed_view.find_AK("AlternateIndex",
long_criteria, date_criteria) ,
which would find the first element that matches the criteria
provided by long_criteria and date_criteria using the fields
named in the "AlternateIndex" to determine if we have a
match.

Finally, you will notice that the above code has calls to insert(),
replace() and erase() methods for IndexedDBView. One major
difference between the IndexedDBView container and a standard
container is that any changes made to the items in our container
can be automatically propagated back to the database. If we
construct the container to initialize in what we call "Bound"
mode then any changes made to the container are also sent to the
database. In our example, when we call the erase() method, this
removes the item in the container and also deletes the underlying
record in the database. Similarly, insert() and replace() will
modify both container and the database.

When you don't know what you need...dynamic queries is the answer

The queries shown above
assume that you know exactly what your target table looks like
and are able to define static objects to go against known fields
in these tables. In practice, you often end up in the situation
where you have a query with an unknown number of columns with
unknown types and you want to bind a dynamic object to this query.
To solve this problem, our library has two additional containers
called DynamicDBView and DynamicIndexedDBView which perform
binding to a variant row class. This variant row class allows for
an arbitrary number of fields, with each field being of an
arbitrary type2. The type and number of fields in variant row
are determined at run-time by querying the underlying database to
find the number of fields in the query and the type of each field
that is to be returned. To illustrate, we present an example:

// Using a DynamicDBView to read rows from the database.
// Read the contents of a table and print the resulting rows
void SimpleDynamicRead() {
// Our query will be "SELECT * FROM DB_EXAMPLE"
DynamicDBView<> view("DB_EXAMPLE", "*");
// NOTE: We need to construct r from the view itself since we
// don't know what fields the table will contain.
// We therefore make a call to the DataObj() function to have the
// table return us a template row with the correct number of fields
// and field types.
// We use this construction since we can't be guaranteed that the table
// is non-empty & we want to still display column names in this case.
variant_row s(view.DataObj());
// Print out the column names
vector<string> colNames = s.GetNames();
for (vector<string>::iterator name_it = colNames.begin();
name_it != colNames.end();
name_it++)
{
cout << (*name_it) << " ";
}
cout << endl;
// Print out all rows and columns from our query
DynamicDBView<>::select_iterator print_it = view.begin();
for (print_it = view.begin(); print_it != view.end(); print_it++)
{
variant_row r = *print_it;
for (size_t i = 0; i < r.size(); i++)
{
cout << r[i] << " ";
}
cout << endl;
}
}

Unlike the DBView code presented above, in DynamicDBView there is
no notion of a BCA to bind records to a particular class since
the assumption is that DynamicDBView will always bind to a
variant_row object. Therefore, the DynamicDBView is constructed
by specifying a table name and a list of fields to select from
the table (in this case we use "*" to specify all
fields in the table). When we go to retrieve rows from our table,
the row iterator returns variant_row
objects. Essentially, variant_row is an
array of varying types designed to hold the fields from our query.
variant_row is constructed when the query is first
executed, at which time the view interrogates the database in
order to find out the number and types of fields that will be
returned. Here we use three methods from variant_row in
order to display our results.

First, we call GetNames() in order to obtain a vector of the field names
in our query. To retrieve the field names, we must first
initialize a variant_row object from the view:

variant_row s(view.DataObj());

It is crucial that we initialize
all variant_row objects that we want to use from our view
class. This is because a single variant_row
object is shared by all dynamic
views and therefore they have to initialize their particular
version at runtime to tell variant_row
what fields it will need to hold
from the query. The second method that we use from variant_row is
the size() method. This returns the number of fields in
our row. Finally, we access individual fields within a row via
the [] operator. The [] operator returns a variant_field object
that we can use to read, write or print individual fields.
Individual fields may be specified by either field name or field
number. To illustrate, we continue with a second example that
uses DynamicIndexedDBView. What this example does is to repeat
the IndexedViewExample code shown above; but it uses a variant_row object
to do all its work rather than a specialized Example class.

Using STL Algorithms, the Table Difference Function>

As a final example, we
show how our library's compliance with the STL standards allows
us to take easy advantage of native STL algorithms. If we pass
two table containers to the function below, it can use the
standard STL algorithms to easily perform a 'difference'
operation showing any changed records in the tables.

Conclusion

In the foregoing article
we presented an STL centric paradigm for reading, writing and
updating table data from an ODBC data source. The library we
presented is centered around the notion of representing database
table operations via standard STL iterators and containers. Our
presentation was at an overview level for these iterators and
containers; full technical details have been left to the
reference documentation that we provide with the library. The
advantage of following the STL iterator and container paradigm is
that we are able to plug our database abstractions into a wide
variety of STL algorithms for data storage, indexing and
manipulation. In addition, the C++ reflection mechanism that we
introduced to bind iterators to database tables allows us to add
powerful automatic indexing and lookup features to our container
representations.

2 Our variant row type uses a
template mechanism to be able to hold values of common database
types. It is loosely based on the variant_t class proposed by
Fernando Cacciola. See F. Cacciola (2000). "An Improved
Variant Type Based on Member Templates," C++ Users
Journal Oct 2000, p. 10.

Top White Papers and Webcasts

U.S. companies are desperately trying to recruit and hire skilled software engineers and developers, but there is simply not enough quality talent to go around. Tiempo Development is a nearshore software development company. Our headquarters are in AZ, but we are a pioneer and leader in outsourcing to Mexico, based on our three software development centers there. We have a proven process and we are experts at providing our customers with powerful solutions. We transform ideas into reality.

When individual departments procure cloud service for their own use, they usually don't consider the hazardous organization-wide implications. Read this paper to learn best practices for setting up an internal, IT-based cloud brokerage function that service the entire organization. Find out how this approach enables you to retain top-down visibility and control of network security and manage the impact of cloud traffic on your WAN.