Deep Dive into the HTML5 IndexedDB

In this article, I’ll review a technology that solves an important piece of the application puzzle—managing storage and retrieval of user-specific data on the client side—called "IndexedDB."

Editorial Note

This article is in the Product Showcase section for our sponsors at CodeProject. These reviews are intended to provide you with information on products and services that we consider useful and of value to developers.

Over the years, the web has
increasingly transformed from being a repository of content to a marketplace of
full-fledged functional apps. The suite of technologies that fall under the
"HTML5" banner have, as a fundamental goal, the capabilities to build within
this new breed of software. In this article, I’ll review a technology that
solves an important piece of the application puzzle—managing storage and
retrieval of user-specific data on the client side—called "IndexedDB."

What is IndexedDB?

An IndexedDB is basically a
persistent data store in the browser—a database on the client side. Like
regular relational databases, it maintains indexes over the records it stores
and developers use the IndexedDB JavaScript API to locate records by key or by
looking up an index. Each database is scoped by "origin," i.e. the domain of
the site that creates the database.

IndexedDB is also a great example
of how web standards evolve. Through standards working groups and HTML5
Labs (a site that publishes prototype implementations of various HTML5
specifications so you can try them out and provide feedback), IndexedDB will
soon be ready for prime time site use.

Building an offline note-taking app

We are going to build a client
side data layer for a note-taking web app:

From a data model point of view,
it’s about as simple as it can get. The app allows users to write text notes
and tag them with specific key words. Each note will have a unique identifier that
will serve as its key, and apart from the note text, it will be associated with
a collection of tag strings.

It should be obvious what each
method does. All method calls execute asynchronously (that is, when results
are reported via callbacks), and where a result is to be returned to the caller,
the interface accepts a reference to a callback that is to be invoked with the
result. Let’s see what it takes to efficiently implement this object using an
indexed database.

Testing for IndexedDB

The root object that you deal
with when talking to the IndexedDB API is called indexedDB. You can
check for the presence of this object to see whether the current browser supports
IndexedDB or not. Like so:

Asynchronous requests

The asynchronous API calls work
through what are known as "request" objects. When an asynchronous API call is
made, it would return a reference to a "request" object, which exposes two
events—onsuccess
and onerror.

As you work with the indexedDB
API, it will eventually become hard to keep track of all the callbacks. To
make it somewhat simpler, I’ll define and use a small utility routine that
abstracts the "request" pattern away:

The open method opens the
database if it already exists. It is doesn’t, it will create a new one. You can
think of this as the object that represents the connection to the database.
When this object is destroyed the connection to the database is terminated.

Now that the database exists, let’s
create the rest of the database objects. But first, you’ll have to get
acquainted with some important IndexedDB constructs.

Object stores

Object stores are the IndexedDB
equivalent of "tables" from the relational database world. All data is stored
inside object stores and serves as the primary unit of storage.

A database can contain multiple
object stores and each store is a collection of records. Each record is a
simple key/value pair. Keys must uniquely identify a particular record and can
be auto-generated. The records in an object store are automatically sorted in
ascending order by keys. And finally, object stores can be created and deleted
only under the context of "version change" transactions. (More on that later.)

Keys and Values

Each record in the object store
is uniquely identified by a "key." Keys can be arrays, strings, dates, or
numbers. For comparison’s sake, arrays are greater than strings,
which are greater than dates, which are greater than numbers.

Keys can be "in-line" keys or
not. By "in-line," we indicate to IndexedDB that the key for a particular
record is actually a part of the value object itself. In our notes store
sample, for instance, each note object has an id property that contains
the unique identifier for a particular note. This is an example of an
"in-line" key—the key is a part of the value object.

Whenever keys are "in-line," we
must also specify a "key path"—a string that signifies how the key value can be
extracted from the value object.

The key path for "notes" objects
for instance is the string "id" since the key can be extracted from note
instances by accessing the "id" property. But this scheme allows for the key
value to be stored at an arbitrary depth in the value object’s member
hierarchy. Consider the following sample value object:

Database versioning

IndexedDB databases have a
version string associated with them. This can be used by web applications to
determine whether the database on a particular client has the latest structure
or not.

This is useful when you make
changes to your database’s data model and want to propagate those changes to
existing clients who are on the previous version of your data model. You
can simply change the version number for the new structure and check for it the
next time the user runs your app. If needed, upgrade the structure, migrate
the data, and change the version number.

Version number changes must be
performed under the context of a "version change" transaction. Before we get
to that, let’s quickly review what "transactions" are.

Transactions

Like relational databases, IndexedDB
also performs all of its I/O operations under the context of transactions.
Transactions are created through connection objects and enable atomic, durable
data access and mutation. There are two key attributes for transaction
objects:

1. Scope

The scope
determines which parts of the database can be affected through the
transaction. This basically helps the IndexedDB implementation determine what
kind of isolation level to apply during the lifetime of the transaction. Think
of the scope as simply a list of tables (known as "object stores") that will
form a part of the transaction.

2. Mode

The transaction
mode determines what kind of I/O operation is permitted in the transaction.
The mode can be:

a. Read
only

Allows only "read"
operations on the objects that are a part of the transaction’s scope.

Read/write

Allows "read" and "write"
operations on the objects that are a part of the transaction’s scope.

Version
change

The "version
change" mode allows "read" and "write" operations and also allows the creation
and deletion of object stores and indexes.

Object stores are created by
calling the createObjectStore method on the database object. The first
parameter is the name of the object store. This is followed by the string
identifying the key path, and finally a Boolean flag indicating whether the key
value should be auto-generated by the database when new records are added.

Adding data to object stores

New records can be added to an
object store by calling the put method on the object store. A reference to
the object store instance can be retrieved through the transaction object.
Let’s implement the addNote method of our NotesStore object and
see how we can go about adding a new record:

Invoke
the transaction
method on the database object to start off a new transaction. The first
parameter is the names of the object stores that are going to be a part of the
transaction. Passing null causes all the object stores in the
database to be a part of the scope. The second parameter indicates the
transaction mode. This is basically a numeric constant which we have declared
like so:

Once
the transaction has been created we acquire a reference to the object store in
question through the transaction object’s objectStore method.

Once
we have the object store handy, adding a new record is just a matter of issuing
an asynchronous API call to the object store’s put method passing in
the new object to be added to the store. Note that we do not pass a
value for the id field of the new note object. Since we passed true
for the auto-generate parameter while creating the object store, the IndexedDB
implementation should take care of automatically assigning a unique identifier
for the new record.

Once
the asynchronous put call completes successfully, we commit the transaction.

Running queries with cursors

The IndexedDB way of enumerating
records from an object store is to use a "cursor" object. A cursor can iterate
over records from an underlying object store or an index. A cursor has the
following key properties:

A
range of records in either an index or an object store.

A
source that references the index or object store that the cursor
is iterating over.

A
position indicating the current position of the cursor in the given
range of records.

While the concept of a cursor is
fairly straightforward, writing the code to actually iterate over an object
store is somewhat tricky given the asynchronous nature of all the API calls.
Let’s implement the listNotes method of our NotesStore object and
see what the code looks like.

First,
we acquire a transaction object by calling the database object’s transaction
method. Note that this time we’re indicating that we require a "read-only"
transaction.

Next
we retrieve a reference to the object store via the objectStore method of
the transaction object.

Then
we issue an async call to the openCursor API on the object store. The tricky
part here is that every single iteration over a record in the cursor is itself
an async operation! To prevent the code from drowning in a sea of callbacks,
we define a local function called iterate to encapsulate the logic of
iterating over every record in the cursor.

This
iterate
function makes an async call to the cursor object’s move method and
recursively invokes itself again in the callback if it detects that there are
more rows to be retrieved. Once all the rows in the cursor have been retrieved
we finally invoke the callback method passed by the caller handing in the
retrieved data as a parameter.

Dive even deeper!

This is, by no means,
comprehensive coverage of the API, despite what you may think! I only covered:

Available
options for implementing client-side storage today

The
various key aspects of the IndexedDB API, including:

Testing whether the browser supports it

Managing asynchronous API calls

Creating/opening databases

Key parts of the API including object stores, keys/values, versioning,
and transactions

Creating
object stores

Adding
records to object stores

Enumerating
object stores using cursors

Hopefully it was up close and
personal enough!

Now, if you're ready for more,
the W3C specification document
is a good reference and is short enough to be readable! I’d encourage you
to experiment—having access to a functional database on the client side opens
up a range of new scenarios for web applications.

Another good resource is the IndexedDB/AppCache sample on the IE Test Drive site. This sample
covers a scenario where the two specifications complement each other in
providing the user with a rich experience…even when she’s not connected to the
internet. The sample also demonstrates using new features in IE10 like CSS3 3D
transforms and CSS3 transitions.