Databases and MIDP, Part 2: Data Mapping

As you discovered in Part 1 of this series, the Mobile Information Device Profile (MIDP) provides for data persistence through the Record Management System (RMS). MIDP support for persistence is limited to simple byte arrays, and records are read and written in their entirety, not field by field. Thus the RMS application programming interface (API) is very simple, but it requires applications to use a very unsophisticated binary format for data storage.

This article describes data mapping strategies you can use to encapsulate low-level storage operations so your applications can store and retrieve persistent data efficiently and effectively.

Core Classes for Manipulating Data

Writing data to a record store is really no different from sending a packet of data across a network to a server. The Connected Limited Device Configuration (CLDC), on which MIDP is based, includes standard data-manipulation classes drawn from the core library of the Java 2 Platform, Standard Edition (J2SE) that are particularly useful for RMS operations. A big benefit of using common interfaces is that MIDlets can interoperate more easily with applications running on standard and enterprise Java platforms.

Byte-Array Streams

A ByteArrayInputStream object transforms a byte array into an input stream, as this trivial example demonstrates:

The ByteArrayOutputStream's buffer grows automatically as data is written to the stream. The toByteArray() method copies captured data to a byte array. We can reuse the internal buffer for further capturing by calling reset().

Data can be read in this way only if it has been written to the stream in the machine-independent format that DataInputStream expects. The class has methods to read most simple Java data types: readBoolean(), readByte(), readChar(), readShort(), readInt(), and readLong(). CLDC 1.1 implementations additionally support readFloat() and readDouble().There are also methods for reading byte arrays and unsigned values: readFully(), readUnsignedByte() and readUnsignedShort().

The readUTF() method reads strings up to 65,535 characters long that were encoded in UTF-8 format. To read a string written as a sequence of two-byte char values, an application must make multiple calls to readChar(), which assumes either that a delimiter identifies the end of the string, or that the length is already known. The length may be a fixed value or it may have been written to the stream immediately before the string.

The application reading the data must know the order in which the primitives were written in order to invoke the correct methods.

The data is written in the same machine-independent format expected by DataInputStream. The class has methods to write most simple Java data types: writeBoolean(), writeByte(), writeChar(), writeShort(), writeInt(), and writeLong(). CLDC 1.1 implementations additionally support writeFloat() and writeDouble().

Strings are written using one of two methods. You can write strings of up to 65,535 characters encoded in UTF-8 format with writeUTF(), or call writeChars() to write a sequence of two-byte characters.

Basic Data Mappings

The standard data manipulation classes make basic data mapping easy.. Writing data to a record store is simply a matter of combining a DataOutputStream with a ByteArrayOutputStream and storing the resulting byte array:

If the class is not modifiable, such as the standard Vector or Hashtable classes, you'll need to write a helper class. For example, here's a helper class that maps a list of non-null strings to a byte array:

Of course, what you're really doing with all this coding is developing an object serialization framework. A complete framework is beyond the scope of this discussion, but the following issues deserve careful consideration when you're making objects persistent:

Object creation. CLDC doesn't include reflection APIs, so you'll need some way to re-create persistent objects. It could be a constructor that accepts a stream or a byte array, or a more complicated factory class.

Versioning. If an object's data layout may change, you'll want to store a number at the start of the byte array that identifies the version of the object that was stored.

Object references. If one object refers to another, the relationship must be maintained. In some cases, you can replace the object reference with an index or a key that can be used to locate the object referred to, after deserialization. Otherwise, you must store a complete object graph, not just a single object.

You can avoid these issues or minimize their impact by saving and restoring only primitive Java data types and simple, self-contained objects. The fromByteArray() and toByteArray() methods shown so far are simple but effective means of making object persistence easy. You can also use these techniques to copy objects across a network: Once you have the object in byte-array form, it's a simple matter to send the array to another device or to an external server using a network connection, and to recreate the object at the other end. For example, the data classes you develop for your J2ME application can just as easily be used in a servlet developed for the Java 2 Platform, Enterprise Edition (J2EE).

Using Data Streams

The examples shown so far have dealt directly with byte arrays. Convenient as they are for saving and restoring a single set of data, dealing with raw byte arrays becomes cumbersome as soon as you want to store a sequence of data sets -- a list of objects, for example. A better approach is to separate the management of the byte array from the reading and writing of the data. For example, we can add these methods to our Contact class:

Centralize the reading and writing of the data to ensure that it is always consistent.

What's Next

In this part of the series, you've learned some basic data-mapping techniques. In Part 3 I'll show you a more sophisticated approach to managing persistent data, one that enables your applications to store and retrieve objects composed of multiple fields of varying types.

About the Author:Eric Giguere is a software developer for iAnywhere Solutions, a subsidiary of Sybase, where he works on Java technologies for handheld and wireless computing. He holds BMath and MMath degrees in Computer Science from the University of Waterloo and has written extensively on computing topics.

Reader Feedback

Excellent Good Fair Poor

If you have other comments or ideas for future technical tips, please type them here:

Comments:

If you would like a reply to your comment, please submit your email address:
Note: We may not respond to all submitted comments.