Summary

Persistence is the capability of an application to store the state of objects and recover it when necessary. This article compares the two common types of serialization in aspects of data access, readability, and runtime cost. A ready-to-use code snippet using BinaryFormatter with simple encryption is provided.

Introduction

I was amazed the first time I read the .NET documentation about serialization. Prior to the .NET era, it was a big headache to deal with configuration data. You would have to write large code pieces to stream the data out to a file and then parse the long strings again to find out the proper data to read back. When playing with serialization, I was hoping to create a complete cache of the application and restore it just like nowadays Windows system “Hibernation” feature. Although the reality is always far from the imagination, .NET serialization is still very useful in caching “part” of an application – the data objects.

.NET Framework provides two types of serialization: shallow serialization, and deep serialization, represented by XmlSerializer in System.Xml.Serializationnamespace andBinaryFormatter in System.Runtime.Serialization.Formatters.Binarynamespace,respectively. The differences between the two types are obvious: the former is designed to save and load objects in human-readable XML format, and the latter provides compact binary encoding either for storage or for network streaming. The .NET Framework also includes the abstract FORMATTERS class that can be used as a base class for custom formatters. We will focus on XmlSerializer and BinaryFormatter in this article.

XmlSerializer Basics

There are three projects in the attached package. The first one XMLSerializerSample shows some typical scenario that XmlSerializer could be applied. In file SampleClasses.cs, three sample classes are defined:

The Main program routine simply serializes out the instance of each class to a file and reads it back sequentially, plus an array object to test the performance on bulk data. I tag the test case numbers in both the source code and the article. You can perform the tests yourself if you'd like. Simple guidelines are in the source code, which illustrate the basic elements of a Software Test Document (STD).

There must be a default constructor. Normally the compiler will generate one if none explicit constructor is present. But sometimes we could create a parameterized constructor but forget to add a default constructor. Then the serialization would be “accidentally” disabled. (Test Case 8)

String manipulation is very expensive, and storage in text format is huge (Test Case 9):

The workaround for making a build-in type serializable (Test Case 10):

The Dictionary type is also supported, with a little more cost (Test Case 17).

Basically, you don’t need to worry too much about your data types, just put SerializableAttribute on your class. Then you can achieve persistency by saving the object wherever it needs to. For the types that cannot be persisted properly, you can either put NonSerializedAttribute on the data member for the serializer to ignore it, or implement ISerializable interface to make it serializable.

Example of Use

From the above experiments, we can see that it’s natural to favor BinaryFormatter over XmlSerializer. Even for configuration settings, it is recommend to modify the data through user interface, rather than directly touching the data in the output files. The third project in the attached package provides two more helper function to save and load data without encryption.

Another attached application, TCPaint, uses exactly the same code to persist the size and location of the form as well as other configuration setting data such as MRU (most recent used files). The unlimited steps of undo and redo actions are also saved using this technique. A user can always rewind and modify their drawings as a set of individual objects rather than as a bitmap image.

In the end, using serialization properly can save you a lot of time and headaches.