Sunday, August 8, 2010

Serialization Performance on Windows Phone 7

On a recent project of mine I needed to download a large file, do some massaging of the data and then write it to the device so it can quickly be accessed again. The data was in a tree form and was populated in memory to populate a POCO.

I put this off a little bit since I knew it shouldn’t be that difficult to do something using the DataContractSerializer to serialize it to a stream created from Isolated Storage. Something like:

I put in the code and started to test. Performance was terrible, my app would start, I clicked on “Break All” in Visual Studio .NET and found it was stuck on my new serialization code. And this was on my P7 laptop so this wasn’t going to work once I put it on the device. My instance had a larger number of arrays, each array had a considerable number of points. I used List<T> for my arrays. Next I too those out tried again. At least it finished, but it took about 8 seconds, again that was on my P7 multi-core so this wasn’t going to work on the device. Well maybe, once I download and process the array I figured I could start a background task to write the data, I could see some problems with this but it could be the best I could do. Ok let’s try to reconstitute that instance, 24 seconds?!?!? Time for a different strategy.

My solution came in two parts, the first which is specific to my application which as basically do the serialization myself. Turn each instance in the object graph and arrays of points into an array of bytes. Then just write those bytes to Isolated Storage.

First I wrote the code to turn those objects into byte arrays and then recreate the instances. This performed extremely well. Next I needed to write those byte arrays to isolated storage. This was slow relatively speaking. My process was to scan the object graph grab the byte array and then write it to the Isolated Storage stream. Not knowing how storage worked internally it might be slow to to lots of smaller writes. So I tried writing the byte arrays to a memory stream, then read the memory stream in big chunks (100Kb) to write to Isolated storage. So basically we are doing fewer larger writes instead of lots of smaller writes. This worked awesome. I was able to get my serialization process down for taking longer than I cared to wait, to a little over a second.

For deserialization I just basically reversed the process, read all the bytes in big chunk (again 100Kb) from Isolated Storage into a Memory Stream and then read the memory stream in the right size chunks to re-populate my instances. Still not 100% happy with the read, but most of that is creating instances and populating the data.