I have had the need for a small utility to do this several times, in case a customer has some XML, and they want to know if the XML is even valid. The check for well formed-ness can be done by simply opening the file in Internet Explorer, but that is just soooo slow for large files. I tried a 70MB XML file once. It took IE a couple of hours to open it and tell me what was wrong with the XML. My utility did it in a few seconds.

At Logica we often participate in different events where employees compete to see who rides their bike more often to work, who walks the most steps in a month, and so on. After a month of competition, we end up with a spreadsheet, where I may have walked 180.000 steps, but my colleague Henrik only walked 78.000 steps (He is kind of a wimp )

So lets say that we want to give a prize to one of us, and Henrik should have a chance of 78000/(78000+180000) (30,23%) of winning and I should have a chance of 180000/(78000+180000) (69,77%) of winning. As the number of points and the number of contestants get bigger, this becomes increasingly difficult to manage.

Therefore, I have written a small winforms program, that helps you manage this. You can add as many contestants as you like, and give them points. If you are only interested in a "normal" draw, you can just give all contestants one point.

Screen shot:

The program not only does the draw, it will also:

Give you an overview of the contestants, their points and their chance of winning, which is dynamically updated each time a contestant is added

Give you the opportunity to simulate any number of draws, to ensure that the program is random. When doing the simulation, the percentage of wins by each contestant is shown next to the chance of winning, so they can be compared.

I am using http://www.last.fm to keep track of what I listen to, and to get inspired to listen to some new music that I didn't know I liked.

You can find my profile at http://www.last.fm/user/eliasen and you can find information about an excellent Danish band Baal at http://www.last.fm/music/Baal. Please note, though, that I listen to the Danish band Baal, and not the Japanese band that unfortunately shares the band name Baal with the Danish band.

Anyway, the point of this blog post is, that it seems that I am currently the top listener of Baal:

This means that I listen to more Baal than all that listen to either the Danish or the Japanese band...

The setup

It has a simple spreadsheet and a schema for this spreadsheet (both are described in my previous post) and the setup basically just has a FILE Receive Location and a send port with a filter that takes everything from the Receive Port the Receive Location belongs to. My aim is to see how fast the Spread Disassembler is.

The host machine is a Hewlett Packard 8710w laptop with an Intel Core Duo T7700 2,4GHz CPU, 2GB of RAM and Windows XP Professional Service Pack 3 and completely updated as of 7'th December 2008.

The guest system is a virtual machine which has one 2,4GHz CPU, 1GB RAM and Microsoft Windows Server 2003 R2 Enterprise Edition SP2 - also completely updated as of 7'th December 2008.

The test

I created 999 copies of the same spreadsheet and moved them into a folder watched by the receive location. They were read, transformed into XML, and output into the output folder in 3:19 minutes. This is an average of 5 spreadsheets per second.

This took me by surprise - I had expected it to be faster. So I decided to do things more academic than looking at the time stamps of the output files. After all, there are PLENTY of functionality that could be the time consumer. So I created a BAM Activity and View, tracking when my Receive Port starts and when it ends.

A table showing the average processing time can be seen here:

Number of messages in test

Average processing time per message

Messages per minute

5

0,0227 seconds

2643

63

0,0337 seconds

1780

127

0,0584 seconds

1027

1966

0,2714

221

So it is pretty clear, that performance drops drastically when the load increases. I do not blame this on the Spread Disassembler, though. Since this is a virtual PC, with SQL Server on the same box as BizTalk, the mere I/O operations when writing all the output files to the hard drive conflicts with the I/O operations of BizTalk using the MessageBox. I find this a much more likely issue for the drop in performance than that the disassembler should get slower just because more messages come in.

So, to sum up, it seems that the Spread Disassembler can take a pretty heavy load - Up to 2643 messages per minute (44 messages per second). This is given less than ideal operating and hardware conditions, but optimal conditions with regards to the BizTalk Server not doing anything else at the time.

Maybe in a later post I will take a look at more complex spreadsheets/schemas and also test the performance of the assembler.