Moving ScrumWall to the Windows Azure Platform (Part 1: the data)

New Signature have developed a Silverlight version of Srumwall. Learn about moving ScrumWall to Azure. Part 1 centres on data

Here at New Signature, we have been developing a Silverlight version of a ScrumWall for a while now. We are advocates of both Scrum and XP for project delivery.

We are also keenly using the Azure Services Platform, most notably with our Wikipedia explorer, which was also ported to Azure. There are some interesting experiences from this project that we will be sharing soon.

As we have an enthusiastic interest in Silverlight, Azure and Scrum, we decided to put these together and port our Silverlight ScrumWall to Azure. ScrumWall is backed by a SQL Server data store; this is accessed by the Silverlight UI via WCF services. The enforced separation of the data access and the user interface by the use of a WCF service layer provides an excellent design. Additionally, instead of exposing the data “objects” (I.e. LINQ or Entity Framework generated classes), a set of DTO (data transfer object) are used to provide a better separation between the Silverlight UI and the data store behind the WCF service.

This allows the data store to be replaced independently and ensures that the WCF service interface remains unchanged. With this design in place, all attention can be focused on creating an Azure data store to replace the SQL Server database.

We chose to use Windows Azure tables to store the information for ScrumWall as conceptually, at least, it is fairly well suited to a “big table” type of storage. There is no simple migration or porting approach that can be used to move a relational data store to Windows Azure table storage. A completely fresh approach is required. Here we describe how the Windows Azure table storage approach relates to a SQL Server approach as a way of highlighting the differences.

The information required for a Scrum wall breaks down into 3 key parts:

The users

The wall itself, e.g. information about the project

The notes on a wall

This data is fairly loosely coupled. The UI is also designed in such a way as to allow the updating of each of the parts of this information to be done separately. Therefore, these can be happily separated into different Azure tables.

Taking the example of “the note on a wall”, the screenshot below shows notes on a fictitious ScrumWall:

A story note (blue) has a number of “child” task notes (green). A story note also has an owner (which is one of the users) and additional properties omitted for clarity. Modelling this in a relational database is fairly simple; a possible SQL Server implementation using 3 tables is shown.

With Windows Azure a different approach is required. First, we only use a single table. This contains all the information about a note, including the parent-child type relationship and owner. Second, the correct selection of the partition key is vital. The partition key provides the mechanism by which Windows Azure can scale the data store by potentially separating the physical store of data across different devices. Therefore, retrieving rows in multiple partitions can be slow. We have selected the partition to be a unique identifier for a project. Third, the row key is also fundamental. The row key is a unique identifier for a note (either of a story or note). The below shows of how this could be represented in an Azure table.

ProjectId

(PartitionKey)

NoteId

(RowKey)

NoteType

NoteText

ParentStory

Owner

OwnerUserId

….

123

1

Story

Story 1

(null)

Charles Bond

1

…

123

2

Task

Task 1

1

Charles Bond

1

…

123

3

Task

Blimey!

1

Charles Bond

1

…

123

4

Story

Story 2

(null)

John Thomas

2

…

123

5

Task

task for story 2

4

(null)

(null)

…

…

…

…

…

…

…

…

…

124

11

Story

Another project

(null)

(null)

(null)

…

Simplifying the data store by flattening the relationship between story and note does require additional effect by the other layers of the application to rebuild this relationship. There is also significant duplication of data by storing the name of the owner with each note. But this is a reasonable trade off, since comparatively a user does not change their name often.

There were other challenges porting our SQL Server data model to Windows Azure table storage. Windows Azure does not support transactions, joins nor does it currently provide a secondary index (in addition to the partition key/row key). Therefore, performing a lookup from a many-to-many relationship would only be using the index in one direction. A solution is to duplicate the data and store in denormalised by both the left and right hand sides of the many-to-many relationship.

In conclusion, moving the data store for ScrumWall from SQL Server to Windows Azure table store was relatively straight forward. While, the approach used to denormalise the data results in duplication. This is manageable by changes to the way in which updates are performed.