Sync Framework synchronizes data on a table-by-table basis, but my entities are normalized across sets of related parent-child tables. This creates problems for my application where a parent row might appear on the server to be processed, but the child rows might not appear for a few seconds. If there is a connection problem between my client app and the server, the child rows might not appear for some time.

How can I design my application to handle child tables being synced separately from the parent tables?

The specific scenario I'm looking at is receiving work orders on a server from a backend system to be distributed to engineers in the field using tablet PCs or PDAs. These work orders are large, complex entities that might cover half a dozen tables. The engineers complete their work, syncs the results, and the server returns the completed work order to the backend system.

Sounds like that could work, but how would it scale? Would the Sync Framework still sync all the Orders, then all the Order Details? What if I have 100 new Orders each with 100 items? If there's an error on the 99th, is it going to roll back everything?
–
Rory MacLeodFeb 22 '10 at 18:38

This answer doesn't address the inherent problems in syncing table-by-table: What if I have to sync 20 related tables? Do I sync the whole database in one transaction? If I put multiple tables in one transaction, I am making the transaction run for a lot longer. If it times out, I won't be able to sync anything. -1.
–
Rory MacLeodMar 3 '10 at 18:35

When building applications that you intend to have some synchronization in, you have to take account of that when designing the applications data. By the very nature of the reason you want to do sync. ie, you are disconnected at some time, or a link fails (say a 3g link) at some time. If you design the application to need to have 20 tables all in step, then yes you must by design have a long running transaction, and, as you say it fails then you need to roll all back.. a way out is to build a log file type system where you write changes, say by a trigger to a single table, then sync that..
–
JohnnyJPMar 3 '10 at 23:41

..and apply the changes locally. But these are really workarounds for a design that isnt built in a manner that lends itself to being disconnected. While syncronization gets you eventual consistency, is does mean that not everything is always up to date. The sync framework allows you to group things as I described, but if that group is so large as to make it impractical, then a look at the design of the data model is really whats needed..
–
JohnnyJPMar 3 '10 at 23:46

Despite once downvoting this answer, I now think it's better than any of my other ideas for pushing water uphill. The application should be designed around what Sync Framework will and will not do. Use batching to support larger numbers of rows, but try to limit the dependencies between tables.
–
Rory MacLeodFeb 16 '11 at 12:58

Design the application so that it doesn't matter if data appears at different times. The app will display or operate on whatever data is available at the time. If more data shows up later, it will display that too.

Pros:

This could be a flexible and robust way of dealing with data and doesn't depend on a lot of complicated synchronization code.

Cons:

It could be confusing to the user if they think they're looking at a complete task, or work order, or whatever, but extra pieces appear later on.

If data synced from a user to the server to be sent on to some other backend system, that system might not support partial submissions.

Denormalize everything. Create a database view that flattens the related tables into a single joined-up result set, or just store your data in one big table in the first place. Configure Sync Framework to sync that one table, batching it by the "left-most" key in the view, which should be the primary key of the root table in the hierarchy. Each transaction on the client now consists of all the changes made to a single entity.

Pros:

Can be implemented entirely in the database.

No need to replace any parts of the Sync Framework.

Might scale fairly well to large numbers of rows so long as you were careful about how the view was constructed and filtered, and how the underlying tables were indexed.

Cons:

Throwing out database normalization could be considered bad.

Might not scale well to large numbers of tables and columns, requiring lots of joins.

I could customize Sync Framework to make it respect database relationships. When a custom SyncAdapter comes across a row with changes, it could traverse the child relationships in the database schema to pick up any changes in related rows. These changes would all be added to the same dataset and synced as a single transaction.

Pros:

This seems like the best solution from the perspective of data integrity. I can be sure that a particular entity either contains all of the available changes from a client, or none of them.

I don't need to change my entities or describe them in any special way to the custom adapter - all the information it needs can be derived for the database relationships I already have.

I don't need to do anything special with my database schema - I can pretty much point my code at any database and it will just work.

Cons:

Customizing the Sync Framework this way could be a lot of work and would require detailed knowledge of the framework's internals.

The custom adapter would need to detect and handle circular database relationships.

What about something with checksums? Every time the application makes a change to the entity, it calculates a hash based on the latest contents of the entity and saves it in the parent row. When the application reads the entity it can recalculate the hash. If the data that's available at the time doesn't match the hash stored with the entity, the application knows that there's more synchronization to be done.

Pros:

Could be a fairly straightforward change to the application's domain model that doesn't involve changing Sync Framework's internals.

Cons:

The application would need to read all the rows that relate to the entity into memory every time it makes a change.

This would get a lot more complicated if the application had to support updates to the same entity coming from multiple clients.

Need to carefully plan what changes get synced in each direction and when the corresponding hash is calculated. Depending on your data, you might need to sync the same tables several times.

Bespoke to one application; you couldn't take the same code and apply it to something else.