Tuesday, 13 April 2010

T-SQL Tuesday #005 – Offloading Reporting (#TSQL2sDay)

Aaron Nelson (Blog|@SQLvariant) is hosting this months #TSQL2sDay with the theme of reporting. My post looks at some of things I have done in the past to offload reporting to another server to reduce the overhead of running reports against production OLTP databases.
A few years ago, back when I was implementing a HA solution in SQL Server 2000 utilising clustering to prevent against node failure and Log shipping to maintain an off-site, second copy of the database. The organisation wanted to implement a reporting and BI solution so that management information could be made available on demand for selected reports and also to load a data warehouse. Even though we copied the logs over to the standby server they were only restored once a day, so it was possible for the logged shipped DB to be a day behind. If we ever needed to recover the db in the event of a failover at most one days worth of logs would need to be applied before the database could be used and the day's latency was acceptable to the report writers and the load process.
The OLTP solution was a 24 hour service with roughly a 50-50 split between reads and writes and it was not easy to differentiate between peak and off peak time in this organisation, because of the nature of their business, for scheduling the load of the data mart or the running of resource intensive reports during off-peak times. So we went for the option of making the log shipped database read-only and allowing the read intensive report writers queries and the load process to read the data from the read-only log shipped copy thus not interfering with the clustered production database.
This worked well and definitely helped spread the load, in later versions of SQL Server I may well have made use of Database Mirroring to kill two birds with one stone, I would have still had a need to cluster the instance but I could have used Database mirroring and database snapshots to maintain a second up to date off-site copy of the database for reporting purposes but also to allow automatic failover should the entire cluster become available, setting that up is probably another post entirely though.
I guess the point I'm trying to make here, I have found if you have a database which has to satisfy the demands of multiple workloads and functions there can be some benefit in splitting the read intensive queries of reports and data loads to a different server.

5 comments:

Hi Gethyn - I am interested in using Mirroring and snapshots to offload reporting. The problem I have is that i would need to recreate the snapshot every hour to keep the data relevant. Replication is a no go because our ERP system does not allow it. I am concerned that there is a split second every hour when all reporting is unavailable - especially if the snapshot recreation occurs DURING a report being executed.Do you know of any way around this? I have considered using the SQL Native Client and naming a failover server but I am not sure if this will work mid-transaction. Hopefully there is an easier way.Thanks

Hi Rob, thanks for your comment. I'm not sure that there is a way around this, obviously you are pointing the reports at the mirrored database snapshot, but when the snapshot is created , assuming you are dropping and recreating the same snapshot on the mirror then there will be a period, all be it short, of unavailability. You could also look to creating more than one snapshot on the mirror...that way reports will not be disrupted because they will be pointing at an existing unaffected snapshot, when the new snapshot is created you can the change the connection string in the report to point at the new snapshot.

Hi Gethyn, thanks for the reply. I have considered this but unfortunately our main reporting tool is Crystal Reports and there is no way to dynamically change the connection string. I could programmatically update a system DSN to point at the new database but of course this will only work for new report executions. Reports currently in process would be cut off as soon as the snapshot object was destroyed. Sadly we have many very large reports which can take over 5 minutes to run and export to excel so could realistically hit this hourly "refresh" period. Until Microsoft improve this snapshot function and enable an online recreation of the snapshot I think I must carry on with the old method of data warehousing my report data first. Thanks again.