Unit-Testing T-SQL Projects

Recently, I've been tasked with maintaining software that is driven by T-SQL. Unfortunately, these applications did not have sufficient documentation describing what the software does and what its expected behaviors are. Thus, I was thinking of a low-cost and simple way of introducing unit-testing so that I can have a better understanding of what's going on in the software and to provide a means of stating my assumptions on what the expected behaviors are.

In this post, I describe in further detail my motivation for introducing unit-testing to T-SQL projects. Then I briefly describe my approaches in preparing the application for testing and the process of testing itself. Finally, I introduce some next steps that supports continuous delivery with verified software states.

There are several test and mock frameworks out there but for the scope of this post, I don't mention any in particular. I assume the reader is also familiar with the C# language although most of this post is more about the process rather than specifics.

Motivation

Unit tests have been seen as an essential aspect of software verification since the 1970s. So it's clear that the software engineering community agrees that unit-testing is very important. Unfortunately, in my experience, testing is often neglected due to perceived inefficiencies. The consequence of neglecting tests is painfully obvious when it comes to maintaining software projects.

Documentation

First of all, I believe that unit tests are a great way of documenting your expectations of the software. Who needs to write up thorough documentation when you can simply open up a test fixture and see what the expectations are? If something like a thorough functional document is required as a deliverable, then a technical writer can go through the test fixtures and easily understand what the expected behaviors of each functions are.

Cost

The cost of neglected tests is more obvious when it comes to maintenance. Defects occur and hunting them down is a lot more resource-heavy if there are no stated expectations on what the software does because developers tasked to fix the defect are now forced to learn what the expectations were.

Tracing through code is very time-consuming especially if you have no idea where the error is localized. A test suite doesn't need to have full coverage to be helpful in finding errors although one should attempt to cover all cases. If a perceived defect is encountered, then one can look at the stated assumptions and see what was missed, then create a test case for it, and locate the error by tracing that unit of code.

A T-SQL-Driven Application

To me, a T-SQL-driven application is one that relies heavily on T-SQL stored procedures and functions. The application itself is pretty much a user interface between the T-SQL functions and the user. Thus, the application can be in any form from class libraries to APIs to websites and there are two obvious components of the software: the back-end T-SQL system and the front-end system.

Changes in the T-SQL system can potentially break the front-end since assumptions may have changed. So here is where unit-testing comes into play. If we had unit tests for the front-end components that ultimately invokes the T-SQL functions and the functionality changes, then the tests should pick up the discrepancies and most likely fail, thus alerting developers that the system change isn't complete.

Prepare for testing

To be clear, I am not advocating test-driven development in this post. However, there are some paradigms to keep in mind when developing software with unit-testing in mind. I believe that these paradigms will result in a more compartmentalized software solution that is easier to test and debug.

Divide and Conquer

No, not the algorithm archetype but the architecture of the software. In order for unit-testing to work, you need to be able to test on a single unit of the software. Be especially wary of external resources since during testing, you will likely not have access to these resources.

For example, consider connection strings to databases in a .NET application. I commonly see these accessed via the application's config file in the following manner:

A T-SQL Unit

Each programmability entity in a database can be seen as a unit so they can be tested. The front-end tests ensures that the front-end's communication with the database is intact and that assumptions are still valid whereas the T-SQL tests can be viewed an ensuring that the entities are all in agreement of the data model.

An all-too-often pain point I've experienced with database projects are instances where there is a slight mismatch between what the function believes is the table definition and the actual table definition. Continuous integration in conjunction with good unit test coverage can help with catching these situations before they go unnoticed for too long.

Writing the Tests

When it comes to writing software, I like to write tests for all changes I introduce as it is a way to document what my expectations are. It is better to build your test suite iteratively one unit at a time than to attempt to play catch-up later and sweep through your project to cover as much as you can in one go.

Linq to SQL

I simply use Linq to connect my tests with the target database containing the T-SQL functionalities to test. I believe that the front-end application should be using some form of object layer to interface with the database. That way, if the database changes, it will be known, especially when there are tests involved.

I like to create a database mapping file (DBML) per database connection to keep things simple. By default, the DBML uses dynamic types but you can specify what concrete type is returned from a T-SQL function/procedure call, which makes your entities type-safe. To do so from the designer, simply access the properties of the function in question and explicitly select a Return Type:

Solution Structure

This may simply be personal preference but I like to keep everything in a single solution including the tests like so:

The DBML simply lives in the test project. Then the classes that accesses the data context are within the same namepace.

Beyond the Front-End

With Linq, you can now test your T-SQL functions and procedures that are otherwise not called by your front-end application. You can even test scheduled procedures!

Simply prep your test by creating the necessary entities and submit them to the database via Linq.

Next Steps

With a decent version control system in place and a continuous integration solution that at least builds and run tests in an autonomous manner, you will catch breaking changes made to database definitions. Any time a breaking change is introduced, your software is in a state where it must not be deployed because clearly expectations changed or there is an error present in the software.

This cycle of submitting changes and blocking promotion when a breaking change is introduced is the essence of continuous delivery. You make iterative changes and each change is verified via a pipeline of tests. The frontline is the automated unit tests as they are quick to detect breaking changes. The idea is detecting errors early and keeping maintenance costs low with continuous feedback on the state of your software.

I believe all T-SQL projects need to be treated in a similar fashion. While enterprise databases may be tightly-coupled with internal applications, it would be ideal to know what state your entities are in and whether they are compatible with each other in terms of expectations. Find errors early before they linger in the organization unnoticed for too long.

I believe all software projects need to go through the process of stating expectations and keeping a record of changes in some manner. It is often the case in my experience that internal applications are seen as one-off projects but that is almost never the case because stakeholders will likely want to go back and make this one-off project a permanent one.

In short, you need to understand what state your software is in. Your software expectations can never be a mystery because maintaining a large mystery box in the future is very expensive. Build your test suite iteratively so that providing good coverage doesn't become a chore. All changes should be accompanied by a test stating your expectations.

I believe the above will streamline the software development process and keep maintenance costs low.

@Brian: Sorry, I think the first segment you quoted was poorly written so allow me to clarify:

If we had unit tests for the T-SQL functionality that the front-end ultimately invokes, then the discrepancies would be picked up...

The point I was trying to make with this post was to make sure that all expectations are stated. If your application is dependent on T-SQL functionality, then you need to state what its expectations are. In this post, I provided one method to do so.

I agree with your comment but I think it is beyond the scope of this post.

Changes in the T-SQL system can potentially break the front-end since assumptions may have changed. So here is where unit-testing comes into play. If we had unit tests for the front-end components that ultimately invokes the T-SQL functions and the functionality changes, then the tests should pick up the discrepancies and most likely fail, thus alerting developers that the system change isn't complete.

then later..

I believe that the front-end application should be using some form of object layer to interface with the database.

These statements seem contradictory - I agree with the second bit - use an ORM here and you can split out the back-end munging tests from the front-end business logic tests. Then you can simply implement stub implementations to make writing test cases pretty trivial. So by extra work I mean why not use an ORM for simpler code and test cases rather than writing a bunch of complicated ones.