Wednesday, 9 January 2013

Using a SQL script to reset data in database integration tests

Automated database tests need to be isolated and repeatable. It’s a lot simpler if each test begins with the database in a known state. The simplest way to do this is to delete unwanted data from the database before each test. As one of my colleagues likes to say: “We need to flatten the database”.

On a recent project, I went through a number of options for resetting the database, similar to those outlined by Jimmy Bogard in Isolating database data in integration tests. I also spent some time attempting to query and delete test objects via features of our ORM before realising that it wasn’t really set up to help me delete a complex graph of inter-related objects within a single unit of work.

Based on this experience, I am pleased to share a top tip: Direct SQL is the fastest and simplest mechanism for resetting your database and is probably the one that you should try first.

So how do we generate and run our SQL script? Jimmy describes a way of calculating the dependencies between tables at run time in some C# test infrastructure code and executing a bunch of “delete from xxx” statements. Nice. An automated solution like this will even accommodate changes to the database schema, er, automatically. However, there are some drawbacks. Diagnostics / analysis options are limited. If a schema / mapping change broke the assumptions in the code and made it impossible to generate the delete scripts in the correct order, we’d probably need to fire up the debugger and step through the code to work out which constraints were preventing the dependency order from being established.

Another option is to develop a SQL script yourself and execute it when preparing to run each test. It’s very simple and works as follows:

This approach has a couple of advantages:
  • It is clear at-a-glance how the database is being reset
  • You may need to go beyond several “delete from [table]” statements and get down and write some bespoke logic. Maybe you need to preserve a couple of records in a certain table. Perhaps there is one table in your 100 table system where you need to drop constraints, delete the data and re-create the constraints. A raw SQL script allows for manual intervention.
The main drawback, of course, is that you will need to update the script as the structure of your application's database changes, but you probably won't find that too annoying in practice.

Note that we don't have to hand-code our SQL from the ground up. See the next post for details of a TSQL script that you can run directly in SQL Server to help generate your reset script.

1 comment:

  1. I am too a database controller and I really liked your post related to my interests. Also, share some guidelines for controlling database through the web. Thanks.