Over the last couple of months I have on several occasions found myself in need of test data for demos, blogposts etc. You all know AdventureWorks, which will work for a lot of demoing stuff. But the amount of data is limited, and if you are demoing stuff about performance, AdventureWorks just isn’t that good.

I started looking around for tools, and thought I would have a look at TPC. My goal was not to be able to compare performance and transaction loads, but simply to generate data sets a bit bigger than AdventureWorks. So I found the TPC-H DBGen. In this blog post I will show you step by step how to use that tool to generate a database with a set of tables with data volumes large enough to be able to demonstrate performance tuning stuff.

2. Build the solution

Now open C:\tpch_2_14_3\dbgen\tpch.sln in Visual Studio. Depending on your Visual Studio version, you might be faced with a conversion wizard. Just click Finish to execute the conversion. All you need to do, is to build the entire solution. I had some errors because of some locked files, so I had to manually delete all files from the C:\tpch_2_14_3\dbgen\Debug folder before I could compile the solution. The result is the file C:\tpch_2_14_3\dbgen\Debug\dbgen.exe.

3. Generate data using dbgen.exe

Now we need to execution dbgen.exe. If we execute the command with –h we get some help:

If we simply run dbgen.exe, it default to generating 1 GB of data, divided into 8 different tables (customers, nation, lineitem, orders, parts, partsupp, region, supplier). The –s parameter specifies a scale factor, so –s 10 gives us 10GB, and –s 100 generates 100GB of data. Let’s just try the default:

Whoops, we got an error! Why this is, I’m not sure, but I found the solution to be simple: Copy the file dbgen.exe one level up, so it is located in the C:\tpch_2_14_3\dbgen folder, and try again:

The –v gives verbose output. Now it generates files for each table. Depending on the speed of your system, this may take a few minutes.

The resulting files will be located in the same directory as dbgen.exe. The list of generated files is:

4. Create database and tables

The next we need to do, is create an empty database, and create the tables.

CREATE DATABASE DemoData
GO

The schema we need, is available in the file C:\tpch_2_14_3\dbgen\dss.ddl and it looks like this:

Using this guide we can easily create 1, 10 or hundreds of GB of data, giving us the base we need to truly demonstrate the effect of performance tuning. This demo database will be used in later blog posts about query analyzing and performance optimizing. At some point I will also look into using the TPC tools for testing the transactional capacity of different systems, to be able to see the effect of configuration changes, hardware upgrades and other things we might wanna change to improve a system.