Ted has been fascinated with gaming ever since he first played Hangman on the Commodore PET. He's written software in dozens of languages and can explain "skip" microcode for the 78HC10, but strangely enough has never written a line of 6502 assembler. He most recently helped to build EEDAR, the business intelligence and analytics company for the gaming industry.

SQL Server: High performance inserts

In order to write high performance SQL code, we need to know more about what’s going on underneath the hood. We won’t need to go into exhaustive details on internals, but we’ll go in depth into a simple challenge and show you how you can create robust scalability one feature at a time.

Revision One: A Bad Design

I’ll present, first of all, the kind of database design we may all have seen at one time or another. Here’s a table that’s not really optimized for massive data storage; it’s ugly but workable:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

CREATETABLE item_drops_rev1 (-- This is the reference to the Items system ID number; it's not a foreign key -- because the "Items" table is on a different database server
item_id BIGINT,-- This is the internal monster class name, e.g. "Creature.Wolf" or "Player.NPC"
monster_class VARCHAR(50),-- The Zone ID refers to a database on the same server, so we built it as a -- foreign key
zone_id INTFOREIGNKEYREFERENCES world_zones(zone_id),-- The position and time where and when the kill happened
xpos REAL,
ypos REAL,
kill_time datetime DEFAULT GETDATE())

We can probably live with this table structure. It’s not normalized, but it gets the job done. I’ll explain in a moment what we can do to improve on it.

On the other hand, for the client code, I’m going to show you the wrong way to get things done: I’ll create a SQL injection vulnerability. Any web-facing application or service layer is at risk for SQL injection, and frankly it’s easy to overlook something and get lazy and only discover your server hacked a month later. If you ever see this kind of code, this is by far the worst thing you can ever do:

What’s bad about it? Do you see how the SQL statement is being assembled by a string parser? SQL statements are basically dynamic code. You would never let a random user type code into a web browser and execute it on your server, so don’t do the same thing with your database. Because SQL as a language has lots of features for combining multiple statements, someone could write a malicious monster_class string that would have terrible side effects. A simple attack might be “‘;UPDATE players SET gold=gold+10000;–”, but there are lots of other more subtle attacks too. SQL Server even has a feature called xp_cmdshell() that can allow a SQL injection vulnerability to turn into a sitewide security breach.

So don’t ever write your own SQL statements using string parsers; you instead want to use using statement. When the .NET execution environment exits a “using() { }” clause, it automatically disposes all resources that were obtained in the initialization. So those two items said, here’s what good code looks like:

Besides being easier to read, this approach guarantees that SQL injection will not be a problem. So this is where I’ll start.

This code, using the SQLCommand and parameters, gets the job done and it’s fast. Running a test on an old laptop I have around, I was able to insert 10,000 records in 7.768 seconds. This kind of performance is good enough for most companies; running on a proper server it would be 3-4 times faster and everyone would be happy. For a business application this code is fine, easy to maintain, and reliable.

That said, we’re a gaming company, and what works for us is a bit different than the code used by an accounts payable system that processes a few tens of thousands of records per day. You’re probably going to put this code up in front of a web service that gets called every time a zone server in your game records a kill, so maybe 10,000 kills in 7-8 seconds isn’t good enough. Let’s see what we can find to make improvements.

Performance Limitations Of Revision One

In order to make this database insert process fast, let’s start looking at what happens when you execute this command. I’ll overlook all the internal stuff that we don’t have control over, and show you instead a list of a few problems this code suffers from:

connection pooling to retrieve a previously opened connection with an identical connection string. If you’re not using C# for your client library, make certain that your environment provides pooling!

configure a VLAN to do this, and most servers nowadays have lots of spare ethernet ports.

SQL Server Parser is so fast its timing is often negligible; but we can still help cut down on its work by using stored procedures instead of straight SQL.

ACID compliance, SQL Server has to ensure that all database changes are serialized. It does this by using locks to ensure that database inserts don’t conflict with each other. Most SQL Servers are multi-user machines: in addition to your “INSERT” statement, the SQL server is probably handling dozens or hundreds of other clients at the same time. Maybe client A is inserting item drop records, while client B is running reports on these records, and client C is sending stats to the community manager in realtime.

Well, when SQL Server is doing all of these things, it generates locks to ensure that A, B, and C each receive sensible results. SQL Server guarantees that each statement is executed while the database is in a fully predictable state, and managing those locks takes time. Additionally, if SQL Server is just overloaded with too many clients, it doesn’t matter how well written your query is – it will be slow! You need to take a holistic look at the server’s load levels, and continue to monitor each and every client that connects to the server to see where optimizations can take place.

constraint (a foreign key, or a default, or a “check” statement) takes a non-zero amount of time to test. SQL server guarantees that each record inserted, updated, or deleted will continue to satisfy all constraints on the table. Do you really need foreign keys for large, high volume tables?

Hadoop cluster on Amazon EC2 that you keep paying for each month.

Varchars. Varchars are essential tools, but they can also lead to lots of unexpected overhead. Each time you store a variable length column, SQL Server has to do more memory management to pack records together. Strings can easily consume hundreds of bytes of memory each. If you put a VARCHAR column in an index, SQL Server has to execute tons of O(string length) comparisons as it searches through the B-tree, whereas integer compare instructions are limited only by memory latency and processor frequency.

wage constant war between normalizing and denormalizing; don’t ever let yourself get stuck on one side of the fence or the other.

After all this overhead, your query will return. In most cases, the query is so fast you won’t even realize all this overhead happened. However, there’s one more thing to keep in mind:

Disk IO. SQL Server will eventually write the data you’ve just inserted to disk. In SQL Server, data is first written to a transaction log, then the transaction log is merged with the permanent database file when a backup is executed. This happens in the background, so it won’t delay your query directly; but each transaction that occurs must have its own lump of disk IO. You can reduce this overhead by giving SQL server physically separate disks for transaction logs and main DB files. Of course, the best solution is to actually reduce the number of transactions you execute.

As you can see, there’s a lot to consider when optimizing SQL. The good news is that these limitations are manageable, provided you pay attention to them.

Revision Two: Free Improvements

Given what we’ve learned, let’s rewrite the SQL Insert command and see what benefit we can get. First, we can normalize our database by moving “monster_class” into its own table. We can then load into our client application the list of monster classes and use a hashtable to find the ID number to insert into the database directly. That reduces the amount of data stored in each record, and it makes each record identical in size. It offloads some of the work that SQL server has to perform and distributes it to the application tier.

With these improvements, we’ll see a reduction in data storage size per record:

The old vs new record layouts

These changes cut down the size of the record noticeably. The original record used to take between 41 and 91 bytes, probably averaging 60 or so bytes per record; the new record takes exactly 35 bytes. Since SQL server stores records in 8K pages, that means that you can now store over 200 records per page, whereas the old system could only hold about 130. This increases the number of rows you can insert before SQL Server succumbs to memory pressure, it speeds up your inserts, and since each record is a fixed length, SQL Server reduces the amount of offset calculation work that’s done to rapidly scan through records in memory.

Next, let’s eliminate our constraints. This should be done when you’ve tested your code and know that removing the constraint won’t make your data integrity suffer. In the initial table, we had two constraints: a foreign key and a default value. If we’re not really worried about ensuring that each zone ID is rigidly correct, we can forego the foreign key. That means for each insert we no longer need to test the zone database to ensure each ID exists. For a business application, that kind of foreign ID testing is valuable; but for my game I’ll just write a test script to check once per day that no bad zone IDs are being written.

Finally, I’ll replace the custom SQL text with a stored procedure. This can potentially reduce parser overhead, although our program is simple enough that the parser overhead is pretty low and highly likely to be cached anyways.

The revised table, stored procedure, and client program is now capable of inserting 10,000 records into my ancient laptop in about 6.552 seconds, for a performance boost of ~15%. The code now looks like these snippets:

This is a great set of simple improvements. However, there’s one more change you can make, and that’s the big one.

Revision Three: Lazy Writes

Think about your data requirements. Is it essential that every record be written to the database immediately? Can it wait a minute? Five minutes? What would the penalty be if a record was lost? How many records could you lose before your business failed?

For a bank, losing any monetary transactions can be life or death; but for a game you can define your own risk level. Let’s imagine that, for our monster drop table, we only want to risk losing at most five minutes’ worth of data. This gives us an incredible opportunity for performance improvement: we’ll keep a list of records to insert and batch-insert them once every five minutes! SQL Server offers us a great way to reduce overhead when we batch up work like this: the Transaction.

A transaction is basically an atomic lump of work. SQL Server guarantees that each transaction will either succeed and be written to the database permanently, or fail completely and no data will be written to the database. This sounds complicated, but keep in mind that transactions also allow SQL Server to forego lots of extra overhead. If you execute ten statements all by themselves, you get ten lock-based overheads; but if you wrap them in a transaction you only have to do your lock-based overhead once.

The only downside to batching is that we can’t use GETDATE() anymore, since records may be inserted a few minutes after they were generated. Instead we must preserve all the data for all records in memory while waiting for the time to insert a batch. The new code looks like this:

using(SqlConnection conn =new SqlConnection(my_connection_string)){
conn.Open();// This transaction tells SQL server to obtain locks once and keep them on hand until the transaction is committed
SqlTransaction trans = conn.BeginTransaction();for(int i =0; i < INSERT_SIZE; i++){using(SqlCommand cmd =new SqlCommand("insert_item_drops_rev3", conn)){// Setting the command type ensures that SQL server doesn’t need to do any complex parsing
cmd.CommandType= CommandType.StoredProcedure;// Joining a transaction means that SQL Server doesn't have to close and re-open locks
cmd.Transaction= trans;
cmd.Parameters.Add("@item_id", item_id[i]);
cmd.Parameters.Add("@monster_class_id", monster_class_id[i]);
cmd.Parameters.Add("@zone_id", zone_id[i]);
cmd.Parameters.Add("@xpos", xpos[i]);
cmd.Parameters.Add("@ypos", ypos[i]);
cmd.Parameters.Add("@kill_time", kill_time[i]);
cmd.ExecuteNonQuery();}}// This statement tells SQL server to release all its locks and write the data to disk.// If your code had thrown an exception above this statement, SQL Server would instead// do a "rollback", which would undo all the work since you began this transaction.
trans.Commit();}

With this change, we’re now able to insert 10,000 records in 2.605 seconds. That’s about a 66% performance improvement – pretty major! Even more importantly, wrapping sensible objects into a transaction significantly reduces database contention and can really alleviate concurrency bottlenecks. Of course, if we weren’t batching these inserts together, adding a transaction is a slight performance penalty; but whenever you’re grouping lots of commands into one transaction you’ll cut down overhead.

Revision Four: Table Parameters

You can see that the above approach really saved us a ton of time. However, to insert 10,000 records into the database we’re still contacting the database 10,000 times. In fact, each call to “cmd.ExecuteNonQuery” generates a roundtrip message from your client application to the database. What if there was a way that we could insert all ten thousand records, but only contact the database server once?

The good news is that SQL Server 2008 introduced an incredible new capability called “Table Parameters”. Table Parameters work by grouping tons of records together into a single parameter into a stored procedure or SQL statement. This essentially converts overhead performance penalties from O(number of records) to O(number of times you batch insert). Additionally, by reducing the number of SQL commands being executed, you dramatically reduce database contention and improve performance for other programs.

Here’s the final insert code including table parameters. You may notice that I’ve removed the BeginTransaction() and Commit() calls – those only boost our performance when we’re doing more than one ExecuteNonQuery() command at a time. So here goes:

And what does the performance of this logic look like for inserting 10,000 records? We’re now down to 0.218 seconds, an improvement of over 97% from our initial attempt.

Conclusions

Although this calculation isn’t perfect, I’m going to suggest the following estimates for the overhead we encountered in this test. Out of a total 97% speedup we obtained, I’m guessing the breakdown goes roughly as follows:

As you can see, you don’t need to abandon SQL Server to get massive performance improvements. However, it’s easy to forget about SQL Server performance because the technology is so effective. We use database wrappers and database abstraction layers that speed our development time, but separate us from the fine details. If you want to get lots of high performance work out of SQL server, use the same process you’d do if you were tuning C++ code: profile it, track the most frequently called functions, optimize the heck out of them, and finally rethink your work so you call them less frequently.

Finally let me pass some credit to Tom Gaulton and Rich Skorski who provided valuable feedback in the editing of this article.