SAN 101 for the DBA

As will become apparent from this post, I am no Storage Admin. My apologies for offending the sensibilities of those who know this topic better than I do!

I get asked occasionally about placing SQL Server data on SAN storage, and I've done it with a few systems, and a lot of smart people helping me, so here's a SAN 101 crib sheet for DBAs who may be new to this.

SAN Basics

SAN stands for Storage Area Network, like LAN or WAN. The first thing of note is that it's a network, and not a device. Often it's a fiber-optic network, for speed and bandwidth, but sometimes it can be a copper network. It's easy to lose sight of the fact that its a network, because it's not what we all think of as "the network," and because the storage admins are usually the only ones that play there. Hence the colloquial term "a SAN" referring to one storage device. One device on the network is really a Storage Array (SA?), not a Storage Array Network (SAN). Tomato, tom-AH-to, I suppose, but it's important in larger environments to talk about one array separately from the storage network itself.

Because it's a network, there's a network protocol for devices to communicate. Usually that is FCP for fiber or iSCSI for copper (with exceptions.) The protocols really send SCSI commands, the same kind many hard disk devices use within a single server. That means, for practical purposes, that devices a server "sees" on the SAN look and act like hard disks rather than network devices. There can be switches in a SAN, and so on, just like other networks.

Servers typically connect to a fiber SAN with one or two (maybe more) Host Bus Adapters (HBAs). An HBA just acts as an interpreter, making the devices visible on the storage network appear to the host bus, transparently, as if they were disks. So, if you are building such a setup, you'll have a server with a couple of HBAs, those are connected to a switch or two on the fiber network, which is then connected to one or more storage arrays. If a chunk of storage on one of the arrays is allocated for your server, then you see that storage space as if you'd stuck a hard disk in. Partition, format, write. Simple!

Rule 1: There is no Magic

Storage vendors do have a reputation (they are selling things, after all) to pitch their stuff in a way that makes it seem like all your performance worries will vanish in some kind of magic fiber-optic {poof}. Not so, I'm afraid. Enterprise arrays are good, but they are still subject to the laws of physics. The disks in an array are still disks, the cache is just a cache. This is good for the DBA, in a way, because if you know how to size direct attached storage by spindle count and IOPS, then - whooee! - it's not really that different. There are things that help performance some, like a huge write cache with write re-ordering and fault tolerance, but you don't have to throw out what you know and start over. Lots of random IO still means you need lots of random IO capacity, which still means disks.

Rule 2: Performance Costs More than Space

The fundamental mistake I have seen is to hand a SAN administrator (or your SAN vendor) a spec about how much space you need for your data. That is a recipe for disaster. What is important is what the performance profile is, and, just as with plain ol' disks, IOPS will determine what the hardware has to look like in an array. And you want to do that before you get the price quote, because the performance, not the space, will drive the cost. Ask for a terabyte? You might get one 1 TB SATA disk. That won't be happy at all.

Rule 3: Yes, Direct Attached Storage is Cheaper ... But

OK, so there's a whole economics problem to be aware of. SAN storage is expensive, sometimes very expensive. In some circumstances, it's still worth it, but the price tag is undeniable. Examples: if you have the same amount of storage (or in the SQL world, amount of storage performance) in direct attached disks, and in SAN storage, the SAN flavor is going to cost a lot more up front. It's pretty easy to spend $1,000 per disk for 15k fiber channel disk drives in an array.

However, with real IT in real organizations of any size, things are more complicated - there isn't one, unchanging and monolithic system. In fact, things, servers, storage requirements, circumstances, applications all change at a breakneck pace. What we do, most of the time, is try to accomodate change and keep things "up." That is life for many IT pros. So it is not cheaper to have piles and piles of unutilized or overspec'd direct attached storage sitting around, married to single servers. If you had to buy twice what you needed, where's the savings? When you buy into a SAN what you are buying is, or should be, flexibility. The flexibility to use it fully, all the time, and to change the storage allocation for all your servers without physically scrapping storage hardware and buying new. If you have a SAN, and it's not providing that value, then you probably wasted your money.

So, if you have one relatively static database server and you're looking at storage options, direct-attached might save you a pile of money. Don't build a SAN for performance around a couple of servers -- there's no compelling performance-per-dollar argument. On the other hand, if you find that you look around your enterprise and see tens or hundreds of servers where there's a huge waste factor for storage in some places, and full disks in others, there's your SAN opportunity. This server over here is low on disk space? OK, let's just add some. That one had too much, and we're wasting space? Take some away. Storage over here is too slow? Migrate it to a set if disks with more performance.

In other words:

Best case: one SQL Server or cluster with correct direct attached storage, fully utilized. Nice, but this is the real world!

Next best: a heterogeneous, changing environment that uses SAN storage to best advantage

Worse: a collection of over-spec'd DAS hardware that is under 50% utilized, but can't realistically be changed. It seemed cheap at the time, but there's so darn much of it!

Worst: a big, expensive, under-utilized SAN

Rule 4: You Need a Good Relationship with the SAN Admin

I've blogged about this before, but suffice it to say that bad communication with the SAN admin = FAIL. SQL Server often has unique and demanding IO requirements that don't go away just because you have a fancy array. You have to be able to work that out with the storage admins, if you have them, or the vendor, if you are in a smaller shop. Together you will have to talk through the need to separate logs, data and backups, and what the performance profile of each "virtual" disk system needs to be, backed by perf counter data, to prevent the SAN nightmare: "We spent our $5,000,000 and the VP wants to know why it's SLOW."