Oracle Blog

Databases & Applications with Sun Unified Storage Systems

Wednesday Sep 24, 2008

Recently I am been testing few pieces of Storage projects of OpenSolaris with PostgreSQL. One of tests involves using an iSCSI disk with PostgreSQL. Unfortunately the storage that's available is in Colorado while my PostgreSQL server is located in Massachusetts. Latency will definitely be one of my top problems since storage is halfway across the country (in Colorado). Plus the fact that I will be running a database server on my end doesn't really sound like a good idea. Come to think about it, this could be a more common problem nowadays since Cloud Storage (for example Amazon S3 Webservice ) could be optimistically half way across the country and pessimistically be on the other side of the world.

So what are my options to solve such problems? ZFS in OpenSolaris 2008.05 has many new features, couple of which can potentially help with my problem.

ZFS L2 ARC: Ability to use a Level 2 Adaptive Replacement Cache which can be block device (or cache reads on device in simple terms).

Thats an interesting set of new features that I thought will be useful in my case. One to log writes separately which can be on a fast disk and another to use a fast disk for caching reads. Of course I am not the first to say on this topic since these new features have been discussed in length a lot specially with SSDs. But I plan to solve the problem of latency of my Cloud Storage with these new ZFS features and some local disks partitions that I have in my system.

Many people do the analogy that compared to a regular 7,200 rpm SATA or 10,000 rpm SATA/SAS or 15,000 SAS drives, the SSDS act like 40,000 rpm drives. Well extending this to Cloud Storage, I think Cloud Storage is like more like a 500 rpm to 1000 rpm drives depending on the phase of the moon and/or the stock market.

Anyway to continue with my setup, I used an iSCSI disk exported in Colorado. I created a regular zpool on top of it on the server in Massachusetts and called it "colorado" as shown below:

# zpool create colorado c9t600144F048DAAA5E0000144FA6E7AC00d0

Then I created a PostgreSQL database in /colorado/pgdata and started loading up data in it using pgbench. It was painful to do this late in the day and then waiting for it to finish. At this point of time I also wished that pgbench had a scale factor of smaller than 1 (maybe it does I don't know). Anyway I did not have the patience to let it finish. I terminated the process after about 8 minutes as that scenario was unacceptable.

I destroyed that "colorado" pool . I referred the Solaris ZFS Administration Guide to get help with the updated syntax of these new features and recreated the pool using a local disk partition for cache and another for log as follows:

Not bad. Cutting the latency of writes of something
that would have taken in excess of 8-10 minutes is atleast recorded
within 4 seconds on nonvolatile cache/log combination and allowing ZFS to sync it up to
the actual storage.

So using these new features of ZFS in OpenSolaris 2008.05 can helps hide the latency of these low-cost Cloud Storage and actually make them usable even as a database server.

Plus I heard ZFS is also coming out with recovery options which will
allow to recover not only with the log but also without the separate log and cache device available. If your server dies and takes your ZIL disk with it, you can go and build another
server and attach to your cloud device to regain your data even if you don't have the device to replay the ZIL.If you have your ZIL Log, you can use it to get most current version. This is important during disaster recovery where you are willing to take whatever you have and start the business again from that point.