I'm interested in storage benchmarks for various configurations in order to figure out what's best for a virtualization environment. The virtualization environment will be proxmox, as it is my choice for the best manageable virtualization platform with plenty of features right now.

I want to look at the following configuration options, which may have an impact on performance:

filesystem

lvm

thin provisioning

transparent compression

multi disk technology(technology, raidlevel)

ssd caching

thin provisioning

Thin provisioning is the method of having virtually unlimited space and provide actual physical existent space only in the amount of actually used space. So you can define multiple TB of disk capacity and only have a 250 GB SSD at the back. If that backend device is getting filled up, you can add more storage when you need it. It's especially helpful in the times of SSDs because they are still considerably more expensive, so you do not want to spend thousands of $ when you in fact do not need it. Furthermore there are big differences in SSD products. SSDs for desktop use maybe quite cheap. But SSDs for server which are heavily written on are much more expensive.

Many filesystems have interesting features, which are helpful besides the pure performance and problems which one would not like:

PRO: zfs and btrfs has checksums and selfhealing against data corruption.

PRO: zfs and lvm provides methods for thin provisioning

PRO: ext4 is easy to use. a simple fire and forget filesystem.

PRO: btrfs has an enormous flexibility

PRO: lvm has the flexibility to change configurations without downtime

CON: ext3 has quite long filesystemcheck times.

...

transparent compression

Transparent compression is a layer which reduces the amount of written/read data onto/from the raw disk and thus may increase speed at the cost of cpu power.

multi disk technology(technology, raidlevel)

There are different multi disk technologies available. Linux Software RAID, LVM, btrfs raid, zfs raid. They combine the speed of multiple devices and add redundancy to be able to cope with device failures without data loss.

ssd caching

ssd caching can accelerate slower hdds by adding putting used data onto the fast ssd as read cache or by storing datas to be written preliminary to the ssd and have it synced to the slower hard disks in the background, not loosing data security, because data written to the ssd is already persistent.

ceph - no option here

Ceph is a very interesting technology. I'm not considering using it, because the money needed to get it run with good performance is a lot higher than just with disks and ssds. You need at least 10 G networking, or even better, which is a lot more costly than 1 G. You need full equipped SSD Storage which is more expensive too. A big plus with ceph is that you get a redundant network storage, so you can immediately start virtual machines on other nodes if a compute node crashes. If money is no problem, and the performance is not needed at the maximum, ceph would be an excellent choice. I have a 3-node-cluster with ceph here up and running. It works like charm. Administration is easy and performance is fine.

In the following threads, I'll introduce more on my environment and scripts of the benchmarking.

The hard disks are of type SAS and attached to the adaptec raid controller as single disks. One Intel SSD as OS-Filesystem. The other one is attached PCIe SSD-m.2 Adapter. An additional m.2 SSD will be attached for later tests with ssd caching.

For the tests I will make use of fio - flexible I/O tester - one of the currently most popular storage benchmarking tools.

My production scenario will be webhosting. So it will be 25% write and 75% read. I will test that probably later after the basic read/write tests.

At first I'm making sure the device names I use are fixed so my tests will not overwrite any of the wrong disks. This may happen under linux because there is no fixed device naming of storage devices. The ordering may be different at every reboot. And it actually is, as I have noticed.

So I'm checking the serial numbers and copy the device file names to unique names I will be using then.

What's regarding partitions: I try to avoid using them and use whole disks instead as it makes the procedere simpler.

Would love to see zfs test on that ratio.
Should shine with separated l2arc devices on ssd, when it gets warm.

Be sure to limit ARC size in production scenarios, leaving <insert size> for large application allocations if required.
If you intend to benchmark zfs as well.

For KVM and ZFS inside VM(s), more tuning will be required... would not recommend inside virtual machines with additional layers on top of raw(s), qcow(s) or zvol(s)
Containers on the other hand work directly, so it should be interesting to see performance on LXC with zpool configured with L2ARC and log devices.

AFAIK transparent compression with snapshots/clones etc. outside brtfs and zfs will be hard to find on linux filesystems.

So it's XFS or EXT4 all the way i'm afraid with LVM inside hypervisors for flexibility.
Stripe it over those rust disks and explore LVM caching a bit (have not used it, but it's there )
You will have everything but transparent compression @ your disposal.

I got some advices from a person who wrote his thesis on the subject of benchmarking:

benchmark with the applications that are like the later used applications.

test with concurrent i/o-requests(if that's your scenario, and it is like that almost always).

test with small block sizes, as this will be the realistic work load for the storage in my case.

test with virtual machines and with network, so it will be like the i/o when the system is used in production.

I'm already testing with small block sizes. Concurrent jobs testing is running at the moment. Real-world testing will be done at some later point.

--- Post updated at 04:53 PM ---

1. Performance Base Line of the system

This benchmark is to demonstrate the actual speed of the used system. It's in no way relevant for the later workload and just to make sure there storage system is generally performing without major trouble.

Interesting: RAID-5 has slower write speeds than expected. zfs-RAIDZ which is similar in it's data distributin is considerably faster.

So I reckon that RAID3 will be slightly better than RAID5 (unless you're going to use RAID10 with a large number of members).

Why do you think that? I do not understand that RAID3 would perform better. As I understand it, it should be nearly the same. Only the parity goes to one dedicated disk and that disk is used heavily.

EDIT: Ahh. I possibly understand. RAID3 uses Byte-Level-Striping instead of Block-Level_Striping of RAID4/5 which may be calculated faster? Unfortunately Linux Software RAID does not support RAID3.

--- Post updated at 05:25 PM ---

2. First Insight: LVM seems not to impact read/write throughput or iops performance

Check the numbers by comparing the neighbor rows with and without lvm with the same other specs. The numbers of those pairs do only differ very little. I'll still test and watch the lvm performance readings, but I'll not report them any more, except there is some worth mentioning.

6 More Discussions You Might Find Interesting

1. AIX

I have an old IBM Power 5 9111-520 that has data on it but the system is failing. I need to move it to a more reliable server. The current system has two drives and no raid. I would like to setup my "newer" system with raid and two partitions then clone my setup over. What is the best way to do... (2 Replies)

Discussion started by: BDC80

2 Replies

2. Solaris

Hi,
Couple of sentences for background: I'm a software developer, whose task was to create a server software for our customer. Software is ready for deployment and customer has a new T4-1 SPARC, but somehow it also became my task also to setup the server. I have managed to get the server is up... (13 Replies)

Discussion started by: julumme

13 Replies

3. Shell Programming and Scripting

I need a little clarification in understanding why there would be a need for a benchmark file when used with a backup script. Logically thinking would tell me that the backups itself(backuptest.tgz) would have the time created and etc. So what would be the purpose of such a file:
touch... (6 Replies)

Discussion started by: metallica1973

6 Replies

4. UNIX for Dummies Questions & Answers

Hi,
I wanted to find out that in my database server which filesystems are shared storage and which filesystems are local. Like when I use df -k, it shows "filesystem" and "mounted on" but I want to know which one is shared and which one is local.
Please tell me the commands which I can run... (2 Replies)

Discussion started by: kamranjalal

2 Replies

5. HP-UX

Hi,
I've a HP-UX 10x running on HP9000 box and also I have 3 scsi hdd(9Gb),
one of them is working. I need to check the other 2 hdd physically.
Is there an utility to check them from unix or another way to do it?
Thanks.... (5 Replies)