The last few weeks I’ve been getting acquainted with Splunk, a powerful tool for searching, analysing and visualising logs and events that happen in your infrastructure, live application performance and any type of machine generated data. I read the performance blog post that Splunk had previously done on physical bare-metal hardware and Amazon ECS instances, and wanted to se what I could get in a virtual environment on top of EMC’s scale-out block storage ScaleIO (which I’ve written several posts on here).

Generally speaking, virtualising Splunk has been frowned upon as Splunk consumes a lot of resources, more and more as you add more data ingestion and more searches. Physical bare-metal servers have been the de facto standard for Splunk servers for years, but I still wanted to see what we could do with virtual instances of it. Here’s the setup:

4 Splunk 6.0 servers, configured in a VMware environment with 12 vCPUs and 12 GB RAM as is recommended in the Splunk Enterprise installation guide.
Each Splunk server has a ScaleIO volume attached to it for the entire /opt/splunk directory, containing the Splunk installation and all log and index files.
These ScaleIO volumes are running on top of EMC’s XtremSF PCIe Flash cards.

For the tests I used a standard tool for performance testing of Splunk, namely Splunkit. This tool can be used for generating a large log file, which can then be tested and indexed by Splunk itself.

To configure Splunkit like I did, edit the file called “pyro.properties” like this:

Then, create the log file by running the following command in the splunkit-server directory:

python bin/gendata.py

When the data has been generated, start the index test by running this command in the same directory:

python bin/indextest.py

Now login to your Splunk instance, and go to the Splunk-on-Splunk tab, and you should see something like this:

That graph will show you the current estimated indexing rate, which is always interesting (this one shows close to 30000KB/sec). But if you want to compare your indexing performance to other benchmarks, you can click the “View results” link to get to another search, and enter the following search term:

So what eps values did I get out of my virtualised Splunk Enterprise environment? Pretty good ones I must say. And note that this is on a ScaleIO shared scale-out block storage, not individual independent local drives in each server. Also, it’s one volume per server, not a striped volume across multiple virtual drives. So no LVMs or anything like that, and regular ext4 filesystems without any tuning. Your basic server setup so to say :)

System

Splunk Version

Virtual Hardware

Average EPS

Splunk-Index1

6.0

12 vCPUs, 12 GB RAM

86931 eps

Splunk-Index2

6.0

12 vCPUs, 12 GB RAM

90242 eps

Splunk-Index3

6.0

12 vCPUs, 12 GB RAM

87199 eps

Splunk-Index4

6.0

12 vCPUs, 12 GB RAM

92792 eps

So as you can see, we’re surpassing the performance numbers of the tests mentioned before, which is great! However, it will be even more interesting when we continue to do massive log input and then add searches on top, to see if we can maintain performance or not. And according to the performance number we get from the ScaleIO environment (see below), we’re nowhere near saturated on disk right now, which hopefully means that we can squeeze out the searches without a heavy impact on the indexing performance.

Great test my friend, and going by the results in the other blog post you are beating dedicated hardware with RAID 0 and RAID 1+0 by a clear 10,000 events per second. Were there other results around average search as per the Splunk performance post? Interested to know was it 4 VM’s across 4 physical hosts or was there some consolidation in there (4 VMs on 2 hosts, 4 VMs on 1 host).

Reblogged this on Ian Thompson's Technology Blog and commented:
Great way to do testing on Splunk. Glad to see EPS numbers on ScaleIO in the 80-90K region. Splunk doesn’t like to talk EPS, mostly because horizontal scaling will let you achieve EPS^n factor. Unfortunately, so many users are educated by appliance vendors that push hardware based on EPS alone. Ends up being a bigger issue than it really is. Numbers like this provide great ammunition when talking to people who don’t yet fully understand how Splunk works.