Here's another quick update to demonstrate what's possible with a single Graphite node running master (these Carbon and Graphite-Web commits, specifically). As you'll see in the results below, this configuration was able to achieve 300k datapoints per second.

This test was performed on a Packet type 3 server with the pair of NVMe flash drives striped in a single LVM volume. Installation of the Graphite stack was still performed using Synthesize v.2.4.1. To take advantage of the increased I/O capacity I added more cache processes for a grand total of eight (8) relays and sixteen (16) caches. Five instances of Haggar ran concurrently, on a separate Packet type 1 server in the same Parsippany, NJ datacenter.

Hello, friends. Just wanted to follow up the previous post with a quick update. As I've mentioned publicly, developing and cutting a new release from Graphite's master branch has become a personal and professional priority for me. And while I've become very familiar with much of the code base over the course of writing The Graphite Book and have thrown a lot of traffic at it over the last few months, I hadn't run any significant tests for performance regressions at scale (compared to the 0.9.15 release).

This round of tests used the same configuration and benchmarking processes as before. I neglected to mention this before, but all series of benchmarks started with installing Graphite using the Synthesize setup script. For the previous test I used Synthesize v.2.4.1 to install Graphite 0.9.15 on a 64-bit Ubuntu 14.04 LTS instance in Amazon's EC2 cloud. For this round I went with Synthesize v3.0.0RC2, which targets Graphite's master branch.

This is just a quick post to share some recent benchmarking results for a single Graphite 0.9.15 server. The host is a single EBS-optimized EC2 i2.4xlarge instance with a 400 GiB EBS Provisioned IOPS SSD (io1) with a requested 20k Max IOPS.

I'm not going to dive in too deep with the results, but I'll point out that with the following configuration we were able to increase batch writes effectively, resulting in a peak 38 points per update (pointsPerUpdate, averaged across all cache processes). This means that on average, caches were able to flush 38 datapoints from memory to disk with every write request.