Posts Tagged 'Riak'

In December, I posted a MongoDB performance analysis that showed the quantitative benefits of using bare metal servers for MongoDB workloads. It should come as no surprise that in the wake of SoftLayer's Riak launch, we've got some similar data to share about running Riak on bare metal.

To run this test, we started by creating five-node clusters with Riak 1.3.1 on SoftLayer bare metal servers and on a popular competitor's public cloud instances. For the SoftLayer environment, we created these clusters using the Riak Solution Designer, so the nodes were all provisioned, configured and clustered for us automatically when we ordered them. For the public cloud virtual instance Riak cluster, each node was provisioned indvidually using a Riak image template and manually configured into a cluster after all had come online. To optimize for Riak performance, I made a few tweaks at the OS level of our servers (running CentOS 64-bit):

Noatime
Nodiratime
barrier=0
data=writeback
ulimit -n 65536

The common Noatime and Nodiratime settings eliminate the need for writes during reads to help performance and disk wear. The barrier and writeback settings are a little less common and may not be what you'd normally set. Although those settings present a very slight risk for loss of data on disk failure, remember that the Riak solution is deployed in five-node rings with data redundantly available across multiple nodes in the ring. With that in mind and considering each node also being deployed with a RAID10 storage array, you can see that the minor risk for data loss on the failure of a single disk in the entire solution would have no impact on the entire data set (as there are plenty of redundant copies for that data available). Given the minor risk involved, the performance increases of those two settings justify their use.

With all of the nodes tweaked and configured into clusters, we set up Basho's test harness — Basho Bench — to remotely simulate load on the deployments. Basho Bench allows you to create a configurable test plan for a Riak cluster by configuring a number of workers to utilize a driver type to generate load. It comes packaged as an Erlang application with a config file example that you can alter to create the specifics for the concurrency, data set size, and duration of your tests. The results can be viewed as CSV data, and there is an optional graphics package that allows you to generate the graphs that I am posting in this blog. A simplified graphic of our test environment would look like this:

You may notice that in the test cases that use SoftLayer "Medium" Servers, the virtual provider nodes are running 26 virtual compute units against our dual proc hex-core servers (12 cores total). In testing with Riak, memory is important to the operations than CPU resources, so we provisioned the virtual instances to align with the 36GB of memory in each of the "Medium" SoftLayer servers. In the public cloud environment, the higher level of RAM was restricted to packages with higher CPU, so while the CPU counts differ, the RAM amounts are as close to even as we could make them.

One final "housekeeping" note before we dive into the results: The graphs below are pulled directly from the optional graphics package that displays Basho Bench results. You'll notice that the scale on the left-hand side of graphs differs dramatically between the two environments, so a cursory look at the results might not tell the whole story. Click any of the graphs below for a larger version. At the end of each test case, we'll share a few observations about the operations per second and latency results from each test. When we talk about latency in the "key observation" sections, we'll talk about the 99th percentile line — 99% of the results had latency below this line. More simply you could say, "This is the highest latency we saw on this platform in this test." The primary reason we're focusing on this line is because it's much easier to read on the graphs than the mean/median lines in the bottom graphs.

The SoftLayer environment showed much more consistency in operations per second with an average throughput around 450 Op/sec. The virtual environment throughput varied significantly between about 50 operations per second to more than 600 operations per second with the trend line fluctuating slightly between about 220 Op/sec and 350 Op/sec.

Comparing the latency of get and put requests, the 99th percentile of results in the SoftLayer environment stayed around 50ms for gets and under 200ms for puts while the same metric for the virtual environment hovered around 800ms in gets and 4000ms in puts. The scale of the graphs is drastically different, so if you aren't looking closely, you don't see how significantly the performance varies between the two.

Similar to the results of Test 1, the throughput numbers from the bare metal environment are more consistent (and are consistently higher) than the throughput results from the virtual instance environment. The SoftLayer environment performed between 1500 and 1750 operations per second on average while the virtual provider environment averaged around 1200 operations per second throughout the test.

The latency of get and put requests in Test 2 also paints a similar picture to Test 1. The 99th percentile of results in the SoftLayer environment stayed below 50ms and under 400ms for puts while the same metric for the virtual environment averaged about 250ms in gets and over 1000ms in puts. Latency in a big data application can be a killer, so the results from the virtual provider might be setting off alarm bells in your head.

In Test 3, we're using the same specs in our virtual provider nodes, so the results for the virtual node environment are the same in Test 3 as they are in Test 2. In this Test, the SoftLayer environment substitutes SSD hard drives for the 15K SAS drives used in Test 2, and the throughput numbers show the impact of that improved I/O. The average throughput of the bare metal environment with SSDs is between 1750 and 2000 operations per second. Those numbers are slightly higher than the SoftLayer environment in Test 2, further distancing the bare metal results from the virtual provider results.

The latency of gets for the SoftLayer environment is very difficult to see in this graph because the latency was so low throughout the test. The 99th percentile of puts in the SoftLayer environment settled between 500ms and 625ms, which was a little higher than the bare metal results from Test 2 but still well below the latency from the virtual environment.

Summary

The results show that — similar to the majority of data-centric applications that we have tested — Riak has more consistent, better performing, and lower latency results when deployed onto bare metal instead of a cluster of public cloud instances. The stark differences in consistency of the results and the latency are noteworthy for developers looking to host their big data applications. We compared the 99th percentile of latency, but the mean/median results are worth checking out as well. Look at the mean and median results from the SoftLayer SSD Node environment: For gets, the mean latency was 2.5ms and the median was somewhere around 1ms. For puts, the mean was between 7.5ms and 11ms and the median was around 5ms. Those kinds of results are almost unbelievable (and that's why I've shared everything involved in completing this test so that you can try it yourself and see that there's no funny business going on).

It's commonly understood that local single-tenant resources that bare metal will always perform better than network storage resources, but by putting some concrete numbers on paper, the difference in performance is pretty amazing. Virtualizing on multi-tenant solutions with network attached storage often introduces latency issues, and performance will vary significantly depending on host load. These results may seem obvious, but sometimes the promise of quick and easy deployments on public cloud environments can lure even the sanest and most rational developer. Some applications are suited for public cloud, but big data isn't one of them. But when you have data-centric apps that require extreme I/O traffic to your storage medium, nothing can beat local high performance resources.

In my Breaking Down 'Big Data' – Database Models, I briefly covered the most common database models, their strengths, and how they handle the CAP theorem — how a distributed storage system balances demands of consistency and availability while maintaining partition tolerance. Here's what I said about Dynamo-inspired databases:

What They Do: Distributed key/value stores inspired by Amazon's Dynamo paper. A key written to a dynamo ring is persisted in several nodes at once before a successful write is reported. Riak also provides a native MapReduce implementation.Horizontal Scaling: Dynamo-inspired databases usually provide for the best scale and extremely strong data durability.CAP Balance: Prefer availability over consistencyWhen to Use: When the system must always be available for writes and effectively cannot lose data.Example Products: Cassandra, Riak, BigCouch

This type of key/value store architecture is very unique from the document-oriented MongoDB solutions we launched at the end of last year, so we worked with Basho to prioritize development of high-performance Riak solutions on our global platform. Since you already know about MongoDB, let's take a few minutes to meet the new kid on the block.

Riak is a distributed database architected for availability, fault tolerance, operational simplicity and scalability. Riak is masterless, so each node in a Riak cluster is the same and contains a complete, independent copy of the Riak package. This design makes the Riak environment highly fault tolerant and scalable, and it also aids in replication — if a node goes down, you can still read, write and update data.

As you approach the daunting prospect of choosing a big data architecture, there are a few simple questions you need to answer:

How much data do/will I have?

In what format am I storing my data?

How important is my data?

Riak may be the choice for you if [1] you're working with more than three terabytes of data, [2] your data is stored in multiple data formats, and [3] your data must always be available. What does that kind of need look like in real life, though? Luckily, we've had a number of customers kick Riak's tires on SoftLayer bare metal servers, so I can share a few of the use cases we've seen that have benefited significantly from Riak's unique architecture.

Use Case 1 – Digital Media
An advertising company that serves over 10 billion ads per month must be able to quickly deliver its content to millions of end users around the world. Meeting that demand with relational databases would require a complex configuration of expensive, vertically scaled hardware, but it can be scaled out horizontally much easier with Riak. In a matter of only a few hours, the company is up and running with an ad-serving infrastructure that includes a back-end Riak cluster in Dallas with a replication cluster in Singapore along with an application tier on the front end with Web servers, load balancers and CDN.

Use Case 2 – E-commerce
An e-commerce company needs 100-percent availability. If any part of a customer's experience fails, whether it be on the website or in the shopping cart, sales are lost. Riak's fault tolerance is a big draw for this kind of use case: Even if one node or component fails, the company's data is still accessible, and the customer's user experience is uninterrupted. The shopping cart structure is critical, and Riak is built to be available ... It's a perfect match.

As an additional safeguard, the company can take advantage of simple multi-datacenter replication in their Riak Enterprise environment to geographically disperse content closer to its customers (while also serving as an important tool for disaster recovery and backup).

Use Case 3 – Gaming
With customers like Broken Bulb and Peak Games, SoftLayer is no stranger to the gaming industry, so it should come as no surprise that we've seen interesting use cases for Riak from some of our gaming customers. When a game developer incorporated Riak into a new game to store player data like user profiles, statistics and rankings, the performance of the bare metal infrastructure blew him away. As a result, the game's infrastructure was redesigned to also pull gaming content like images, videos and sounds from the Riak database cluster. Since the environment is so easy to scale horizontally, the process on the infrastructure side took no time at all, and the multimedia content in the game is getting served as quickly as the player data.

Databases are common bottlenecks for many applications, but they don't have to be. Making the transition from scaling vertically (upgrading hardware, adding RAM, etc.) to scaling horizontally (spreading the work intelligently across multiple nodes) alleviates many of the pain points for a quickly growing database environment. Have you made that transition? If not, what's holding you back? Have you considered implementing Riak?

Big data is only getting bigger. Late last year, SoftLayer teamed up with 10Gen to launch a high-performance MongoDB solution, and since then, many of our customers have been clamoring for us to support other big data platforms in the same way. By automating the provisioning process of a complex big data environment on bare metal infrastructure, we made life a lot easier for developers who demanded performance and on-demand scalability for their big data applications, and it's clear that our simple formula produced amazing results. As Marc mentioned when he started breaking down big data database models, document-oriented databases like MongoDB are phenomenal for certain use-cases, and in other situations, a key-value store might be a better fit. With that in mind, we called up our friends at Basho and started building a high-performance architecture specifically for Riak ... And I'm excited to announce that we're launching it today!

Riak is an open source, distributed database platform based on the principles enumerated in the DynamoDB paper. It uses a simple key/value model for object storage, and it was architected for high availability, fault tolerance, operational simplicity and scalability. A Riak cluster is composed of multiple nodes that are all connected, all communicating and sharing data automatically. If one node were to fail, the other nodes would automatically share the data that the failed node was storing and processing until the node is back up and running or a new node is added. See the diagram below for a simple illustration of how adding a node to a cluster works within Riak.

We will support both the open source and the Enterprise versions of Riak. The open source version is a great place to start. It has all of the database functionality of Riak Enterprise, but it is limited to a single cluster. The Enterprise version supports replication between clusters across data centers, giving you lots of architectural options. You can use replication to build highly available, live-live failover applications. You can also use it to distribute your application's data across regions, giving you a global platform that you can update anywhere in the world and know that those modifications will be available anywhere else. Riak Enterprise customers also receive 24×7 coverage, both from SoftLayer and Basho. This includes SoftLayer's one-hour guaranteed response for Severity 1 hardware issues and unlimited support available via our secure web portal, email and phone.

The business use-case for this flexibility is that if you need to scale up or down, nodes can be easily added or taken down as your requirements change. You can opt for a single-data center environment with a few nodes or you can broaden your architecture to a multi-data center deployment with a 40-node cluster. While these capabilities are inherent in Riak, they can be complicated to build and configure, so we spent countless hours working with Basho to streamline Riak deployment on the SoftLayer platform. The fruit of that labor can be found in our Riak Solution Designer:

The server configurations and packages in the Riak Solution Designer have been selected to deliver the performance, availability and stability that our customers expect from their bare metal and virtual cloud infrastructure at SoftLayer. With a few quick clicks, you can order a fully configured Riak environment, and it'll be provisioned and online for you in two to four hours. And everything you order is on a month-to-month contract.

Thanks to the hard work done by the SoftLayer development group and Basho's team, we're proud to be the first in the marketplace to offer a turn-key Riak solution on bare metal infrastructure. You don't need to sacrifice performance and agility for simplicity.

Forester defines big data as "techniques and technologies that make capturing value from data at an extreme scale economical." Gartner says, "Big data is the term adopted by the market to describe extreme information management and processing issues which exceed the capability of traditional information technology along one or multiple dimensions to support the use of the information assets." Big data demands extreme horizontal scale that traditional IT management can't handle, and it's not a challenge exclusive to the Facebooks, Twitters and Tumblrs of the world ... Just look at the Google search volume for "big data" over the past eight years:

Developers are collectively facing information overload. As storage has become more and more affordable, it's easier to justify collecting and saving more data. Users are more comfortable with creating and sharing content, and we're able to track, log and index metrics and activity that previously would have been deleted in consideration of space restraints or cost. As the information age progresses, we are collecting more and more data at an ever-accelerating pace, and we're sharing that data at an incredible rate.

To understand the different facets of this increased usage and demand, Gartner came up with the three V's of big data that vary significantly from traditional data requirements: Volume, Velocity and Variety. Larger, more abundant pieces of data ("Volume") are coming at a much faster speed ("Velocity") in formats like media and walls of text that don't easily fit into a column-and-row database structure ("Variety"). Given those equally important factors, many of the biggest players in the IT world have been hard at work to create solutions that provide the scale and speed developers need when they build social, analytics, gaming, financial or medical apps with large data sets.

When we talk about scaling databases here, we're talking about scaling horizontally across multiple servers rather than scaling vertically by upgrading a single server — adding more RAM, increasing HDD capacity, etc. It's important to make that distinction because it leads to a unique challenge shared by all distributed computer systems: The CAP Theorem. According to the CAP theorem, a distributed storage system must choose to sacrifice either consistency (that everyone sees the same data) or availability (that you can always read/write) while having partition tolerance (where the system continues to operate despite arbitrary message loss or failure of part of the system occurs).

Let's take a look at a few of the most common database models, what their strengths are, and how they handle the CAP theorem compromise of consistency v. availability:

Relational Databases

What They Do: Stores data in rows/columns. Parent-child records can be joined remotely on the server. Provides speed over scale. Some capacity for vertical scaling, poor capacity for horizontal scaling. This type of database is where most people start.Horizontal Scaling: In a relational database system, horizontal scaling is possible via replication — dharing data between redundant nodes to ensure consistency — and some people have success sharding — horizontal partitioning of data — but those techniques add a lot of complexity.CAP Balance: Prefer consistency over availability.When to use: When you have highly structured data, and you know what you'll be storing. Great when production queries will be predictable.Example Products:Oracle, SQLite, PostgreSQL, MySQL

Document-Oriented Databases

What They Do: Stores data in documents. Parent-child records can be stored in the same document and returned in a single fetch operation with no join. The server is aware of the fields stored within a document, can query on them, and return their properties selectively.Horizontal Scaling: Horizontal scaling is provided via replication, or replication + sharding. Document-oriented databases also usually support relatively low-performance MapReduce for ad-hoc querying.CAP Balance: Generally prefer consistency over availabilityWhen to Use: When your concept of a "record" has relatively bounded growth, and can store all of its related properties in a single doc.Example Products:MongoDB, CouchDB, BigCouch, Cloudant

Key-Value Stores

What They Do: Stores an arbitrary value at a key. Most can perform simple operations on a single value. Typically, each property of a record must be fetched in multiple trips, with Redis being an exception. Very simple, and very fast.Horizontal Scaling: Horizontal scale is provided via sharding.CAP Balance: Generally prefer consistency over availability.When to Use: Very simple schemas, caching of upstream query results, or extreme speed scenarios (like real-time counters)Example Products:CouchBase, Redis, PostgreSQL HStore, LevelDB

BigTable-Inspired Databases

What They Do: Data put into column-oriented stores inspired by Google's BigTable paper. It has tunable CAP parameters, and can be adjusted to prefer either consistency or availability. Both are sort of operationally intensive.Horizontal Scaling: Good speed and very wide horizontal scale capabilities.CAP Balance: Prefer consistency over availabilityWhen to Use: When you need consistency and write performance that scales past the capabilities of a single machine. Hbase in particular has been used with around 1,000 nodes in production.Example Products:Hbase, Cassandra (inspired by both BigTable and Dynamo)

Dynamo-Inspired Databases

What They Do: Distributed key/value stores inspired by Amazon's Dynamo paper. A key written to a dynamo ring is persisted in several nodes at once before a successful write is reported. Riak also provides a native MapReduce implementation.Horizontal Scaling: Dynamo-inspired databases usually provide for the best scale and extremely strong data durability.CAP Balance: Prefer availability over consistency,When to Use: When the system must always be available for writes and effectively cannot lose data.Example Products:Cassandra, Riak, BigCouch

Each of the database models has strengths and weaknesses, and there are huge communities that support each of the open source examples I gave in each model. If your database is a bottleneck or you're not getting the flexibility and scalability you need to handle your application's volume, velocity and variety of data, start looking at some of these "big data" solutions.

Tried any of the above models and have feedback that differs from ours? Leave a comment below and tell us about it!