voltdb

You are here

The VoltDB engineering team is pleased to announce that VoltDB 4.0 is now available!

Bruce Reading, the VoltDB CEO is fond of saying that VoltDB contains lots of “yummy goodness”, and, while that is not a term we engineers use often, VoltDB v4.0 does indeed include a lot of new features. The highlights of VoltDB v4.0 include:

Enhanced in-memory analytics capabilities with a host of new SQL support.

Greatly improved analytic read throughput performance.

Clusters can grow elastically, increasing both throughput and capacity, by adding nodes to running clusters without blocking ongoing operations.

The leap second is an extra one (1) second added to the clock every once in a while to keep the official Coordinated Universal Time (UTC) in sync with the actual rotation of the earth. Twenty-six leap seconds have been inserted since 1972 - and the next one is coming at midnight on June 30, 2015 UTC. For more, see the Leap Second article in Wikipedia.

What will your computer do? For the most part, there are two strategies:

Most computers will move time backwards by one (1) second just before midnight, repeating 23:59:59 UTC before rolling over. (Groundhog Day, anyone?)

I’d like to take a quick moment to address some myths and misconceptions about VoltDB. Many people selling products who view VoltDB as competition seem to be repeating them. As you’ll read, much of what’s said is just plain FUD.

VoltDB is an in-memory database that has benchmarked at over 3 million transactions a second on bare metal, and recently crushed previous performance records in the cloud, posting eye-popping YCSB (Yahoo! Cloud Service Benchmark) numbers on AWS, Amazon’s cloud platform.

Data loss is the enemy of businesses everywhere. Snapshots, replication, active-active architectures, even physical measures such as backup generators are used to ensure data is preserved as it enters and is used or warehoused by the enterprise. Most companies use a database to manage this data and extract intelligence from it; therefore, the database has an important role in preservation of data – durability in database-speak.

As data wends its way through ingestion, transactions and analysis, many databases rely on frequent writes to disk to ensure data is durable, even in the event of

This blog reiterates what Ning Shi has already so eloquently described in his blog, and showcases how Hadoop can be used to address the blog’s fast data app use case that follows.

In the age of Fast Data, being able to make decisions on a per-event basis is just as important as being able to handle the high ingestion rate. An example that demonstrates this concept well is clickstream processing.

Clickstreams record users’ activities on the web as they navigate through web pages.

Working on distributed systems is fun, but not easy! As a software engineer at VoltDB, a big chunk of my time is spent testing software on a cluster of machines as part of new feature development and also for customer issue reproduction. Any software engineer who does this on a daily basis knows this is not an easy process.

VoltDB is a clustered database. A big part of the challenge building VoltDB features comes from the fact that database processing is distributed, involving multiple processes on different machines connected by a network.

For traditional databases, you buy a decent server machine, likely one with many CPU cores and reasonable memory, and then focus on application IOPS (I/O Operations per Second). If you are really going to stress the database, you must choose disks that can support the I/O needs of your application, today and in the future. Because these systems often use many disks to achieve high I/O performance, capacity is usually an afterthought.

With in-memory databases, throw out everything you know about sizing databases.

We’re happy to share a great blog from Mik Quinlan, a Java Technical Architect and Agile Mentor with more than 16 years experience in software development, who has extensive experience deploying VoltDB in fast data ingestion applications for the retail market. He is currently Director of Mobile Advertising at Thinknear. The blog originally appeared here.

Over the last two years, I have been involved in transforming a complex legacy processing system that gave rise to a unique solution that may be used in other contexts.

I attended M2M World Congress in London two weeks ago and have been pondering my “take-aways” since. While many VoltDB customers use our NewSQL, in-memory database to provide scalable transactions, decisions and analytics in M2M deployments (smart grids, mobile telco platforms, cloud PaaS products), putting the M2M/IoT space into a broader context is hard. It was great to take a couple of days to listen and learn.

I left for London with the question, “What might the winner look like – from which industry might the winners emerge?”.

We can all agree that data is an organization’s greatest asset, yet in many industries data is treated as a ‘fixed’ asset, collected and stored in data warehouses for later analysis. This ignores data’s most valuable moment: when it’s analyzed in real-time to inform business decisions.

Not all companies hoard data for later analysis and action, of course.

Yesterday, Facebook announced open-source access to Presto, “a distributed SQL query engine optimized for ad-hoc analysis at interactive speed.” You can read more in this blog post by Martin Traverso, a member of Facebook’s Presto team.

At VoltDB, we’ve always believed that data is only valuable when you can interact with it. Data you can’t interact with — well, that’s just overhead. We put access and interactivity first. And that means we put SQL first. Five years ago this wasn’t a popular choice but we stuck to our guns. Because we have history on our side.

One of the original design goals for VoltDB was to support large numbers of connections per server. This kind of came for free with Volt’s event based design. The event based design was driven by a desire to keep the number of active software threads 1:1 with available hardware threads and this basically proscribed any kind of thread per client network IO implementation.

In theory the only limit on the number of connections in Volt is kernel memory and heap space.

Back in February, Bruce asked the Activities Committee to think about a program that would foster exercise and healthier living. After some careful deliberation, the VoltDB C25K program was born. We launched over Memorial Day weekend.

We followed a structured schedule, and had weekly group runs on the Lexington bike trails.

Consider the scenario where you are building an online multi-player game platform, where users can use the platform to create their own games. Such a platform would have the following high-level requirements:

It needs to be able to process tens of thousands of write transactions, game state changes, per second. As each player makes a “move” or “action”, that needs to be recorded.

It needs to be able to query player status and ranking. Each player’s ranking in relation to other players, for example, needs to be computed regularly, across all players and all games.

Target audience

This post targets other VoltDB developers who are going to be dealing with the various unconventional ways Volt now uses native memory in the Java portions of the database. It will also be of interest to other Java developers looking to step outside what is typically considered Java’s comfort zone for interacting with native memory.

What was

Out of the box Java doesn’t give you much in the way of tools for accessing native memory allocation and deallocation mechanisms like mmap or malloc.