Real-time database vendor Aerospike hit most of the major developmental trends in big data this week by announcing a new round of funding, the release of a new open source edition of its version of the NoSQL database, and the acquisition of a smaller vendor with more big data specific functions than Aerospike's own code.

It also changed its name from Citrusleaf, which it used since its founding in 2009, to Aerospike as a reference to the rapid growth of big data and the company's own ambition for fast growth, according to a statement.

Aerospike's main product is a distribute hash-table database designed as a NoSQL data store that processes transactions in real time, manages unstructured data as efficiently as traditional data, and scales horizontally across clusters of commoditized server and storage hardware.

Its primary purpose is to serve Web-based apps with strict latency requirements and huge volumes of data to access--gaming sites, advertising-driven sites that must display relevant ads within milliseconds, and other applications with high performance requirements and unpredictable load levels.

The company touts three primary features for its NoSQL database--speed, scalability, and reliability--that are traditional virtues in data management, but are particularly acute needs in big data analytics, according to Shalini Das, research director of the CIO Executive Board consultancy in Washington, D.C.

The ability to store unstructured data, including text and images, and to run analytics that can automatically add metadata that would allow even images to be used in external apps or found using standard queries is a basic prerequisite for big data; doing it with response times for both data-intake and data reporting are major advantages, Das said.

Sixty percent of companies with big data projects in process use relational databases as at least part of their data store, however, so it's not enough that a big data project use a single multifaceted database, according to Mike Boyarski, director of product marketing for Jaspersoft, another NoSQL vendor that surveyed open source big data users for an August report.

While the report showed a higher than expected percentage of companies launching big data projects as production systems rather than pilots, it also found the tools available to gather, process, structure, and search unstructured data alongside relational data (or in combination with relational databases) are far too weak to satisfy existing requirements.

"There's a lot of uncertainty of the value proposition of the tools at your disposal right now to take advantage of big data," Boyarski said. "It's a little surprising so many companies are moving forward into production despite the tools available."

Aerospike addressed the need for multi-format data support by acquiring startup database specialist Alchemy Database, whose AlchemyDB is designed to combine a relational database management system (RDBMS) with a document store, graphing capabilities, and a Redis open source key-value data store.

The combination will give Aerospike a good NoSQL key-value store and extensive data management capabilities, according to a statement from Aerospike that focused on the performance aspects of the combination.

Performance is important because many of the data sets are so large, according to Das.

SAP, Oracle, Microsoft, IBM, and a host of other major companies are building big data features into their existing databases and applications already, however, so there are plenty of big data platforms available, Das said.

And there is no shortage of NoSQL data stores, most of them open source and optimized for big data.

What is really in short supply are tools that integrate neatly with those data-crunching platforms to gather, clean, tag, store, and index unstructured data that few companies have ever tried to incorporate into their master databases, she said.

"Most of the activity in that area is from startups right now, so you won't see many of them in major products until there are some more acquisitions or until the market matures to point that these functions are more widely available," Das said.

Aerospike itself is a startup, which announced a new round of funding at the same time it announced its new product features and acquisition. Aerospike's Series B funding round raised an undisclosed amount from New Enterprise Associates, Draper Associates, and Alsop Louie Partners; the latter two were key backers during the company's first round.

Most of the best-regarded products are also either open source software or are based on open source with layers of proprietary enhancements to add new functions, Boyarski said.

Jaspersoft's main NoSQL product set is open source, as is Aerospike's, though Aerospike's proprietary enhancements shift the whole suite into the world of commercial software.

To keep from alienating open source developers and keep the latest enhancements moving into its products as well, Aerospike reversed course by releasing an open sourced version of its NoSQL database, the Aerospike Community Edition.

It shares most functions with the Enterprise version, but comes with a free unlimited license, supports a single cluster of two nodes in one data center, and has an upper data-storage limit of 200 GB.

Despite its real-time performance and claims of high throughput, Aerospike will face considerable competition both from other commercial software companies, open source software, and traditional applications and databases tweaked to provide big data-like benefits, according to Das' evaluation of the big data software market.

Jaspersoft's survey confirms the competition from unusual directions. Relational databases are the most common data store cited by respondents to the survey, followed by MongoDB (cited by 19%), Hadoop (18%), analytic databases from Teradata, Vertica and similar vendors (11%), Google's BigQuery (8%), Hbase (8%), and Cassandra (7%).

Welcome to
TechWeb, the IT professional's online resource for news coverage of the
information technology industry. We know technology news. Our mobile
and wireless news coverage moves as fast as wireless technology itself.
We follow all the devices you depend on to stay connected. Our software
coverage follows the multi-faceted software industry from every angle.
We've got a lock on network security and computer security issues.
We're all over the business of the Web--the Internet business--and the
engines that run it. We have our eyes and ears tuned to the players who
make and run the tools that tie us all together--Google, Microsoft,
eBay, Cisco, Yahoo, Oracle, Apple, Sony--and scores of others. And we
keep close tabs on the backbone of information technology, PC hardware.
We know PCs and Apple computers inside and out. We cover computer
technology, computer news, software news, search engine news, business
software, operating systems, and software development. Our coverage of
tech news includes a strong focus on the security business, its
attendant spyware and viruses, how security relates to wireless
technology and business networking and the security issues surrounding
RFID technology. We closely follow developments in Internet news and
Internet technology, including the spread of broadband and its effect
on Web browsers and the Web business. We watch the VoIP business, and
how VoIP technology is affecting the state of telephony in the
enterprise. And if all that isn't enough, we also track developments in
the IT industry that affect IT jobs, IT careers, and outsourcing.