Category: NoSQL

On July 21, Redis Labs announced the finalization of $14M in Series C funding led by Bain Capital Ventures and Carmel Ventures. The Series C funding raise builds upon 350% year over year revenue growth for Redis Labs, the creators of Redis and distributors of an enterprise-grade version of the Redis NoSQL database platform. In the first two quarters of 2016, Redis Labs notched up over 600 new enterprise customers including the likes of TD Bank, Groupon, Verizon and Twitch. Redis Labs now claims over 6,200 enterprise customers and more than 55,000 Redis Cloud accounts. The announcement of over 350% YOY revenue growth in conjunction with details of a roster of new enterprise customers in verticals that include finance, media and retail emphatically illustrate the increasing penetration of Redis in the contemporary, enterprise NoSQL database space. Fueled by the proliferation of applications marked by high transactional volume and data throughput, the impressive growth of Redis amongst enterprise customers in the first half of 2016 testifies to the importance of its in-memory, data structure store database technology whose optimization of data structures and commands delivers enhanced execution response times and application performance. With an extra $14M in funding, the industry should expect even more innovation from Redis Labs that builds upon the announcement of Redis Modules in May.

On June 28 at MongoDB World, MongoDB announced details of MongoDB Atlas, a database as a service product platform for MongoDB. MongoDB Atlas renders it easier for MongoDB users to deploy and manage MongoDB on a multitude of cloud platforms. Whereas MongoDB users previously needed to manage discrete cloud-based MongoDB deployments to ensure scalability, high availability and security, they can now take advantage of MongoDB Atlas to automate cloud-related service operations across a plurality of cloud platforms. Dev Ittycheria, president and CEO of MongoDB, remarked on the significance of MongoDB Atlas as follows:

MongoDB Altas takes everything we know about operating MongoDB and packages it into a convenient, secure, elastic, on-demand service. This new offering is yet another major milestone for the most feature rich and popular database for modern applications, and expands the options for how customers can consume the technology – in their own data centers, in the cloud, and now as a service.

Here, Ittycheria comments on the ability of MongoDB Atlas to render MongoDB into a turnkey platform that allows developers to consume MongoDB as an on-demand service marked by elastic scalability. MongoDB Atlas delivers elastic scalability to cloud-based MongoDB deployments in addition to provisioning and upgrades as well as backup and recovery services. The elastic scalability delivered by MongoDB Atlas features automatic sharding functionality that allows for scaling with no application downtime. The MongoDB Atlas screenshot below gives customers a snapshot of metrics related to MongoDB deployments within the AWS North Virginia region:

As the graphic above illustrates, customers can use MongoDB Atlas to understand and monitor pricing across a multitude of instances. The larger vision of MongoDB Atlas, however, consists in its ability to deliver automation and oversight of MongoDB deployments across a multitude of cloud platforms, thereby giving customers a centralized platform from which to manage all of their cloud-based MongoDB infrastructures. MongoDB Atlas is currently available on Amazon Web Services although integrations with Microsoft Azure and the Google Cloud Platform are expected soon. The release of the platform marks a breakthrough moment not only with respect to enhanced capabilities for deployment and ongoing management of MongoDB but also with respect to data sovereignty and data governance, particularly in the context of multi-cloud, regionally dispersed hybrid cloud deployments. Expect MongoDB Atlas to facilitate increased adoption of MongoDB and subsequently expand its market share within the space of NoSQL, document-oriented databases.

The following guest blog post was authored by Ron Bennatan, co-founder of jSonar Inc.

SonarW: An Architecture for Speed, Low Cost and Simplicity

SonarW is a purpose-built NoSQL Big Data warehouse and analytics platform for today’s flexible modern data. It is ultra-efficient, utilizing parallel processing and demanding less hardware than other approaches. Moreover, SonarW brings NoSQL simplicity to the Big Data world.

Key architectural features include:

JSON-native columnar persistence: This works well for both structured and unstructured data; data is always compressed; and can be processed in parallel for every operation.

Indexing and Partitioning: All data is indexed using patent-pending Big Data indexes.

Parallel and Distributed Processing: Everything is done in parallel-both across nodes and within a node to ensure small, cost effective clusters.

JSON Optimized Code: Designed from the ground up for efficient columnar JSON processing.

Lock-less Data Structures: Built for multi-thread, multicore, and SMID processing.

Ease of Use: SonarW inherits its ease of use and simplicity from the NoSQL world and is 100 percent MongoDB compatible. Big Data teams are more productive and can spend less time on platform and code.

Due to its key architectural advantages over today’s Big Data warehousing approaches, SonarW defers the need for large clusters and scales to any size but does not require an unreasonable number of nodes to perform workloads of other Big Data solutions. As a result, the platform reduces both hardware costs and the costs of managing these clusters.

Why is there a Need for a NoSQL Data Warehouse for Big Data Analytics?

Big Data implementations can be complex

Big Data is no longer a stranger to the IT world. All organizations have embarked on the Big Data path and are building data lakes, new forms of the Enterprise Data Warehouse, and more. But many of them still struggle to reap the benefits and some are stuck in the “collection phase”. Landing the data is always the first phase, and that tends to be successful; it’s the next phase, the usage phase-such as producing useful Big Data analytics – that is hard. Some call this the “Hadoop Hangover”. Some never go past the ETL phase, using the Data Lake as no more than an ETL area and loading the data back into conventional data stores. Some give up.

When these initiatives stall the reason is complexity. But while all this is happening, on the other “side” of the data management arena, the NoSQL world has perfected precisely that. Perhaps the main reason that NoSQL databases such as MongoDB has been so successful is the appeal to developers who find it easy to use and who feel they are an order of magnitude more productive than other environments.

Bringing NoSQL Simplicity to Big Data

So why not merge the two? Why not take NoSQL’s simplicity and bring it to the Big Data world? That was precisely the question we put to ourselves when we went out to build SonarW – a Big Data warehouse that has the look-and-feel of MongoDB, the speed and functionality of MPP RDBMS warehouses and the scale of Hadoop.

As in other NoSQL-based systems, many of the advantages stem from the nature of JSON documents. Javascript Object Notation (JSON) is a perfect middle ground between structure and flexibility. JSON has become ubiquitous and is considered to be the “lingua-franca” of Web, mobile applications, social media and IoT. JSON is,

Simple- but not simplistic.

Flexible- yet has enough self-describing structure to make it effective.

Structured – but one that is easy to work with, can express anything, and can bring the simplicity and flexibility that people love.

JSON is the fastest growing data format on earth – by a lot. It is also the perfect foundation for Big Data where disparate sources need to quickly flow in and be used for deriving insight.

For SonarW, we started with JSON and asked ourselves how we can make it scale – and the answer was in compressed columnar storage of JSON coupled with rich analytic pipelines that can be executed directly on the JSON data. Everything looks like a NoSQL data pipeline similar to MongoDB or Google Dremel or other modern data flows, but they execute on an efficient columnar fabric and all without the need to define schema, to work hard to normalize data or to completely lose control without any structure.

Efficient scalability also reduces complexity

The other goal we set for SonarW is efficiency. Everything scales horizontally these days – and SonarW is no exception. But scaling horizontally allows one to hide inefficiencies. Throw enough hardware at anything and things go fast. But it also becomes expensive – especially in the enterprise where costs and charge-backs are high. We fondly refer to SonarW as “Big-but-Lean Data”. I.e. it’s good to scale, but it’s better to do it efficiently. As an example, the figure below shows the number of nodes and costs to run the Big Data benchmark on a set of platforms. All these systems achieved the same minimal performance scores (with RedShift and SonarW being faster than the others), but the size and cost of the clusters were different (in both charts, smaller is better).

NoSQL can optimize Big Data analytics success

A NoSQL approach has been shown to be a highly successful approach for Big Data OLTP databases as provided by companies such as MongoDB. However, no such capability has been available for Big Data analytics. SonarW was built, from the ground up – with a JSON columnar architecture – to provide a simple NoSQL interface along with MPP speeds and efficient scalability that optimizes the developer’s ability to deliver on Big Data analytics projects.

For more information about jSonar and SonarW please visit www.jsonar.com

Big Data Benchmark: Breakthrough Cost and Performance Results

One of the benchmarks used for Big Data workloads is the “Big Data Benchmark,” which is run by the AMP lab at Berkeley. This benchmark runs workloads on representatives from the Hadoop ecosystem (e.g. Hive, Spark, Tex, etc), as well as from MPP environments. Note SonarW’s performance and cost in comparison to Tez, Shark, Redshift, Impala and Hive.

Ron Bennatan Vita

Ron Bennatan is a co-founder at jSonar Inc. He has been a “database guy” for 25 years and has worked at companies such as J.P. Morgan, Merrill Lynch, Intel, IBM and AT&T Bell Labs. He was co-founder and CTO at Guardium which was acquired by IBM where he later served as a Distinguished Engineer and the CTO for Big Data Governance. He is now focused on NoSQL Big Data analytics. He has a Ph.D. in Computer Science and has authored 11 technical books.

Basho Technologies today announced the release of the Basho Data Platform, an integrated Big Data platform that enhances the ability of customers to build applications that leverage Basho’s Riak KV (formerly Riak) and Riak S2 (formerly Riak CS). By integrating Riak KV, Riak, Apache Spark, Redis and Apache Solr, the Basho Data Platform enhances the ability of customers to create high performing applications that deliver real-time analytics. The platform’s integration with Redis cache allows users to leverage the capability of Redis to improve the read performance of applications. The platform also boasts an integration with Apache Solr that builds upon the ability of Riak to support searches powered by Apache Solr. Moreover, the Basho Data Platform supports the replication and synchronization of data across its different components in ways that ensure continued access to applications and relevant data. The graphic below illustrates the different components of the Basho Data Platform:

The Basho Data Platform responds to a need in the marketplace to complement high performance NoSQL databases such as Riak with analytics and caching technologies such as Apache Spark and Redis, respectively. The platform’s cluster management and orchestration functionality absolves customers of the need to use Apache Zookeeper for cluster synchronization and cluster management. By automating provisioning and orchestration and delivering Redis-based caching functionality in conjunction with Apache Spark, the platform empowers customers to create high performance applications capable of scaling to manage the operational needs of massive datasets. Today’s announcement marks the release of an integrated platform that stands poised to significantly augment the ease with which customers can build Riak-based Big Data applications. Notably, the platform’s ability to orchestrate and automate the interplay between its different components means that developers can focus on taking advantage of the functionality of Apache Spark and Redis alongside Riak KV and Riak S2 without becoming mired in the complexities of provisioning, cluster synchronization and cluster management. As such, the platform’s out of the box integration of its constituent components represents a watershed moment in the evolution of Riak KV and Riak S2 and the NoSQL space more generally as well.

On February 3, MongoDB announced the release of MongoDB 3.0, the most significant release of MongoDB in the company’s history. The release features a fundamental rearchitecting of the database marked by the addition of a pluggable storage engine API that allows for additional storage engines. Last year’s acquisition WiredTiger constitutes one of the storage engines that highlight this release by delivering write performance improvements of 7-10x and 60 to 80% improvements in compression. MongoDB 3.0 includes a storage engine designed for read-intensive applications, one for write-intensive applications and an in-memory storage engine. As such, the newly enhanced MongoDB platform allows for the optimization of the database platform for different workloads and use cases while using a unified data model and operations interface.

Charity Majors, Production Engineering Manager at Parse (Facebook), remarked on the significance of the MongoDB 3.0 release as follows:

We at Parse and Facebook are incredibly excited for the 3.0 release of MongoDB. The storage API opens the door for MongoDB to leverage write-optimized storage engines, improved compression and memory usage, and other aspects of cutting edge modern database engines.

As Majors notes, the re-architecting of MongoDB expands the range of use cases that MongoDB can handle by rendering it more suitable for applications that require the writing of data. MongoDB 3.0 also boasts marked improvements in performance and scalability because of its redesigned storage architecture. This release additionally features the introduction of Ops Manager, an application that enables customers to deploy, monitor and update MongoDB deployments. Ops Manager integrates with well known monitoring tools such as AppDynamics, New Relic and Docker and stands to reduce the operational overhead of MongoDB deployments by automating routine tasks into one-click, push button functionality. Overall, MongoDB 3.0 represents a watershed moment in the development of MongoDB as evinced by its ability to embrace a variety of application workloads and use cases alongside a massively improved level of performance and scalability.

Basho Technologies, creator of the Riak NoSQL key-value database platform, today announced the finalization of $25M in Series G funding led by existing investor Georgetown Partners. In addition to the funding news, Basho revealed details of record growth including sequential growth of 62 percent and 116 percent in Q3 and Q4 of 2014 respectively. 2014 represented a landmark year for Basho given that it shipped Riak 2.0, Riak CS 1.5 and appointed Adam Wray, former CEO of Tier 3, as CEO. In the same year, Basho replaced Oracle as the database platform for the National Health Service of UK and deepened its relationship with The Weather Company as noted below by Bryson Koehler, executive vice president and CITO for The Weather Company:

The amount of data we collect from satellites, radars, forecast models, users and weather stations worldwide is over 20TB each day and growing quickly. This data helps us deliver the world’s most accurate weather forecast as well as deliver more severe weather alerts than anyone else, so it is absolutely mission critical and has to be available all of the time. Riak Enterprise gives us the flexibility and reliability that we depend on to enable over 100,000 transactions a second with sub 20ms latency on a global basis.

Here, Koehler remarks on the ability of Riak Enterprise to handle “over 100,000 transactions a second” with latencies less than 20 ms. Importantly, The Weather Company’s daily data collection rate of 20 TB a day illustrates the massive volumes of data that Enterprise Riak can aggregate for archival and analytic use cases. As told to Cloud Computing Today in an interview with Basho CEO Adam Wray, Riak also gained traction in verticals such as gaming, healthcare and financial services in 2014 with much of its uptake propelled by trends in the technology industry marked by increased adoption of Big Data, distributed systems and applications in the cloud computing space and the growth of the internet of things vertical. Wray further remarked that Riak stands strongly positioned to reap the benefits of increased stakeholder awareness about the value of key-value stores and concepts such as eventual consistency. Today’s capital raise brings the total funding raised by Basho to $65M. With an extra $25M in the bank and an enviable roster of enterprise customers out of the gate, the NoSQL space should expect Basho to build steadily upon its success in 2014 by gaining even more market traction amongst Fortune 50 customers and staking out its positioning amongst the likes of MongoDB, MarkLogic, Couchbase and DataStax, with a particular focus on sharpening its differentiation in comparison to other key-value store databases such as Couchbase and DataStax.

On October 14, MongoDB announced major enhancements to its cloud-based MongoDB Management Service (MMS) for managing MongoDB deployments. The most recent version of MMS introduces significant operational efficiencies that streamline and simplify the deployment and subsequent operational management of MongoDB. For example, MMS now enables users to provision MongoDB deployments with one click and configure the resulting infrastructure with minimal manual intervention and decision-making. Moreover, the recent enhancements consolidate the ability to upgrade and downgrade deployments expeditiously as well as to seamlessly scale out deployments to accommodate customer growth. Notably, this release boasts a deeper integration with Amazon Web Services that gives customers greater control over MongoDB deployments on AWS as illustrated by the screenshot below:

As told to Cloud Computing Today by Kelly Stirman, MongoDB’s Director of Products, MongoDB Management Service users can now deploy Amazon Web Services instances from within the MMS infrastructure itself by using the automation agent functionality depicted above. Previously, MMS customers needed to independently provision AWS instances from within the AWS platform, but they can now leverage the deep integration between MMS and AWS to enjoy greater operational efficiencies specific to the deployment of AWS infrastructures containing MongoDB deployments. That said, MMS remains infrastructure agnostic and can work with any public cloud, on premise environment or hybrid cloud infrastructure although, in the case of non-AWS hosting environments, customers will need to independently configure and deploy the underlying infrastructure outside of MMS. The other notable feature of MMS is that it now operates on a freemium model that allows customers to take advantage of its functionality free of charge for up to 8 servers. The freemium model positions MongoDB to significantly expand the range of customers that opt to try out the functionality of MMS and continue hurtling the company in the direction of a lucrative IPO.