Share This Story!

Unplugged: Cassandra could wreak havoc on Oracle

Cassandra isn't the girl next door. It's an open source database technology likely to wreak havoc on the technology industry and threaten the future of legacy purveyors of relational databases, such as Oracle.

In fact, she isn't a person at all. Cassandra is an open-source database technology that is likely to wreak havoc on the technology industry and threaten the future of legacy purveyors of relational databases, such as Oracle.

Just as Linux — an open source operating system — spurred lower prices for proprietary computer servers sold by IBM, Sun Microsystems, Dell, Hewlett-Packard and others, Cassandra stands to do the same thing with databases. It offers a greater ability to scale — critical in this data-crazed era — at lower costs than traditional databases.

"Open source makes all of the difference," says Billy Bosworth, CEO of DataStax, a San Mateo, Calif.-based software start-up that offers an enterprise-scale version of Cassandra databases.

First, a quick primer. The Cassandra project was developed by the Apache Software Foundation, a non-profit corporation consisting of a community of decentralized software developers all over the world who are devoted to creating free and open source software. Just as with the Linux movement in servers, the basic gist is that the collaboration by many developers toward a common goal of building a better database has resulted in just that: a better mousetrap.

The primary advantage to Cassandra architecture over relational database technology is the ability to expand scale and improve performance at one-tenth of the cost of traditional vendors, Bosworth argues. With relational technology, companies need to constantly expand their core enterprise databases to handle larger amounts of data, which in today's digital world is a daunting task facing every company and government agency. Essentially, this means building a bigger database machine, usually in the same physical locations.

Cassandra, on the other hand, is a decentralized database spread over myriad geographical locations using clusters of lower-cost commoditized computer servers. Geeks call this a "fully distributed" architecture, which also boasts a critical feature: It has no single point of failure. If servers go down in one location, servers in other parts of the world simply pick up the slack.

That feature was of particular interest to Netflix, which needed a database solution that didn't risk any down time because of failure, says Christos Kalantzis, Netflix's cloud and database engineering expert.

As the world's largest Internet television network, serving 36 million viewers who watch more than 1 billion hours of TV shows and movies a month, Netflix simply can't afford to go dark. So, Netflix switched over to a Cassandra database about two years ago supplied by DataStax, Kalantzis says. Now, Netflix stores about 95% of its non-program data in Cassandra, which includes customer account information, movie ratings, bookmarks and logs — all the stuff viewers use before and after they select a show to watch.

"We need that data available across the globe," Kalantzis told USA TODAY.

What's more, Netflix stores more than 200 terabytes of data in its computer clusters, making it one of the largest cloud computer operators in the world. Consequently, the company is constantly needing to expand its capacity, which is where Cassandra's affordable scalability provides another selling point.

In hindsight, the once-risky move to switch over to the untested Cassandra was a no-brainer, Kalantzis says. "We've never looked back," he says.

Netflix is DataStax's largest customer, but the fledgling database outfit already counts 20 companies in the Fortune 100 as customers, including eBay and Adobe Systems. In just a year, DataStax's customer base grew tenfold to more than 270 by the end of 2012. The company, which could be to Cassandra what Red Hat is to Linux, has more than 100 employees and has raised $39 million from venture capital firms, including Crosslink Capital here.

Bosworth predicts that this is just the beginning as the word about Cassandra begins to travel and reach information technology executives. "If companies don't get on with this change, I don't know how they compete," Bosworth says.

Helping spread that word, no less, is the CEO himself, who will be making a keynote address this week at the Cassandra Summit, a developer conference devoted to all things related to the open source database project. More than 1,000 developers and technology buyers are expected to descend on San Francisco's Fort Mason conference center to attend the two-day confab. The conference should give the Cassandra movement a shot in the arm.

"There is a new breed of software developer, and he could not care less about relational databases," Bosworth says. "This technology frees a developer to think differently."

And to think about Cassandra.

Mark Veverka is a technology columnist with more than 25 years of financial journalism experience. He was previously a columnist at Barron's, The Wall Street Journal and the San Francisco Chronicle. Follow him on Twitter: @markveverka.