Feature:

Affordable performance and scalability with AWS Big Data solutions

Over the past decade, the enterprise database has transformed completely to meet the
requirements of Big Data. Scalability, reliability, and speed are three of the biggest challenges
facing the traditional, relational database, and as relational systems have floundered as
gargantuan data processing requirements have arisen, a burgeoning need for non-relational database
solutions has emerged. As expensive and costly relational databases fail at scale, it is no
surprise to discover that simple and scalable NoSQL solutions that can run on cheap, commodity
hardware have taken the industry by storm.

“I feel like NoSQL has freed people to look at data structures in a different way.” Said Simon St.
Laurent, Fluent co-chair and Senior Editor at
O’Reilly Publishing. “They can say ‘These are the pieces that I need, and they may or may not have
connections. I don’t have to define that all at the beginning.’ I always loved drawing out the
tables on a relational database system. It was fun; it was happy! But then I would change
them later, and I would suffer. With NoSQL, you can make yourself suffer for different reasons; for
example, you can structure your data badly. But I feel like there’s less ache in iterating and
making changes later.” And along with this need to structure data well, there’s also enormous
pressure to manage the database itself properly in order to achieve optimal performance.

AWS rises to the occasion

Of course, when there’s a scalability and performance problem on the computing horizon, it’s not
long before a Bay Area behemoth step up to the occasion. Not surprisingly, Amazon launched its Relational
Database Service (RDS) in 2009 to help businesses overcome the obstacles involved in handling
traditionally structured Big Data, but a fully functional NoSQL solution lagged behind. The
NoSQL-esque Dynamo concept was originally described in 2007 and AWS soon began offering
non-relational data storage systems based upon it. Surprisingly though, like so many inventions
that are ahead of their time, the initial launch of Dynamo left corporations cold. Organizations
still had to manage the operational side, with all of the associated complexities. Adding to the
frustration was the fact that the SimpleDB service for smaller businesses didn’t scale up to meet
the needs of enterprise. AWS needed to create a dynamic non-relational database that could be
consumed as a service at the enterprise level. That solution, DynamoDB, only came together in 2012,
but since then, it has experienced huge success.

In his keynote address at the San Francisco AWS Summit in 2014, Sr. VP Andy Jassy revealed that
DynamoDB is one of the fastest growing services they’ve ever released. It came out of the gate with
low latency and high throughput, and AWS
has been innovating and improving ever since based on customer feedback. “We added global and
local secondary indexes to improve the query flexibility. That was a really big deal for our
customers. We added item level access control that allows you to put an access control policy for
an item in a table.” Additional features now include parallel scan, batch writes, a geo-spatial
indexing library, and various testing tools.

Turning the dials on DynamoDB

We added global and local secondary indexes to improve the query flexibility. That was a really
big deal for our customers.

AWS Senior VP
Andy Jassy

Having access to the full breadth of AWS infrastructure and network power as a managed service
for non-relational databases has changed how Operations engineers think about their true DB
requirements. Dave Albrecht at Crittercism described it this way: “When you think about DBs, a lot
of the time the number in your head has to do with the capacity. How big is this—ten gigabytes, ten
terabytes, twenty terabytes? Amazon is doing something pretty special and different with its whole
provisioned throughput model. They’re letting you pick along both of these two axes. You get to
ask, ‘How much capacity do I want on this database, and also how many IOPS do I want on this
database?’ It’s kind of crazy that you can just turn those two knobs more or less independently of
each other.”

Having the ability to refine the performance of databases allows Operations to support the needs
of developers and consumers of enterprise applications at a new level. At the same time, this type
of fine-tuned control can create other issues. Developer Tim Gross noted in his blog post series Falling in and
out of love with DynamoDB that spikes in demand can cause problems with throttling. He
recommends using cron jobs, estimation, and careful monitoring to manage this AWS service.

Enterprises still have many choices to make

Right now, the number of NoSQL databases on offer is far broader than the market targeted by
DynamoDB. AWS is currently partnered with MongoDB and Couchbase to help customers run
non-relational DBs in EC2 and EBS. However, enterprises that need to implement a variety of
non-relational DBs to handle specific types of data will need to explore less standardized
solutions in the cloud. Developers can install any standard NoSQL database they wish on Amazon’s
EC2. This may be the preferred option for organizations that already have the skills and expertise
to manage scaling and other aspects of administration in-house. No doubt the number of AWS
partnerships with major non-relational DB solution providers will continue to grow, allowing
enterprises to let more and more of their Big Data be managed in DynamoDB.

How have you leveraged NoSQL in your enterprise solutions? Let us know.

TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations technology projects - with its network of technology-specific websites, events and online magazines.