Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri

Presentation given by Akmal Chaudhri (Hortonworks) to the BCS Data Management Specialist Group on 24th October 2013.
The presentation provides a balanced view of the state of NoSQL technology and tools and options for selection on projects.
A video of the presentation is available on YouTube at https://www.youtube.com/watch?v=FYfJ8C_YcvI

Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri

1.
Considerations for using
{"no":"SQL"} technology on
your next IT project
Akmal B. Chaudhri
(艾克摩 曹理)

2.
Abstract
Over the past few years, we have seen the emergence
and growth in NoSQL technology. This has attracted
interest from organizations looking to solve new business
problems. There are also examples of how this
technology has been used to bring practical and
commercial benefits to some organizations. However,
since it is still an emerging technology, careful
consideration is required in finding the relevant
developer skills and choosing the right product. This
presentation will discuss these issues in greater detail. In
particular, it will focus on some of the leading NoSQL
products and discuss their architectures and suitability
for different problems

11.
History
Have you run into limitations with
traditional relational databases? Don’t
mind trading a query language for
scalability? Or perhaps you just like shiny
new things to try out? Either way this
meetup is for you.
Join us in figuring out why these new
fangled Dynamo clones and BigTables
have become so popular lately.
Source: http://nosql.eventbrite.com/

27.
But ...
Riak ... We’re talking about nearly a year
of learning.[1]
Things I wish I knew about MongoDB a
year ago[2]
I am learning Cassandra. It is not easy.[3]
[1] http://productionscale.com/blog/2011/11/20/building-an-application-upon-riak-part-1.html
[2] http://snmaynard.com/2012/10/17/things-i-wish-i-knew-about-mongodb-a-year-ago/
[3] http://planetcassandra.org/blog/post/datastax-java-driver-for-apache-cassandra

57.
“The Stars, Like Dust”
... a squadron of small, flitting ships that
had struck and vanished, then struck
again, and made scrap of the lumbering
titanic ships that had opposed them ...
abandoning power alone, stressed speed
and co-operation ...
-- Isaac Asimov
Source: “The Stars, Like Dust” Isaac Asimov (1951)

69.
But ...
... we find developers spend a significant
fraction of their time building extremely
complex and error-prone mechanisms to
cope with eventual consistency and
handle data that may be out of date. We
think this is an unacceptable burden to
place on developers and that consistency
problems should be solved at the
database level.
Source: http://research.google.com/pubs/pub41344.html

72.
How many systems? ...
There are a lot of Key/Value stores and
distributed schema-free Document
Oriented Databases out there. They’re
springing up like weeds in a spring garden.
And folks love to blog about them and/or
talk about how their favorite is better than
the others (or MySQL).
-- Jeremy Zawodny
Source: http://blog.zawodny.com/2010/03/28/nosql-is-software-darwinism/

79.
Document store
• Represent rich, hierarchical data structures,
reducing the need for multi-table joins
• Structure of the documents need not be known a
priori, can be variable, and evolve instantly, but
a query can understand the contents of a
document
• Use cases: rapid ingest and delivery for evolving
schemas and web-based objects

91.
Column store ...
• Manage structured data, with multiple-attribute
access
• Columns are grouped together in “columnfamilies/groups”; each storage block contains
data from only one column/column set to provide
data locality for “hot” columns
• Column groups defined a priori, but support
variable schemas within a column group

119.
MongoDB security
The most effective way to reduce risk for
MongoDB deployments is to run your
entire MongoDB deployment, including all
MongoDB components (i.e. mongod,
mongos and application instances) in a
trusted environment.
Source: http://docs.mongodb.org/manual/administration/security/ (October 2012)

120.
CouchDB security
When you start out fresh, CouchDB allows
any request to be made by anyone ...
While it is incredibly easy to get started
with CouchDB that way, it should be
obvious that putting a default installation
into the wild is adventurous. Any rogue
client could come along and delete a
database. relax
Source: http://guide.couchdb.org/draft/security.html (October 2012)

121.
Redis security
Redis is designed to be accessed by
trusted clients inside trusted environments.
This means that usually it is not a good
idea to expose the Redis instance directly
to the internet or, in general, to an
environment where untrusted clients can
directly access the Redis TCP port or
UNIX socket.
Source: http://redis.io/topics/security/ (October 2012)

128.
Public API for NoSQL store
In some cases, the team decided to hide
the platform’s complexity from users; not
to facilitate its use, but to keep loosecannon developers from doing something
crazy that could take down the whole
cluster. It could show them all the controls
and knobs in a NoSQL database, but “they
tend to shoot each other,” Jacob said.
“First they shoot themselves, then they
shoot each other.”
Source: “How Disney built a big data platform on a startup budget” Derrick Harris (2012)

154.
Relational does NoSQL
Often the overhead of managing data in
multiple databases is more than the
advantages of the other store being faster.
You can do “NoSQL” inside and around a
hackable database like PostgreSQL, not
just as a separate one.
-- Hannu Krosing
Source: http://2013.nosql-matters.org/cgn/abstracts/#abstract_hannu_krosing/

163.
Understand vendor-speak
What vendor says
What vendor means
The biggest in the world
The biggest one we’ve got
The biggest in the universe
The biggest one we’ve got
There is no limit to ...
It’s untested, but we don’t mind if you
try it
A new and unique feature
Something the competition has had for
ages
Currently available feature
We are about to start Beta testing
Planned feature
Something the competition has, that we
wish we had too, that we might have one
day
Highly distributed
International offices
Engineered for robustness
Comes in a tough box
Source: “Object Databases: An Evaluation and Comparison” Bloor Research (1994)

165.
The great debate ...
About every ten years or so, there is a
“great debate” between, on the one hand,
those who see the problem of data
modelling through a more or less relational
lens, and on the other, a noisier set of
“refuseniks” who have a hot new thing to
promote. The debate usually goes like
this:

166.
The great debate ...
Refuseniks: Hah! You relational people
with your flat tables and silly query
languages! You are so unhip! You simply
cannot deal with the problem of [INSERT
NEW THING HERE]. With an [INSERT
NEW THING HERE]-DBMS we will finish
you, and grind your bones into dust!

167.
The great debate
R-people: You make some good points.
But unfortunately a) there is an enormous
amount of money invested in building
scalable, efficient and reliable database
management products and no one is going
to drop all of that on the floor and b) you
are confusing DBMS engineering
decisions with theoretical questions. We
plan to incorporate the best of these ideas
into our products.
Source: Paul Brown

168.
It’s the people ...
... MongoDB Day London ... the problem is
the people! They all talk like this:
1. Some problem that just doesn’t really
exist (or hasn’t existed for a very long
time) with relational databases
2. MongoDB
3. Profit!
-- Gaius Hammond
Source: http://gaiustech.wordpress.com/2013/04/13/mongodb-days/

170.
Final thoughts
We are clearly in the phase of a new
technology adoption in which the category
is hyped, its benefits over-promised, its
limitations poorly understood, and its value
oversold.
-- Tim Berglund
Source: “Saying Yes to NoSQL” Tim Berglund (2011)

195.
Negative NoSQL comments ...
• MongoDB is to NoSQL like MySQL to SQL -- in
the most harmful way
– http://use-the-index-luke.com/blog/2013-10/mysql-isto-sql-like-mongodb-to-nosql
• The Genius and Folly of MongoDB
– http://nyeggen.com/blog/2013/10/18/the-genius-andfolly-of-mongodb/