Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I've been looking at the wikipedia page for NoSQL and it lists several variations on the Key/Value store database, but I can't find any details on what it means by Key/Value store in this context. Could someone explain or link an explanation to me? Also, when would I use such a database?

Hi @indyK1ng ... I notice that you seem to have asked a few questions on the site, but that you've not given a lot of commentary on the questions. The site is focused on community INTERACTION and one of the ways we do that is by accepting good quality answers and giving feedback when answers don't help us. I would like to encourage you to either accept answers or add commentary where they don't help. Thanks!
–
jcolebrand♦Jan 28 '11 at 1:03

Unfortunately I'm in a bit of an awkward situation. I committed back when the proposal was the broader termed databases, didn't pay attention then saw this go into private beta before I knew it was changed to Database Administrators. I am more interested in the innards of databases, but want to fulfill my commitment. Sorry.
–
indyK1ngJan 28 '11 at 1:15

1

So what's stopping you from asking those sorts of questions? Go over to Meta, examine. We want to ask those questions too. Or do you intend that you wanted more indepth information on how NoSQL works in it's internals? I can go into that too, but didn't feel it was the scope of this question.
–
jcolebrand♦Jan 28 '11 at 1:24

1

Also, accepting isn't a sin even if you don't want to be here, and it helps those from google or the like. I'm not saying "accept all my answers, I need the rep" as you can see if you visit my profile, I don't. I am more interested in seeing that future users can benefit from the direction provided by "this is what the asker found useful".
–
jcolebrand♦Jan 28 '11 at 1:26

@jcolebrand I thought that those kinds of questions were considered off topic just judging from the name change. That's why This question and a few of my other questions were worded the way they were, so they would be on the side of on topic. Thanks for letting me know, I'll start being more active once I have the chance (college is doing its best to take up my time, I'm procrastinating right now ;) ).
–
indyK1ngJan 28 '11 at 1:37

5 Answers
5

Are you familiar with the concept of a Key/Value Pair? Presuming you're familiar with Java or C# this is in the language as a map/hash/datatable/KeyValuePair (the last is in the case of C#)

The way it works is demonstrated in this little sample chart:

Color Red
Age 18
Size Large
Name Smith
Title The Brown Dog

Where you have a key (left) and a value (right) ... notice it can be a string, int, or the like. Most KVP objects allow you to store any object on the right, because it's just a value.

Since you'll always have a unique key for a particular object that you want to return, you can just query the database for that unique key and get the results back from whichever node has the object (this is why it's good for distributed systems, since there's other things involved like polling for the first n nodes to return a value that match other nodes returns).

Now my example above is very simple, so here's a slightly better version of the KVP

So as you can see the simple key generation is to put "user" the userunique number, an underscore and the object. Again, this is a simple variation, but I think we begin to understand that so long as we can define the part on the left and have it be consistently formatted, that we can pull out the value.

Notice that there's no restriction on the key value (ok, there can be some limitations, such as text-only) or on the value property (there may be a size restriction) but so far I've not had really complex systems. Let's try and go a little further:

rather than edit I'm just going to include this link en.wikipedia.org/wiki/Distributed_hash_table and point out that this is where the magic of NoSQL scalability comes in, and that you have two options: either understand the math behind why this works, or trust that the guys who implement the systems understand the math on this. I also recommend the FLOSS podcasts for the MongoDB and several other NoSQL groups because they talk about these things in more detail twit.tv/floss
–
jcolebrand♦Jan 14 '11 at 16:22

Then what's the difference between Key/Value databases and traditional row oriented databases?
–
skan2 days ago

1

The fact that there are often only two (or three, or a few more, depending on the metadata involved) columns instead of a massive number of columns, and the types are often fixed. There's no reason NOT to create a KVP store in a traditional RDBMS, except that it's basically schemaless.
–
jcolebrand♦yesterday

In SQL terms, a NoSQL database is a single table with two columns: one being the (Primary) Key, and the other being the Value. And that's it, that's all the NoSQL magic.

You would use NoSQL for one main reason: scalability.

If your application needs to handle millions of queries per second, the only way to achieve it is to add more servers. That is very cheap and easy with NoSQL. In contrast, scaling a traditional SQL database is much more complicated.

Only the biggest websites out there are actually taking advantage of the full NoSQL potential, i.e., Facebook, having thousands of servers running Cassandra.

I strongly recommend to read this blog post, comparing SQL, NoSQL and ORM:

In general, SQL managed to deal with
specially structured data and allowed
highly dynamic queries according to
the needs of the department in
question.

While there are still no real
competitors for SQL in this specific
field, the use-case in everyday web
applications is a different one. You
will not find a highly dynamic range
of queries full of outer and inner
joins, unions and complex calculations
over large tables. You will usually
find a very object oriented way of
thinking. Especially with adoption of
such patterns as MVC, the data in the
back-end is usually not being modelled
for a database, but for logical
integrity which also helps people to
be able to cope with understanding
huge software-infrastructures. What
is being done to put these
object-oriented models into relational
databases is a large amount of
normalization that leads to complex
hierarchies of tables and completely
steers against the main idea behind
object oriented programming. Servers
that adhere to the SQL standard also
have to implement a large portion of
code that is of no use to simple data
storage what so ever and only inflates
the memory footprint, security risks
and has performance hits as a result.

The fact that SQL allows for arbitrary
dynamic queries for complex sets of
data is being rendered useless by
using an SQL Database only for
persistent storage of object oriented
data, which is what basically most
applications do these days.

This is where Key Value stores come into play. Key value stores allow the
application developer to store
schema-less data. This data is usually
consisting of a string which
represents the key and the actual data
which is considered to be the value in
the "key - value" relationship. The
data itself is usually some kind of
primitive of the programming language
(a string, an integer, an array) or an
object that is being marshalled by the
programming languages bindings to the
key value store. This replaces the
need for fixed data model and makes
the requirement for properly formatted
data less strict.

They all allow storage of arbitrary
data which is being indexed using a
single key to allow retrieval. The
biggest difference for the "simpler"
stores is the way you can (or cannot)
authenticate or access different stores
(if possible). While the speed
advantages in storing and retrieving
data might be a reason to consider
it over common SQL Databases, another
big advantage that emerges when using
key-value stores is that the resulting
code tends to look clean and simple
when compared to embedded SQL strings
in your programming language. This is
something that people tend to fight
with object-relational mapping
frameworks such as Hibernate or Active
Record. Having an object relational
mappers basically seems to emulate a
key value store by adding a lot of
really complex code between an SQL
database and an object-oriented
programming language.

A whole
community of people come together
under the "NoSQL" tag and discuss
these advantages and also
disadvantages of using alternatives to
re- lational database management
systems. read more
This is a bit old article, but I found very useful.

when would I use such a database?Could someone explain or link an explanation to me?
Its more of architectural decision, and a debatable one... You have to consider lots of factors like scalability, performance etc...

View below slides/articles and you'll get an idea, when, why and why not use key value store :)

A key/value database stores data by a primary key. This lets us uniquely identify a record in a bucket. Since all values are unique, lookups are incredibly fast: it's always a simple disk seek.

The value is just any kind of value. The way the data is stored is opaque to the database itself. When you store data in a key/value store, the database doesn't know or care if it's XML, JSON, text, or an image. In effect, what we're doing in a key/value store is moving the responsibility for understanding how data is stored out of the database in to the applications that retrieve our data. Since you only have a single range of keys to worry about per bucket, it's very easy to spread the keys across many servers and use distributed programming techniques to make it possible for this data to be accessed quickly (every server stores a range of data).

A drawback of this approach to data is that searching is a very difficult task. You need to either read every record in your bucket o' data or else you need to build secondary indexes yourself.

When you are working with a rich, complex data model that can't be modeled in an RDBMS.

There are about as many reasons to use a key/value database as there are to using an RDBMS and there are just as many arguments to justify one over the other. It's important to take a look at how you're querying your data and understand how that data access pattern guides how you're going to be inserting and storing data.

Just remember that a key/value database is just one type of NoSQL database.

This is how all databases used to be, with Berkeley DBM being a good example, from 1979. Since then, things have advanced (you can have many values per key in any RDBMS). For many applications a key-value store is sufficient (e.g. this is how sendmail stores its aliases). But if you find yourself pre-processing the value in your own code (or concatenating strings to make your "key"), perhaps splitting the value on a delimiter or parsing it, before you can use it, you will probably be better off with an RDBMS and actually storing it that way.

Still not clear from Gaius answer what the new 'NoSQL' Key-Value DB can do that the table that he described above can't do. Apart from splitting the table to a different tables on a different server nodes.
–
GyRoDec 17 '14 at 13:37