Summary
In a recent set of articles, Don Haderle and Michael Stonebraker review a bit of database history and point the way to column-oriented databases.

Advertisement

Current databases are mostly oriented around database rows, a design that originated with the constraints present at the time the first relational databases were implemented.

Among the architects of one of those early databases, IBM's DB2, was Don Haderle, now an IBM vice president and CTO of that company's data management division. Haderle is also an advisor to Vertica, a startup focused on creating a new type of database centered around columns, not rows. Vertica was co-founded by Michael Stonebraker, a CS professor at Berkeley and designer of the Postgres database system.

In a recent set of blog posts, Haderle and Stonebraker discuss the constraints of the original relational database implementations, and how changes in the cost of processing can usher in column-oriented databases more suitable to analyze rich data types.

Current relational database management systems are largely built on designs from the 1980s. Back then, computers were expensive and slow relative to today's systems. The minimization of expensive CPU cycles -- not I/O considerations -- was the driving force in early relational DBMS design. The market sweet spot was transaction processing coupled with simple decision support, which was generally satisfied by access on a limited set of attributes (dimensions)...

In the 1980s, rows were small (actually the model was an 80 column punched card averaging 20 fields per record) and the number of entities (tables) was small (100 was large). Two- and three-way joins were the norm. Today, the number of attributes per table is in the hundreds with the most perverse having thousands of attributes. The number of tables in the database has climbed into the multiple thousands. Six- and seven-way joins are common; ten- and twelve-way joins are not extraordinary. As a result, searching is significantly more complex given the number of search arguments (attributes) and the number of relationships involved...

To address these challenges, it makes sense to design an inverted database where the emphasis is on the attribute lists and the relationships between entities. This is precisely what a columnar database does. The rest is details which will determine success: compressing for efficiency; linking lists for joins; time stamping data elements to provide historical detail, as well as alleviate pressure on loading and updating; adding all of the relational functionality; etc.

Only reading the columns you need. We see fact tables with anywhere from 40 to 200 attributes. Warehouse queries typically read 5 or less columns. A column database reads exactly the columns needed; a row store reads all the columns. In round numbers this is an order of magnitude performance penalty.

Superior compression. Columnar compression is more effective than the schemes used by row stores because blocks read by column databases have only one data type in them (a portion of a column) while row stores have several data types (a collection of tuples). Compressing one data type is fundamentally easier than compressing several. Moreover, in Vertica's case, it does not store an explicit tuple identifier or space-wasting bit maps of non-null fields. Hence, we typically see columnar compression beating row-store compression by a factor of 2.

Executor runs on compressed data. The row stores uncompress a block when it is brought off the disk because they have legacy executors that deal with uncompressed data. In contrast, the Vertica executor runs on compressed data. This results in better L2 cache performance, copying less bytes of data, etc.

Inner loop is column-oriented not row-oriented. A row-oriented query plan has an inner loop that picks up a tuple and does something with it. A column-oriented query plan has an inner loop that picks up a column and does something with it. In a fact table query, there might be 10 ** 9 rows but only 5 relevant columns. Hence, the inner loop, with its inherent overhead, is executed vastly less times in a column store.

It's interesting. It seems like there could be some trade-offs involved. There are some operations where you really do need to operate on the whole row (insert/delete), and now that you have loosely associated columns, it may involve more processing power.

On the other hand, column-level locking might be more natural to implement.

i'll ask a question which will reveal how clueless i am about databases: is there a way one could take this to some extreme by not having either rows or columns at all? every datum would be (tableName,row,col,value) and would be in random access storage with no further explicit structure. then you'd layer all sorts of indecies on top of the heap to get whatever kinds of performance you need out of it.

this is obviously a bad idea when it comes to the fundamental idea of caches because you have no locality guarantees. and this is obviously a bad idea from the storage requirements perspective since you have all these extra (tableName,row,col) bits around.

but it seems like choosing either row or column perspectives is an implementation detail that causes a possibly more abstract theory to degrade in some way (reads for rows, writes for columns). so could we start with an architecture that more closely approximates the theory (which i guess means that everything ends up sucking performance wise, rather than only half of everything) and then refine from there? ja, well, no, i guess. ;-)

> i'll ask a question which will reveal how clueless i am> about databases: is there a way one could take this to> some extreme by not having either rows or columns at all?> every datum would be (tableName,row,col,value) and would> be in random access storage with no further explicit> structure. then you'd layer all sorts of indecies on top> of the heap to get whatever kinds of performance you need> out of it.

I'm showing some ignorance as well, but it seems like your idea is close to what OODBMS's are.

> > i'll ask a question which will reveal how clueless i am> > about databases: is there a way one could take this to> > some extreme by not having either rows or columns at> all?> > every datum would be (tableName,row,col,value) and> would> > be in random access storage with no further explicit> > structure. then you'd layer all sorts of indecies on> top> > of the heap to get whatever kinds of performance you> need> > out of it.> > I'm showing some ignorance as well, but it seems like your> idea is close to what OODBMS's are.

Dr. Codd's point in defining the RDBMS (Relational Model, actually) was to put, what came to be known as, ACID functions *with* the data *not* myriad Applications. OO databases revert to the COBOL paradigm of putting the ACID functions back into myriad Applications. This is why database centric folk get hives listening to OO zealots; and why OO databases and OO transactional systems always [:)] fail.

Whether column oriented databases turn out to have performance advantages is an open question. Hardware, specifically multi-core multi-processor machines with solid state disks, will make this moot.

> Whether column oriented databases turn out to have> performance advantages is an open question. Hardware,> specifically multi-core multi-processor machines with> solid state disks, will make this moot.

Can you back that prediction up with references or an explanation? A significant number enterprise databases (if not a vast majority) are already running on multi-processor machines and an increasing number of those are multi-core. With 64 bit architectures, the amount of accessible RAM has also increased dramatically.

It seems to me what you are pointing to are just linear performance improvements. It the problem has quadratic complexity (as I believe these articles imply) then linear performance improvements are not a long term solution.

> Dr. Codd's point in defining the RDBMS (Relational Model,> actually) was to put, what came to be known as, ACID> functions *with* the data *not* myriad Applications. OO> databases revert to the COBOL paradigm of putting the ACID> functions back into myriad Applications.

Please elaborate. Based upon my extensive knowledge of the ACID functions (I just looked them up on Wikipedia one minute ago) these functions are (I think, I haven't used db4o all that much) handled by the db4o library, not by my applications. There is only one place where they are handled.

I can't find the article, of course, but here is the quote I pulled out:

... but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.-- Linus Torvalds/2007

It doesn't take much imagination to see that this also implies that the "joins are too slow" argument for OO/flat file/hierarchic, and against contemporary SQL datbases, goes away. The same goes for column stores. There's a thread on comp.databases.theory which touched on it. Mr. Celko allowed as his next book talks about this.

As to ACID: if the attributes applied are calculated in Application clients, then all bets are off. Only if these are the product of RI in the database is ACID independent of the Applications. Having a surrogate key (object ID) and the like disposes of the integrity inherent in the RM. It *looks* like RI is enforced, but if the PKs are surrogate, nothing prevents duplication of real world attributes. Only the application, if at all, can determine whether two OIDs mean two distinct entities. This is the built in flaw of OO databases.

The one place where column stores can be helpful is concurrency: if I change row A, column 22; and you change row A, column 23 (and these are logically independent attributes), then a column store will allow non-conflicting update. This is a Good Thing, if data is less the 5NF; e.g. there are lots of 'data' columns per PK. Even row level locking in contemporary SQL databases fails here. Whether it is correct to have less than 5NF datastores is another question.

With (my term) hyperparallel cpus and SS disk, then the mantra of "one fact, one place, one time" can be implemented without (to within x%) performance penalty. The notion of interative processing (i.e. COBOL and OO and xml) goes by the wayside. The intent of the RM is that data change is "all at once" (I think Codd actually said that); such machines make this nearly practical.

Prolog has/had a similar view. It emerged about the same time, and uses similar terminology.

> Please elaborate. Based upon my extensive knowledge of> the ACID functions (I just looked them up on Wikipedia one> minute ago) these functions are (I think, I haven't used> db4o all that much) handled by the db4o library, not by my> applications. There is only one place where they are> handled.

The library can't handle this. To get it right, you have to specify on the code level which updates go together, assuming you have updates that have to be handled as a unit.

Keep in mind that I work for a relational database vendor (Oracle), but I have seen a number of interesting specialized solutions. I'm not sure why "kx" didn't come up in this discussion, for example; they seem much more appropriate to the context than db4o ....

Maybe the "column oriented" thing in this article is just some marketing B.S., but assuming it's serious, what makes you certain that storing column/row is somehow going to relax the rules of SQL and relational data management compared to storing it row/column?

> The library can't handle this. To get it right, you have> to specify on the code level which updates go together,

I'll admit I'm talking a little bit out of my job title, but if you Object is independent of others (it contains only primitives, immutables, etc.) then the Object is the group of updates that goes together, which is simple and obvious. The Object is the transaction.

If the object contains references to other objects then it's messy, you do some chaining, all wrapped in a transaction, right??? I imagine there a lot of technical details and gotchas, but Hibernate has been doing this for years, so it must be faily well defined.

Cameron - I mentioned db4o (and now Hibernate) solely cause I have limited experience with them.

Kx makes kdb, which is a column oriented database. In my job, I get to see a large variety of systems in almost every major financial services firm, exchange, etc. While I don't see it often, the kx stuff is apparently particularly good at time series work, which can be useful for regression testing (in the financial sense, not coding ;-), pre-trade compliance, etc. Unfortunately, their web site appears to have been "dumbed down" by marketdroids .. it used to have a whole lot of really useful technical information.