For all practical purposes, there are no DBMS vendors left advocating single-server strategies.

Vendor agreement has become even stronger in the interim, as evidenced by Oracle/MySQL, IBM/Netezza, Oracle’s NoSQL dabblings, and various companies’ Hadoop offerings.

Multiple data stores for a single application

We commonly think of one data manager managing one or more databases, each in support of one or more applications. But the other way around works too; it’s normal for a single application to invoke multiple data stores. Indeed, all but the strictest relational bigots would likely agree:

It’s common and sensible to manage authentication and authorization data in its own data store. Commonly, the data format is LDAP (Lightweight Directory Access Protocol).

It’s common and sensible to manage the “content” and “e-commerce transaction records” aspects of websites separately.

Even beyond that case, there are often performance reasons to manage BLOBs (Binary Large OBjects) outside your relational database.

Internet “interaction” data is also often best managed outside an RDBMS, in part because of its very non-tabular data structures.

There also are cases in which applications dutifully get all their data via SQL queries, but send those queries to two or more DBMS. Teradata is proud that its systems can support rather transactional queries (for example in call-center use cases), but the same application may read from and write to a true OTLP database as well.

Further, many OLTP (OnLine Transaction Processing) applications do some fraction of their work via inbound or outbound messaging. Many buzzwords can come into play here, including but not limited to:

SOA (Service-Oriented Architecture). This is the most current and flexible one.

EAI (Enterprise Application Integration). This was a hot concept in the late 1990s, but was generally implemented with difficulties that SOA was later designed to alleviate.

Message-oriented middleware (MOM) and Publish/Subscribe. These are even older, and overlap greatly.

Finally, every dashboard that combines information from different data stores could be assigned to this category as well.

Multiple storage approaches in a single DBMS

In theory, a single DBMS could operate like two or more different ones glued together. A few functions should or must be centralized, such as administration, and communication with the outside world (connection handling, parsing, etc.). But data storage, query execution and so on could for the most part be performed by rather loosely coupled subsystems. And so you might have the best of both worlds — something that’s multiple data stores in the ways you want that diversity, but a single system in how it fits into your environment.

So will these trends succeed? The forgoing caveats notwithstanding, my answers are more Yes than No.

… multi-purpose DBMS will likely always have performance penalties, but over time the penalties should become small enough to be affordable in most cases.

…

Machine-generated data and “content” both call for multi-datatype DBMS. And taken together, those are a large fraction of the future of computing. Consequently …

… strong support for multiple datatypes and DMLs is a must for “general-purpose” RDBMS. Oracle and IBM [have] been working on that for 20 years already, with mixed success. I doubt they’ll get much further without a thorough rewrite, but rewrites happen; one of these decades they’re apt to get it right.

In 2005 I had been more ambivalent, in part because my model was a full 1990s-dream “universal” DBMS:

IBM, Oracle, and Microsoft have all worked out ways to have integrated query parsing and query optimization, while letting storage be more or less separate. More precisely, Oracle actually still sticks everything into one data store (hence the lack of native XML support), but allows near-infinite flexibility in how it is accessed. Microsoft has already had separate servers for tabular data, text, and MOLAP, although like Sybase, it doesn’t have general datatype extensibility that it can expose to customers, or exploit itself to provide a great variety of datatypes. IBM has had Oracle-like extensibility all along, although it hasn’t been quite as aggressive at exploiting it; now it’s introduced a separate-server option for XML.

That covers most of the waterfront, but I’d like to more explicitly acknowledge three trends:

Among other things, Hadoop is a collection of DBMS (HBase, Impala, et al.) that in some cases are very loosely coupled to each other. The question is less how well the various data stores work together, and more how mature any one of them is on its own.

The multiple-data-models idea has been extended into schema-on-need, which is sometimes but not always housed in Hadoop.

Even on the relational side, multiple storage capabilities exist in one product.

Vertica was designed that way from the get-go. (Like the old joke about police duos, one is to read and one is to write.)

IBM, Microsoft and Oracle have all recently added some kind of in-memory columnar capability.

Comments

I agree, that especially in the world of big data there is no silver bullet. We have number of “full table scan” engines like Impala, Hive, Presto, etc as well as “index access” engines also called NoSQLs (too much to list). Each one in both families has its own strength and weaknesses.
In the same time duplicating (or triplicating) data among engines is expensive.
I think we will get to the multiple engine – single data paradigm, where different engines are looking into data stored in some open format. It is problematic for mutating data, but IMHO it is feasible for read-only data…