Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It only takes a minute to sign up.

Is blockchain a potentially viable database solution for modern, high transaction volume applications?

It is pretty easy to see it's value for low volume transactions like personal medical records, but what about high volume databases?

What is blockchain?

Blockchains rely on cryptography to allow a set of computers to make changes to a global record without needing a central actor.

Removing the middleman cuts costs in almost every sector.

The blockchain is a ledger that records everything that happens to a collection of data known as a "block" in a chronological order or "chain".

As a currency this is an important feature because it allows users to be sure their digital money is one of a kind, the same way each note in your wallet is unique.

"Blockchain tech will be the way we create assets because it allows you to transfer digital information without copying," says Adam Ludwin, chief executive of Chain.com, which builds blockchain networks.

Blockchain can be used to track the history of all sorts of information and maintain its value, so, for example, doctors could use it to update medical records.

Since each change to a blockchain is made simultaneously across the whole network, no information is lost and because changes cannot be undone the system maintains its transparency. A special key is needed to make changes to each block, so individuals can keep their records safe by protecting that key.

Now how many computers around the world are responsible for keeping the bitcoin database? I'm no expert on bitcoin but I think the complete history of transactions are stored in the block chain, so all computers that participate in the bitcoin network essentially keep a copy of the entire database (the transactions part of course, not the accounts info and secret keys, these are kept in the personal wallets).

We can only estimate how many they are but I'd guess they are more than a million. 300K transactions in a day with a million computers does not sound like high volume. And 8 minutes for confirmation?

A modern RDBMS in a decent hardware can easily go up to 1K transactions per second. That's about 86M transactions per day. The confirmation time? That depends on the size of the transaction (how many tables and rows it affects) but for a small transaction of the bitcoin type (remove 42 coins from account A and add 42 coins to account B), it will be milliseconds.

In conclusion the difference in volumes and time is 1000 to 100000-fold today.

If the blockchain technology solves this issue in the future, it might be possible to be used in medium or high volume applications. We can read discussions and suggestions for how the problem should be solved - many of the companies mentioned in the links are actually working on these issues - but we haven't seen yet an actual working solution or product that offers high volumes and speed.

another issue I have with the blockchain is it's bloody inconsistent. It's all on load and the client that process the transactions aren't 'dedicated' so you could see a bunch drop out or get added. 8 minutes sounds about right, maybe the final 10 minutes was an extra minute or two for the approval to reach all the clients? Not sure, who knows with more nodes maybe it's gone down! Either way great links. Thanks.
– Ali RazeghiMay 6 '16 at 22:33

I'm very familiar with cryptocurrency and databases, and I can tell you it's not a great DB engine at all.

Using the blockchain as a live database:

Think of it as a first normalized form without any really good built in search capability or indexing as far as the blockchain goes. Basically a excel sheet without any computation capabilities that just gives you 'read/write' capabilities with lots of verification and validation. A blockchain is a great way to validate your data is sanitized and correct before you put it in a database which let's you query it differently, index it, etc.

Benefits of the blockchain:

The blockchain in this case is purely a ledger and an API for PUT and GET requests. That's about it. The blockchain is interesting because you need a majority of nodes to pass the transaction as valid and there aren't any rollbacks, once it's committed it's committed. Thus if someone tries to put in a fake transaction it will be caught unless the person doing it has a pool which has a strong majority share. Then they can validate it in their pool before someone can reject it. That is the strong point of the blockchain. Verification that the data is accurate. It is also typically pretty slow. You're looking at about 10 minutes under normal load for it to get validated. Under heavy load the time goes up quite a bit.

After you have validated that the transactions are valid and not fraudulent using the blockchain, you can then import that data into a database and work with it however you like. I have some experience with this but note that every single transaction on the current bitcoin architecture will be recorded thus it has some interesting info to analyze.

As far as what DBMS you should put it in, that's up to your use case. If you want to analyze the transactions/wallet IDs to see some patterns or do B.I. work I would recommend a relational DB. If you want to setup a live ingest with multiple cryptocoins I would recommend something that doesn't need the transaction log so a MongoDB solution would be good. I don't think you need to worry about Elastic Search unless you want to start doing live recording of all cryptocoins at the same time and will use it to do auto trading or something equally crazy. :)

In 2014 we built ascribe.io with the premise of using Bitcoin as a database for Intellectual Property claims. Upon release, we plugged the network because it couldn't handle the throughput, latency was at least 10 minutes and we were limited by what we could put into the OP_RETURN, forcing us to store the actual digital file relating to the claim in Amazon S3. We realized that Bitcoin in its current form could never be a high transaction database.

But the idea of whether we could have a blockchain style database - decentralized control, immutability (tamper-resistance) and live assets on the network stuck with us. So in mid-2014, we started working on BigchainDB

Long story short - we can process 100k tps with 100mS latency and have petabytes of capacity. The code is our BigchainDB Github, technical documentation here and the foundational thinking in our whitepaper.

If you have a use case for a high-transaction, decentralized database - we built BigchainDB exactly for this.

Blockchain derived from Bitcoin is slow and expensive; the amount of data that can be stored in a block is very modest. The mechanisms behind blockchains (distribute ledgers) are intended to provide an incorruptible, highly replicated data store; peer to peer is less an essential feature than a "political requirement" to avoid the appearance of central control. I have been working for some 18 months to produce a high performance distributed ledger (see metrognomo.com for one instantiation) that takes as little from Bitcoin as possible. In the end though, a distributed ledger looks pretty much like a sequential file that can be added to but not edited after addition. This is valuable thing for some applications, but not what most people think of as a database.