As human beings, we get used to the limitations of the technologies we use and over time forget how fundamental some of these limitations are.

As a database administrator in the early 1990s, I remember the shock I felt when I realized that the contents of the database files were plain text; I’d just assumed they were encrypted and could only be modified by the database engine acting on behalf of a validated user. But I got used to it.

I also got used to the idea that the contents of a database where pretty much what I – the DBA – said it was. Rudimentary audit logs could be put in place to track activity, but as DBA I could easily disable the audit logs and tamper with any database if I so desired.

I think it’s obvious to all of us that this is not the way it should be – contents of production databases should be trustworthy. We should know that a DBA, hacker, or privileged user has not tampered with the contents of the database. However, until recently we lacked the technology to ensure this.

The emergence of a tamper-proof distributed ledger in the form of the Blockchain now promises to give us a mechanism to at least “seal” database records. We can’t necessarily stop a hacker or malicious insider from breaking the seal, but we can at least know if the seal has been broken.

In this post, I’ll show how to implement a simple Blockchain seal for MongoDB. We’ll record a hash value corresponding to a set of documents in a database. As long as the hash value has not changed, we can be confident that the database records have not been tampered with. The hash value is stored on the Blockchain so that we can know with certainty that a particular hash value was in effect at a specific point in time.

Setup

I’m using the Tierion service to handle the Blockchain proof of existence processing – you can apply for a free Tierion account at tierion.com.

I’m going to step through just the basic steps in this post – you can find the full source code on GitHub.

We set up a connection to Tierion using our username and password. This returns a token which we can examine, but most importantly for our purposes, it initializes the hashClient that we’ll use in subsequent calls.

Hashing Documents

We generate a hash for a set of MongoDB documents using the crypto package. This function takes db, collection, filter, and projection arguments to determine the set of documents to be returned. A hash digest is generated for those documents. Should anyone alter those documents then the hash will no longer be valid.

We submit the hash to Tierion using the submitHashItem call. This gives us a receipt id.

We check the status of the receiptId periodically using the getReceipt call. It may take as long as 10 minutes to see our receipt on the block, so we poll every 30 seconds

Once we get the updated receipt, we store it in the database record.

Validating an Existing Hash

If we want to see that the database documents have not been tampered with, we call the checkHash function, using the query filters that we originally used to create the hash in the first place. This checks that the hash values stored in the database control table still match, and retrieves the receipt id that was generated from the blockchain. For demonstration purposes, the sample program validates the hash immediately.

If we go to the blockchain.info page we can check that the blockchain transaction id is what we expect. The data at blockchain.info is proof that the database document hashes were valid at the time at which the block was added to the blockchain. If the hash value is the same now as it was then we can be confident that the documents have not been changed.

Conclusion

I believe that, eventually, all critical database systems will need to be equipped with built-in mechanisms to “seal” database records on the blockchain. But for now, we can get some of the way there using existing APIs together with a bit of duct tape and JavaScript code.