Adding Metadata to the Blockchain, part 1

The blockchain’s rise to popularity especially over the past year has mainly been driven by the fact that companies have started to realize that blockchains can be used for many more purposes than just digital currency transactions. This way of using the blockchain is referred to as Bitcoin 2.0, Crypto 2.0, or Blockchain 2.0. The possibilities to do so exploded when the OP_RETURN field was added in Bitcoin Core version 0.9.0 in March 2014. This OP_RETURN field might not appear very interesting at first glance, as it comes down to a comment field on Bitcoin transaction that can contain just 40 bytes of data. But a closer look reveals that this field can be made to hold a virtually limitless amount of data, and even digital assets.

The process representing real world assets on top of the Bitcoin blockchain became known as “coloring coins”. Colored Coins may represent anything from financial assets such as shares and bonds, to physical assets such as a house or a car. NASDAQ uses this concept of “Colored Coins” on the Bitcoin blockchain for its blockchain-powered private market Linq, as it issues private shares via colored coins. Online retailer Overstock.com has even received permission from the Securities and Exchanges Commission (SEC) to issue publicly traded shares via blockchain technology, potentially revolutionizing Wall Street.

The Basic Concept

It is important to realize that in the previously discussed use cases no data files are physically stored on the blockchain. The basic method to put (large amounts of) metadata on the blockchain, while retaining privacy, involves creating a hash to reference the file that stores the actual data.

In order to understand how this works it can help to first consider how a typical websites will handle login data. Users will create an account name and a password, and the password will subsequently be stored in a database. Because it is unsafe to store a plain text password, a hash function (MD5, SHA1, SHA256, etc.) will be used to convert a password to a hash. For example, using MD5 on the password “12345” would result in the following hash: 827ccb0eea8a706c4c34a16891f84e7b. Only this generated hash will be stored in the database along the account name, and the original password will be omitted. Whenever a user tries to login the used password will be encrypted in the same way, and compared to the stored hash for the given username. The login is successful when the data matches.

If an independent party had access to the same database, it would not be able to derive the original password (12345) from the stored hash (827ccb0eea8a706c4c34a16891f84e7b). Should the user, however, send his password information to an independent party, then that independent party will be able to encrypt it as well and confirm that it is already stored in the database. In short, the password “12345” can be verified by any third party if required.

Coloring Coins

Now let’s take a leap to the creation of a digital asset on the blockchain. Bank ABC decides to issue 1 company share via the blockchain. In order to “color” a coin, ABC first creates a (MD5) hash that references “1 company share Bank ABC ID#001” resulting in “51e0cb7ee2403c8fa4a5c069486e0ff2”. If investor Alice buys this share, ABC can simply post a transaction to Alice’s wallet on the Bitcoin blockchain that will contain the comment “51e0cb7ee2403c8fa4a5c069486e0ff2”, and send the corresponding information to Alice to complete the transfer. What ABC has done is roughly the equivalent of creating a password, hashing it, storing it in a database and sharing it with Alice in accordance with the previous example.

Since Alice is receiving the original information “1 company share Bank ABC ID#001”, she can verify that this produces a hash that matches the posted hash. Any different character would result in a different hash, hence Alice would easily find out if ABC fails to post the correct transaction. If Alice now goes to independent third party “Bob” then he can verify several things. The first one being that there has been a transaction between ABC and Alice, the second that this contains a reference to a specific asset if either Alice or ABC provide this information. Bob would subsequently conclude that Alice is now the owner of this asset, as Alice is the final recipient. Any external party can of course also see the signed message on the blockchain, but will not be able to derive the meaning of it.

Algorithm and Alternative Use Cases

The nice thing of hash functions is that their output is always a string of equal length. The MD5 hashing algorithm used for the example is always 32 characters (16 bytes). Bitcoin itself uses SHA256, which can, conveniently, also be used in the OP_RETURN field. The SHA256 hash is always 64 characters longs (32 bytes), and will easily fit in the limited (40 bytes) comment field.

Another nice thing is that a transaction that uses the OP_RETURN field does not have to indicate a transfer of ownership. The meaning depends on the actual content, which could be just a certification of a certain document (simply claiming ownership). Proof of Existence, for example, allows users to hash their documents and post it on the blockchain to prove the content existed at a certain point in time. The hash can be posted publicly, because it does not reveal the actual content. As users own the data that produces the hash, it can be revealed and verified in case a conflict arises.

With many financial and non-financial use cases still being explored or being in active development, many more examples will undoubtedly appear over time. How to use the OP_RETURN field yourself, and possibly develop a new use case or simply attach a random message to a transaction, will be discussed in the next part.

The next part will be referenced here when available, and will include more technical details on how to actually use the OP_RETURN field.