Every transaction has a hash associated with it. In a block, all of
the transaction hashes in the block are themselves hashed (sometimes
several times -- the exact process is complex), and the result is the
Merkle root. In other words, the Merkle root is the hash of all the
hashes of all the transactions in the block. The Merkle root is
included in the block header. With this scheme, it is possible to
securely verify that a transaction has been accepted by the network
(and get the number of confirmations) by downloading just the tiny
block headers and Merkle tree -- downloading the entire block chain is
unnecessary. This feature is currently not used in Bitcoin, but it
will be in the future.

How can you check if a transaction has been verified only using Merkle roots? How does that mechanism work?

While I could grasp the definition of Merkle Tree and Root immediately, I struggled to figure out the larger context and their use, like many posts on this thread, until I did a bit more research. I try to explain a scenario here.
– RT DenverMar 27 '18 at 22:46

6 Answers
6

The idea (as I understand it) is that the Merkle tree allows for you to verify transactions as needed and not include the body of every transaction in the block header, while still providing a way to verify the entire blockchain (and therefore proof of work) on every transaction.

To understand this, first understand the concept of a tree. Consider an 8 transaction block. Imagine each of those 8 transactions at the base of a pyramid: these are called leaves. Put four "branches" on the second tier of the pyramid and draw two lines from each of them to the leaves so that each branch has two leaves attached to it. Now join those four branches to two branches on pyramid level 3 and up to one branch (what is called the root of the tree) on the top of the pyramid. (Our tree is growing upside down in this example.)

Now we can start to understand the hashing process. Hash the hashes of the "leaves" and include that as part of the 2nd level branches that those leaves are attached to (these are called child nodes and parent nodes). Now hash the hashes of those hashes and include that as part of the third level branches. And so on. (And if you had more than 8 transactions, all you need are more levels to the pyramid.)

So now you have a root node that effectively has a hash that verifies the integrity of all of the transactions. If one transaction is added/removed or changed it will change the hash of its parent. Which will change the hash of its parent, and so on, resulting in the root node's hash (which is the Merkle root) changing as well.

So how does this help us with potentially not having to have the entire blockchain? Because we could verify the transactions as needed. If we have a transaction that claims to have been from block #234133 we can get the transactions for that block, verify the Merkle tree, and know that the transaction is valid. We can do that without necessarily knowing all of the transactions from #234132 or #234134 because we know that the blocks are tamper proof.

Even better, if we know where it is in the Merkle tree and we know the hashes of the branches we don't even need all of the transactions from #234132. (There were 868 in that block.) We start with just our transaction and its sibling (if it has one) and calculate the hash of those two and verify that it matches the expected value. From that we can ask for the sibling branch of that and calculate the hash of that and verify it. And continue with this process, up the tree. Which only takes ten verifications for 868 transactions. (That's one of the great things about trees, they can hold a lot of values with only a relatively small number of layers.)

How do we know that the source of this data isn't lying to us about the hash values? Because a hash function is one-way, there is no way that a deceptive party could guess a value that would hash with our second-to-last value to create the Merkle root. (Which we know from our verified blockchain.) This reasoning holds further down the tree: there's no way to create a fake value that would hash to our expected value. Another way to think about it, is that even a single alteration of a transaction at the base of the tree, would result in a rippling change to all the hash values of nodes in its branch all the way up to the root's hash value.

In short, the Merkle tree creates a single value that proves the integrity of all of the transactions under it. Satoshi could have just included the hash of a big list of all of the transactions in the Bitcoin header. But if he had done that that would have required you to hash the entire list of transactions in order to verify its integrity. With this way, even if there are an extremely large number of transactions the work you need to do (and the number of hashes you need to request/download) in order to verify the integrity is only log(O).

[As always, feel free to edit this. This is primarily just inference on my part from looking at the spec.]

A block header does not include the transaction ids from the transactions in the block, does it? So basically the idea of the last part of the quote will only work if txid's were included in the block headers.
– Steven RooseDec 10 '13 at 19:26

It reads "block header and merkle tree". That makes more sense. Does the original protocol allow for requesting merkle trees and/or headers including them?
– Steven RooseDec 10 '13 at 19:27

1

What if we do not know the block# of the transaction. In that case are we require to iterate through all blocks on the block chain? @David Ogren
– alperDec 5 '17 at 14:07

Maybe this is a bad question but what if I find two certain transactions with equal hashes with birthday attack and do one of those transactions and later claim that I had done the other one. How can I be proved wrong?
– tgwtdtJul 8 '18 at 17:43

1

It's too long to answer your question here @tgwtdt. In short, you can't execute a birthday attack because you don't have arbitrary control over inputs. Second, even a birthday attack on SHA-256 isn't realistically possible. But, in general, yes, if you can find a way to exploit SHA-256 then you can do all kinds of nasty things within bitcoin: the difficulty of reversing the hash algorithm is a founding principle. On the other hand, hash algorithm security is a very well researched field.
– David OgrenSep 9 '18 at 1:49

"Figure 7-2. Calculating the nodes in a merkle tree" from Mastering Bitcoin shows the Merkle Root (HABCD) of a list of four transactions: Tx A, Tx B, Tx C, and Tx D:

To verify that a transaction—for example, that with hash HK—is a valid transaction (i.e., part of a list of, in this example, 16 transactions with hashes HA, HB, …
HP), one need only perform at most 2*log2(N) < N hashes, shown in the Merkle path here:

If HK leads to the correct Merkle root, then TK was in the transaction list.

And the Merkle path, needed to verify Hk corresponds with the Merkle root, only contains 4 hashes in the above example. The Merkle path takes up much less space than storing all the transactions in a block. (In the example above: 4 hashes takes much less space than 16.) This is why SPV is lighter-weight.

To verify that a transaction: How do we know the exact location of Hk on the Merkle Tree? @Geremia
– alperDec 5 '17 at 14:19

@Avatar To construct Merkle paths from scratch requires knowing all the transactions. Also, forge a fake Merkle path that corresponds to a given Merkle root would be even more difficult than to crack SHA256.
– GeremiaDec 5 '17 at 15:53

For example from root when we follow: right left right left we reach to Hk, which we want to verify. But how could we know that we should follow that path? @Geremia
– alperDec 5 '17 at 17:02

1

@Avator Verification of a Merkle path proceed from the leaf node to the Merkle root.
– GeremiaDec 5 '17 at 17:41

1

@Avatar Sorry, I meant to say that one simply needs to search for the transaction in the "list of transactions."
– GeremiaDec 5 '17 at 18:56

We're looking for long answers that provide some explanation and context. Don't just give a one-line answer; explain why your answer is right, ideally with citations. Answers that don't include explanations may be removed.

BE AWARE! The merkle root is important for mining. since the merkle root is the hashed value of ALL transaction hashes from the block, the value of the merkle root is taken into advance when miners do their work. See: https://en.bitcoin.it/wiki/Block_hashing_algorithm. Previous hash:

The value on the Wiki is correct, as anyone can check by looking at the hash of the previous block's header or by running your code, which will not produce the values you claim.
– David A. HardingApr 2 at 15:44

The Merkle Root, as I understand it, is basically a hash of many hashes (Good example here) - to create a Merkle Root you must start by taking a double SHA-256 hash of the byte streams of the transactions in the block. However, what this data is (the byte streams), what it looks like, and where it comes from remains a mystery to me.

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).