An attacker has been modifying Bitcoin transactions, causing them to have a different hash.
Recently an attacker has been taking transactions on the Bitcoin peer-to-peer network, modifying them slightly, and rapidly sending them to a miner. The modified transaction often gets mined first, pre-empting the original transaction.
The attacker can only make "trivial" changes to a transaction, so exactly the same Bitcoin transfer happens as was intended - the same amount is moved between the same addresses, so this attack seems entirely pointless. However, each transaction is identified by a cryptographic hash, and even a trivial change to the transaction causes the transaction hash to change. Changing the hash of a transaction can have unexpected effects on the Bitcoin system.

A very quick explanation of transactions

A Bitcoin transaction moves bitcoins from one address to another. A transaction must be signed with the private key corresponding to the address, so only the owner of the bitcoins can move them. (This signing process is surprisingly complex.) The signature is then put in the middle of the transaction. Finally, the entire transaction (including the signature) is cryptographically hashed, and this hash is used to identify the transaction in the Bitcoin system. The important data is protected by the signature and can't be modified by an attacker. But there are few ways the signature itself can be changed, but still remain valid.

Looking at a modified transaction

To find a transaction suffering from malleability, I looked at the unconfirmed transactions page. If a transaction gets modified, only one version will get mined successfully (and actually transfer bitcoins), and the other will remain unconfirmed (and have no effect). Among the many conditions enforced in mined blocks, the same bitcoins can't be spent twice, so both transactions will never be mined. This is why having two versions of a transaction doesn't result in two payments.

I picked a random unconfirmed transaction from Feb 11 to examine. (Unfortunately this transaction has been discarded since I wrote this article, breaking my links. But you can look up a different one if you want.) Blockchain.info helpfully includes a banner warning that something is wrong:

Warning! this transaction is a double spend of 112593804. You should be extremely careful when trusting any transactions to/from this sender.

Both transactions look identical: the bitcoins are moving between the same accounts in both cases, the amounts are equal, and the scripts look identical. So why do they have different hashes? A clue is the unconfirmed transaction is 224 bytes and the confirmed transaction is 228 bytes.

There are a couple differences (highlighted in red). But what do they mean?

This script is the scriptSig, the signature of the transaction using the sender's private key. This signature proves the sender owns the bitcoins. However, the scriptSig isn't just a simple signature, but is actually a program written in Bitcoin's Script language. This program pushes the signature data onto the execution stack. The program from the unconfirmed script is interpreted as follows:

PUSHDATA 48

48

signature (DER)

sequence

30

length

45

integer

02

length

20

X

539901ea7d6840eea8826c1f3d0d1fca7827e491deabcf17889e7a2e5a39f5a1

integer

02

length

21

Y

00fe745667e444978c51fdba6981505f0a68619f0289e5ff2352acbd31b3d23d87

SIGHASH_ALL

01

PUSHDATA 41

41

public key

type

04

X

6c4ea0005563c20336d170e35ae2f168e890da34e63da7fff1cc8f2a54f60dc4

Y

02b47574d6ce5c6c5d66db0845c7dabcb5d90d0d6ca9b703dc4d02f4501b6e44

The program from the confirmed script is interpreted as follows:

OP_PUSHDATA2 0048

4d 48 00

signature (DER)

sequence

30

length

45

integer

02

length

20

X

539901ea7d6840eea8826c1f3d0d1fca7827e491deabcf17889e7a2e5a39f5a1

integer

02

length

21

Y

00fe745667e444978c51fdba6981505f0a68619f0289e5ff2352acbd31b3d23d87

SIGHASH_ALL

01

OP_PUSHDATA2 0041

4d 41 00

public key

type

04

X

6c4ea0005563c20336d170e35ae2f168e890da34e63da7fff1cc8f2a54f60dc4

Y

02b47574d6ce5c6c5d66db0845c7dabcb5d90d0d6ca9b703dc4d02f4501b6e44

Note the highlighted differences. The original transaction has a byte 0x48, which says to push (hex) 48 bytes of data. The modified transaction has a OP_PUSHDATA2 (0x4d), which says the next two bytes (48 00) are the number of bytes to push. In other words, both transactions do exactly the same thing (push the signature), but the original indicates this with 48, while the modified transaction indicates this with 4d 48 00. (Pushing the public key has a similar modification.) Since both scripts do exactly the same thing, both transactions are equally valid. However, since the data has changed, the transactions have two different hashes.

Why does malleability matter?

Transaction Malleability has been discussed for years and treated as a minor inconvenience. Both transactions have exactly the same effect, moving bitcoins between the same addresses. Only one transaction will be confirmed by miners, and the other will be discarded, so nobody gets paid twice even though there are two transactions.

There are, however, three problems that have turned up recently due to malleability.

First, the major Mt.Gox exchange stated they would stop processing bitcoin withdrawals until the Bitcoin network approves and standardizes on a new non-malleable hash. Apparently they were using the hash to track transactions, and would re-send bitcoins if the transaction didn't appear to go through. This is obviously a problem if the transaction did go through, but with a different hash.

Second, some wallet software would use both transactions to compute the balance, which caused it to show the wrong value.

Finally, due to the way Bitcoin handles change, malleability could cause a second transaction to fail. This requires a bit more explanation.

Failures due to change and malleability

The Bitcoin protocol doesn't really move bitcoins from address to address. Instead, it takes bitcoins from a set of inputs, and sends them to a set of outputs. Each output is an address (actually a script, but let's ignore that for now). Each input is an output from a previous transaction, and each input must be entirely spent.

As a result, if you have 3 bitcoins, and you want to spend one of them, the other two bitcoins get returned to you as change, sent to an address you control. If you then want to spend some of the change, your second transaction references the previous transaction that generates the change, referencing it by the hash of the first transaction. This is where malleability becomes a problem - if the first transaction's hash changed, the second transaction is not valid and the transaction will fail. Note that the change will still go to your proper address, so you can spend it as long as you use the correct (modified) transaction hash, so you don't lose any bitcoins. You just have the inconvenience of having a transaction rejected, and you'll need to redo it with the right hash.

The change problem only happens because some wallet software takes a shortcut, letting you (attempt to) spend the change before the transaction has been confirmed. The reasoning is that since it's your change from your transaction, you should be able to trust yourself. But that breaks down with malleability.

Malleability has been known for a long time

Transaction malleability has been known since 2011. The exact OP_PUSHDATA2 malleability used above was described four months ago here. There are many other types of malleability, which are explained here. The script code can be modified in several ways while leaving its operation unchanged. The signature itself can be encoded slightly differently. And interestingly, due to the mathematics of elliptic curves the numeric value of the signature can be negated, yielding a second valid signature.

Conclusion

Hopefully this has helped to make malleability more understandable. If you want to know more details of the Bitcoin protocol, including signing and hashing, see my previous article Bitcoins the hard way.

Note that the malleability is also with PUSHDATA and PUSHDATA4, not specifically PUSHDATA2. Also, a major application that malleability breaks is anything that relies on a precomputed nlocktime'd refund transaction, spending a transaction back to the sender before the original transaction is announced.

Really nice post!Now I'm trying to conduct such a malleable TX on my own example, just to try how it works.Can you tell what exactly and how I can change in my tx 350946f9c61598ff4d8c77cb99625f6ac106765dcbf2d2d855a122363b3f3c24? Did you use createrawtransaction/signrawtransaction?