I’m new to Bitcoin, find it fascinating, and have been trying to understand how all the pieces work. Here are a few questions I’ve been wrestling with...

When I started the Bitcoin client for the first time it spent several hours downloading (I think) all the blocks. What exactly was it downloading? Was it the full data for every block with every committed transaction ever? Or was it something smaller, like maybe just a few important fields from the block? And related to this, how much of what my client downloaded did it retain? (E.g., does it download a lot of detail, verify it, and then keep just some hash of this, like say the Merkle tree?)

On a related note, how does my client verify that a transaction is good (if it’s even possible for it do this)? I know there are a lot of pieces to the verification but the particular point I’m wondering about is how it confirms that a particular transaction input is in fact the output of an earlier transaction. It seems to do this my client would either need to have (1) the full contents of the input’s block; or (2) the full Merkle tree of the input’s block; or (3) the Merkle root of the input’s block + the Merkle branch of the input transaction. (1) seems unlikely since that’s a lot of data (I believe this is what a “full node” has). (3) doesn’t seem possible because AFAICT the protocol for a transaction doesn’t allow for including a Merkle branch. So I guess then it’s either (2) or something else entirely.

When I started the Bitcoin client for the first time it spent several hours downloading (I think) all the blocks. What exactly was it downloading? Was it the full data for every block with every committed transaction ever?

Yes. Unless you are running some other software, in which case I have no idea how that software works.

On a related note, how does my client verify that a transaction is good (if it’s even possible for it do this)? I know there are a lot of pieces to the verification but the particular point I’m wondering about is how it confirms that a particular transaction input is in fact the output of an earlier transaction. It seems to do this my client would either need to have (1) the full contents of the input’s block; or (2) the full Merkle tree of the input’s block; or (3) the Merkle root of the input’s block + the Merkle branch of the input transaction. (1) seems unlikely since that’s a lot of data (I believe this is what a “full node” has). (3) doesn’t seem possible because AFAICT the protocol for a transaction doesn’t allow for including a Merkle branch. So I guess then it’s either (2) or something else entirely.

It sounds like you do know how transactions are verified. The inputs are looked up in an index kept for just that purpose.

Every node using the reference client is a "full node" with a full blockchain history. But, it turns out that you only need a subset of the blockchain for most things, including transaction verification. You can discard transactions after they are spent and just keep the unspent ones*. The set of unspent outputs is relative small, like a couple hundred MB, rather than several GB for the full block history.

Right now, the protocol doesn't allow requesting or sending partial blocks, but they are working on it. That will allow even lighter clients.

* Well, you need rollback history too.

17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.

Every node using the reference client is a "full node" with a full blockchain history. But, it turns out that you only need a subset of the blockchain for most things, including transaction verification. You can discard transactions after they are spent and just keep the unspent ones*. The set of unspent outputs is relative small, like a couple hundred MB, rather than several GB for the full block history.

Right now, the protocol doesn't allow requesting or sending partial blocks, but they are working on it. That will allow even lighter clients.

* Well, you need rollback history too.

If you had such a lighter client that could request partial blocks, it seems that it would be able to verify the existence of a transaction input (e.g., by checking the Merkle branch and making sure the hashes work) but is there any way that it could verify that the input hadn't already been spent (i.e., in some block that the client doesn't have a copy of)?

Every node using the reference client is a "full node" with a full blockchain history. But, it turns out that you only need a subset of the blockchain for most things, including transaction verification. You can discard transactions after they are spent and just keep the unspent ones*. The set of unspent outputs is relative small, like a couple hundred MB, rather than several GB for the full block history.

Right now, the protocol doesn't allow requesting or sending partial blocks, but they are working on it. That will allow even lighter clients.

* Well, you need rollback history too.

If you had such a lighter client that could request partial blocks, it seems that it would be able to verify the existence of a transaction input (e.g., by checking the Merkle branch and making sure the hashes work) but is there any way that it could verify that the input hadn't already been spent (i.e., in some block that the client doesn't have a copy of)?

Nope. As you point out, it doesn't have access to that information, so by definition, it can't say anything about it. For that, you need something else.

I personally prefer having full nodes that provide that service (probably for a fee) to users. I had a lengthy chat on IRC the other day about his proposal to have a fraud alert system, where allegedly honest nodes would shout if a transaction was a double spend, but I wasn't a big fan. You could also extend the protocol to allow nodes to request data from a peer's UTXO (unspent transaction output) set, and have your client ask several peers just to be sure.

I'm sure there are other ways too.

17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.