We propose a strategy to scale Bitcoin to a far greater throughput and performance than available today while keeping the risk of centralization and costs to a minimum. To achieve this we decrease block validation latency with diff blocks, parallelize transaction validation, enable UTXO sharding with transaction input block height annotations, and deploy a series of extension blocks for sustainable capacity increases.

We propose a strategy to scale Bitcoin to a far greater throughput and performance than available today while keeping the risk of centralization and costs to a minimum. To achieve this we decrease block validation latency with diff blocks, parallelize transaction validation, enable UTXO sharding with transaction input block height annotations, and deploy a series of extension blocks for sustainable capacity increases.

The iguana bitcoin core implements a parallel download where the vast majority of data goes into read only files. This avoids needing a DB and also allows them to be put into a compressed file system vi mksquashfs. By processing the data in several stages, it is possible to stream data in at bandwidth saturation levels. I am not seeing any bottlenecks until it exceeds 500 mbps. The parallel download is able to get 70 to 120 megabytes/sec, which is 12 minutes for all 60GB blockchain.

Using 8 cores all of the data structures are created in parallel with hash tables and bloom filters built into the read only files. I am seeing about a half hour time to get to where things are ready for the last pass. The last pass does the final processing that is needed.

So, the parallel processing somewhat similar to what you write is already in functioning project and it does remove the bottlenecks the DB oriented approaches incur. The only thing that changes into the past is the state of the unspents, but this is encoded into 6 bytes per unspent by assigning a deterministic 32bit integer to each of the high entropy hashes. So the net result is a relatively compact set of utxo. Even the spends data can be processed in parallel once all the blocks are loaded and create a vector of updates to the unspents. By or'ing together these vectors, it creates a current set of unspents relatively quickly.

The searches using the read-only bundles can also be done in parallel, but I am seeing times of about 2 milliseconds for the equivalent of an importprivkey operation on a 1.4Ghz i5 laptop just serially processing the parallel files.

Carpe diem - cut the down side - be anti-fragileA feature that needs more than one convincing argument is no.My coding style is legendary but limited to 1MB, sorry but cannot come much over my C64, Bill Gates and Tom Bombadil