Bitcoin Network Capacity Analysis – Part 5: Stress Test Analysis

Follow:&nbsp&nbsp

Posted onJun 16, 2015

This is a continuation of TradeBlock’s block chain and network analysis to address the ongoing block size discussions. It is intended for an audience with at least a fundamental comprehension of block chain technology. If you have not yet done so, we recommend first reading:

Part 5 – Stress Test Analysis

Approximately two weeks ago (May 29-May 30) a number of individuals within the Bitcoin community collaborated in an attempt to test the capacity of the network by sending tens of thousands of transactions over a short span of time. The so-called ‘stress test’ began to reflect in the mempool late evening (UTC) on May 29, and continued until the early morning (UTC) on May 30 (see chart below). For the purposes of this analysis, we define the start of the stress test at block 358,596 (11:39 PM UTC) the end at block 358,625 (6:54 AM UTC). TradeBlock maintains an extensive, custom data infrastructure to track the bitcoin network, from which unique analytics about the effects of this stress test can be derived.

In order to create a scenario of high volumes, several participants programmatically generated thousands of transactions that continuously re-spent unconfirmed inputs with a small fee attached. At the peak there were as many as 26k transactions in the mempool (vs. 1-4k transactions on average during more normal periods). In this piece we explore the implications on transaction wait times under the stress test, while also shedding additional light on miners’ block size limitations.

Transaction Prioritization

With close to 135k transactions confirmed over the 24 hour period surrounding the stress test and block sizes at or close to their limits, miners had to prioritize which transactions to include in a given block. In order to determine how the stress test transactions compare vs. organic transactions, we analyzed the distribution of transactions relative to a defined ‘control period.’ The control period includes data from two days before the stress test (May 23 & May 27) and two days after the test (June 3 & June 6); we included two weekend days in order to adjust for any day-of-week bias.

The average fee per byte of data for all the transactions entered during the 7.5hr stress test window equated to roughly 32 satoshis. However, a large proportion of transactions (26%) were in the lowest fee category (0-5 satoshis per byte of data). Similarly, the average transaction value was 5.8 XBT during the stress test, with a median of 0.1 XBT. This compares to the control period average and median of 10.9 XBT and 0.4 XBT, respectively. In short, the transaction and data volume was significantly increased during the stress test, but did not necessarily represent standard transactional characteristics.

Taking it a step further, we analyzed how long it took for a transaction to be included in a block during the stress test and compared it to the control period. The ‘wait time’ in our analysis is measured in terms of number of blocks. Per the charts below, it is evident that low fee transactions (0-5 satoshis per byte of data) took roughly 28 blocks, implying almost five hours, to confirm (on average) during the stress test . During the control period, similar ‘low-fee’ transactions took just under 10 blocks, implying almost two hours, instead. In general, higher fees led to faster confirmation times during the stress test, just as they would under normal circumstances. Comparing the light and dark columns in the chart below, it is also clear that the confirmation times increased for all fee categories during periods of network congestion.

The one anomaly to that trend was in the 45-50 satoshi/byte fee bucket. This is largely a result of the fact that a large portion of the data in this category was comprised of long chains of unconfirmed transactions broadcast to the network in rapid succession. These transactions not only overloaded the mempool, but also had an increased likelihood of being viewed as ‘spam,’ given their extremely low coin age, regardless of the attached fee. The visualization below illustrates those strings of unconfirmed transactions when they were in the mempool.

Interactive Chart of Mempool During Stress Test

A similar trend is evident when comparing transaction sizes and time to confirmation. Given that ~80% of the stress test transactions were in the 0-1 XBT category, the expected confirmation time is noticeably higher. Overall, confirmation time increased for transactions of all sizes as the network dealt with abnormally high volumes.

Block Sizes, Miner Limits and Hash Rate

During the 7.5 hour window of the stress test, there were a total of 30 blocks that were generated by miners. Interestingly, for a typical 7.5 hour window, one would expect roughly 45 blocks (10 minutes per block). While we are hesitant to assume causality, statistical analysis indicates the lower block count may well have been a direct result of the large number of transactions in each block.

Per the chart below, 67% of the blocks during the stress test window were at the 750kb block size category – the default in the standard bitcoin client software. This compares to 15% observed from Jan-May 2015 (per the analysis in Part 1 of this series). Notably, DiscusFish and Eligius appear to have higher block size caps at 1MB, the maximum/hard limit under the current protocol. Subsequent to the stress test, it appears BTCChina and AntMiner both reconfigured their limits to closer 1MB. In summary, it appears five of the top ten miners have limits at 750kb, with the remaining five are now in the 900-1,000kb range.

Lastly, it is interesting to note that 35% of blocks were not full during the stress test, and there was even a zero-transaction block, despite a mempool with a large number of transactions awaiting confirmation. The potential effect of this behavior by miners on overall capacity of the network is significant. The simulated capacity analysis from Part 4 of this series is then ultimately a best case scenario, with the evidence from the stress test pointing to even greater congestion caused by unfilled blocks.

A comparison of each miner’s hash rate during the stress test relative to the past hash rate trend (2015 YTD) reveals a few additional data points (see chart below). Notably, over the 7.5 hour period, BTCChina Pool found 20% of blocks, relative to its longer-term hash rate calculated at approximately 8% of the network. DiscusFish found only two blocks, implying a 7% hash rate (relative to 19% in normal conditions). It’s important to note the effects of randomness when hashing could lead to improbably skewed data. However, with 30 data points (generally considered a statistically significant sample size), the binomial probabilities of this occurring are worth highlighting as an indicator that this outcome is likely due to the impact of a variable beyond the calculable semi-randomness associated with hashing.

Concluding Thoughts

While the stress test had its limitations, the 7.5 hours of data revealed a number interesting takeaways:

When the network is at or near capacity, transaction confirmation times increases for all categories of fees and transaction values.

In a scenario where transactions are competing to be included in an upcoming block, it appears a higher fee results in comparably faster confirmation times (as per normal situations).

Five of the top ten miners currently have limits at 750kb, while the remaining five are now in the 900-1,000kb range.

Lastly, given the lower-than-expected number of blocks generated (30 over 7.5 hour period) and the discrepancies in miners’ hash rates, it appears plausible that one or more miners face an adverse impact on its hashing abilities during periods of network congestion.

While we applaud the community effort to test the boundaries of the network, we note that future stress tests would need to feature greater diversity and more standard transaction characteristics with regard to fee amounts, transaction sizes, transaction sources (varying coinage), and transaction scripts (multisig, P2SH, etc).

This analysis has been prepared in good faith on the basis of information available at the date of publication without any independent verification. Schvey, Inc. does not guarantee or warrant the accuracy, reliability, completeness or currency of the information in this presentation nor its usefulness in achieving any purpose. Readers are responsible for assessing the relevance and accuracy of the content of this publication. Schvey, Inc. will not be liable for any loss, damage, cost or expense incurred or arising by reason of any person using or relying on information in this publication. This analysis may not be duplicated, shared, or reproduced in its entirety or in part for any reason without the expressed written consent of Schvey, Inc.