Block Publication Incentives For Miners

Jul 29, 2016

Thanks goes to Jonathan HB for reviewing the math in an early draft; all mistakes still mine!

If I’m a miner and I find a block, why should I publish that block to the
world?

It’s a simple question, yet it turns out to be at the heart of Bitcoin’s
scalability problem, and security model. Here we’ll take a high-level,
theoretical, look at what incentives exist in the Bitcoin protocol to publish
blocks, and what those publication incentives mean for decentralization.

Decentralization Goals

We’ll discuss block publication incentives in terms of fairness between
different classes of miners; by fairness we mean the profit per unit hashing
power. In an ideal world miners/pools1 with a small
percentage of total hashing power will have the same per-unit revenue as
miners/pools with a large percentage of total hashing power. If that is not
true, larger miners are likely to get even bigger as they reinvest their
profits, and hashers are incentivised to move their hashing power to larger
pools over smaller ones, both centralization pressures.

In short, we want to design a protocol that incentivises what we want: a
diverse mining community without single points of failure. Of course, due to
overheads like running a full node mining will never be entirely fair but we’d
like to limit that unfairness as much as possible.

Why Publish At All?

The most basic incentive for miners to publish blocks is to get paid: assuming
buyers only accept coins from blocks that their full nodes have validated,
miners are forced to publish their blocks to bitcoin buyers to get paid for the
blocks they mine. Equally, with current blocksizes the cost to publish those
blocks widely is low enough that we can rely on altruistically run P2P nodes.

However, this isn’t by itself a direct incentive to publish blocks to other
miners; publish blocks to bitcoin buyers only needs to be done occasionally
when a miner sells their reward. Why should miners publish to other miners?

Majority Hashing Power

For a short-term rational miner with a majority of hashing power there isn’t a
clear incentive: they have the majority, so if they choose not to publish the chain
they’re working to other miners it always will eventually have more work than
the chain their competition is working on. When they publish their chain as
part of selling their coins their competitions’ chain is simply wiped out,
eventually pushing them out of business and allowing the majority miner to save
money by turning off some of their hashing power; if users complain about
reorgs the majority miner can simply publish more frequently, every time they
are in the lead.

Of course, such blatantly unfair behavior has a second order effect: the fact that
no-one else can mine is strong evidence that Bitcoin is now highly centralized,
and a rather dubious investment. This should significantly reduce the
exchange rate, but as we’ll see later such attacks aren’t necessarily easily
detected, or even clearly attacks.

Minority Hashing Power

For a minority miner, if other miners don’t build on top of their blocks they
won’t be a part of the most-work chain. So if they don’t publish, their
competition will create a longer chain in the long run, and when they wish to
sell their coins buyers will consider those coins invalid.

However, note how that incentive presumes that other miners need a block to
build on top of it - that’s not clearly true. At minimum, a valid empty block
can be created by a miner knowing only the previous block’s hash, height, and
median time, so long as the previous block is valid. Equally if the miner can
learn what modifications to make to the UTXO set - perhaps with a segwit block
sans witnesses - they can also collect transaction fees.

Delaying Block Publication

While a minority miner needs to publish their blocks to other miners at some
point, that doesn’t necessarily mean a minority miner has an incentive to
publish in a timely fashion. If we publish our blocks immediately, we
maximize our chances of our block ending up in the final consensus chain
(minimizing our stale rate2).

However it turns out that in many circumstances deliberately delaying
publication is to our advantage: while delaying increases our stale rate, it increases
our competitors’ stale rates even more, resulting in us finding more blocks
than them. In the short run this potentially increases our transaction fee
revenue; in the medium term (one to two weeks3) difficulty
adjusts downwards as the stale blocks are taken into account by the difficulty
adjustment leaving us ahead on fees, and no worse off for subsidy revenue; in
the long run we benefit if any of our competition is forced out of business.

This strategy was first discovered and initially analyzed by the Bitcoin
development community in 20104. It was later
rediscovered5 and more rigorously analysed by Eyal
and Sirer6, who gave their particular take on it the
name “Selfish Mining”.

For the sake of simplicity we’ll look at a somewhat simplified analysis
originally published7 by myself, which gets very
similar results to Sirer’s more rigorous work modulo the (as we’ll see later,
often unnecessary) active network attack they propose.

Temporary Block Withholding

Suppose we’re a miner with hashing power, and the rest of the network has
hashing power. We’ll assume propagation time is negligible; between the
time a miner publishes a block, and the time all other miners receive the
block, the chance of another miner finding a block is essentially zero. If we
find a block and do not propagate it immediately there are three possible
outcomes, with the following probabilities:

— We find the next block. We’re now two blocks ahead, so if our
competitors mine a block, we can publish our two blocks in response.
Since our chain is longer, their block (and work) will be wiped out by ours.

— Someone else finds the next block:

— We find the next block after that. Again, we’re two blocks
ahead, and if we broadcast our block the competitions’ block will be wiped out.

— They find the next block. They’re now two blocks ahead,
so the simplest thing to do is switch to their chain, throwing away our block.

In the first two outcomes, we’ve made our competitors waste work on blocks that
didn’t get into the chain; in the last outcome our work was wasted. Thus, if our
goal is to get more blocks than our competitors, we need the probability of the
first two outcomes to be higher than the last outcome:

This is a quadratic, with roots . One root is
out-of-range (), giving us:

In short, if we have more than hashing power the optimal strategy is
to withhold, basically because when we find a block, our competitors have to
find two blocks in a row to beat us

Non-Negligible Propagation Delays

What happens when block propagation time isn’t negligible? The simplest case
is the uniform, fully-connected, topology with all miners experiencing a
uniform propagation delay with respect to each other:

In this simplistic topology, when a miner finds a block they succeed in
collecting their reward only if another miner didn’t already find a block that
they did not already know about; we win so long as no block was found in the
time interval . Mining is a Poisson process, so for a block interval
that probability is:

For a propagation delay less than of the block interval, the
approximation introduces an error less than .

Let’s suppose we’re a miner with non-negligible hashing power . We will
not attempt any block withholding, and will immediately publish blocks when
we find them. When we find a block, the following outcomes are possible:

— No other block is found.

— We find another block while the first block is propagating.

– Another miner found a block first.8 We don’t however
switch to that block, as if we find another block we can still get rewarded for
both; switching guarantees that our block reward will be orphaned. The next block breaks the tie:

— We find another block before anyone else does, breaking the tie in our favour9.

— Someone else finds another block, breaking the tie against us.

We win in the first, second, and third cases, and lose in the last, which lets
us compute the (normalized) expected return of finding a block:

Let’s take the derivative of the expected return with respect to hashing power:

Note how this derivative is positive for . This means that miners
can always earn more money by centralizing: the fact that miners don’t
orphan their own block rewards gives them a head start. Put another way, all
else being equal, larger miners earn more money per unit hashing power than
smaller miners.

Similarly, we can show that larger miners’ stale rates are affected less by
propagation delays than smaller miners by looking at the derivative of the
expected return with respect to propagation time:

For a given propagation time increase , a miner is
about less affected by propagation delays as a miner, and a
miner less affected.

Small Miners

What about the expected return of negligible hashing power miners? For them the
existence of a large miner has a major change on their expected return: even if
no-one else found a block just prior, they’ll still lose if the large miner
finds a block before the small miner’s block gets to them, and then
subsequently finds a second block.

This means if we’re part of the small miners, when we find a block
the following outcomes are possible:

— Someone else found a block first. Since we have
negligible hashing power, the chance of us finding a subsequent block in time
is negligible, so we lose our block reward.

— No-one else found a block first. We’re now racing to
get our block to the large miner:

— The large miner finds a block before they get ours. Small and large miners are tied:

— Large miner finds a subsequent block, breaking the tie against us. We lose our reward.

— Large miner doesn’t find another block. A small miner may also find a block, but that’s irrelevant to us because we have the head-start10 so we win.

Thus our expected return is:

Let’s take the derivative with respect to the hashing power of the large miner:

Since , the derivative is always negative: as the larger miner’s
hashing power increases, the smaller miners revenue per block goes down. Of
course, this isn’t surprising: we already showed that larger miners make excess
super-linear returns; those returns have to come from somewhere.

Similarly, we can show that longer propagation times always negatively
impact small miners expected returns:

Again, since , and , the derivative is always negative
and longer propagation times decrease expected returns for small miners.

Expected Return Ratio

Mining is a zero sum game: since difficulty adjusts
downward if hashing power drops, pushing your competitors out of
business makes you more profitable in the long run. So let’s look at the
ratio between large and small miners of their respective expected returns,
.

We already know that increasing gives the advantage to the larger miner
at the expense of the smaller miners; what about increasing the propagation
delay ? By applying the product rule we see that:

Since , and all other terms are positive, increasing the
propagation delay always increases the large miner’s revenue relative to
the small miners.

Non-Uniform Propagation Delays

The real Bitcoin network isn’t uniform: some miners receive blocks faster than
others, in part for the simple reason that not all miners - or nodes - have the
same bandwidth. In addition, latency between miners isn’t uniform due to
speed-of-light delays between miners in different geographic locations (a very
good thing for decentralization!).

How do these non-uniform delays affect expected returns? Do those non-uniform
delays give some miners an advantage over others?

Suppose we’re a miner and we find a block. We’ll say that that set of miners
mining our block at time is , and the set of miners
who are not mining our block is , allowing us to define , the
fraction of hashing power in at a given moment:

We want to know if there exist such that the stale rate for miners in
is greater than .

As with our block withholding case, miners need to find one more block before
miners find two blocks to avoid their block reward getting orphaned; the converse is true
for miners. To simplify the analysis we’ll assume that the next block
miners find propagates approximately instantaneously - an optimization
that miners do in fact use, by mining empty blocks for the first few seconds
after a block is found. Similarly, we’ll assume the best-possible-case where
blocks propagate instantaneously. Since blocks propagate
instantaneously, once a block is found becomes a constant
function, as miners mine the most-work chain they see first.

Let , be the arrival times of the -th and
blocks, the total number of blocks found at time , and . We start at . The expected return for miners is the probability they will
find two blocks before finds their second block:

The expected return for miners is the probability they’ll either find one more
block before , or failing that, find a second block before finds their second block:

Let’s sanity check our work. We defined and as the expected
returns for the next block - the chance a block isn’t stale. Since
miners’ blocks propagate approximately instantly, any stales for should
decrease the stale rate for , resulting in an overall expected return
per block of one:

Secondly, if is constant, we’re effectively saying that block
propagation is instantaneous, and someone is doing a block withholding attack:

Sure enough, we get the exact same equations that we derived earlier for the
block withholding attack. Good!

Slow Blocks

Can miners influence the of the blocks they create? Absolutely! In
addition to simply withholding blocks, and DoS attacking relay nodes, miners
can create blocks that propagate slower than normal (“slow blocks”) in two main
ways:

Validation Time — Among other things it’s currently possible to exploit
flaws in the existing signature hashing algorithm by creating especially large
transactions with many inputs; this flaw been triggered both by deliberate
attacks, as well as by accident (F2Pool produced a number of UTXO “cleanup”
blocks that took dozens of seconds to validate11). However
miners exploiting these flaws are very obvious, and will likely be fixed sooner
or later with soft-forks (segwit’s signature hashing scheme fixes this12,
at least for segwit transactions).

(Non-Pre-Forwarded) Size — Moving bytes takes time. Schemes like compact
blocks and
FIBRE try to mitigate this by moving those bytes in
advance - mainly by taking advantage of pre-broadcast transactions - but those
schemes are trivially defeated by simply putting non-pre-forwarded data in
blocks; our defence is the upper limit imposed by the block size limit.

It’s important to remember that both techniques can be easily done with miners
with less hashing power than the threshold required for a block
withholding attack, so long as block propagation is non-uniform; slow blocks
are not isomorphic to block withholding attacks as they work with any amount of
hashing power.

It’s very easy for pre-forwarding defeats to happen by accident. For example,
every time the standard transaction rules are expanded by some nodes/miners,
miners who have not adopted the newly allowed transaction forms are potentially
put at a disadvantage. Similarly, transaction malleability can defeat
pre-forwarding, and malleability can be triggered both by attackers (who don’t
necessarily need any hashing power at all) as well as by mistake.

Even deliberate actions that defeat pre-forwarding aren’t always clearly
malicious “attacks”; pre-forwarding defeats often have plausible deniability.
For example, BTCC13 offers BlockPriority,
a service that expedites confirmations for BTCC customers: BTCC’s
pool will mine even zero-fee transactions if they either pay a BTCC customer,
or were generated by a BTCC customer. Zero-fee transactions aren’t reliably -
if at all - relayed on the P2P network, so blocks containing such transactions
are containing non-pre-forwarded data, slowing down propagation of BTCC’s
blocks. So long as propagation is non-uniform - and it is certainly is due to validationless
mining, as well as the Great Firewall of China - for BTCC’s blocks
will rise rapidly enough that the slower propagation times for the remaining
miners give BTCC a long-term advantage relative to their competitors.

Is BTCC attacking Bitcoin? Personally, rather than answer that question with a
“yes” or “no”, I’d rather answer with a Bitcoin protocol design where it
doesn’t matter what they do to maximize their revenue; Bitcoin mining shouldn’t
have negative externalities.

Footnotes

By “miner” we mean someone with hashing power who also has a full node that creates blocks; by “hasher” we mean someone with hashing power who does no block validation themselves, instead pointing their hashing power at a pool. ↩

The stale rate is the percentage of blocks found that do not end up in the main chain due to propagation delays - other miners finding blocks first. This is commonly referred to by the misnomer “orphan rate”, a term derived from the fact that in a stale block, the coinbase transaction has no valid “parent” block: it’s now a worthless orphan whose outputs can’t be spent. ↩

Note how in non-Bitcoin protocols with faster difficulty adjustment periods - such as Ethereum - the transition from subsidy revenue loss to revenue neutral happens much faster. ↩

It’s very unlikely that another miner will find more than one block first - particularly if we assume other miners have negligible individual hashing power - so it’s safe to ignore that case in this approximation. ↩

This analysis is slightly simplified, as we’re not taking propagation into account for the tie-breaking scenario. However as the probability of multiple ties in a row very quickly diminishes the simplification is insignificant. ↩

Remember that a small miner has a negligible chance of finding the two blocks in a row required to beat us. ↩