Bloom filters are one of those
simple and handy engineering tools that any engineer should have in their toolbox.

It is a space-efficient probabilistic data structure that represents
a Set and allows you to test if an element is in it.

They are really simple to construct and look like this:

The cool thing is that to achieve such space-efficiency Bloom filters allow
for errors with an arbitrarily small probability!

Remarkably, a Bloom filter that represents a set of $1$ million items with an
error rate of $0.01$ requires only $9585059$ bits ($~1.14 \text{MB}$)! Irrespective
of the length of each element in $S$!

There’re many kinds of Bloom filters.
We’ll cover the Standard Bloom Filter as
it is the basis for the others and also serves as a nice intro for more
advanced probabilistic set-membership data structures such as
Cuckoo Filter.

False-Negatives and False-Positives

If Contains($y$) returns False, clearly $y$ is not a member of $S$.
This property is why we say Bloom filters have zero false-negatives. A
situation where Contains($e$) is False and $e \in S$ simply can’t
happen (because otherwise all relevant bits were $1$).

If Contains($y$) returns True, $y$ may or may not be a member of $S$.
This is why we say that Bloom filters may have false-positives, because
a situation where Contains($e$) is True and $e \notin S$ can occur.

Arbitrarily Small Probability of False-Positive

The cool thing about Bloom filters is that based on $n$ (number of elements in
the set $S$) and a chosen probability of false positives $P_{FP}$ we can derive
optimal $k$ (number of hash functions) and $m$ (length of bit vector $B$).

Remarkably, a Bloom filter that represents a set of $1$ million items with a false-positive
probability of $0.01$ requires only $9585059$ bits ($~1.14 \text{MB}$) and $7$
hash functions. Only $9.6$ bits per element! ($\frac{9585059}{1000000}$).

Toy Implementation

The first thing that comes to mind is how exactly do we get a family of hash
functions $h_1, \ldots, h_k$ that are uniformly distributed?

Since our objective is learning and we are not aiming for efficiency, we can
simply build our family of hash functions on top of an existing
hash function as a primitive. Such as
SHA256.

Bitcoin’s blockchain is made out of a sequence of blocks. Each block contains
transactions and each transaction cryptographically instructs to transfer $X$
Bitcoins from a previous transaction to a Bitcoin address.

An ECDSA public key $Q$ is the outcome of multiplying a private key $d$ with
some known constant base point $G$. That is, $Q = d \times G$.

How does an ECDSA private key is generated?

An ECDSA private key $d$ is simply an integer that is preferably generated using a
cryptographically secure random number generator. Anyone that knows $d$ can
redeem Bitcoins that were sent to $A$.

What’s a Brainwallet address?

A Brainwallet is simply a Bitcoin address where its corresponding private key $d$
was generated using a mnemonic (!) rather then a secure random number
generator. One possible Brainwallet construction looks like:
$$
d = \text{SHA256(MNEMONIC)}
$$

Recall that we defined $P_1$ to be the probability that a specific bit is set to
$1$. This specific bit might be set to $1$ by either $h_1$ or $h_2$ or both.
That is, we search for the probability of the union of independent events
that are not mutually exclusive.

We substract $\frac{1}{m^2}$, since otherwise we “count” the event that both
$h_1$ and $h_2$ set the bit to $1$ twice. This is due to the Inclusion-Exclusion Principle.

As evident from above, if we continue this way we end up with a rather
intricate expression for $P_1$. For this reason, most derivations of the false-positive
probabilty use the complementary event to go around it.

Lets define $P_0$ to be the probability that a certain bit is not set to $1$.

If we knew $P_0$ we could easily compute $P_1$ and $P_{FP}$:

$$
P_{FP} = P_{1}^{k} = (1 - P_0)^{k}
$$

So, how do we calculate $P_{0}$ ?

Lets start with an empty Bloom filter $B$ again and add elements from $S$.

Because when calculating $P1$ we wanted the probability of the event that one
of the hash functions sets a specific bit. But, for $P_0$ we want the probability that all hash functions does not set a specific bit!