Notes to self.

This is an explanation of the key function on Proof-of-Work in the
Bitcoin blockchain. It focuses on the one feature of Proof-of-Work
that is essential and shows that other features often talked about
such as security are secondary side-effects, useful, but not
essential.

This explanation rests on illustrating a few interesting properties of
how Proof-of-Work is used in the blockchain that are not immediately
obvious and sometimes are rather counter-intuitive, for example how
participants collectively solve a problem without ever
communicating.

Having understood each of these properties, one should conclude that
Proof-of-Work is primarily a mechanism which accomplishes a
distributed and decentralized system of timing, i.e. a clock.

Note that this write up isn’t about Proof-of-Work per se, it explains how the
blockchain takes advantage of it. If you do not know anything about Proof-of-Work,
then this link might be a
good start.

The Decentralized Ledger Time Ordering Problem

Before describing the solution, let us focus on the problem. Much of
the literature around Proof-of-Work is so confusing because it
attempts to explain the solution without first identifying the
problem.

Any ledger absolutely needs order. One cannot spend money that has not
been received, nor can one spend money that is already
spent. Blockchain transactions (or blocks containing them) must be
ordered, unambiguously, and without the need for a trusted third
party.

Even if the blockchain was not a ledger but just data like a log of
some sort, for every node to have an identical copy of the blockchain,
order is required. A blockchain in a different order is a different
blockchain.

But if transactions are generated by anonymous participants all over
the world, and no central party is responsible for organizing the
list, how can it be done? For example transactions (or blocks) could
include timestamps, but how could these timestamps be trusted?

Time is but a human concept,
and any source of it, such as an atomic clock,
is a “trusted third party”. Which, on top of everything, is
slightly wrong most of time due to network delays as well as
the effects of Relativity.
Paradoxically, relying on a timestamp to
determine event order is not possible in a decentralized system.

The “time” we are interested in is not the year, month, day, etc. that
we are used to. What we need is a mechanism by which we can verify that
one event took place before another or perhaps concurrently.

First though, for the notions of before and after to be applicable, a
point in time needs to be established. Establishing a point in time
may seem theoretically impossible at first because there is no
technology accurate enough to measure a
Planck. But as you’ll
see, Bitcoin works around this by creating its own notion of time
where precise points in time are in fact possible.

Timing is the Root Problem

It must be stressed that the impossibility of associating events with points in time
in distributed systems was the unsolved problem that
precluded a decentralized ledger from ever being possible until
Satoshi Nakamoto invented a solution. There are many other technical details that
play into the blockchain, but timing is fundamental and paramount.
Without timing there is no blockchain.

Proof-of-Work Recap

Very briefly, the Bitcoin Proof-of-Work is a value whose
SHA-2 hash
conforms to a certain requirement which makes such a value difficult
to find. The difficulty is established by requiring that the hash is
less than a specific number, the smaller the number, the more rare the
input value and the higher the difficulty of finding it.

It is called “Proof Of Work” because it is known that a value with
such a hash is extremely rare, which means that finding such a value
requires a lot of trial and error, i.e. “work”. Work in turn implies
time.

By varying the requirement, we can vary the difficulty and thus the
probability of such a hash being found. The Bitcoin Difficulty adjusts
dynamically so that a proper hash is found on average once every ten
minutes.

Nothing Happens Between Blocks

The state of the chain is reflected by its blocks, and each new block
produces a new state. The blockchain state moves forward one block at
a time, and the average 10 minutes of a block is the smallest measure
of blockchain time.

SHA is Memoryless and Progress-Free

The Secure Hash Algorithm is what is known in statistics and
probability as memoryless.
This is a property that is particularly counter-intuitive for us humans.

The best example of memoryless-ness is a coin toss. If a coin comes up
heads 10 times in a row, does it mean that the next toss is more
likely to be tails? Our intuition says yes, but in reality each toss
has a 50/50 chance of either outcome regardless of what happened immediately
prior.

Memorylessness is required for the problem to be progress-free.
Progress-free means that as miners try to solve blocks iterating over
nonces, each attempt is a
stand-alone event and the probability of finding a solution is
constant at each attempt, regardless of how much work has been done in
the past. In other words at each attempt the participant is not
getting any “closer” to a solution or is making no progress. And a miner
who’s been looking for a solution for a year isn’t more likely to
solve a block at the next attempt than a miner who started a second ago.

The probability of finding the solution given a specific difficulty in
a given period of time is therefore determined solely by the speed at
which all participants can iterate through the hashes. Not the prior
history, not the data, just the hashrate.

The hashrate in turn is a function of the number of participants and
the speed of the equipment used to calculate the hash.

The SHA Input is Irrelevant

In the Bitcoin blockchain the input is a block header.
But if we just fed it random values, the probability of finding a
conforming hash would still be the same. Regardless of whether the
input is a valid block header or bytes from /dev/random, it is going
to take 10 minutes on average to find a solution.

Of course if you find a conforming hash but your input wasn’t a valid
block, such a solution cannot be added to the blockchain, but it is
still Proof-of-Work (albeit useless).

The Difficulty is Intergalactic

Curiously, the difficulty is universal, meaning it spans the entire
universe. We could have miners on Mars helping out, they do not need
to know, or communicate with the Earth miners, the problem would still be
solved every 10 minutes. (Ok, they’ll need to somehow tell the Earth
people that they solved it if they do, or else we’ll never know about
it.)

Remarkably, the distant participants are communicating without
actually communicating, because they are collectively solving the same
statistical problem and yet they’re not even aware of each other’s
existence.

This “universal property” while at first seemingly magical is actually
easy to explain. I used the term “universal” because it describes it
well in one word, but really it means “known by every participant”.

The input to SHA-256 can be thought of as an integer between 0 and
2256 (because the output is 32 bytes, i.e. also between 0
and 2256, anything larger guarantees a collision,
i.e. becomes redundant). Even though it is extremely large
(exponentially larger
than the number of atoms in the perceivable universe), it is a set of numbers that is known by every participant
and the participants can only pick from this set.

If the input set is universally known, the function (SHA-256) is
universally known, as well as the difficulty requirement is universally known,
then the probability of finding a solution is also indeed “universal”.

Trying a SHA Makes You a Participant

If the stated problem is to find a conforming hash, all you have to do
is to try it once, and bingo, you’ve affected the global hash rate,
and for that one attempt you were a participant helping others solve
the problem. You did not need to tell others that you did it (unless
you actually found a solution), others didn’t need to know about it,
but your attempt did affect the outcome. For the whole universe, no
less.

If the above still seems suspicious, a good analogy might be the
problem of finding large prime numbers. Finding the largest prime
number is hard and once one is found, it becomes “discovered” or
“known”. There is an infinite number of prime numbers, but only one
instance of each number in the universe. Therefore whoever attempts to
find the largest prime is working on the same problem, not a separate
instance of it. You do not need to tell anyone you decided to look for
the largest prime, you only need to announce when you find one. If no
one ever looks for the largest prime, then it is never going to be
found. Thus, participation (i.e. an attempt to find one), even if it’s
in total secrecy, still affects the outcome, as long as the final
discovery (if found at all) is publicized.

Taking advantage of this mind-boggling statistical phenomenon whereby
any participation affects the outcome even if in complete secrecy and
without success, is what makes Satoshi’s invention so remarkably
brilliant.

It is noteworthy that since SHA is progress-free, each attempt could be
thought of as a participant joining the effort and immediately
leaving. Thus miners join and leave, quintillions of times per second.

The Participation is Revealed in Statistics

The magical secret participation property also works in reverse. The
global hashrate listed on many sites is known not because every miner
registered at some “miners registration office” where they report
their hash rate periodically. No such thing exists.

The hash rate is known because for a solution of a specific difficulty
to be found in 10 minutes, on average this many attempts
(~1021 as of this writing) had to have been made by someone
somewhere.

We do not know who these participants are, they never announced that
they are working, those who did not find a solution (which is
practically all of them) never told anyone they were working, their
location could have been anywhere in the universe, and yet we know
with absolute certainty that they exist. Simply because the problem
continues to be solved.

Work is a Clock

And there is the crux of it: The difficulty in finding a conforming
hash acts as a clock. A universal clock, if you will, because there
is only one such clock in the universe, and thus there is nothing to
sync and anyone can “look” at it.

It doesn’t matter that this clock is imprecise. What matters is that
the this is the same clock for everyone and that the state of the
chain can be tied unambiguously to the ticks of this clock.

This clock is operated by the multi-exahash rate of an unknown
number of collective participants spread across the planet,
completely independent of one another.

Last Piece of the Puzzle

The solution must be the hash of a block (the block header, to be precise). As
we mentioned, the input doesn’t matter, but if it is an actual block,
then whenever a solution is found, it happened at the tick of our
Proof-of-Work clock. Not before, not after, but exactly at. We know
this unambiguosly because the block was part of that mechanism.

To put it another way, if blocks weren’t the input to the SHA256
function, we’d still have a distributed clock, but we couldn’t tie blocks to
the ticks of this clock. Using blocks as input addresses this issue.

Noteworthy, our Proof-of-Work clock only provides us with ticks. There
is no way tell order from the ticks, this is what the Merkle tree is
for.

What About the Distributed Consensus?

Consensus means agreement. What all participants have no choice but to
agree on is that the clock has ticked. Also that everyone knows the
tick and the data attached to it. And this, in fact, does solve the
Byzantine Generals Problem, as Satoshi explained in an email
referenced earlier.

There is a separate consensus in a rare but common case of two
consecutive ticks being associated with conflicting blocks. The
conflict is resolved by what block will be associated with the next
tick, rendering one of the disputed blocks “orphan”. How the chain will
continue is a matter of chance, and so this too could probably be
indirectly attributed to the Proof-of-Work clock.

And that is it

This is what Proof-of-Work does for the blockchain. It is not a
“lottery” where miners win the right to solve a block, nor is it some
peculiar conversion of real energy into a valuable concept, those
are all red herrings.

For example the lottery and the miner’s reward aspect is what
encourages miners to participate, but it isn’t what makes the
blockchain possible. Blocks are a Merkle tree, but again, that has
nothing to do with Proof-of-Work, it cryptographically reinforces
recording of the block ordering. The Merkle tree also makes the
previous ticks “more certain”, “less deniable” or simply more secure.

Proof-of-Work is also the mechanism by which blocks become effectively
immutable, and that’s a nice side-effect which makes Segregated
Witness possible, but it could just as well be done by preserving the
signatures (witness), so this too is secondary.

Conclusion

The Bitcoin blockchain Proof-of-Work is simply a distributed, decentralized clock.

If you understand this explanation, then you should have a much better
grasp of how Proof-of-Work compares to Proof-of-Stake,
and it should
be apparent that the two are not comparable: Proof-Of-Stake is about
(randomly distributed) authority, while Proof-of-Work is a clock.

In the context of the blockchain, Proof-of-Work is probably a
misnomer. The term is a legacy from the
Hashcash project, where it
indeed served to prove work. In the blockchain it is primarily about
verifiably taking time. When one sees a hash that satisfies the
difficulty, one knows it must have taken time. The method by which the
delay is accomplished is “work”, but the hash is primarily interesting
because it is a proof of time.

The fact that Proof-of-Work is all about time rather than work also
suggests that there may be other similar statistical challenges that
are time-consuming but require less energy. It may also mean that the
Bitcoin hashrate is excessive and that the Bitcoin clock we described
above could operate as reliably on a fraction of the hashrate, but it
is the incentive structure that drives up the energy consumption.

Figuring out a way to pace ticks with less work is a trillion dollar
problem, if you find one, please do let me know!

In a previous post I wrote
some initial thoughts on storing the blockchain in Postgres. It’s been
a couple of months and I’ve made some progress on the
import project. This post documents
the latest incarnation of the SQL schema used to store the
blockchain as well as thoughts on why it was decided to be this way.

Blockchain Data Structure Overview

A transaction consist of inputs and outputs. The inputs reference
outputs from prior transactions, which may include transactions in the
same block. When an output is referenced by an input, the output is
considered spent in its entirety, i.e. there is no way to spend a part
of an output.

When two different transactions’ inputs reference the same output, it is
considered a double spend,
and only one of the spending transactions is valid (the details
of how validity is determined are outside the scope of this write up).
While double-spends do imply that one of the transactions is invalid,
it is not uncommon for double-spends to exist at least for a period of time,
thus the database schema needs to allow them.

A transaction’s input value is the sum of its inputs and the output
value is the sum of its outputs. Naturally, the output value cannot
exceed the input value, but it is normal for the output value to be
less than the input value. The discrepancy between the input and the
output is the transaction fee
and is taken by the miner solving the block in which the transaction is included.

The first transaction in a block is referred to as the
coinbase. Coinbase is a special transaction where the inputs refer
to a (non-existent) transaction hash of all zeros. Coinbase outputs
are the sum of all the fees and the miner reward.

Curiously it is possible for the same coinbase transaction to be
included in more than one block, and there is at least
one case
of this in the blockchain. The implication of this is that the second
instance of such a transaction is unspendable. This oddity was
addressed by a change in the consensus which requires the block
height to be referenced in the coinbase and is since then no longer
possible (see BIP30).

The same transaction can be included in more than one block. This is
common during chain splits, i.e. when more than one miner solves a
block. Eventually one of such blocks will become orphaned, but there
is a period of time during which it is not known which chain is
considered “best”, and the database structure needs to accommodate
this. Chain splits also cause multiple blocks to have the same height
which implies that height alone cannot identify a particular block or
that it is unique.

With introduction of SegWit transactions also include witness
data. Witness is stored at the end of a transaction as a list where
each entry corresponds to an input. A witness entry is in turn a list,
because an input can have multiple signatures (aka witness). Presently
per-input witness list is stored in the input record as a BYTEA.

Row Ids and Hashes

In the blockchain blocks and transactions are always referred to
through their hash. A hash is an array of 32 bytes. While in theory
we could build a schema which relies on the hash as the record
identifier, in practice it is cumbersome compared to the traditional
integer ids. Firstly, 32 bytes is four times larger than a BIGINT and
eight times larger than an INT, which impacts greatly the amount of
space required to store inputs and outputs as well as degrades index
performance. For this reason we use INT for block ids and BIGINT for
transaction ids (INT is not big enough and would overflow in a few
years).

There is also an ambiguity in how the hash is printed versus how it is
stored. While the SHA standard does not specify the endian-ness of the
hash and refers to it as an array of bytes, Satoshi Nakomoto decided
to treat hashes as little-endian 256-bit integers. The implication
being that when the hash is printed (e.g. as a transaction id) the order of
bytes is the reverse of how it is stored in the blockchain.

Using integer ids creates a complication in how inputs reference
outputs. Whereas in the blockchain it is done entirely via a
transaction hash, here we need to also store the integer id of the
referenced transaction (prevout_tx_id). This is an easily
justifiable optimization, without it to lookup the input transaction
would require first finding the transaction integer id. The downside
is that during the initial import maintaining the hash to integer id
correspondence in an efficient manner is bit of a challenge.

Integers

Most integers in Core are defined as uint32_t, which is an unsigned
4-byte integer. Postgres 4-byte INT is signed, which presents us
with two options: (1) use BIGINT instead, or (2) use INT with the
understanding that larger values may appear as negative. We are opting
for the latter as preserving space is more important and for as long as
all the bits are correct, whether the integer is interpreted as signed
or unsigned is of no consequence.

Blocks

Blocks are collections of transactions. It is a many-to-many
relationship as multiple blocks can include to the same transaction. The
CBlockHeader
is defined in Core as follows:

Columns orphan, status, filen and filepos are from the
CBlockIndex
class which is serialized in LevelDb and not formally part of the blockchain.
It contains information about the file in which the block was stored
on-disk as far as Core is concerned. This information is only
necessary for debugging purposes, also note that it is unique to the
particular instance of the Core database, i.e. if you were to wipe it
and download the chain from scratch, location and even status of
blocks is likely to be different.

Note that the C++ CBlockHeader class does not actually include the
hash, it is computed on-the-fly as needed. Same is true with respect
to transaction id.

We also need a many-to-many link to transactions, which is the
block_txs table. Not only do we need to record that a transaction is
included in a block, but also its exact position relative to other
transactions, denoted by the n column:

The txid column is the transaction hash and should not be confused
with tx_id in other tables referencing the transaction. (“txid” is
what the transaction hash is typically called in code and
documentation).

The tx_id column is the transaction to which this output belongs, n
is the position within the output list.

The spent column is an optimization, it is not part of the
blockchain. An output is spent if later in the blockchain there exists
an input referencing it. Core maintains a separate LevelDb dataset
called the UTXO Set (Unspent Transaction Output Set) which contains
all unspent outputs. The reason Core does it this way is because by
default it does not index transactions, i.e. Core actually does not
have a way of quickly retrieving a transaction from the store as there
generally is no need for such retrieval as part of a node operation,
while the UTXO Set is both sufficient and smaller than a full
transaction index. Since in Postgres we have no choice but to index
transactions, there is no benefit in having UTXOs as a separate table,
the spent flag serves this purpose instead.

The UTXO Set does not include any outputs with the value of 0, since
there is nothing to spend there even though no input refers to them
and they are not technically spent.

Inputs

An input in Core is represented by the
CTxIn
class, which looks like this:

The COutPoint class is a combination of a hash and an integer
representing an output. CScriptWitness is an array of “witnesses” or
(roughly speaking) signatures, which are byte arrays, just like the
scriptSig.

As we already mentioned above witness is stored as opaque bytes. The
prevout_tx_id is the database row id of the transaction this input
is spending.

Indexes and Foreign Key Constraints

Blocks and transactions are indexed by id as their primary index.
Blocks also need an index on hash (unique), as well as on height
and on prevhash (not unique). Transactions need a unique index on
the txid.

Inputs and outputs need (tx_id, n) as primary indexes. Inputs are
also indexed on (prevout_tx_id, prevout_n) so that we can quickly
identify the spending input given an output.

Finally, we need a basic set of foreign key constraints that ensure
the integrity between all the related tables.

Triggers

The spent column in the output and the prevout_tx_id of an input
are maintained by a trigger on the txins table. Every time an input
is inserted, it locates the database id of the transaction it spends
as well as updates the spent flag of the corresponding output.

Technically it is done using two triggers for performance
reasons. This is because a trigger that modifies the row being
inserted must be a BEFORE trigger, but BEFORE triggers are not allowed
to be to be CONSTRAINT triggers. CONSTRAINT triggers have the
advantage of being deferrable, i.e. they can be postponed until
(database) transaction commit time. Deferring constraints can speed up
inserts considerably, for this reason the code that updates spent is
in a separate AFTER trigger.

The trigger code is still rough around the edges, but here it is for
posterity anyway:

Identifying Orphaned Blocks

While this is not part of the schema, I thought it would be
of interest to the readers. An orphaned block is a block to
which no other prevhash refers. At the time of a chain split we
start out with two blocks referring to the same block as previous, but
the next block to arrive will identify one of the two as its previous
thereby orphaning the other of the pair.

To identify orphans we need to walk the chain backwards starting from
the highest height. Any block that this walk does not visit is an
orphan.

In SQL this can be done using the WITH RECURSIVE query like so:

12345678910111213141516171819202122

UPDATEblocksSETorphan=a.orphanFROM(SELECTblocks.id,x.idISNULLASorphanFROMblocksLEFTJOIN(WITHRECURSIVErecur(id,prevhash)AS(SELECTid,prevhash,0ASnFROMblocks-- this should be faster than MAX(height)WHEREheightIN(SELECTheightFROMblocksORDERBYheightDESCLIMIT1)UNIONALLSELECTblocks.id,blocks.prevhash,n+1ASnFROMrecurJOINblocksONblocks.hash=recur.prevhash%s)SELECTrecur.id,recur.prevhash,nFROMrecur)xONblocks.id=x.id)aWHEREblocks.id=a.id;

The WITH RECURSIVE part connects rows by joining prevhash to hash,
thereby building a list which starts at the highest hight and going
towards the beginning until no parent can be found.

Then we LEFT JOIN the above to the blocks table, and where there is
no match (x.id IS NULL) we mark it as orphan.

Conclusion

Devising this schema was surprisingly tedious and took many trial and
error attempts to reimport the entire blockchain which collectively
took weeks. Many different variations on how to optimize operations
were attempted, for example using an expression index to only index
a subset of a hash (first 10 bytes are still statistically unique),
etc.

I would love to hear comments from the database experts out there. I’m
not considering this version “final”, there is probably still room for
improvement and new issues might be discovered as I progress to
writing up how to insert new blocks and actually verify blocks and
transactions.

The Missing Functionality

Support for Variable Length Integer
used in the blockchain and more generally the binary encoding of a transaction or its components.

Elliptic Curve Signature. Even
though postgres integrates with OpenSSL, which has that functionality, there is no way to call
the EC functions.

Ability to parse and evaluate Bitcoin script. This is a biggie,
as transaction verification requires it, and it is one of the more complex and bug-prone
aspects of Bitcoin.

It is also important that all of the above be performant. Even though
varints, script and even elliptic curve could be implemented in plain
PL/pgSQL,
it probably wouldn’t be fast enough for practical use. Which leaves us with the only possible option:
a C extension.

Avoid Reinventing the Wheel

Anything is possible in C, but can we avoid having to reimplement it
from scratch? Are there libraries that could be leveraged?

As it is now, the Bitcoin protocol is primarily specified by its
source code, and the source of all truth is the Bitcoin Core.
It is possible to use C++ in PG
extensions, which means at least in theory the Bitcoin Core code could be leveraged somehow.

My initial conclusion is that this would be a daunting task. Bitcoin
Core code requires at least C++11, as well as Boost. It also seems
that the core code assumes its own specific storage and caching mechanism and
isn’t easily abstracted away from it. Not to mention that using C++
libs from Postgres has complexities of its own.

I looked around for a plain C implementation of Bitcoin and found a few
rather incomplete ones. The most functional one seems to be Jeff Garzik’s
picocoin. With the looming
Segwit2x fork
and all the controversy surrounding it this may seem like an odd
choice of a library, but for the purpose of what we are doing, I think
it’s fine. It also seems like Picocoin isn’t actively developed,
which is not great. I would very much appreciate opinions/advice on this, if
you know of a better C lib, do leave a comment.

The C extension

Thanks to this excellent series of posts
and Postgres’ superb documentation, I was able to put together a proof-of-concept extension,
available at https://github.com/blkchain/pg_blkchain.
While the C internals of it would be subject for a whole separate post (or
few), suffice it to say that it is fairly rudimentary and all the
heavy lifting is delegated to the picocoin lib.

get_vout(tx bytea) Similarly to get_vin(), an SRF that returns outputs.

parse_script(script bytea) An SRF which parses a Bitcoin script and returns (more or less) human-readable rows.

verify_sig(tx bytea, previous_tx bytea, n int) Verifies a specific input of a transaction (denoted by n),
given a the previous transaction to which the input refers. Returns a boolean.

This is hardly enough to support all of what would be required by a
full node, but this is sufficient to do some interesting stuff.

Note that the function names and signatures are not final, this is a
work in progress and I expect this all to evolve and change. For
example, initially I implemented get_vout() which returned an array,
but in the end an SRF seemed like a more flexible approach.

The Schema

In the last post I used separate tables for the transaction, inputs
and outputs. With the ability to serialize/deserialize transactions at
our disposal, there are more interesting options.

The most compact way to store transactions is to just use the
serialized binary form in a binary (bytea) column. We can get at any
particulars of it by using our functions.

The examples below are based on a single table created as

1234

CREATETABLErtxs(idBIGINTNOTNULL,txBYTEANOTNULL);

I imported the first 100K blocks or so into this table, how it was
done I might describe in a separate post.

I’ll introduce the extension with my favorite example: the decoding of the
signature of the genesis block input:

Now if we want to be able to look up transactions quickly by the
transaction hash, as is the convention, we can create an expression
index like so:

1

CREATEINDEXONrtxs(digest(digest(tx,'sha256'),'sha256'));

When we do this, PostgreSQL scans the entire table, computes the hash
and stores it in the index. An index, after all, is just another table
(of sorts), and there is nothing wrong with indexes containing values
that do not exist in the table to which the index refers.

Once we do this, any time the expression digest(digest(tx, 'sha256'), 'sha256')
is used in reference to the rtxs table, PostgreSQL will not execute
the digest() function, but would instead use the value stored in
the index.

--Note: this is obviously not the full blockchainSELECT(parse_script(scriptpubkey)).op_sym,count(1)FROM(SELECTscriptpubkeyFROMrtxouts)xGROUPBYop_symORDERBYcount(1);op_sym|count----------------+---------OP_NOP|5OP_DUP|1007586OP_EQUALVERIFY|1007586OP_HASH160|1007586OP_PUSHDATA|1139431OP_CHECKSIG|1151434(6rows)

Anyway, that’s it for now. Please comment your questions/comments
below, or via twitter, I am very curious on what people think on this
approach!

In theory, Postgres should be able to verify transactions and blocks,
as well as do a lot of other things that are currently only done by
full nodes. For this to be performant, it will most likely require an
extension written in C, but I’m curious how far we can get with bare
bones Postgres.

More importantly, would that actually be useful? A node is really
just a database, a very efficient one for a very specific purpose, but
would leveraging the full power of Postgres be somehow more beneficial
than just running Bitcoin-Qt or btcd, for example?

To get to the bottom of this would be a lot of work, and potentially a
lot of fun. It would also be a great blockchain learning exercise. (If
you’re working on a PG extension for Bitcoin or more generally
blockchain, please do let me know!)

Random Thoughts

The structure of the Bitcoin blockchain is relatively simple. We have
transactions, which in turn have inputs and outputs and belong
to blocks. Four tables, that’s it.

I’ve been able to import the whole blockchain with some fairly basic
Go code into my old Thinkpad running Linux overnight. The Go code
needs some more polishing and is probably worthy of a separate write
up, so I won’t get into it for now. Below is the schema I used. I
intentionally left out referential integrity and indexes to keep it
simple and avoid premature optimization.

There are a couple projects out there that keep the blockchain in a
database, most notably
Abe. I haven’t studied
the code very carefully, but my initial impression was that Abe tries
to use standard SQL that would work across most big databases, which
is philosophically different from my objective of going 100% Postgres
and leveraging all that it can do for us.

Bitcoin uses a lot of uint32’s. A Postgres INT is the correct size,
but it is signed, which means we have to use the next larger type,
BIGINT. It seems like it might be a waste to use 64 bits for a 32-bit
value, but I couldn’t think of anything better than a BIGINT. For the
binary stuff it seems like BYTEA is the best match.

So what can we do with this? There is no easy way to create or verify an
Elliptic Curve signature
in Postgres, but with the help of the pgcrypto
extension, we should be able to at least generate the correct SHA256
digest which is used in the signature. As a side note, EC signature math is actually
remarkably simple and could probably be implemented
as a PG function, but I’m too lazy. Here it is in a
few lines of Python.

For the purposes of this exercise, I’d just be happy with a value that
matches, even if the code does not fully comply with the Bitcoin rules.

One problem I ran into was that, because Bitcoin blockchain is
little-endian except for where it isn’t, you often need a way to
reverse bytes in a BYTEA. Strangely, Postgres does not provide a way
to do that, unless I’m missing something. But thanks to
stackoverflow,
here is one way to do this:

The particular transaction we are looking at is
this.
It happens to have id of 37898 in my database. In case you’re
wondering, for this example I used a subset of the blockchain which
only has the first 182,000 blocks. On the full blockchain and without
indexes, this statement would have taken an eternity to execute.

What makes this particular transaction interesting is that it has two
inputs, which is slightly trickier, because to spend them, there need to
be two different signatures of the same transaction. This is because
before signing, the input scriptSig needs to be replaced with the
output’s scriptPubKey (the oversimplified version). This is reflected in the SQL
in the use of LATERAL and CASE.

You do not have to take my word that the two hashes are correct, we
can verify them fairly easily with a bit of help from the Python ecdsa
library. Here is the code to verify the second hash. The key and the
signature are in the
transaction itself.

As of Sep 28, 2017, according to blockchain.info the hashrate is:
9,214,860,125 GH/s.

These days it seems that the best miner available for sale is the
AntMiner S9. It is actually over a year old, and there are faster and
more energy efficient ASICs now, e.g. BitFury, but it is very hard to
get any information on those, so we will just use the S9 information.

The S9 is capable of 12,930 GH/s. The collective Bitcoin hash rate is
equivalent to 712,672 S9 miners running in parallel.

An S9 uses 1375W, which means that in 1 hour it consumes 1.375 kW/h.

In USA, a kWh costs $0.12 on average. (It can be
as low as 0.04, according to this EIA chart.)

At 12c per kWh a running S9 costs $0.165 per hour.

712,672 running S9’s would cost $117,591.02 per hour.

Bitcoin blocks are solved at 6 per hour on average. Thus, each block
costs $19,598.50 to solve.

The current mining reward is 12.5 BTC, which gives us the answer:

At \$0.12 kW/h a Bitcoin costs \$1,567.88 to mine.

At \$0.04 kW/h a Bitcoin costs \$522.62 to mine.

This, of course, does not include hardware and other costs.

It’s quite likely that the largest mining operations pay even less
than $0.04 for electricity and the hardware they use is many times
more efficient.

While grosly inaccurate, this shows that mining is quite profitable,
and that Bitcoin price would have to fall a lot for mining to stop
being profitable.

Looking at the Trend

Current difficulty is 1103400932964, The difficulty before that was
922724699725.

Difficulty adjusts every 2,016 blocks, which is about two weeks.

The Difficulty number is a coefficient of the “difficulty 1 target”,
i.e. where the hash has to begin with 4 zero bytes (32 zero bits). It
means is that it is N times harder than “1 target”.

We can see that at the last adjustment it went up by 180,676,233,239,
or 16%, which is quite a bit in just two weeks. The last adjustment
before that was from 888171856257, or 4%.

Assuming that the only miner in the world was the S9, the difficulty
adjustment can only be explained by more S9’s coming online. The
number of S9’s online is directly proportional to the hashrate, which
is directly proportional to the difficulty. Thus there is a direct
relationship between energy cost and the difficulty.

When the difficulty was 922724699725 (Sep 6 through 16), the hash rate
was at about 8,000,000 TH/s, or equivalent of 618,716 S9’s. At that
difficulty and the 12c kW/h price, a BTC cost $1,361 to mine.

Now let’s look back at the world before the S9, which uses the Bitmain
BM1387 16nm ASIC. Before the S9, there was S7, based on BitMain BM1385
28nm ASIC. The S7 power consumption is roughly same as S9, or let’s
assume it is for simplicity, but it is only capable of 4,000 GH/s.

Back at the end of 2015 when S7 was announced, the hashrate was at
around 700,000,000 GH/s, or equivalent to 175,000 S7’s. That cost
\$28,875 per hour, or \$4,812.5 per block. The block reward was 25
Bitcoins then, so a Bitcoin would cost only \$192 to mine. (With a 12.5
reward it would have been \$385).

This is all very confusing, but we can see that faster hardware and
more of it drives the cost of mining up and the rlationship between
the difficulty and the cost of mining a Bitcoin is linear. Faster
hardware enables higher hash rate at improved energy efficiency, and
the difficulty adjusts to keep the rate of blocks and supply of new
BTC at 10 minutes.

The cost factor behind Bitcoin is energy, and spending more energy on
mining makes a Bitcoin more expensive and less profitable. However, a
more energy-expensive Bitcoin is a more sound/secure Bitcoin from the
cryptographic perspective, which means it is likely to go up in USD
price, and thus should still be profitable for the miners. This is a
very interesting factor here, because if the BTC/USD price wasn’t
going up, the miners would be bitter enemies and would do everything
possible to prevent more miners from coming online. The rise of the
BTC/USD price is what justifies as positive more miners coming
online. So far we have not seen any news reports of mining facilities
being sabotaged, which probably means miners are not enemies.

I will need to think on this some more as there are a lot of moving
parts. But if I can make a cursory conclusion here, it is that
(industrial) mining is and will remain very profitable for some time.

In part 1 I explained
how money has always been a global ledger and Bitcoin is just a
different implementation of one. The million dollar question remains,
what should Bitcoin be worth in a currency we’re more familiar with,
such as USD?

Asset Pricing

To illustrate the dilemma we’re faced with, lets look at three types
of assets and how we price them.

Stocks: price is determined by the present and future profits, which
is relatively straight forward.

Commodities (e.g oil): price is a function of supply (oil being
produced) and demand (oil burned in engines or whatever).

Store of value: this is anything that is bought because it keeps its
value. Most commonly it is precious metals like gold, but it can also
be fiat currency or valuable works of art. This category is most
fitting for Bitcoin. Pricing of a store of value is strangely
arbitrary, I attempt to explain it below.

Speculative Demand

There are two kinds of demand. Actual demand is based in our
everyday needs. For example we rely on combustible engines which
consume oil. Engines burn oil (or its byproducts) converting it to
exhaust gases, at which point it is no more. The more we drive, the
more oil we burn, the higher the demand.

The second kind of demand is speculative. It is based on the
expectation of a future price change. Speculators buy assets they
expect to go up in price and sell those they don’t. When everyone
wants to buy oil because they think the price is going up, its price
does indeed goes up. But that is not directly related to the actual
supply of oil from the ground (it often is, but not always).

Right Price

In part 1 we covered how
a sale is actually a loan, and how the seller ends up with tokens
representing the value owed to the seller. The number of tokes (aka
price) is a reflection of how we value things relative to each other.

When we buy stuff for everyday use we establish a price range that is
driven by, for lack of a better term, common sense. For example we
may think that a loaf of bread is worth a dozen eggs. If the price of
something exceeds common sense, we will forego buying it. This means
that the price has a direct effect on demand especially when it comes
to everyday consumption items such as food or fuel.

Speculators have no respect for common sense and the right price. They
are only concerned with the trend. It is possible for the speculative
demand to drive the price way above the common sense level, we saw
that when “peak oil” was a thing. But the price will eventually
gravitate towards the common sense price.

Store of Value Price

In contrast, price for a store of value is purely speculative, which
means the sensibility of the price does not apply.

Let’s take gold, for example. Intuitively we might reason that there
is a (non-speculative) supply and demand for gold, but it’s actually
illusory. The annual production of gold is minute compared to the
total gold above ground, which means there is essentially no
supply. There is also next to no demand, because gold cannot be
consumed. There is never less or more gold available in the world, its
quantity is fairly constant. Yet the price fluctuates. The only
explanation for this is speculation.

There is simply no such thing as the “right price” for store of
value. If I want to move a million dollars into gold, the price of
gold is of no consequence to me, be it a thousand or a million dollars
per ounce. So long as I know that it is stable, it is a good store of
value.

The good news here is that no price is too high (or too low) for
Bitcoin. 4K only seems high because a year ago it was 400. We tend to
judge the price based on history, and there is good sense in that,
indeed what goes up in value too much too fast often subsequently
corrects.

Importance of Market Cap

Market capitalization is the price of all of the asset available in
the world. It’s easy to compute the market cap for a stock because we
know the number of shares outstanding. There is no such thing as a
market cap for a commodity because it is continuously produced and
consumed. When it comes to something like gold, we can estimate the
market cap because we know approximately how much physical gold is
above ground. Bitcoin, like gold, has an approximate market cap
(approximate because it is not possible to know how much BTC has been
lost).

Market cap size is critical for adoption of a store of value. It needs
to be large enough to “fit” even very large amounts of fiat, ideally
without affecting the market. Gold market cap is estimated at 7
trillion USD, which means that even the richest people can move all
their assets into gold and not move the market. (At least one at a
time. All of them at once will move the market big time).

Bitcoin market cap of about 70B USD is not large enough for even one
of the richest people on the planet. This implies that if the market
cap does not grow, Bitcoin is likely to fail as store of value.

Hash Rate

What sets Bitcoin apart from all other crypto currencies is its
extremely high hash rate. This means that a Bitcoin is orders of
magnitude more “precious” than any other crypto coin presently in
existence.

There is a definite correlation between the Bitcoin hash rate and the
price. Some people argue that hash rate follows price, not the other
way around, and it’s probably true.

Bitcoin’s high hash rate is what makes it the best store of value
among crypto coins today.

Adoption

Ultimately, I believe adoption is the most important factor in Bitcoin
USD price. Greater adoption will increase the number of speculators
willing to own Bitcoin, it will drive the price up, and hopefully
bring it to a level comparable to that of gold.

The key to adoption is not ease of payment or volume of transactions
like we used to think until very recently. The key to adoption is
understanding of the mathematics behind Bitcoin. With all the hype
surrounding it, only remarkably few understand how sound Bitcoin
actually is. In many ways it is more sound than any other store of
value known, including gold.

Regardless of whether Bitcoin becomes de facto digital gold or not, we
are witnessing a historic transformation possibly bigger than the
Internet itself.

The fundamental question about Bitcoin is not whether it is sound from
the cryptography standpoint. The question is: what is it?

Money is Debt Ink

To define Bitcoin we need to look back at the history of money. The
earliest money was in the form of things that were scarce and
impossible to falsify, something like specific kinds of sea shells.
Everyone knew that stuff could be traded for these tokens.

Once such monetary tokens were invented, we no longer needed to decide
what to barter right there and then, we could postpone the
decision. One could sell milk for tokens, then use those tokens to
buy a spear later, this way the milk didn’t spoil while waiting for
the spear to be made.

What is not very obvious is that the tokens represented debt. A
sale is really a loan in disguise. Before the sale, the seller had
milk. After the sale, the seller had tokens, which are proof that
value of tokens is owed to the seller. In other words, the tokens
received for the milk sold were a record of debt. Tokens are the
ink in which this record is written.

It is noteworthy that there is no money if there is no debt, or that
money implies debt. It’s a simple principle that so few understand.

World-Wide Debt Ledger

The best way of thinking about money is that it is the medium in
which we maintain a world-wide record of debt. The entries in this
book or ledger are written as physical tokens. Only the people in
possession of the tokens actually know how much they have and there is
no history, only the final state. The history exists only in the
minds (or records) of the traders. It is very private.

Gold Ink

Later people started using rare metals such as gold or silver as
money. Metals were better than sea shells because they were
divisible. We could now make arbitrary size tokes we called (coined?)
coins.

Although we intuitively think that gold has a lot of value, in reality
it has very little. Gold does not feed us or keep us warm. It does
have some unique properties, but back when we started using gold as
money we couldn’t possibly appreciate those, other than perhaps gold
being pretty and extremely durable.

Gold is also rare. But rarity does not imply value. The sea shells
were worthless before they were used as money, and they are worthless
now, yet they too are rare.

Banks, Paper and Records of Records (of Records)

But it turned out that keeping valuable tokens was difficult, they
could be lost or stolen, and worse, people were willing to kill for
them. And so we decided to keep them all safe in one place. This
was the original bank.

The bank issued paper notes that corresponded to the gold in the
vault. Now these paper notes could be traded for anything. This was
because people knew that even though the paper is worthless, it
represents gold that is in the bank. At any time one could go to the
bank, give the bank the paper note and receive gold (at which point
the bank would destroy the paper note because the debt is settled).

A paper note is a record of the record of debt. The true record was
in gold, paper was a copy. It’s a bit of a mind-twister, but humans
have become really good at rewriting the original debt ledger in other
mediums.

Ironically the concept of the bank as a safe vault never really
worked: people were willing to steal and kill for paper money just the
same. These days bank vaults keep paper notes as if its gold. And the
bank’s computer keeps a record of the record of the record of debt.

Real Estate Ink

At some point bankers realized that they can manipulate the monetary
supply because only the bankers actually knew how much gold they
had. It was done “for the good of the public” who could get easier
loans, but it was also an easy way for the banks to make money out of
nothing.

Eventually it was decreed that not just gold, but anyhting could be
similarly held by the bank so that vastly more paper notes could be
issued. Most notably real estate, the arrangement of issuing paper
notes for a house being known as a mortgage. And since a house cannot be
placed into the vault, it too had to be recorded, creating yet another
layer of abstraction. It all ended with collateralized debt
obligations, credit default swaps and ultimately the 2008 subprime
mortgage crisis. Next year Bitcoin was born…

Monetary System is Just a Ledger

The bottom line remains: we kept a legder. The recording medium was
precious metals, then evolved to paper and metals, and finally when we
went off the gold standard it became just paper reflecting value of
arbitrary things held under lien as collateral.

Enter Blockchain

The Bitcoin blockchian is also a medium for this legder, only
instead of relying on scarcity of precious metals, the scarcity is in
the mathematical complexity of a problem.

And this is where our minds begin to play tricks on us, because this
is a concept previously unknown to humans. A Bitcoin, which takes an
enormous amount of computational power to generate, is actually,
really scarce. Yes, it is not physical, it is just “knowlegde” or
“information”, but by all laws of nature it is scarce, in fact more
scarce than gold, the total amount of which in the universe isn’t
fully known.

But Bitcoin is just an Agreement?

Interestingly, Bitcoin is merely an agreement and one might argue that
some day we can collectively decide to increase the 21 million limit
thereby diluting Bitcoin value. But can we actually do that, or will it
not be Bitcoin at that point? I think only time will
tell.

We do already have a lot of things that we agree on and we don’t
really question how it happened. The aforementioned shells were
collectively agreed upon. We agree on what the current date is, does
it matter how it happened? In fact, much of what the world is, just
is, including the fiat money (where “fiat” literally means “let it
be done”). And so now Bitcoin just is.

The sea shells ceased to exist as money in favor or precious metals,
and it is likely that same will at some point happen with Bitcoin.

History does show that when it comes to money, people show their worst
traits. This is why countries with solid currencies have big armies
and police, and very strict laws regarding manipulation of money. This
is how “fiat” actually works.

Amazingly, the mathematic principles on which Bitcoin is based do not
need to be defended. No army in the world could ever change a single
prime number.

Alternative Realities

The name Bitcoin refers to a specific blockchain. There can be many
like it. The name could have been different, the parameters of the
algorithm could have been different, just like the dollar bills could
have been blue. There are other cryptographic currencies, and they are
different, they too now exist. (Caveat: some of them are
mathematically bogus).

One could argue that gold exists in nature, while Bitcoin was created
by man, and thus gold is somehow more real. But Bitcoin rests on the
mathematical principles that too are just part of this universe, they
were not created by man, they were discovered and applied, and again
in this sense Bitcoin isn’t much different than gold.

The Mystery of Value

The mystery to me is how we collectively set a value of things like
gold or Bitcoins. Now that we’ve demonstrated that as money, they are
equivalent. Why is an ounce of gold worth $1300? Who decided that? The
market? Is the real value of it in how good of an ink it is in the
world-wide debt ledger?

It’s been a while since I’ve written on
Tgres,
here’s a little update, Independence Day edition.

Current Status

The current status is that Tgres is looking more and more like a
finished product. It still needs time and especially user testing (the
ball is in your court, dear reader), because only time reveals the
weirdest of bugs and validates stability. I would not ditch your
current stack just yet, but at this point you’d be remiss not
having given Tgres a spin.

Recently I had an opportunity to test Tgres as a mirror replica of a
sizable Graphite/Statsd/Grafana set up receiving approximately 10K
data points per second across more than 200K series, and the results
were inspiring. Tgres handled the incoming data without breaking a
sweat on “hardware” (ec2 instances, rather) that was a fraction of the
Graphite machines while still outperforming it in most respects.

I’d say the biggest problem (and not really a Tgres one) is that
mirroring Graphite functionality exactly is next to impossible. Or,
rather, it is possible, but it would imply purposely introducing
inaccuracies and complexities. Because of this Tgres can never be a
“drop in” replacement for Graphite. Tgres can provide results that are
close but not identical, and dashboards and how the data is interpreted
would require some rethinking.

What’s new?

Data Point Versioning

In a round-robin database slot values are overwritten as time moves
forward and the archive comes full-circle. Whenever a value is not
overwritten for whatever reason, a stale value from an obsolete
iteration erroneously becomes the value for the current iteration.

One solution is to be diligent and always make sure that values are
overwritten. This solution can be excessively I/O intensive for sparse
series. If a series is sparse, then more I/O resources are spent
blanking out non-data (by setting the value to NaN or whatever) than
storing actual data points.

A much more efficient approach is to store a version number along with
the datapoint. Every time the archive comes full-circle, version is
incremented. With versions there is no need to nullify slots, they
become obsoleted by virtue of the data point version not matching the
current version.

Under the hood Tgres does this by keeping a separate array in the ts
table which contains a smallint (2 bytes) for every data point. The tv
view is aware of it and considers versions without exposing any
details, in other words everything works as before, only Tgres is a
lot more efficient and executes a lot less SQL statements.

Zero Heartbeat Series

Tgres always strives to connect the data points. If two data points
arrive more than a step apart, the slots in between are
filled in
to provide continuity. A special parameter called
Heartbeat
controls the maximum time between data points. A gap greater than the Heartbeat is
considered unknown or NaN.

This was a deliberate design decision from the beginning, and it is not changing.
Some tools choose to store data points as is,
deferring any normalization to the query time. Graphite is kind of in
the middle: it doesn’t store the data points as is, yet it does not
attempt to do any normalization either, which ultimately leads to
inaccuracies which I describe in another
post.

The concept of Heartbeat should be familiar to those experienced with
RRDTool, but it is unknown to Graphite users which has no such
parameter. This “disconnected” behavior is often taken advantage of to
track things that aren’t continuous but are event occurrences which can
still benefit from being observed as a time series. Tracking
application deploys, where each deploy is a data value of 1 is one
such example.

Tgres now supports this behavior when the the Heartbeat is set to
0. Incoming data points are simply stored in the best matching slot
and no attempt is made to fill in the gap in between with data.

Tgres Listens to DELETE Events

This means that to delete a DS all you need to do is run DELETE FROM
ds WHERE ... and you’re done. All the corresponding table rows will
be deleted by Postgres because of the foreign key constraints, and the
DS will be cleared from the Tgres cache at the same time.

This is possible thanks to the Postgres excellent
LISTEN/NOTIFY
capability.

In-Memory Series for Faster Querying

A subset of series can be kept entirely in memory. The recent testing
has shown that people take query performance very seriously, and
dashboards with refresh rates of 5s or even 1s are not unheard
of. When you have to go to the database to answer every query, and if
the dashboard touches a hundred series, this does not work too well.

To address this, Tgres now keeps an in-memory cache of queried
series. The cache is an LRU
and its size is configurable. On restart Tgres saves cache keys and loads the series
back to keep the cache “warm”.

Requests for some cached queries can now be served in literally
microseconds, which
makes for some pretty amazing user experience.

DS and RRA State is an Array

One problem with the Tgres table layout was that DS and RRA tables
contained frequently updated columns such as lastupdate, value and
duration The initial strategy was that these could be updated
periodically in a lzay fashion, but it became apparent that it was not
practical for any substantial number of series.

To address this all frequently mutable attributes are now stored in
arrays, same way as data points and therefore can be updated 200 (or
whatever segment width is configured) at a time.

To simplify querying DSs and RRAs two new views (dsv and rrav)
were created which abstract away the array value lookup.

Whisper Data Migration

The whisper_import
tool has been pretty much rewritten and has better instructions. It’s been
tested extensively, though admittedly on one particular set up, your mileage may vary.

Graphite DSL

Lots and lots of fixes and additions to the Graphite DSL
implementation. Tgres still does not support all of the functions,
but that was never the plan to begin with.

Future

Here’s some ideas I might tackle in the near future. If you are
interested in contributing, do not be shy, pull requests, issues and
any questions or comments are welcome. (Probably best to keep
development discussion in Github).

Get rid of the config file

Tgres doesn’t really need a config file - the few options that are
required for running should be command line args, the rest, such as
new series specs should be in the database.

A user interface

Not terribly high on the priority list, since the main UI is psql
for low level stuff and Grafana for visualization, but something to
list series and tweak config options might come in handy.

Track Usage

It would be interesting to know how many bytes exactly a series
occupies, how often it is updated and queried, and what is the
resource cost for maintaining it.

Better code organization

For example vcache could be a separate package.

Rethink the DSL

There should be a DSL version 2, which is not based on the Graphite
unwieldiness. It should be very simple and versatile and not have
hundreds of functions.

Authentication and encryption

No concrete ideas here, but it would be nice to have a plan.

Clustering needs to be re-considered

The current clustering strategy is flawed. It might work with the
current plan, but some serious brainstorming needs to happen
here. Perhaps it should just be removed in favor of delegating
horizontal scaling to the database layer.

A few weeks ago I started building yet another web-based app, in
Go. Being mostly a back-end developer, I don’t
have to write web apps very often, and every time I do, it seems like a great challenge.
I often wish someone would write a guide to web development for people
who do not have all day to get into the intricacies of great design
and just need to build a functional site that works without too much
fuss.

I’ve decided to use this opportunity to start from scratch and build
it to the best of my understanding of how an app ought to be built in
2017. I’ve spent many hours getting to the bottom of all things I’ve
typically avoided in the past, just so that for once in many years I
can claim to have a personal take on the matter and have a reusable
recipe that at least works for me, and hopefully not just me.

This post is the beginning of what I expect to be a short series
highlighting what I’ve learned in the process. The first post is a
general introduction describing the present problematic state of
affairs and why I think Go is a good choice. The subsequent posts have
more details and code. I am curious whether my experience resonates
with others, and what I may have gotten wrong, so don’t hesitate to
comment!

Introduction

In the past my basic knowledge of HTML, CSS and JavaScript has been
sufficient for my modest site building needs. Most of the apps I’ve
ever built were done using mod_python
directly using the publisher handler. Ironically for an early Python adopter,
I’ve also done a fair bit of work with Rails. For the past several years
I focused on (big) data infrastructure, which isn’t web development at all,
though having to build web-based UI’s is not uncommon. In fact the app I’m referring to
here is a data app, but it’s not open source and what it does really
doesn’t matter for this discussion. Anyway, this should provide some
perspective of where I come from when approaching this problem.

Python and Ruby

As recently as a year ago, Python and Ruby would be what I would
recommend for a web app environment. There may be other similar
languages, but from where I stand, the world is dominated by Python and
Ruby.

For the longest time the main job of a web application was constructing
web pages by piecing HTML together server-side. Both Python and Ruby
are very well suited for the template-driven work of taking data from
a database and turning it into a bunch of HTML. There are lots of
frameworks/tools to choose from, e.g. Rails, Django, Sinatra, Flask,
etc, etc.

And even though these languages have certain significant limitations,
such as the GIL,
the ease with which they address the complexity of
generating HTML is far more valuable than any trade-offs that came
with them.

The GIL

The Global Interpreter Lock is worthy of a separate mention. It is the
elephant in the room, by far the biggest limitation of any Python or
Ruby solution. It is so crippling, people can get emotional talking
about it, there are endless GIL discussions in both Ruby and Python
communities.

For those not familiar with the problem - the GIL only lets one thing
happen at a time. When you create threads and it “looks” like parallel
execution, the interpreter is still executing instructions
sequentially. This means that a single process can only take advantage
of a single CPU.

There do exist alternative implementations, for example JVM-based, but
they are not the norm. I’m not exactly clear why, they may not be
fully interchangeable, they probably do not support C extensions
correctly, and they might still have a GIL, not sure, but as far as I
can tell, the C implementation is what everyone uses out
there. Re-implementing the interpreter without the GIL would amount to
a complete rewrite, and more importantly it may affect the behavior of
the language (at least that’s my naive understanding), and so for this
reason I think the GIL is here to stay.

Web apps of any significant scale absolutely require the ability to
serve requests in parallel, taking advantage of every CPU a machine
has. Thus far the only possible solution known is to run multiple
instances of the app as separate processes.

This is typically done with help of additional software such as
Unicorn/Gunicorn with every process listening on its own port and
running behind some kind of a connection balancer such as Nginx and/or
Haproxy. Alternatively it can be accomplished via Apache and its
modules (such as mod_python or mod_wsgi), either way it’s
complicated. Such apps typically rely on the database server as the
arbiter for any concurrency-sensitive tasks. To implement caching
without keeping many copies of the same thing on the same server a
separate memory-based store is required, e.g. Memcached or Redis,
usually both. These apps also cannot do any background processing, for
that there is a set of tools such as Resque. And then all these
components need to be monitored to make sure it’s working. Logs need
to be consolidated and there are additional tools for that. Given the
inevitable complexity of this set up there is also a requirement for a
configuration manager such as Chef or Puppet. And still, these set ups
are generally not capable of maintaining a large number of long term
connections, a problem known as C10K.

Bottom line is that a simple database-backed web app requires a whole
bunch of moving parts before it can serve a “Hello World!” page. And
nearly all of it because of the GIL.

Emergence of Single Page Applications

More and more, server-side HTML generation is becoming a thing of the
past. The latest (and correct) trend is for UI construction and
rendering to happen completely client-side, driven by JavaScript. Apps
whose user interface is fully JS-driven are sometimes called
Single Page Applications,
and are in my opinion the future whether we like it or not. In an SPA
scenario the server only serves data, typically as JSON, and no HTML
is constructed there. In this set up, the tremendous complexity
introduced primarily so that a popular scripting language could be
used isn’t worth the trouble. Especially considering that Python or
Ruby bring little to the table when all of the output is JSON.

Enter Golang

Go is gradually disrupting the the world of web applications. Go
natively supports parallel execution which eliminates the requirement
for nearly all the components typically used to work around the GIL
limitation.

Go programs are binaries which run natively, so there is no need for
anything language-specific to be installed on the server. Gone is the
problem of ensuring the correct runtime version the app requires,
there is no separate runtime, it’s part of the binary. Go programs
can easily and elegantly run tasks in the background, thus no need for
tools like Resque. Go programs run as a single process which makes
caching trivial and means Memcached or Redis is not necessary either.
Go can handle an unlimited number of parallel connections, eliminating
the need for a front-end guard like Nginx.

With Go the tall stack of Python, Ruby, Bundler, Virtualenv, Unicorn,
WSGI, Resque, Memcached, Redis, etc, etc is reduced to just one
binary. The only other component generally still needed is a database
(I recommend PostgreSQL). It’s important to note that all of these
tools are available as before for optional use, but with Go there is
the option of getting by entirely without them.

To boot this Go program will most likely outperform any Python/Ruby
app by an order of magnitude, require less memory, and with fewer
lines of code.

So Is there a Popular Framework Yet?

The short answer is: a framework is entirely optional and not
recommended. There are many projects claiming to be great frameworks,
but I think it’s best to try to get by without one. This isn’t just
my personal opinion, I find that it is generally shared in the Go
community.

It helps to think why frameworks existed in the first place. On the
Python/Ruby side this was because these languages were not initially
designed to serve web pages, and lots of external components were
necessary to bring them up to the task. Same can be said for Java,
which just like Python and Ruby, is about as old as the web as we know
it, or even pre-dates it slightly.

As I remember it, out of the box, early versions of Python did not
provide anything to communicate with a database, there was no
templating, HTTP support was confusing, networking was non-trivial,
bundling crypto would not even be legal then, and there was a whole
lot of other things missing. A framework provided all the necessary
pieces and set out rules for idiomatic development for all the common
web app use cases.

Go, on the other hand, was built by people who already experienced and
understood web development. It includes just about everything
necessary. An external package or two can be needed to deal with certain
specific aspects, e.g. OAuth, but by no means does a couple of
packages constitute a “framework”.

If the above take on frameworks not convincing enough, it’s helpful to
consider the framework learning curve and the risks. It took me about
two years to get comfortable with Rails. Frameworks can become
abandoned and obsolete, porting apps to a new framework is hard if
not impossible. Given how quickly the information technology sands
shift, frameworks are not to be chosen lightly.

I’d like to specifically single out tools and frameworks that attempt
to mimic idioms common to the Python, Ruby or the JavaScript
environments. Anything that looks or feels or claims to be “Rails for
Go”, features techniques like injection, dynamic method publishing
and the like which require relying heavily on reflection are not the
Go way of doing things, it’s best to stay away from those.

Frameworks definitely do make some things easier, especially in the
typical business CRUD world, where apps have many screens with lots of
fields, manipulating data in complex and ever-changing database
schemas. In such an environment, I’m not sure Go is a good choice in
the first place, especially if performance and scaling are not a
priority.

Another issue common to frameworks is that they abstract lower level
mechanisms from the developer often in way that over time grows to be
so arcane that it is literally impossible to figure out what is
actually happening. What begins with an idiomatic alias for a single
line of JavaScript becomes layers upon layers of transpilers,
minimizers on top of helpers hidden somewhere in a sub-dependency. One
day something breaks and it’s impossible to know where to even look
for the problem. It’s nice to know exactly what is going on sometimes,
Go is generally very good about that.

What about the database and ORM?

Similarly to frameworks, Go culture is not big on ORM’s. For starters,
Go specifically does not support objects, which is what the O in ORM
stands for.

I know that writing SQL by hand instead of relying on
User.find(:all).filter... convenience provided to by the likes of
ActiveRecord is unheard of in some communities, but I think this
attitude needs to change. SQL is an amazing language. Dealing with SQL
directly is not that hard, and quite liberating, as well as incredibly
powerful. Possibly the most tedious part of it all is copying the data
from a database cursor into structures, but here I found the sqlx
project very useful.

In Conclusion

I think this sufficiently describes the present situation of the
server side. The client side I think could be separate post, so I’ll
pause here for now. To sum up, thus far it looks like we’re building
an app with roughly the following requirements:

So our app is going to have two major parts to it: client and
server. (What year is this?). The server side is going to be in Go,
and the client side in JS. Let’s talk about the server side first.

The Go (Server) Side

The server side of our application is going to be responsible for
initially serving up all the necessary JavaScript and supporting files
if any, aka static assets and data in the form of JSON. That’s it,
just two functions: (1) static assets and (2) JSON.

It’s worth noting that serving assets is optional: assets could be
served from a CDN, for example. But what is different is that it is not
a problem for our Go app, unlike a Python/Ruby app it can perform on
par with Ngnix and Apache serving static assets. Delegating assets to
another piece of software to lighten its load is no longer necessary,
though certainly makes sense in some situations.

To make this simpler, let’s pretend we’re making an app that lists
people (just first and last name) from a database table, that’s
it. The code is here https://github.com/grisha/gowebapp.

Directory Layout

It has been my experience that dividing functionality across packages
early on is a good idea in Go. Even if it is not completely clear how
the final program will be structured, it is good to keep things
separate to the extent possible.

For a web app I think something along the lines of the following
layout makes sense:

Top level: package main

At the top level we have package main and its code in main.go. The
key advantage here is that with this layout go get github.com/user/foo
can be the only command required to install the whole application into
$GOPATH/bin.

Package main should do as little as possible. The only code that
belongs here is to parse the command argument flags. If the app had a
config file, I’d stick parsing and checking of that file into yet
another package, probably called config. After that main should pass
the control to the daemon package.

The above program accepts three parameters, -listen, -db-connect
and -assets-path, nothing earth shattering.

Using structs for clarity

In line cfg := &daemon.Config{} we are creating a daemon.Config
object. It’s main purpose is to pass around configuration in a
structured and clear format. Every one of our packages defines its own
Config type describing the parameters it needs, and packages can
include other package configs. We see an example of this in
processFlags() above: flag.StringVar(&cfg.Db.ConnectString,
.... Here db.Config is included in daemon.Config. I find doing
this very useful. These structures also keep open the possibility of
serializing configs as JSON, TOML or whatever.

Using http.FileSystem to serve assets

The http.Dir(assetsPath) in setupHttpAssets is in preparation to
how we will serve the assets in the ui package. The reason it’s done
this way is to leave the door open for cfg.UI.Assets (which is a
http.FileSystem interface) to be provided by other implementations,
e.g. serving this content from memory. I will describe it in more
detail in a later post.

Lastly, main calls daemon.Run(cfg) which is what actually starts our
app and where it blocks until it’s terminated.

package daemon

Package daemon contains everything related to running a
process. Stuff like which port to listen on, custom logging would
belong here, as well anything related to a graceful restart, etc.

Since the job of the daemon package is to initiate the database
connection, it will need to import the db package. It’s also
responsible for listening on the TCP port and starting the user
interface for that listener, therefore it needs to import the ui
package, and since the ui package needs to access data, which is
done via the model package, it will need to import model as well.

Note how Config includes db.Config and ui.Config as I described
earlier.

All the action happens in Run(*Config). We initialize a database
connection, create a model.Model instance, and start the ui
passing in the config, a pointer to the model and the listener.

package model

The purpose of model is to separate how data is stored in the
database from the ui, as well as to contain any business logic an
app might have. It’s the brains of the app if you will.

The model package should define a struct (Model seems like an
appropriate name) and a pointer to an instance of the struct should be
passed to all the ui functions and methods. There should only be one
such instance in our app - for extra credit you can enforce that
programmatically by making it a singleton, but I don’t think that’s
necessary.

Alternatively you could get by without a Model and just use the
package model itself. I don’t like this approach, but it’s an
option.

The model should also define structs for the data entities we are
dealing with. In our example it would be a Person struct. Its
members should be exported (capitalized) because other packages will
be accessing those. If you use
sqlx, this is where you would also
specify tags that map elements to db column names, e.g. `db:"first_name"`

Our Person type:

1234

typePersonstruct{Idint64First,Laststring}

In our case we do not need tags because our column names match the
element names, and sqlx conveniently takes care of the capitalization,
so Last matches the column named last.

package model should NOT import db

Somewhat counter-intuitive, model cannot import db. This is
because db needs to import model, and circular imports are not
allowed in Go. This is one case where interfaces come in
handy. model needs to define an interface which db should
satisfy. For now all we know is we need to list people, so we can
start with this definition:

123

typedbinterface{SelectPeople()([]*Person,error)}

Our app doesn’t really do much, but we know it lists people, so our
model should probably have a People() ([]*Person, error) method:

123

func(m*Model)People()([]*Person,error){returnm.SelectPeople()}

To keep things tidy, code should be in separate files, e.g. Person
definition should be in person.go, etc. For readability, here is a
single file version of our model:

package db

db is the actual implementation of the database interaction. This is
where the SQL statements are constructed and executed. This package
also imports model because it will need to construct those structs
from database data.

First, db needs to provide the InitDb function which will
establish the database connection, as well as create the necessary
tables and prepare the SQL statements.

Our simplistic example doesn’t support migrations, but in theory this
is also where they might potentially happen.

We are using PostgreSQL, which means we need to import the
pq driver. We are also going to rely on
sqlx, and we need our own model. Here is the beginning of our db
implementation:

Out InitDb() creates an instance of a pgDb, which is our Postgres
implementation of the model.db interface. It keeps all that we need
to communicate with the database, including the prepared statements,
and exports the necessary methods to satisfy the interface.

12345

typepgDbstruct{dbConn*sqlx.DBsqlSelectPeople*sqlx.Stmt}

Here is the code to create the tables and the statements. From the SQL
perspective this is rather simplistic, it could be a lot more
elaborate, of course:

123456789101112131415161718192021222324252627

func(p*pgDb)createTablesIfNotExist()error{create_sql:=` CREATE TABLE IF NOT EXISTS people ( id SERIAL NOT NULL PRIMARY KEY, first TEXT NOT NULL, last TEXT NOT NULL); `ifrows,err:=p.dbConn.Query(create_sql);err!=nil{returnerr}else{rows.Close()}returnnil}func(p*pgDb)prepareSqlStatements()(errerror){ifp.sqlSelectPeople,err=p.dbConn.Preparex("SELECT id, first, last FROM people",);err!=nil{returnerr}returnnil}

Here we’re taking advantage of sqlx to run the query and construct a
slice from results with a simple call to Select() (NB:
p.sqlSelectPeople is a *sqlx.Stmt). Without sqlx we would have to
iterate over the result rows, processing each with Scan, which would
be considerably more verbose.

Beware of a very subtle “gotcha” here. people could also be defined
as var people []*model.Person and the method would work just the
same. However, if the database returned no rows, the method would
return nil, not an empty slice. If the result of this method is
later encoded as JSON, the former would become null and the latter
[]. This could cause problems if the client side doesn’t know how to
treat null.

That’s it for db.

package ui

Finally, we need to serve all that stuff via HTTP and that’s what the
ui package does.

Note how indexHTML contains next to nothing. This is 100% of the
HTML that this app will ever serve. It will evolve a little as we get
into the client side of the app, but only by a few lines.

Also noteworthy is how the handler is defined. If this idiom is not
familiar, it’s worth spending a few minutes (or a day) to internalize
it completely as it is very common in Go. indexHandler() is not a
handler, it returns a handler function. It is done this way so
that we can pass in a *model.Model via closure, since an HTTP
handler function definition is fixed and a model pointer is not one of
the parameters.

In the case of indexHandler() we’re not actually doing anything with
the model pointer, but when we get to implementing an actual list of
people we will need it.

Conclusion

Above is essentially all the knowledge required to build a basic Go
web app, at least the Go side of it. Next week I’ll get into the
client side and we will complete the people listing code.

I started programming professionally back when I was a teenager.
I’ve spent most of my early career working at large ISP’s solving industrial-scale hosting challenges. Since around 2009 I’ve become more intersted in and now work exclusively on data infrastuctre, both big and small, but mostly big.

I was born and grew up in Moscow, Russia, though I’ve lived in the Washington, DC (USA) area for more than half of my life now. Our kids were born and go to school here, it’s gradually become home for us.