Reactionary Visions

Fri, 13 Mar 2015

I've been doing a lot of reading on Paxos lately. There are many
papers to read. Some of them are very good, some of them are less
so. Paxos is not very complex - there are only four messages types in
total in the classic Paxos case (though most implementations will end
up with at least six), but it is quite subtle. Very few of the papers
speak at all about why things need to be written to disk, or when, for
example. The following are some notes of mine, which might be useful
to others.

I'm not going to attempt to describe Paxos (though I may do so by
accident). The most succinct description I've come across is the first
paragraph of the Paxos Made Practical paper by Robbert van Renesse:

Paxos is a simple protocol that a group of machines in a distributed
system can use to agree on a value proposed by a member of the
group. If it terminates, the protocol reaches consensus even if the
network was unreliable and multiple machines simultaneously tried to
propose different values. The basic idea is that each proposal has a
unique number. Higher numbered proposals override lower-numbered
ones. However, a "proposer" machine must notify the group of its
proposal number before proposing a particular value. If, after
hearing from a majority of the group, the proposer learns one or
more values from previous proposals, it must re-use the same value
as the highest-numbered previous proposal. Otherwise, the proposer
can select any value to propose.

Far too many papers choose different terminology for the different
phases of the protocol. Thus in the following I'm just going to call
them phase 1 and phase 2. Phase 1 is the one where: (1a) a proposer
invents a ballot number and sends that to a majority of the acceptors;
(1b) each acceptor replies iff the ballot number from (1a) is greater
than any ballot number it's previously seen in a (1a) message, and
that reply contains the ballot number and value of any (2b)
message. Phase 2 is the one where: (2a) a proposer asks the majority
of acceptors to accept a value with a ballot number; (2b) each
acceptor replies (and accepts the value) iff the ballot number is
greater or equal to the maximum ballot number it has seen in a (1a)
message.

These papers normally talk about being able to tolerate failure of up
to F machines (actually, acceptors). For this to be possible, Paxos
still requires the availability of a majority of the original set of
acceptors. So that means a total of 2F + 1 acceptors (if you have 2F +
1 machines, then a majority of them is F+1 machines, hence F of them
can still fail and you have access to a majority of the original set
of machines). The importance of a majority is that if you randomly
choose two sets of F+1 machines, there will be at least one machine in
common (intersection is never empty). The whole point of ballot
numbers (and the way they're constructed such that they can never
collide) is so that at least one machine in any two majority sets of
machines will be able to correctly order any two different ballot
numbers. Thus the point of phase 1 is to figure out if your ballot is
currently the maximum known ballot number to each acceptor, and if it
is, what value you're allowed to present for phase 2.

In the original papers, acceptors only ever reply iff the ballot
number from the proposer meets certain requirements. If it does not,
the acceptors are silent and the proposer is meant to determine
through some sort of timeout that their message has failed. However,
there's no need to implement it like this - several systems have the
acceptors actively send back nack messages to the proposer. Paxos will
work perfectly well on unreliable communication channels, which means
it's fine over UDP. However, UDP frequently doesn't work in the cloud
due to cloud providers breaking PMTU discovery and not supporting
fragmentation. In such an environment, Paxos will not violate its
properties, but you might find nothing makes any progress. If none of
that applies to you and so you use UDP then you may well need to
implement the timeouts too, in case the nacks go missing (if you
choose to use that nacks too) and you can't detect that loss. If
you're using TCP then you might decide you can just rely on the nacks
(application-layer nacks, not TCP), not bother with timeouts, and also
watch for the TCP connection dropping. The argument against timeouts
is that the machine with whom you're communicating might just be under
heavy load. But then again, is that acceptable for the service you're
trying to build?

Paxos ensures that once a majority of the acceptors have accepted a
value (by accepted, I mean an acceptor has received a phase 2a message
which meets the requirements such that it intends to reply with a 2b
message), it is impossible to get a majority of the acceptors to
accept a different value. This, in combination with the construction
of ballot numbers means that if any two acceptors have accepted a
value for the same ballot, it is the same value. An entire instance of
Paxos creates consensus in perpetuity on one value only. Normally you
want to create a stream of values, so you need to run lots of
rounds. How you construct your stream is up to you, but simply
name-spacing each instance by the event log ID works just fine.

If in phase 1 you established your ballot number is the greatest
ballot number ever and then you were just allowed to pick any old
value, then it should be clear that anyone else could come along
later, pick an even greater ballot number, and change the accepted
value. So this is why phase 1b includes the value and ballot number
for the most recent 2b message the acceptor has sent. The proposer has
to receive 2b messages from a majority of acceptors before it can make
further progress. Now let's pretend that each acceptor actually sends
back all the ballot numbers and values for all 2b messages its ever
sent as part of this instance. The proposer now has the complete
history of all values accepted from a majority of acceptors. These
histories can be different for each acceptor, though as said above,
where any two acceptors accepted a value for the same ballot, the
value will be the same. You can therefore create a list of all the
accepted values, with the ballot numbers as indices; there may well be
gaps in this list. What should the proposer now do? All the proposer
can do is to cause the acceptors to append to this imaginary list -
the acceptors will only act on a greater-than-ever-before ballot
number, which means appending to the end of our imaginary list. We
want future proposers to be forced to continue our work rather than
diverge. If, for example, we should force the proposer to send a 2a
message with the earliest accepted value then that does not cause more
acceptors to agree on what their earliest accepted value is. So the
only other sane choice is to force the proposer to send a 2a message
with the value of the greatest ballot number it knows of that's been
accepted. This can't reduce the spread of this particular value: even
if the proposer dies right here, the acceptors haven't lost
anything. It can increase the spread of this value though by passing
this value to acceptors who haven't previously accepted any value for
this ballot number (and because we got a 1b reply from that acceptor,
we know that our current ballot number is acceptable to that acceptor;
from this point, we can realise that we're most likely to make
progress if the majority we send our 2a message to is the same as the
majority we sent our 1a message to).

Once a value has been accepted by a majority of acceptors, any further
valid (by valid, I mean it does not get ignored, or cause a nack) 1a
message from a proposer will guarantee the accepted value is returned
in the 1b message and must be chosen again by the proposer in its next
2a message (by the definition of a majority, it is not possible for a
different value to have been accepted at the same time by a different
majority (even with a higher ballot number)). Once a value is accepted
by a majority of acceptors, at least one member of that majority is
guaranteed to be in any other majority, and will thus present its
accepted value in any 1b messages it sends. Acceptors are considered
single-threaded in their dealings with Paxos. So when an acceptor is
dealing with a valid 2a message and sending a 2b response, it is not
simultaneously processing 1a messages. Thus if an acceptor receives a
valid 2a message and accepts that value, some other proposer may be
sending phase 1a messages or even different 2a messages to this same
acceptor, but they have to wait their turn. In both cases: if some 1a
or 2a message arrives afterwards, they are ignored (or a nack sent) if
their ballot number is too low, otherwise the 1a will elicit a
response (1b) containing the newly accepted value. However, the 2a
cannot be valid. This is because a proposer would only send a 2a if it
got a 1b back, which implies its ballot number is the greatest. But
this acceptor has just accepted a value, implying the accepted value's
ballot number must be greater still. Thus in any interleaving
involving multiple proposers and an acceptor, the only value
acceptable will be with the ballot number of the most recently issued
1b message (or put another way, a 2a will only be accepted from
proposer ρ if the previous valid message the acceptor received was a
1a from proposer ρ (other than the special case of the first ballot
number where phase 1 isn't necessary - covered below)).

All this talk of majorities is often confusing in combination with
failures. The simplest formulation is that a proposer should send the
1a and 2a messages to all acceptors, and can make progress when it
receives 1b and 2b message responses (respectively) from a majority of
them. This is the simplest way of dealing with the possibility of
failures. However, it's frequently a good idea to optimise for the
common case, which is when failures don't occur. Thus you can pick
your majority of acceptors at the start (perhaps randomly, perhaps
not) and communicate with just them, sending your 1a and 2a messages
just to them, and waiting for all of them to reply. But what then
happens if any of them fail? You're now not talking to a majority. The
simplest choice here is to abandon the current ballot, pick a fresh
majority (this is a majority of the original 2F+1 acceptors - don't
reduce the size of your machines just because of a failure; you should
only reduce the size of your machines when you know a machine is not
going to come back - covered below), and start from phase 1a with an
increased ballot number.

When a proposer receives 2b messages back from a majority of
acceptors, it knows the value chosen is never going to change for this
particular instance of Paxos. The proposer can then disseminate this
information as it chooses (this is often called phase 3/learning
phase). If you plan it so, you can have the acceptors send their 2b
message to not-just-the-proposer: that way, several parties can learn
at the same time that a value has been chosen, without needing the
extra hop of going through the proposer. You'll need to deal with some
learners dying, whilst others don't, and the need to re-synchronise
what's been learnt. The re-synchronising will however be easy because
you won't have any conflicts - Paxos guarantees us that. So it should
just be adding everything that's been learnt together from all
available sources.

The proposer only gets to propose its own value in a 2a message when
there is no information returned in the 1b messages from a majority of
acceptors. Thus if a proposer is trying to get a particular value
added to an event log, it may have to take part in several instances
of Paxos before it finds one where its able to get 2a messages to a
majority of acceptors for the first ballot number of that
instance. Note that in light of failures of acceptors it should not
immediately skip to the next instance: it could be that it was able to
get its value to some acceptors (albeit not a majority), and some
other proposer just happened to pick some of those acceptors in its
own majority, and succeeded in spreading that value to a majority. It
should only move on to a new Paxos instance if it has learnt it has
lost the current instance.

In this light, if the proposer fails after having sent some 2a
messages, you have no knowledge as to what value is going to be agreed
on by later ballots. If the proposer did manage to get 2a messages to
the majority then yes, you have consensus. But if the 2a messages
didn't reach a majority, then a different proposer can happen to
choose a majority not including any of the previous acceptors, and get
a totally different value accepted. Or it can choose a majority which
has some acceptors in common with the original 2a messages, and thus
complete the instance with the value as originally sent. So you can't
assume anything about what will happen in an instance when a proposer
dies.

If you happen to construct your system such that you know a particular
instance will only ever be started by a particular proposer, then that
proposer doesn't need to bother with phase 1 at all - it can start
with a phase 2a message (you're guaranteeing there's no history of
this instance of Paxos for the proposer to learn through phase 1). Of
course, again in light of failures of acceptors it may need to revert
to phase 1, but in the common case (no failures), this is a useful
optimisation that can halve the number of messages.

The original Paxos papers talk about having the acceptors write their
state to disk, though it's not really explained why. If you do have
the acceptors write state to disk then it means they can be restarted
and continue where they left off - they'll have to read their state
off disk and send some more messages, perhaps redundantly, and so your
proposers will have to make sure they can handle messages they receive
(i.e. 1b and 2b messages) idempotently, but you've probably done that
anyway. But for this to work also requires that the restart of the
acceptors isn't detected as a failure by the proposers, which may mean
you're using UDP rather than TCP, which means you've probably gone
down the timeout route. All of this means performance is unlikely to
be amazing: the extra fsyncs necessary is going to hurt, the timeouts
may have to be fairly generous (and how on earth do you decide what
amount of time is allowed for an acceptor to restart without
considering that a failure?), and detecting actual failures is going
to be more complex.

If you decide to use TCP instead, and you decided that a TCP
connection dropping indicates a failure at the other end, then it
means that an acceptor being restarted will be considered a failure of
an acceptor. In which case, it doesn't matter if that acceptor loses
its state. Consequently, the acceptors don't need to write anything to
disk. To see this, consider the following: due to design, or some
programming bug, you have always chosen the exact same F+1 as your
majority of acceptors. They have never crashed, and so they contain
all the information to date about every instance of Paxos you've
run. The remaining F acceptors contain nothing. Suddenly there's a
power failure, and F of those chosen F+1 acceptors die. But Paxos can
handle the death of up to F acceptors, so you're still fine. But now
you have just F+1 acceptors left, and those F+1 acceptors are your
only choice, ongoing, for the majority of acceptors. Crucially, that
includes the 1 surviving acceptor from the original majority that has
all the state. So nothing has been lost, even without writing anything
to disk.

Now yes, if you need to be able to power down the entire system and
then resume where you left off then you're going to have to write to
disk. But that may still be outside of Paxos rather than within. For
example, if you're using Paxos to achieve consensus on some stream of
commands then the client which is submitting the command doesn't
return until: 1) Paxos has successfully added the command to the
stream (i.e. the client, playing the role of proposer, has had 2b
messages back from a majority of acceptors for an instance of Paxos in
which the proposer was able to pick its own value (command) for the 2a
message); 2) the command has been processed by some later step
(command processor) and some result returned. Now if the acceptors are
co-located with the command processor and you need to turn everything
off then does it matter if the stream of commands is lost? The client
that submitted the command is just going to get an error, even if its
command is eventually processed some time later, so do we really care
if that command is lost?

It depends. Presumably the state of the command processors is written
to disk every time the state changes, and if you have several of these
processors then they could run at different speeds. So it's then a
matter of: how do you re-synchronise the state of these command
processors? Some of the papers show that you could use Paxos for this,
which is true, but then you may need Paxos to maintain quite a history
and add other messages to the acceptors so that you can efficiently
catch up. Or you could build this re-synchronisation through other
means, outside of Paxos, and just keep Paxos for speedy fault-tolerant
non-blocking consensus.

Update (13th March 2015): There's a case I missed here. Consider that
paxos isn't writing to disk. Acceptors send out their 2b messages to
learners. Only 1 learner receives at least F+1 2b messages before all
the acceptors die. That 1 learner acts on the 2bs, proceeding in some
way (which does involve writing to disk) before it too dies. At this
point, the system is blocked because >F acceptors have died, and
indeed the entire system is now turned off. Everyone now comes back
up, except for the one node that received all the 2bs. Consequently,
knowledge of this paxos instance is lost, but the system is
operational as <F acceptors are dead. If that one node ever comes
back, you have divergence. Even worse, that one node that acted on the
2bs may have done enough work to mean the result of that instance
became visible to the outside world.

So how do you actually deal with intentional removal (or addition) of
acceptors? One simple idea is that the set of identities of acceptors
amounts to a topology, and this is versioned (ver n). So if you want
to add or remove an acceptor then you calculate what the new topology
is, give it an increased version (ver n+1), and now run a Paxos
instance proposing achieving consensus on what ver n+1 of the topology
really is (I think you want to do this Paxos instance with the new
topology rather than the old). All instances of Paxos reference the
topology version. In a Paxos instance, once a value has been accepted
by the majority, you cannot change the topology for that instance. If
you increased the number of acceptors then you could render the old
majority so small that none of them are chosen in a later ballot, thus
allowing the accepted value to be changed. If you decreased the number
of acceptors then you could remove enough of the old majority such
that a new majority from the new topology need not include any of the
original majority, and again the accepted value could change. Thus the
easiest is simply that any 1a message would have to carry the topology
version, and once an acceptor has received a valid 1a message, the
topology version for that Paxos instance is fixed. So then if an
acceptor receives a 1a or 2a message for that instance which has the
wrong topology number, it will issue a nack, indicating the correct
topology number, hopefully forcing the proposer to discover the
correct new topology. That does mean that if >F acceptors fail, a
Paxos instance can just get stuck. This, you'll just have to deal with
explicitly, having everyone detect this case and completely aborting
the instance.

There are further complications here too. Consider the earlier case
where the same majority of F+1 kept getting chosen, and then F of them
died, leaving just 1 acceptor with all the information. If a topology
change now happens, removing all the failed acceptors then all the
information held by this special acceptor is certainly not held by a
majority any more, and is in danger of being lost. So historical
instances of Paxos must be made read only, and before the topology
change is agreed, it may be essential to duplicate or otherwise make
safe such data.

Hopefully the above is useful to some people, not just myself. I've
struggled to find much information regarding the various approaches
and tradeoffs when implementing Paxos. There are various papers such
as Paxos Made Live which are certainly worth reading. But they don't
seem to cover much of what I've tried to cover above. Such papers tend
to record their concrete implementations rather than considering
various different uses of Paxos and what the consequences
are. Comments and corrections are very welcome - please let me know.

Sat, 17 Jan 2015

In the aftermath of the attacks on Charlie Hebdo in Paris, there has
been some high quality thinking and writing. There's also been some
really stupid things said, from the usual protagonists. It's an
interesting facilitation that the internet now provides: as I no
longer watch any news on TV (in fact I don't watch any news at all),
nor subscribe to any newspaper, I'm used to reading articles from a
wide range of sources. Equally, it's much easier for me to avoid
opinion I disagree (right-wing press) with or trivialised dumbed-down
reporting (e.g. BBC news). Because of this ease of reading what you
want to (in both the good and bad sense), I thought a lot of the
reaction was measured and sensible. Turns out I was just unaware of
most of the reaction going on.

Anyway there seems to be virtually nothing left to say on this, so
this post is really little more than a bunch of links to pieces I
thought were thoughtful and well written.

I am an agnostic. I personally don't believe in any religion but I
accept I can't prove that every religion is false and so I may be
wrong. I tend to treat the beliefs of any religion as arbitrary and
abstract ideas. Thus one place to start is the acknowledgement that
the laws of any country or civilisation are as arbitrary and ad-hoc as
the rules or teachings of any religion. They are just things that
people choose to believe in, or follow, or not violate. In the UK,
some of our law is based in Christianity (e.g. thou shalt not murder -
though I've no idea whether ideas like that actually predate
Christianity; I wouldn't be surprised if they do) though other parts
of Christianity are not encoded in law (adultery is not illegal, for
example).

Many have careless labelled these attacks as an attack of free speech
and thus that the reaction is about defending free speech. As such,
much has been written about how it's possible to defend the right
Charlie Hebdo has to publish anything they want, even if it's
offensive, even whilst criticising their choice of content.

Freedom of speech is, I believe, essential for any sort of democracy
to work. This doesn't suggest that if you have freedom of speech then
you have a democracy (I don't believe in the UK, or arguably anywhere
in Europe or north America there is functioning democracy; merely
systemic corporate dictatorships disguised by corrupt elected
faux-representatives), but without freedom of speech, you certainly
don't have any ability to transparently hold people to account and
thus corruption and abuse will obviously take hold. But freedom of
speech is a choice, it's not something axiomatic to existence; it's
something many people have chosen to attach huge importance to and as
such will defend. But just because there are widely agreed reasons to
defend the concept of freedom of speech doesn't mean that it's
self-evidently a "better" idea than not having freedom of speech. To
judge something as "better" than something else requires all manner of
discussion of the state of human existence. Consequently, criticising
people for not holding freedom of speech in the same regard as we
claim to do is no different from criticising people for their choice
of clothing, or housing, or diet, or career, or religious views.

The USA appears to hold freedom of speech as a much more inviolable
concept. For example:

The ACLU vigorously defends the right of neo-Nazis to march
through a community filled with Holocaust survivors in Skokie,
Illinois, but does not join the march; they instead vocally
condemn the targeted ideas as grotesque while defending the right
to express them.

But whilst the outpouring in Paris and the crowds gathered as a
statement of unity were warmly defiant, it is somewhat likely that
rather more than physical violence that was being defied, and more
than freedom of speech defended by the crowd. As Brian Klug wrote:

Here is a thought experiment: Suppose that while the demonstrators
stood solemnly at Place de la Republique the other night, holding
up their pens and wearing their “je suis charlie” badges, a man
stepped out in front brandishing a water pistol and wearing a
badge that said “je suis cherif” (the first name of one of the two
brothers who gunned down the Charlie Hebdo staff). Suppose he was
carrying a placard with a cartoon depicting the editor of the
magazine lying in a pool of blood, saying, “Well I’ll be a son of
a gun!” or “You’ve really blown me away!” or some such
witticism. How would the crowd have reacted? Would they have
laughed? Would they have applauded this gesture as
quintessentially French? Would they have seen this lone individual
as a hero, standing up for liberty and freedom of speech? Or would
they have been profoundly offended? And infuriated. And then what?
Perhaps many of them would have denounced the offender, screaming
imprecations at him. Some might have thrown their pens at him. One
or two individuals — two brothers perhaps — might have raced
towards him and (cheered on by the crowd) attacked him with their
fists, smashing his head against the ground. All in the name of
freedom of expression. He would have been lucky to get away with
his life.

Jyllands-Posten, the Danish newspaper that published caricatures
of the Prophet in 2005, reportedly rejected cartoons mocking
Christ because they would "provoke an outcry" and proudly declared
it would "in no circumstances ... publish Holocaust cartoons".

But of course it comes down to the content of the publication. In this
case the cartoons exist to ridicule, make fun of and offend members of
one of the world's largest religions, Islam, by mocking their prophet
Mohammed. As Amanda Taub writes:

Dalia Mogahed, the Director of Research at the Institute for
Social Policy and Understanding, explained that Mohammed is a
beloved figure to Muslims, and "it is a human impulse to want to
protect what's sacred to you".

Mogahed compared the cartoons to the issue of flag-burning in the
United States, noting that a majority of Americans favour a
constitutional amendment to ban flag-burning for similar reasons:
the flag is an important symbol of a national identity, and many
Americans see flag-burning as an attack on that identity, or even
on the country itself. That's not extremism or backwardness; it's
about protecting something you cherish.

In any large group of people, there will be the vast majority of sound
mind and thought, and a small minority who are not. This is just the
fact that all over the earth, humans are not all the same: there is
some variance in health, intelligence, and every other aspect of what
a human is. Any large sampling of humans will show the same set of
variations. So if you offend a huge group of people, you will offend
tall people and short people, rich people and poor people, fat people
and thin people, violent people and peaceful people. Unsurprisingly,
it would appear that the background of these killers suggests there is
little to do with Islam there, and more to do with the their
upbringing, family, education and integration with society.

Thus even if you feel Charlie Hebdo's publications of the cartoons
served some purpose (given their biased choice of target, their
purpose does not seem to be an exercise in itself of freedom of
speech), it should be obvious that by offending so many people, they
were placing themselves in danger. The same is true of any sustained,
systemic, deliberate offence to any of this planet's major religions,
races, nationalities or any other grouping of humans which share
values. So it becomes a balancing act between how much do you believe
in the message you're publishing versus the risk you're putting
yourself in. You can view the actions of Edward Snowden in this same
context: he felt that the message he was delivering on the abuses of
surveillance power carried out by governments across the world
outweighed the significant danger he was putting himself in, and so
both delivered the message and accepted the need to flee from his
country, probably never to return, in fear of the consequences of his
actions.

Thankfully, throughout history, there have been people who have chosen
to put themselves in the path of great harm (often losing their lives
as a result) in order to report, expose, document and publicise
matters which the wider public needed to know. Governments, monarchies
and empires have crumbled when faced with popular revolt.

So freedom of speech requires consideration. It is perfectly
reasonable not to say something because you anticipate you won't enjoy
the consequences. Most of us do not conduct our lives by going around
saying anything and everything we want to our friends and family: if
we did, we'd rapidly lose a lot of friends. The expression "biting
your tongue" exists for a reason. Equally, it's perfectly reasonable
for a news outlet to decide not to re-publish the Charlie Hebdo
cartoons if they fear a violent response that they suspect the local
police forces cannot prevent; not to mention just not wanting to
offend so many people.

I view as daft the idea that people should never choose not to publish
something out of fear. People absolutely should choose not to publish,
if they feel the risk to the things they hold dear is not outweighed
by the message they're delivering. Everything in life is a trade-off
and every action has consequences. Whilst I agree with the right to
free speech, that does not imply saying anything you like is free of
consequences. If it were, it would require that words have no meaning,
and subsequently all communication is void: if anything you say has no
consequence then you can say nothing.

I am certainly not suggesting the murders were in any way justified,
or that Islam or any other religion or collection of humans should be
beyond criticism or even ridicule. At the end of the day, no human is
perfect, and as such we can all benefit from a thorough dose of
criticism once in a while. Every article I've linked to in this post
repeats that such violence, regardless of the provocation, is
abhorrent, and I agree with that: murder is never an acceptable
response to any drawing, written or spoken word. But that doesn't mean
that these events weren't predictable.

Finally then we get to the insanely idiotic response from the UK
government. That MI5 should have more powers that they don't need
(they probably just need more money), and that we must deny terrorists
"safe space" to communicate online. Which means banning encryption,
which means it's impossible to use the internet for anyone. The home
secretary, Theresa May said:

We are determined that as far as possible there should be no safe
spaces for terrorists to communicate. I would have thought that
that should be a principle ... that could have been held by
everybody, across all parties in this House of Commons.

So of course, if terrorists can't communicate in private then no one
can. Quickly, we've immediately gone from lazy labelling of events as
an attack on free speech to a knee jerk response of "free speech yes,
but you certainly can't have free speech in private, because you might
be a terrorist". Again, it's a trade-off. I doubt that having such
restrictions on communication will make the earth or this country
safer for anyone and of course the impossibility of a controlled study
means it cannot be proven one way or another. No security service is
ever going to be flawless and from time to time very horrible things
will continue to happen. I think most people are aware of this and
accept this; we're all going to die after all. The loss of civil
liberties though is certainly far more worrying to me.

In theory, I would think these proposals so lunatic as to never see
the light of day (it would be completely impossible to enforce for one
thing - terrorists along with everyone else would learn to use
stenography to encode their messages in pictures of cats, thus
rendering their traffic no different to that of everyone else). Sadly,
Labour have stated they don't believe their position to be that far
away from the Conservatives, which is deeply worrying. Labour don't
exactly have a great record on this area either given their previous
ID card schemes and the introduction of detention-without-charge. What
is needed is some transparency. We need an informed debate, with MI5
and GCHQ providing some independently verifiable facts and figures
that demonstrate how they are being thwarted in what they're trying to
do. We need to understand properly what the risk to us is, and most
importantly we need to understand why these threats exist and what
else we can do to make them decrease.

I've never seen it said that any UK Government policy in the last 15
years has made the UK less of a target for such attacks. Maybe we
should look at that before we start subjecting ourselves to Orwellian
control.

Sun, 19 Oct 2014

In an article published in the Guardian yesterday, author Kathleen
Hale recounts how her first book got some negative reviews by
reviewers on a book review website. One reviewer in particular upset
her and Kathleen ends up figuring out the reviewer is using a false
identity, finds out who the reviewer really is and confronts her. The
piece doesn't read to me like some sort of valedictory "I outed a
fraud" type piece (though there are some passages in there which are
questionable in that direction) and equally there are several passages
where Kathleen expresses deep embarrassment and regret for the course
of action she took. This episode, and that article in particular has
caused substantial reaction: currently 600 comments on the Guardian
article plus severalother blog posts. There's no shortage of opinion
to be found on Twitter either, as you'd expect.

The course of action that Kathleen took seems to be fairly undisputed
as far as I can find. There is some dispute from some of the other
blog posts as to exactly what was tweeted and said by whom, and there
is dispute over Kathleen's claim that there are factual inaccuracies
made in a review of her book. It is not disputed that the reviewer was
using a false identity and that the reviewer had at least public
Twitter, Facebook, and Instagram accounts under the false
identity. The false identity was also a real name (Blythe Harris), by
which I mean a name which if you introduced yourself by that name, no
one would think you're using a false identity. This is distinct from
claiming to be Peter Rabbit, or Buzz Lightyear.

Many people have equated Kathleen's actions with stalking. My
dictionary defines the verb to stalk as:

to follow or approach (game, prey, etc.) stealthily and
quietly

to pursue persistently and, sometimes, attack (a person with
whom one is obsessed, often a celebrity)

, 4,... [not relevant]

The second item there certainly fits. The British legal approach,
whilst it gives no strict definition gives examples and guidance:

....following a person, watching or spying on them or forcing
contact with the victim through any means, including social media.

The effect of such behaviour is to curtail a victim's freedom,
leaving them feeling that they constantly have to be careful. In
many cases, the conduct might appear innocent (if it were to be
taken in isolation), but when carried out repeatedly so as to
amount to a course of conduct, it may then cause significant
alarm, harassment or distress to the victim.

I'm glad it includes "social media" there. Some comments have
suggested that stalking "in real life" is worse than online. This
seems bizarre to me: as if through a computer you are not interacting
with other human beings but merely with shiny pixels who have no
emotional capacity. "In real life" is everything we know. Whilst we're
alive we have no personal experience of anything other than "in real
life".

So I'm fairly sold on the whole argument that Kathleen's behaviour
towards this reviewer can be considered stalking and as such is
reprehensible.

To me, the far more interesting issue is the use of anonymity, false
identities and any realistic expectation we have of privacy on the
internet. A number of people who claim to write book reviews on such
sites have suggested that the behaviour of Kathleen is exactly why
they write their reviews under false names. I think there's something
of a contradiction going on here.

But let's work backwards. Firstly, Kathleen, through some social
engineering (she requested from the book review site the address of
the reviewer so that she could post her a copy of the book) got the
address of the book reviewer. She then used a telephone directory and
census results to identify who really lived there (or likely owned the
land). Now the use of the telephone directory seems a bit odd to me:
telephony directories map names to numbers (and maybe addresses). Yes,
you could use it to map an address to a name but it's very
inefficient: you're essentially searching through the whole directory
looking for the address whilst the directory is sorted by name, not
address. So unless it was a very small telephone directory, I don't
really buy that. Using census results is far more creditable: they're
public documents and when they're online, they do allow you to search
by address. In the UK you can only get access to the
raw census details 100 years after the census has been published
which, to a high probability, rules it out as a means to tie an
address to a person who's still alive. You can get statistics and
aggregates from more recent census results but you can't get the raw
data. I'm assuming that in the US there's no such restriction on
access to raw census data. If there is then I don't understand how
Kathleen really managed to get a name for the owner of the property.

Instead, in the UK, if you want to find out who owns some land, you
can pay the land registry £3 and they'll tell you. Presumably there
are means by which you can legally hide this; I'm sure the rich have
figured this out - probably some method by which some fake company in
a tax haven technically "owns" the land and as they're registered
abroad, they don't have to divulge any further details about that
company. So yes, you could argue the Land Registry is profiting from
facilitating stalkers, but equally there are a bunch of legitimate
reasons to need access to such data and I can't think of any sane way
to ensure the use of such a service isn't abused. So from that I
conclude that unless the owner is a millionaire, the owner of any land
is public knowledge.

The use of social engineering to get the address in the first place is
more interesting but very obvious. This sort of thing happens a lot
and sometimes to horrifying consequences (e.g. the Australian DJs who
phoned up a hospital, pretending to be the Queen and Prince of Wales,
enquiring as to the health of the Duchess of Cambridge. The nurse fell
for the hoax and put the call
through. Three days later, the nurse committed suicide). As a species
we are not good at taking the time to verify who we're talking to or
why. Whilst (hopefully) most of us would hang up if our bank
apparently rang us and then asked for our credit card details "for
security" this is largely only because it's in the bank's interest (in
terms of cost of insurance) to reduce fraud, so they've trained us as
such. But in all sorts of other scenarios we implicitly trust people
we've no real reason to. A simple example: ticket inspectors on public
transport. They may be wearing the uniform, but it could be
faked. With their travel-card readers they could be seeing who has the
expensive yearly travel cards, scanning the unique numbers from them
and then using them to program up fraudulent cards. The crypto on
those things is notoriously weak. Has anyone ever requested some means
to verify the identity of a ticket inspector? And even if you could,
how do you know they're not crooked regardless?

So phoning someone up, impersonating someone else, or pretending to
have valid reasons to request the information you're requesting is
always likely to work. It might be illegal in some cases, but it's
certainly human nature to try to be helpful and if you're given a
plausible justification, on what basis could you refuse the request
unless it's contrary to some sort of company policy? In this case, if
you're concerned about anonymity, wouldn't you be concerned about this
possibility, and make use of an anonymous mail box?

In conflating anonymity and privacy, we have failed to see an
important factual difference between them: under the condition of
privacy, we have knowledge of a person’s identity, but not of an
associated personal fact; whereas under the condition of
anonymity, we have knowledge of a personal fact, but not of the
associated person’s identity

The vast violations of our lives by state surveillance as revealed by
Snowdon over the last year demonstrates the whole-scale collation of
everything we do online and off by our governments. This is both being
able to observe an action and identify the individual who caused it
(thus we have no hope of performing any action anonymously), and being
able to observe an individual and know the actions they take (thus no
privacy). I can't work out whether the ECHR has anything to say on a
right to anonymity; I get the sense that it doesn't try to protect
that. So that's basically saying: "the state shouldn't record your
every move (as that's an invasion of privacy), but moves that we're
interested in, we can know who did them". Of course, we now know
they're recording everything anyway.

We also know that computer systems can always be hacked into - there
is no real security anywhere. Given a skilled and sufficiently funded
adversary, any computer system connected in any way to the internet
can be hacked into. Why? Because humans wrote the software that runs
on those computers and humans are incapable of writing bug-free
software. Look at all the large scale data breaches in recent
history. Nothing is secure.

So we have laws that seem to try and protect privacy, but they're
violated by our own governments, and in any case, we have countless
examples of our inability to store any information securely. So is
there really any hope to be able to exist with anonymity on the
internet?

As ever, it depends who your adversary is. If your adversary is a
government (either your own or some foreign government) then no, you
have no hope. If it's a previous partner of yours who has no
particular computer training, then yes, you're probably going to have
a reasonable chance of being anonymous for a while. But you need to
read up on this and think hard: it's not a trivial undertaking. There
are some good guides as to how to do this, but:

All writers - whether writing under their own names or not -
should be aware of the risks they may incur by hitting 'publish'.

What is the effect of hitting "publish"? It's to put more data points
out there which may lead people to be able to identify you. The fewer
data points out there, the better. So coming back to our book
reviewer, if you want to review books anonymously, and if your
justification for acting anonymously is to avoid being stalked by
authors who don't like your reviews, then why put so many data points
out there? Why have the Facebook page, the Instagram profile with the
faked photos, the Twitter account? Why give your real postal address
to the book review club knowing they're going to post books to it and
might conceivably give your address out to other people?

The social media accounts in particular I find most odd. If you want
to review books then review books. Build your following, your
reputation and cachet on the quality of your reviews. If I'm looking
at a book review I really don't care where you went on holiday, what
your tweets are, or how many pets you have. Putting that information
out there undermines your entire justification for being anonymous: if
you want to be anonymous (i.e. you don't want people to find out who
you are) then why are you putting so much unnecessary information out
there that may allow people to figure out who you are?

Equally, use a name that clearly communicates to me you're trying to
be anonymous: call yourself TheBookReviewer53, DostoyevskyLover or
OrwellWasRight. Doing so doesn't lessen the validity of your opinions
on your chosen subject and is more honest with people reading your
reviews: it's overtly saying "I have reasons to want to exist
anonymously on the internet". It reveals nothing more about your real
identity either: regardless of the obvious fictitious-ness of your
online persona, if you can be found, you can be found.

Researchers show that four data points about a person’s location can identify that person with 95% accuracy. FOUR. You
think you can tweet anonymously from your phone? You think apps like
Whisper allow you to act anonymously? As with pretty much everything
related to the internet and computing, unless you've spent the last 20
years of your life working with computers, studying computers and
thinking very hard about threat models and what data you're putting
out there, and are utterly paranoid, you basically haven't got a
chance. Do you turn off wifi on your phone when you leave the house?
You should. You trust that USB pen drive you're transferring documents
on? You shouldn't.

Finally and most obviously, any attempt at anonymity clearly doesn't
insulate you from the law. As members of various hacking groups such
as lulzsecfound out, you always can be found out by law enforcement
agencies. Yes, you might be able to make it difficult for a poorly
funded person to come after you for libel (which is really just an
indictment of the disgusting relationship between justice and money)
but it's quite a risk to take. If you wouldn't put it in print with
your real name attached, you're placing an awful lot of trust on your
ability to maintain your anonymity against an adversary you probably
don't know as well as you need to.

Sun, 05 Oct 2014

One of the things that annoys me about Object Oriented Programming is
how it's often suggested that it models the "real world". Frequently
tutorials will start with creating an object modelling a chair, and
through inheritance you'll be able to build up composable aspects of
chairs: different numbers of legs, different colours, different
designs. Sometimes they use tables rather than chairs. This is lovely,
but it actually has everything to do with data modelling through
inheritance, decomposition, abstraction and encapsulation, and almost
nothing to do with Object Orientation: the key is that these chairs
have no modifying methods on them. If they have any methods at all
then they'll be for things like getting the number of legs or the
colour, or volume or something - something that is basically fixed
once the object is instantiated. At this point in such tutorials I'd
probably claim this is not actually programming yet: all that's been
achieved so far is that we've assigned some semantics to some numbers
held in memory and we can write some numbers in memory. Programming is
when we manipulate numbers: that involves reading and writing numbers.

The problem then is that Object Orientation immediately stops being
about modelling the "real world" as soon as we can modify memory. If
we think about how we actually would go about getting a chair made for
us, it could go a bit like this:

Go see your local carpenter,

Have a discussion with them about the style and type of chair
you'd like,

They make the chair and give it to you in return for payment,

You take the chair home,

You decide you don't like the colour so you take it to your garage
and repaint it yourself.

It should be clear that the inanimate object (the chair) is the odd
one out here. Everything else is done by actors that have their own
state, mainly act asynchronously, and can communicate with other
actors through protocols - protocols that do not involve sharing
mutable state (e.g. if I say something to you, that speech is
immutable; you can't change what I've said (though you could choose to
mishear me!)). At no point is any state of any actor actually exposed
to another actor: I may share with you what I'm currently thinking,
and you can try to influence me, but we don't exactly need a mutex
around memory in my brain because YOU'RE NOT GETTING IN THERE!

If you tried modelling this sort of thing through Object Orientation
without actors then you'd end up with your own thread doing all the
work: it'd be you, it'd be the carpenter and it'd be the chair, maybe
all at once. If your carpenter is in fact a growing business with a
receptionist, a design team and a billing department your thread would
be playing those roles too and would probably have to use locks to
avoid unexpected interactions with other threads doing the same
commissioning-receptioning-designing-constructing-delivery-repainting
dance. And all the time, whilst you're doing the carpentry yourself,
you'd could easily have your own thoughts, feelings, aspirations and
regrets all on the same stack for your carpenter-alias to mess
with.

Thus Object Orientation causes multiple personality disorder.

So in my view, the way Object Orientation gets introduced tends to be
more like "useful tools for modelling data". But the OO approach to
manipulating that data goes wrong as soon as you try to model the
animated real world. Firstly it has nothing to say about separating
out threads to self-contained actors (but try this in a language or on
a platform without green-threads, or without the ability to preempt
threads and you can quickly hit pain), and secondly even if you do
have actors, OO encourages the sharing of mutable data rather than
passing around either immutable data or copies of data. Yes, good
programming discipline can result in sane designs and a successful
result, but it's not a core aspect of the OOP mantra.

So, OOP has nothing good to say on manipulating data at all - it
either says nothing or it encourages silly ideas like using locks. The
data modelling bits are fine, but I think they're a broader concept
beyond the confines of OOP. What else does OOP get you? An arbitrary
restriction on the receiver of any method. That's about it. It's
thanks to this restriction that writing combinators like cons on a
list library in an OO language is really painful.

The first three examples, from C# are certainly surprising to me (I
barely know any C# at all though). The first two problems come from
trying to compose side-effecting stuff with laziness. In the first
case it's not clear that the problem is with the IO operation
(printing things out) or actually with the laziness, but more the odd
behaviour of the Where operator (presumably the implementation of
Where doesn't know that a Cartesian product isn't necessary, but
surely any normal monadic/list-comprehension implementation wouldn't
have this problem?). The second case is certainly the terrifying
composition of laziness with throwing exceptions and thus the
exception having the potential to pop out anywhere where the lazy
expression gets forced. However, if you know the Select operator is
lazy, it's not really that surprising. It's arguably piss-poor
language design that there's nothing there to help you, but C# doesn't
have checked exceptions; apparently programmers don't like having to
deal with errors so you reap what you sow. The third case is how C#
has a nice using feature which binds a resource to a lexical
scope. But if you construct a closure capturing the resource and then
send it out of that lexical scope then using goes wrong (it will still
discard the resource even though there's a reference to it within the
closure which remains in-scope). This is certainly piss-poor language
design: if the closure captures stuff from your lexical scope and
you're not reference counting (or equivalent) your lexical scope then
YOU'VE DONE IT WRONG. This is as bad as in C allocating stuff on your
stack and then returning pointers to it.

Next he moves on somewhat tangentially to the point that if object
creation is an observable action then you can't optimise it out. I'm
not sure anyone outside a pure functional language specialist would
ever want object creation to be optimised out, but the point is that
if your constructor has side effects or can in any other way be
observed then you can't have your language runtime do memoization of
object creation. Doing side effects in object constructors has long
been discouraged: I first read that back in the Effective Java book
about a decade ago and I'm sure it wasn't exactly a ground-breaking
piece of advice then.

So far then we have that side effects which are untracked have the
potential to be bad: whether it's printing things out, or throwing
exceptions, or discarding resources early, or preventing compiler
optimisations. But next I feel the article goes a bit wrong. He first
moves onto how channels in C⍵ can store state so they're not pure
either, thus bad. And then goes onto how in Erlang you have the same
problem as you're just modelling mutable state in actors:

Note how this Erlang actor basically encodes an object with
dynamic method dispatch using the pattern-matching,
message-sending, and recursion primitives of the language, which
you may happily leverage to implement mutable references,
sabotaging the fact that the Erlang language does not natively
expose mutable state.

This is wrong: you cannot implement mutable references in Erlang. Data
is immutable in Erlang so if you send some value out of an actor, you
are sending that value. Not a reference to a value or variable. Even
if you create a closure and send that out of the actor, the closure is
capturing those values as they exist at that point in time. If you
have received a value sent to you from an actor, you may use it to
create other values, but doing so does not affect the "original", and
similarly, the actor itself can continue to modify its own state, but
it does not affect the values it sent to you. Yes, you can use Erlang
actors to model objects. But an actor's own modifications of its state
cannot be observed as side effects on values you've previously
retrieved from that actor, and vice versa.

The reference you have to an actor is a process identifier (also
immutable) which does not present any information itself about the
state of the actor. Through that, you can send messages to an actor
and test whether or not the actor is still alive, but that is all. And
in any case, where has the sudden objection to mutable state come
from? State is just a catamorphism on prior inputs. State is not the
problem: unconstrained side effects are the problem. Certainly sharing
mutable state is a problem (and you could argue that mutating shared
state is a side effect and that it should be tracked statically), but
Erlang does not allow for that.

He may have been better off going for an example of opening an file,
sending the file handle to another process and then closing the file
handle before it's been used (i.e. the same as the third C#
example). Except:

All file operations can return an error anyway so handling errors
in such code is completely normal;

In Erlang a file handle is normally an actor itself, so what
you're doing is passing around a process identifier. Sending
messages to a dead process (once the file is closed) is a normal
activity and you can detect if the process has died in normal
ways;

If you bypass such normal file handling for performance reasons
and open the file in "raw" mode then Erlang has a light form of
object capabilities in which only the process that opened the file
is allowed to use the file handle, so again the use of the file
handle would error predictably;

The language doesn't have the same C# feature for discarding
resources once you return out of a lexical scope. Consequently
closing a file is an explicit operation and given the asynchronous
concurrent mindset one develops when working in Erlang, it's very
likely you'll realise how odd it is to be closing a file handle
whilst there's some closure out there which may not have been run
yet.

Beyond this, he introduces the Haskell type system and explains that
it captures side effects statically. As a result, by bowing to the
demands of the type checker, it offers you a proof that if such
effects occur, your program will handle them: exceptions will not go
uncaught, IO operations are only permitted where the semantics lead to
expected outcomes, resources are not used after they're discarded and
the compiler can use all these proofs to do all manner of
optimisations to your program.

These proofs can certainly be very valuable (though they are no
substitute for disciplined, high quality design and careful
implementation). Obviously, they don't capture everything
though. Particularly relevant for concurrent and distributed programs,
they don't capture sufficient side effects to allow for a proof of the
absence of deadlocks. Haskell standard libraries contain channels and
semaphores which can easily be used to sporadically end up with a
deadlock between processes. A deadlock is definitely a side effect:
the effect is the program probably stops working. The cause is an
insufficient use of locks to control the scheduler (be it scheduling
of OS threads or language runtime scheduling of green threads).

More broadly, the proof a type checker offers is that the
specification you've provided (type signatures) is not violated by its
inferences about your code. Until the type checker allows "and makes
progress" as part of a specification, Haskell itself is no safer than
any other language that claims to be "mostly functional".

Sat, 27 Sep 2014

I remember when I first went to University, coming across people who
were both clearly extremely expert in their fields, from whom I wanted
to learn, but were also religious, and how this utterly baffled me. At
that point I would cheerfully describe myself as an avid atheist. My
ignorance and naivety was somewhat extensive.

Over a decade later I like to think I have a more nuanced view. The
most recent war in Gaza led, obviously, to vast amounts of suffering
but some excellent articles on the subject (this one by Hadley Freeman
in particular) helped me see perspectives more clearly and articulated
how crucial it is to be precise with criticism: are you criticising a
religion, a people, a government, a policy or something else? Nothing
is ever black-and-white and it seems increasingly important to
anticipate the consequences of an ill-thought-through comment or
reaction. A good example of that is George Galloway's comments this
week in the debate about this country once again getting involved in
Iraq. On the face of it, and certainly without being remotely
well-enough informed to evaluate the accuracy of his claims, if his
claims on the size and makeup of ISIS/ISIL are true then there seems
little likelihood that the bombing campaigns being discussed will be
effective, and quite likely counter-productive. But all of that got
lost due his description of Iraqis as quiescent. The way in which that
description was seized upon by other MPs and the resultant media storm
resulted in the over-shadowing not just of the rest of his
contribution to the debate, but also of other important aspects of the
debate, such as the resignation of Rushanara Ali (Labour's Shadow
Minister for Education), citing once again the lack of a credible
long-term plan for the region and our involvement.

Addressing the broader and somewhat more abstract issue is
this enlightening article by Karen Armstrong. Again, I'm not claiming
to be expert in the area, merely I found the article very
educative. It had barely occurred to me that the western world's
separation of the secular from the sacred was firstly such a recent
occurrence, and secondly that it arose from a specific set of
circumstances. There is no implicit reason why separation of state
from church is an inevitable or even likely happenstance (to me, this
reminds me of the question "if humans evolved from monkeys, then why
can't we find monkeys still evolving into humans today?", to which the
answer is "the circumstances are not right for that to occur"). The
fact that the English word "religion" can't really be translated
accurately into other languages (especially not languages that predate
English such as Greek or Latin; as historically faith is all
encompassing of life, not merely a private affair as we treat it today
in the west) starts to show quite how odd the separation of secular
from sacred in the modern west really is.

More interesting still is the observation that in the west, belonging
to a Nation has in some ways subsumed the role of belonging to a
Religion, only apparently with more positive overtones: we consider it
almost reprehensible to die for your religion, but honourable to die
for your nation. It would seem the concept of even belonging to a
nation and having any sense of greater community outside your
immediate surroundings only came about with the increased ability of
governments to engage with (or intrude upon) their citizens. Before
that point, presumably with church attendance widespread and frequent,
one's interaction with "the wider world" was through the
representative of the church. This would seem to explain a lot about
why governments of the past sought the blessing of their nation's
church for particular courses of action: maybe the church was seen as
the bridge between the government (or monarchy) and the people. The
whole article is worth a read.