7 August 2018

There have been many news stories of late about potential
attacks on the American
electoral system. Which attacks are actually serious? As always, the
answer
depends
on economics.

There are two assertions I'll make up front. First, the attacker—any
attacker—is resource-limited. They may have vast resources, and
in particular they may have more resources than the defenders—but
they're still limited. Why? They'll throw enough resources
at the problem to
solve it, i.e., to hack the election, and use anything left over for
the next problem, e.g., hacking the Brexit II referendum…
There's always another target.

Second, elections are a system. That is, there are multiple
interacting pieces. The attacker can go after any of them; the defender
has to protect them all.
And protecting just one piece very well won't help;
after all, "you don't go through strong security, you go around it."
But again, the attacker has limited resources. Their strategy, then,
is to find the greatest leverage, the point to attack that costs
the defenders the most to protect.

There are many pieces to a voting system;
I'll concentrate on the major ones: the voting machines, the registration
system, electronic poll books, and vote-tallying software.
Also note that many of these pieces can be attacked indirectly, via a
supply
chain attack on the vendors.

There's another point to consider: what are the attacker's goals?
Some will want to change vote totals; others will be content with
causing enough obvious errors that no one believes the results—and
that can result in chaos.

The actual voting machines get lots of attention. That's partly
a hangover from the 2000 Bush–Gore election, where myriad
technological problems in Florida's voting system (e.g.,
the butterfly ballot
in Palm Beach County and the
hanging
chads
on the punch card voting machines) arguably
cost
Gore the state
and hence the presidential election.

And purely computerized
(DRE—Direct Recording Electronic)
voting machines
are indeed problematic. They
make mistakes.
If there's ever a real problem, there's
nothing
to recount.
It's crystal-clear to virtually every computer scientist who has studied
the issue that DRE machines are a bad idea.
But: if you want to change the
results of a nation-wide election or set of elections in the
U.S., going after DRE machines is probably the wrong idea. Why not?
Because it's too expensive.

There are many different election administrations in the U.S.:
about
10,000
of them.
Yes, sometimes an entire state uses the same
type of machine—but each county administers its own machines.
Storing the voting machines?
Software updates? Done by the county.
Progamming the ballot? Done by the county.
And if you want to attack them? Yup—you have to go to that county.
And voting machines are rarely, if ever, connected to the Internet,
which means that you pretty much need physical presence to do
anything nasty.

Now, to be sure, if you are at the polling place you may be able to do
really
nasty things
to some voting machines. But it's not an attack that scales well for
the attacker. It may be a good way to attack a local election, but nothing
larger. A single Congressional race? Maybe, but let's do a back-of-the-envelope
calculation. The population of the U.S. is about
325,000,000.
That means
that each election area
has about 32,500 people. (Yes, I know it's very
non-uniform. This is a back-of-the-envelope calculation.)
There are 435 representatives, so each one has about 747,000 constituents,
or about 75 election districts. (Again:
back
of the envelope.)
So: you'd need a physical presence in seven different counties, and
maybe many precincts in each county to
tamper
with the machines there. As I said, it's not an attack that scales very
well. We need to fix our voting machines—after all, think of Florida
in 2000—but for an attacker who wants to change the result of
a national election, it's not the best approach.

There's one big exception: a supply chain attack might be very
feasible for a nation-state attacker. There are not many vendors
of voting equipment; inserting malware in just a few places could
work very well. But there's a silver lining in that cloud: because
there are many fewer places to defend than 50 states or 10,000 districts,
defense is much less expensive and hence more possible—if
we take the problem seriously.

And don't forget the chaos issue. If, say, every voting machine in a populus
county of a battleground state showed a preposterous result—perhaps
a 100% margin for some candidate, or 100 times as many votes cast
as there are registered voters in the area—no one will be believe
that that result is valid. What then? Rerun the voting in just that
county? Here's what the Constitution says:

The Congress may determine the Time of chusing the Electors, and the
Day on which they shall give their Votes;
which Day shall be
the same throughout the United States.

The voter registration systems are a more promising target for
an attacker.
While these are, again, locally run, there is often a statewide portal
to them.
In fact,
38 states
have or are about to have online voter registration.

In 2016, Russia
allegedly
attacked
registration systems in a number of states. Partly, they wanted to
steal voter information, but an attacker could easily delete or modify
voter records, thus effectively disenfranchising people.
Provisional ballots? Sure, if your polling place has enough of them,
and if you and the poll workers know what to do. I've been a poll worker.
Let's just say that handling exceptional cases isn't the most efficient
process.
And consider the public reaction if many likely
supporters
(based on demographics)
of a given candidate
are the ones who are disproportionately deleted.
(Could the attackers register phony voters? Sure, but to what end?
In-person voter fraud is exceedingly rare; how many times can Boris
and Natasha show up to vote? Again, that doesn't scale. That's also why
requiring an ID to vote is solving a non-problem.)

There's another point. Voting software is specialized; it's attack surface
should be low. It's possible to get that wrong, as in some now-decertified
Virginia
voting machines, and there's always the underlying operating system;
still, if the machines aren't networked, during voting the only exposure
should be via the voting interface.

A lot of registration software, though, is a more-or-less standard
web platform, and is therefore subject to all of the risks of any
other web service. SQL injection, in particular, is a very real risk.
So an attack on the registration system is not only more scalable, it's
easier.

Before the election, voter rolls are copied to what are known as poll books.
Sometimes, these are paper books; other places use electronic ones.
The electronic ones are networked to each other; however, they are
generally not connected to the Internet. If that networking is set up
incorrectly, there can be risks; generally, though, they're networked on
a LAN. That means that you have to be at the polling place to exploit
them. In other words, there's some risk, but it's not much greater than
the voting machines.

There's one more critical piece: the vote-tallying software. Tallies from
each precinct are transmitted to the county's election board; there
may be links to the state, to news media, etc. In other words, this
software is networked and hence very subject to attack.
However: this is used for the election night count; different procedures
can be and often are used for the official canvas. And even without
attacks,
many things
can go wrong:

In Iowa, a hard-to-read fax from Scott County caused election
officials initially to give Vice President Gore an extra 2,006
votes. In Outagamie County, Wis., a typo in a tally sheet threw
Mr. Bush hundreds of votes he hadn't won.

But: the ability to do a more accurate count the second time around
depends on there being something different to count: paper ballots.
That's what saved the day in 2000 in
Bernalillo
County, New Mexico. The problem:
``The paper tallies, resembling grocery-store receipts, seemed to show
that many more ballots had been cast overall than were cast in individual
races. For example, tallies later that night would show that, of about
38,000 early ballots cast, only 25,000 were cast for Mr. Gore or Mr.
Bush.''
And the cause? Programming the vote-counting system:

As they worked, Mr. Lucero's computer screen repeatedly
displayed a command window offering a pull-down menu. From
the menu, the two men should have clicked on "straight
party." Either they didn't make the crucial click, or they
did and the software failed to work. As a result, the
Accu-Vote machines counted a straight-party vote as one
ballot cast, but didn't distribute any votes to each of the
individual party candidates.

To illustrate: If a voter filled in the oval for straight-party
Democrat, the scanner would record one ballot cast but wouldn't
allocate votes to Mr. Gore and other Democratic candidates.

Crucially, though, once they fixed the programming they could retally
those paper ballots.
(By the way, programming the tallying computer can itself be complex.
Bernalillo County, which had a population of
557,000
then, required 114 different ballots.)

The best way to check the ballot-counting software is
risk-limiting
audits.
A risk-limiting audit checks a random subset of the ballots cast.
The closer the apparent margin, the more ballots are checked by hand.
"Risk-limiting audits guarantee that if the vote tabulation
system found the wrong winner, there is a large chance of a
full hand count to correct the results."
And it doesn't matter whether the wrong count was due to buggy software
or an attack.
In other words, ifthere
is a paper trail,
and if
it's actually looked at, via either a full hand-count or a risk-limiting
audit, the tallying software isn't a good target for an attacker.
One caveat: how much chaos might there be if the official count or
the recount deliver results significantly different than the
election night fast count?

There's one more point: much of the election machinery, other than
the voting machines themselves, are an ordinary IT installation, and
hence are subject to all of the security ills that any other IT
organization can be subject to. This specifically includes things like
insider attacks and
ransomware—and some attackers
have been
targeting local governments:

Attempted ransomware attacks against local governments in
the United States have become unnervingly common. A 2016
survey of chief information officers for jurisdictions
across the country found that obtaining ransom was the most
common purpose of cyberattacks on a city or county government,
accounting for nearly one-third of all attacks.

The threat of attacks has induced
at
least one jurisdiction
to suspend online return of absentee ballots. They're wise to
be cautious—and probably should have been that cautious to start.

Again, elections are complex. I've only covered the major pieces here;
there are
many
more ways things can go wrong. But of this sample, it's pretty clear
that the attackers' best target is the registration system.
(Funny, the Russians
seemed
to know that, too.)
Actual
voting machines are not a great target, but the importance of
risk-limiting audits (even if the only problem is a close race)
means that replacing DRE voting machines with something that provides
a paper trail is quite important. The vote-counting software is even
less interesting if proper audits are done, though don't discount
the utility to some parties of chaos and mistrust.

Update: No sooner did I write about how impossible results could
lead to chaos than
this
story
appeared about
DRE machines in Georgia:
"[i]n Habersham County's Mud Creek precinct, … 276 registered voters
managed to cast 670 ballots". There were other problems, too.
I suspect bugs rather than malice—but we don't really know yet.

8 August 2018

I keep hearing stories of people using "foldering" for
covert communications. Foldering is the process of composing
a message for another party, but instead of sending it as an
email, you leave it in the Drafts folder. The other party
then logs in to the same email account and reads the message;
they can then reply via the same technique.
Foldering has been used for a long time, most famously by
then-CIA director
David
Petraeus
and his biographer/lover
Paula Broadwell.
Why is foldering
used? What is it good for, and what are its weaknesses?
There's a one-word answer to its strength—metadata—but its utility
(to the extent that it had any)
is largely that of a bygone era.

Before I start, I need to define a few technical terms.
In the email world, there are "MUAs"—Mail User Agents—and
"MTAs"—Mail Transfer Agents. They're different.

An MUA is what you use to compose and read email. It could be
a dedicated mail program—the Mail app on iPhones and MacOS,
Outlook on Windows, etc. An MUA needs to configured with the
domain names of the user's outbound and inbound email servers.
MUAs live on user machines, like laptops and phones; MTAs are
servers, and are run by corporations, ISPs, and mail providers like
Google. And there's a third piece, an inbound mail server. A receiving
MTA hands off the mail to the inbound mail server; the MUA talks to
it and pulls down email from it.

Webmail systems are a bit funny. Technically, they're remote MUAs
that you talk to via a web browser. But they still talk to MTAs and
inbound mail servers, though
you don't see this. The MUA and MTA might be on the same computer
for a small operation
(perhaps running the open source
squirrelmail package);
for something the size of Gmail or Hotmail,
the webmail servers are on separate machines from the MTAs.
However, foldering doesn't involve an MTA. Rather,
it involves composing messages and leaving them in some folder.
The folders are all stored on disk—as it turns out, on disk managed
by the inbound mail server, even though you're composing mail.
(Why? Because only inbound mail servers and MUAs know about
folders; MTAs don't. The MUA could have a draft mail folder (it probably
does), but by sending it to the inbound mail server, you can start
composing email on one device and continue from another.)

Webmail systems are, as I said, MUAs. For technical reasons, they
generally
don't have any permanent folder storage of their own; they just talk
to the inbound mail server.

So: foldering via a webmail system involves a web server and an
inbound mail server. It does not involve an MTA—and that's important.

If you're trying to engage in covert communications, you're not going
to use your own mail systems—it's too obvious what's going on.
Accordingly, you'll probably use a free commercial email service such
as Google's
Gmail or Microsoft's
Outlook. The party with
whom you're communicating
will do the same.
Let's follow the path of a typical email from a Gmail user (per the
usual
conventions in cryptography,
we'll call her Alice) to an
Outlook user named Bob.

The sender logs in to Gmail, probably via a web browser though
possibly via an MUA app. Even back in the mists of time, the login
connection was encrypted. However, until 2010, the actual
session
wasn't
encrypted by default,
though users were able to turn on encryption since at least 2008. Let's
assume that our hypothetical conspirators or lovers were security-conscious,
and thus turned on encryption for this link. That meant that no eavesdropper
could see what was going on, and in particular could not see who logged in
to Gmail or to whom a particular email was being sent.
After Alice clicks "Send", though, the webmail MUA hands the message
off to the MTA—and that's where the security breaks down.
Back then, the MTA-to-MTA traffic was not encrypted; thus,
someone—an intelligence agency?—monitoring the Internet backbone
would see the emails. Bingo: our conspirators are burned.
And even if we're talking about simple legal processes, the sender and
recipient of such email messages are (probably)
legally
metadata
and hence are readily available to law enforcement.

Suppose, though, that Alice and Bob used foldering. There are no MTAs
involved, hence no sender/receiver metadata, and no unencrypted content
flowing anywhere. They're safe—or so they thought…

When Alice logs into Gmail, her IP address is recorded. It, too, is metadata.
An eavesdropper doesn't know that it's Alice, but her IP address is visible.
More importantly, it's logged by Gmail: user Alice logged in from 203.0.113.42.
Oddly enough, "Alice"—it's really Bob, of course—logged in from
198.51.100.17 as well, and those two IP addresses aren't physically
located anywhere
near each other. That discrepancy might even be logged.
Regardless, it's in Gmail's log files, and if Alice or Bob are under
suspicion, a simple subpoena for the log files (or a simple hack of
the mail server) will show what's going on: these two IP addresses
are showing a decidedly odd login pattern, and one of them belongs to a
party under suspicion.

So where are we, circa 2010? Suppose neither Alice nor Bob were suspected
of anything and they sent email.
An intelligence agency monitoring assorted Internet links
would see email between the two of them; if one was being targeted, it
would be able to pick off the contents of the messages.
If they used foldering, though, they would be much safer: there wouldn't
be any incriminating unencrypted traffic. The spooks would see traffic
from Alice's and Bob's IP addresses to Gmail or Outlook, but that's not
suspicious. The login names and the sessions themselves are protected.

Suppose, though, that Alice and/or Bob were under suspicion by law enforcement.
A subpoena would get the login IP addresses; the discrepancy would stick
out like a sore thumb, and the investigation would proceed apace.

In other words, in 2010 foldering would protect against Internet eavesdropping
but not against law enforcement.

The world is very different today. Following the Snowden revelations,
many email providers
turned
on encryption
for MTA-to-MTA traffic. As a consequence, our hypothetical intelligence
agency can't see that email is flowing between Alice and Bob; it's all
protected. If they're being investigated, of course, a subpoena
will show the email—but the same sort of subpoena would also show the
login IP addresses.

Where does that leave us? Today, an attacker with access to log files,
either via subpoena or by hacking a mail server, can see the communication
metadata whether Alice and Bob are using foldering or simply sending email.
An eavesdropper can't see the communications in either case. This is in
contrast to 2010, when an eavesdropper could learn a lot from email but
couldn't from a foldering channel.

Conclusion: if Alice and Bob and their mail services take normal
2018 precautions, foldering adds very little security.

24 August 2018

This morning, I saw a link to a
fascinating
document.
Briefly, it's a declassified
TICOM document
on some German cryptanalytic efforts during World War II.
There are a number of interesting things about it, starting with the
question of why it took until 2018 to declassify this sort of information.
But maybe there's a answer lurking here…

(Aside: I'm on the road and don't have my library handy; I may update
this post when I get home.)

TYPEX—originally
Type 10, or
Type
X—was the British high-level
cipher device. It was based on the commercial Enigma as modified by
the British. The famous German military Enigma was also derived from
the commercial model. Although the two parties strengthened it in
different ways, there were some fundamental properties—and fundamental
weaknesses—that
both inherited from the original design.
And the Germans had made significant progress against TYPEX—but
they couldn't take it to the next level.

The German Amy Cryptanalytic Agency, OKH/In 7/VI, did a lot of statistical
work on TYPEX. They eventually figured out more or less everything
about how it worked, learning only later that the German army had
captured three TYPEX units at Dunkirk. All that they were missing
were the rotors, and in particular how they were wired and where the
"notch" was on each. (The notch controlled when the rotor would kick
over to the next position.) And if they'd had the rotor details and
a short "crib" (known plaintext)?

The approximate number of tests required would be about 6
&times 143 = 16,464. This was not by any means a large number
and could certainly be tackled by hand. No fully mechanised
method was suggested, but a semi-mechanised scheme using
a converted Enigma and a lampboard was suggested. There
can be no doubt that it would have worked if the conditions
(a) and (b) had ever been fulfilled. Moreover, the step
from a semi-mechanised approach to a fully automatic method
would not have been a difficult one.

In other words, the Germans never cracked TYPEX because they didn't
know anything about the rotors and never managed to "pinch" any.
But the British did have the wiring of the Enigma rotors. How?

It turns out that the British never did figure that one out. It was
the
work of a brilliant Polish mathematician,
Marian
Rejewski; the Poles
eventually gave their results to the French and the British, since
they realized that even perfect knowledge of German plans wouldn't
help if their army was too weak to exploit the knowledge.

Rejewski was, according to David Kahn, the first person to use mathematics
other than statistics and probability in cryptanalysis. In particular,
he used
group
theory and permutation theory
to figure out the rotor wiring.
This was coupled with a German mistake in how they encrypted the
indicators, the starting positions of the rotors.
(Space prohibits a full discussion of
what that means. I recommend Kahn's
Seizing the Enigma
and Budiansky's
Battle of Wits
for more details.)

But what if the Germans had solved TYPEX? What would that have meant?
Potentially, it would have been a very big deal.

The first point is that since TYPEX and the German military Enigma had
certain similarities, the ability to crack TYPEX (which is generally
considered stronger than Enigma) might have alerted the Germans that
the British could do the same to them—which was, of course, the case.
If that wasn't enough, the British often used TYPEX to
communicate
ULTRA—the intelligence derived from cryptanalysis of Engima and
some other systems—to field units.
(Aside: the British used one-time pads to
send
ULTRA to army-level commands but used TYPEX for lower-level units.)
In other words, had the German army gained the ability to read TYPEX,
it might have been extremely serious.
And although their early work was on 1940 and earlier TYPEX,
"Had they succeeded in reading early traffic it seems reasonable to
conjecture that they might have maintained continuity beyond the change on
1/7/40 when the 'red' drums were introduced."
It's certainly the case that the British exploited continuity with
Enigma; most historians agree that if the Germans had used Enigma at the
start of the war as well as they used it at the end, it's doubtful that
the British could have cracked it.

There are a couple of other interesting points in the TICOM report.
For one thing, at least early in the war British cipher clerks were making
the same sorts of mistakes as the German clerks did:
"operators were careless about moving the wheels between the end of one
message and the start of the next". The British called their insight
about similar laziness by the Germans the
"Herivel
tip".
And the British didn't even encipher their indicators; they sent them
in the clear. (To be sure, the bad way the Germans encrypted their
indicators was what led to the rotor wiring being recovered, thus
showing that not even trying can be better than doing something badly!)

So where are we? The Germans knew how TYPEX worked and had devised an
attack that was feasible if they had the rotor wiring. But they
never captured any rotors and they lacked someone with the brilliance of
Marian Rejewski, so they couldn't make any progress. We're also left with
a puzzle: why was this so sensitive that it wasn't declassified until more
than 70 years the war? Might the information have been useful to someone
else, someone who did know the rotor wiring?

It wouldn't have been the U.S. The U.S. and the British cooperated very
closely on ULTRA, though the two parties
didn't
share everything:
"None of our allies was permitted even to see the [SIGABA] machine, let
alone have it."
Besides, TICOM was a
joint project;
the US had the same information on TYPEX's weaknesses.
However, might the Soviets have benefited? They had
plenty
of well-placed agents
in the U.K. Might they have had the rotor wirings? I don't know—but I
wonder if something other than sheer inertia kept that report secret for
so many years.