Barbara Simons, an accomplished computer scientist and e-voting expert, was recently appointed to the Election Assistance Commission (EAC) Board of Advisors. (The EAC is the U.S. Federal body responsible for voting technology standards, among other things.) This is good news.

The board has thirty-seven members, of which four positions are allocated for “members representing professionals in the field of science and technology”. These four positions are to be appointed by Majority and Minority leaders in the House and the Senate. (See page 2 of the Board’s charter.) Given the importance of voting technology issues to the EAC, it does seem like a good idea to reserve 10% of the advisory board positions for technologists. If anything, the number of technologist seats should be larger.

Barbara was appointed to the board position by the Senate Majority Leader, Harry Reid. Kudos to Senator Reid for appointing a genuine voting technology expert.

What about the other three seats for “professionals in the field of science and technology?” Sadly, the board’s membership list shows that these seats are not actually held by technical people. Barbara Arnwine holds the House Speaker’s seat, Tom Fuentes holds the House Minority Leader’s seat, and Wesley R. Kliner, Jr. holds the Senate Minority Leader’s seat. All three appear to be accomplished people who have something to offer on the board. But as far as I can tell they are not “professionals in the field of science and technology”, so their appropriate positions on the board would be somewhere in the other thirty-three seats.

What can be done? I wouldn’t go so far as to kick any current members off the board, even if that were possible. But when vacancies do become available, they should be filled by scientists or technologists, as dictated by the charter’s requirement of having four such people on the board.

The EAC is struggling with voting technology issues. They could surely use the advice of three more expert technologists.

Back when we were putting together the grant proposal for ACCURATE, one of the questions that we asked ourselves, and which the NSF people asked us as well, was whether we would produce a “bright shiny object,” which is to say whether or not we would produce a functional voting machine that could ostensibly be put to use in a real election. Our decision at the time, and it was certainly the correct decision, is that we would focus on innovating in the technology under the covers of a voting system, and we might produce, at most “research prototypes”. The difference between a research prototype and a genuine, commercial system are typically quite substantial, in general, and it would be no different here with voting system prototypes.

At Rice we built a fairly substantial prototype that we call “VoteBox”; you can read more about it in a paper that will appear on Friday at Usenix Security. To grossly summarize, our prototype feels a lot like a normal DRE voting system, but uses some nice cryptographic machinery to ensure that you don’t have to trust that the code is correct. You can verify the correctness of a machine, on the fly, while the election is ongoing. Our prototype is missing a couple features that you’d want from a commercial system, like write-in voting, but it’s far enough along that it’s been used in several human-factors experiments (CHI’08, Everett’07).

This summer, our mission is to get this thing shipped as some sort of “open source” project. Now we have several goals in this:

Allow other researchers to benefit from our infrastructure as a platform to do their own research.

All well and good. Now the question is how we should actually license the thing. There are many, many different models under which we could proceed:

Closed source + patents + licenses. This may or may not yield revenues for our university, and may or may not be attractive to vendors. It’s clearly unattractive to other researchers, and would limit uptake of our system in places where we might not even think to look, such as outside the U.S.

Open source + a “not for commercial use” license. This makes it a little easier for other researchers to pick up and modify the software although ownership issues could get tricky.

Open source with a “BSD”-style license. A BSD-style license effectively says “do whatever you want, just give us credit for our work and you’re on your own if it doesn’t work.” This sort of license tends to maximize the ease with which companies can adopt your work.

Open source with a “GPL”-style license. The GPL has an interesting property for the voting system world: it makes any derivatives of the source code as open as the original code (unless a vendor reimplements it from scratch). This sort of openness is something we want to encourage in voting system vendors, but it might reduce vendor willingness to use the codebase.

Open source with a “publication required” license. Joe Hall suggested this as another option. Like a BSD license, anybody can use it in a product, but the company would be compelled to publish the source code, like a book. Unlike GPL, they would not be required to give up copyright or allow any downstream use of their code.

I did a quick survey of several open source voting systems. Most are distributed under the GPL:

Civitas is distributed under a non-commercial-use only license. VoteHere, at one point, opened its code for independent evaluation (but not reuse), but I can’t find it any more. It was probably a variant on the non-commercial-use only theme. Punchscan is distributed under a BSD-style license.

My question to the peanut gallery: what sort of license would you select for a bright, shiny new voting system project and why?

[Extra food for thought: The GPLv3 would have some interesting interactions with voting systems. For starters, there's a question of who, exactly, a "user" might be. Is that the county clerk whose office bought it, or the person who ultimately votes on it? Also, you have section 3, which forbids any attempt to limit reverse-engineering or "circumvention" of the product. I suppose that means that garden-variety tampering with a voting machine would still violate various laws, but the vendor couldn't sue you for it. Perhaps more interesting is section 6, which talks about how source code must be made available when you ship compiled software. A vendor could perhaps give the source code only to its "customers" without giving it to everybody (again, depending on who a "user" is). Of course, any such customer is free under the GPL to redistribute the code to anybody else. Voting vendors may be most scared away by section 11, which talks about compulsory patent licensing (depending, of course, on the particulars of their patent portfolios).]

People have been heaping blame on Yahoo after it announced plans to shut down its Yahoo Music Store DRM servers on September 30. The practical effect of the shutdown is to make music purchased at the store unusable after a while.

Though savvy customers tended to avoid buying music in forms like this, where a company had to keep some distant servers running to keep the purchased music alive, those customers who did buy – taking reassurances from Yahoo and music industry at face value – are rightly angry. In the face of similar anger, Microsoft backtracked on plans to shutter its DRM servers. It looks like Yahoo will stay the course.

Yahoo deserves blame here, but let’s not forget who else contributed to this mess. Start with the record companies for pushing this kind of DRM, and the DRM agenda generally, despite the ample evidence that it would inconvenience paying customers without stopping infringement.

Even leaving aside past mistakes, copyright owners could step in now to help users, either by enticing Yahoo to keep its servers running, or by helping Yahoo create and distribute software that translates the music into a usable form. If I were a Yahoo Music customer, I would be complaining to the copyright owners now, and asking them to step in and stand behind their product.

Finally, let’s not forget the role of Congress. The knowledge of how to jailbreak Yahoo Music tracks and transform them into a stable, usable form exists and could easily be packaged in software form. But Congress made it illegal to circumvent Yahoo’s DRM, even to enable noninfringing use of a legitimately purchased song. And they made it illegal to distribute certain software tools to enable those uses. If Congress had paid more attention to consumer interests in drafting the Digital Millennium Copyright Act, or if it had passed any of the remedial legislation offered since the DMCA took effect, then the market could solve this Yahoo problem all on its own. If I were a Yahoo Music customer, I would be complaining to Congress now, and asking them to stop blocking consumer-friendly technologies.

And needless to say, I wouldn’t be buying DRM-encumbered songs any more.

UPDATE (July 29, 2008): Yahoo has now done the right thing, offering to give refunds or unencumbered MP3s to the stranded customers. I wonder how much this is costing Yahoo.

Recently Barack Obama gave a speech on security, focusing on nuclear, biological, and infotech threats. It was a good, thoughtful speech, but I couldn’t help noticing how, in his discussion of the infotech threats, he promised to appoint a “National Cyber Advisor” to give the president advice about infotech threats. It’s now becoming standard Washington parlance to say “cyber” as a shorthand for what many of us would call “information security.” I won’t fault Obama for using the terminology spoken by the usual Washington experts. Still, it’s interesting to consider how Washington has developed its own terminology, and what that terminology reveals about the inside-the-beltway view of the information security problem.

The word “cyber” has interesting roots. It started with an old Greek word meaning (roughly) one who guides a boat, such as a pilot or rudder operator. Plato adapted this word to mean something like “governance”, on the basis that governing was like steering society. Already in ancient Greece, the term had taken on connotations of central government control.

Fast-forward to the twentieth century. Norbert Wiener foresaw the rise of sophisticated robots, and realized that a robot would need something like a brain to control its mechanisms, as your brain controls your body. Wiener predicted correctly that this kind of controller would be difficult to design and build, so he sought a word to describe the study of these “intelligent” controllers. Not finding a suitable word in English, he reached back to the old Greek word, which he transliterated into English as “cybernetics”. Notice the connection Wiener drew between governance and technological control.

Enter William Gibson. In his early novels about the electronic future, he wanted a term for the “space” where online interactions happen. Failing to find a suitable word, he coined one – cyberspace – by borrowing “cyber” from Wiener. Gibson’s 1984 novel Neuromancer popularized the term. Many of the Net’s early adopters were fans of Gibson’s work, so cyberspace became a standard name for the place you went when you were on the Net.

The odd thing about this usage is that the Internet lacks the kind of central control system that is the subject matter of cybernetics. Gibson knew this – his vision of the Net was decentralized and chaotic – be he liked the term anyway.

All I knew about the word “cyberspace” when I coined it, was that it seemed like an effective buzzword. It seemed evocative and essentially meaningless. It was suggestive of something, but had no real semantic meaning, even for me, as I saw it emerge on the page.

Indeed, the term proved just as evocative for others as it was for Gibson, and it stuck.

As the Net grew, it was widely seen as ungovernable – which many people liked. John Perry Barlow’s “Declaration of Independence of Cyberspace” famously declared that governments have no place in cyberspace. Barlow notwithstanding, government did show up in cyberspace, but it has never come close to the kind of cybernetic control Wiener envisioned.

Meanwhile, the government’s security experts settled on a term, “information security”, or “infosec” for short, to describe the problem of securing information and digital systems. The term is widely used outside of government (along with similar terms “computer security” and “network security”) – the course I teach at Princeton on this topic is called “information security”, and many companies have Chief Information Security Officers to manage their security exposure.

So how did this term “cybersecurity” get mindshare, when we already had a useful term for the same thing? I’m not sure – give me your theories in the comments – but I wouldn’t be surprised if it reflects a military influence on government thinking. As both military and civilian organizations became wedded to digital technology, the military started preparing to defend certain national interests in an online setting. Military thinking on this topic naturally followed the modes of thought used for conventional warfare. Military units conduct reconnaissance; they maneuver over terrain; they use weapons where necessary. This mindset wants to think of security as defending some kind of terrain – and the terrain can only be cyberspace. If you’re defending cyberspace, you must be doing something called cybersecurity. Over time, “cybersecurity” somehow became “cyber security” and then just “cyber”.

Listening to Washington discussions about “cyber”, we often hear strategies designed to exert control or put government in a role of controlling, or at least steering, the evolution of technology. In this community, at least, the meaning of “cyber” has come full circle, back to Wiener’s vision of technocratic control, and Plato’s vision of government steering the ship.

Public policy, in the U.S. at least, has favored localism in broadcasting: programming on TV and radio stations is supposed to be aimed, at least in part, at the local community. Two recent events call this policy into question.

The first event is the debut of the Pandora application on the iPhone. Pandora is a personalized “music radio” service delivered over the Internet. You tell it which artists and songs you like, and it plays you the requested songs, plus other songs it thinks are similar. You can rate the songs it plays, thereby giving it more information about what you like. It’s not a jukebox – you can’t find out in advance what it’s going to play, and there are limits on how often it can play songs from the same artist or album – but it’s more personalized than broadcast radio. (Last.fm offers a similar service, also available now on the iPhone.)

Now you can get Pandora on your iPhone, so you can listen to Pandora on a battery-powered portable device that fits in your pocket – like a twenty-first century version of the old transistor radios, only this one plays a station designed especially for you. Why listen to music on broadcast radio when you can listen to this? Or to put it another way: why listen to music targeted at people who live near you, when you can listen to music targeted at people with tastes like yours?

The second event I’ll point to is a statement from a group of Christian broadcasters, opposing a proposed FCC rule that would require radio stations to have local advisory boards that tell them how to tailor programming to the local community. [hat tip: Ars Technica] The Christian stations say, essentially, that their community is defined by a common interest rather than by geography.

Many people are like the Pandora or Christian radio listeners, in wanting to hear content aimed at their interests rather than just their location. Public policy ought to recognize this and give broadcasters more latitude to find their own communities rather than defining communities only by geography.

Now I’m not saying that there shouldn’t be local programming, or that people shouldn’t care what is happening in their neighborhoods. Most people care a lot about local issues and want some local programming. The local community is one of their communities of interest, but it’s not the only one. Let some stations serve local communities while others serve non-local communities. As long as there is demand for local programming – as there surely will be – the market will provide it, and new technologies will help people get it.

Indeed, one of the benefits of new technologies is that they let people stay in touch with far-away localities. When we were living in Palo Alto during my sabbatical, we wanted to stay in touch with events in the town of Princeton because we were planning to move back after a year. Thanks to the Web, we could stay in touch with both Palo Alto and Princeton. The one exception was that we couldn’t get New Jersey TV stations. We had satellite TV, so the nearby New York and Philadelphia stations were literally being transmitted to our Palo Alto house; but the satellite TV company said the FCC wouldn’t let us have the station because localist policy wanted us to watch San Francisco stations instead. Localist policy, perversely, pushed us away from local programming and kept us out of touch.

New technologies undermine the rationale for localist policies. It’s easier to get far-away content now – indeed the whole notion that content is bound to a place is fading away. With access to more content sources, there are more possible venues for local programming, making it less likely that local programming will be unavailable because of the whims or blind spots of a few station owners. It’s getting easier and cheaper to gather and distribute information, so more people have the means to produce local programming. In short, we’re looking at a future with more non-local programming and more local programming.

NXP, which makes the Mifare transit cards used in several countries, has sued Radboud University Nijmegen (in the Netherlands), to block publication of a research paper, “A Practical Attack on the MIFARE Classic,” that is scheduled for publication at the ESORICS security conference in October. The new paper reportedly shows fatal security flaws in NXP’s Mifare Classic, which appears to be the world’s most commonly used contactless smartcard.

I wrote back in January about the flaws found by previous studies of Mifare. After the previous studies, there wasn’t much left to attack in Mifare Classic. The new paper, if its claims are correct, shows that it’s fairly easy to defeat MIFARE Classic completely.

It’s not clear what legal argument NXP is giving for trying to suppress the paper. There was a court hearing last week in Arnheim, but I haven’t seen any reports in the English-language press. Perhaps a Dutch-speaking reader can fill in more details. An NXP spokesman has called the paper “irresponsible” but that assertion is hardly a legal justification for censoring the paper.

Predictably, a document purporting to be the censored paper showed up on Wikileaks, and BoingBoing linked to it. Then, for some reason, it disappeared from Wikileaks, though BoingBoing commenters quickly pointed out that it was still available in Google’s cache of Wikileaks, and also at Cryptome. But why go to a leak-site? The same article has been available on the Web all along at arxiv, a popular repository of sci/tech research preprints run by the Cornell University library.

[UPDATE (July 15): It appears that Wikileaks had the wrong paper, though one that came from the same Radboud group. The censored paper is called "Dismantling Mifare Classic".]

As usual in these cases of censorship-by-lawsuit, it’s hard to see what NXP is trying to achieve with the suit. The research is already done and peer-reviewed,. The suit will only broaden the paper’s readership. NXP’s approach will alienate the research community. The previous Radboud paper already criticizes NXP’s approach, in a paragraph written before the lawsuit:

We would like to stress that we notified NXP of our findings before publishing our results. Moreover, we gave them the opportunity to discuss with us how to publish our results without damaging their (and their customers) immediate interests. They did not take advantage of this offer.

What is really puzzling here is that the paper is not a huge advance over what has already been published. People following the literature on Mifare Classic – a larger group, thanks to the lawsuit – already know that the system is unsound. Had NXP reacted responsibly to this previous work, admitting the Mifare Classic problems and getting to work on migrating customers to newer, more secure products, none of this would have been necessary.

You’ve got to wonder what NXP was thinking. The lawsuit is almost certain to backfire: it will only boost the audience of the censored paper and of other papers criticizing Mifare Classic. Perhaps some executive got angry and wanted to sue the university out of spite. Things can’t be comfortable in the executive suite: NXP’s failure to get in front of the Mifare Classic problems will (rightly) erode customers’ trust in the company and its products.

UPDATE (July 18): The court ruled against NXP, so the researchers are free to publish. See Mrten’s comment below.

On Tuesday, the Houston Chronicle published a story about the salaries of local government employees. Headlined “Understaffing costs Houston taxpayers $150 million in overtime,” it was in many respects a typical piece of local “enterprise” journalism, where reporters go out and dig up information that the public might not already be aware is newsworthy. The story highlighted short staffing in the police department, which has too few workers for all the protection it is required to provide the citizens of Houston.

The print story used summaries and cited a few outliers, like a police sergeant who earned $95,000 in overtime. But the reporters had much more data: using Texas’s strong Public Information Act, they obtained electronic payroll data on 81,000 local government employees—essentially the entire workforce. Rather than keep this larger data set to themselves, as they might have done in a pre-Internet era, they posted the whole thing online. The notes to the database say that the Chronicle obtained even more information than it displays, and that before republishing the data, the newspaper “lumped together” what it obliquely descibes as “wellness and termination pay” into each employee’s reported base salary.

The editors understand this might be controversial. But this information already is available to anyone who wants to see it. We’re only compiling it in a central location, and following a trend at other news organizations publishing databases. We hope readers will find the information interesting, and, even better, perhaps spot some anomalies we’ve missed.

The value proposition here seems plausible: Among the 81,000 payroll records that have just been published, there very probably are news stories of legitimate public interest, waiting to be uncovered. Moreover (given that the Chronicle, like everyone else in the news business, is losing staff) it’s likely that crowdsourcing the analysis of this data will uncover things the reporting staff would have missed.

But it also seems likely that this release of data, by making it overwhelmingly convenient to unearth the salary of any government worker in Houston, will have a raft of side effects—where by “side” I mean that they weren’t intended by the Chronicle. For example, it’s now easy as pie for any nonprofit that raises funds from public employees in Houston to get a sense of the income of their prospects. Comparing other known data, such as approximate home values or other visible spending patterns, with information about salary can allow inferences about other sources of income. In fact, you might argue that this method—researching and linking the home value for every real estate transaction related to a city worker, and combining this data with salary information—would be an extraordinary screening mechanism for possible corruption, since those who buy above what their salary would suggest they should be able to afford must have additional income, and corruption is presumably one major reason why (generally low-paid) government workers are sometimes able to live beyond their apparent means.

More generally, it seems like there is a new world of possible synergies opened up by the wide release of this information. We almost certainly haven’t thought of all the consequences that will turn out, in retrospect, to be serious.

Houston isn’t the first place to try this—it turns out that the salaries of faculties at state schools are often quietly available for download as well, for example—but it seems to highlight a real problem. It may be good for the salaries of all public employees to be a click away, but the laws that make this possible generally weren’t passed in the last ten years, and therefore weren’t drafted with the web in mind. The legislative intent reflected in most of our current statutes, when a piece of information is statutorily required to be publicly available, is that citizens should be able to get the information by obtaining, filling out, and mailing a form, or by making a trip to a particular courthouse or library. Those small obstacles made a big difference, as their recent removal reveals: Information that you used to need a good reason to justify the cost of obtaining is now worth retrieving for the merest whim, on the off chance that it might be useful or interesting. And massive projects that require lots of retrieval, which used to be entirely impractical, can now make sense in light of any of a wide and growing range of possible motivations.

Put another way: As technology evolves, the same public information laws create novel and in some cases previously unimaginable levels of transparency. In many cases, particularly those related to the conduct of top public officials, this seems to be a clearly good thing. In others, particularly those related to people who are not public figures, it may be more of a mixed blessing or even an outright problem. I’m reminded of the “candidates” of ancient Rome—the Latin word candidatus literally means “clothed in white robes,” which would-be officeholders wore to symbolize the purity and fitness for office they claimed to possess. By putting themselves up for public office, they invited their fellow citizens to hold them to higher standards. This logic still runs strong today—for example, under the Supreme Court’s Sullivan precedent, public figures face a heightened burden if they try to sue the press for libel after critical coverage.

I worry that some kinds of progress in information technology are depleting a kind of civic ozone layer. The policy solutions here aren’t obvious—one shudders to think of a government office with the power to foreclose new, unforeseen transparencies—but it at least seems like something that legislators and their staffs ought to keep an eye on.

On July 2nd, Viacom’s lawsuit against Google’s YouTube unit saw a significant ruling, potentially troubling for user privacy. Viacom asked for, and judge Louis L. Stanton ordered Google to turn over, the logs of each viewing of all videos in the YouTube database, showing the username and IP address of the user who was viewing the video, a timestamp, and a code identifying the video. The judge found that Viacom “need[s] the data to compare the relative attractiveness of allegedly infringing videos with that of non-infringing videos.” The fraction of views that involve infringing video bears on Viacom’s claim that Google should have vicarious copyright liability–if the infringing videos appear to be an important draw for YouTube users, this implies a financial benefit to Google from the infringement, which would weigh in favor of a claim of vicarious liability.

As Doug Tygar has observed, the judge’s optimistic belief that disclosure of these logs won’t harm privacy seems to be based in part on the conflicting briefs of the parties. Viacom, desiring the logs, told the judge that “the login ID is an anonymous pseudonym that users create for themselves when they sign up with YouTube” which without more data “cannot identify specific individuals.” After quoting this claim and noting that Google did not refute it, the judge goes on to quote a Google employee’s blog post arguing that “in most cases, an IP address without additional information cannot” identify a particular user.

Each of these claims–first, that the login IDs of users are anonymous pseudonyms, and second, that IP addresses alone don’t suffice to identify individuals–is debatable. I haven’t reviewed the briefs that led Judge Stanton to believe each of the assertions. I suppose that his conclusions are reasonable in light of the material presented. It might be the case that the briefs should have led him to a different conclusion. Then again, as the blog post quoted above suggests, Google has at times found itself downplaying the privacy risks associated with certain data. A victory in this argument, causing the judge to take a more expansive view of the possible privacy harms, might have been a mixed blessing for Google in the longer run.

In any case, when he combined the two claims to compel the turnover of the logs, the judge made a significant mistake of his own. Agreeing for the sake of argument that login IDs alone don’t compromise privacy, and that IP addresses alone also don’t compromise privacy, it doesn’t follow that the two combined are equally innocuous. Earlier cases like the AOL debacle have shown us that information that may seem privacy-safe in isolation can be privacy-compromising when it is combined. The fact of combination–the fact that some viewing by a particular login ID happened at a certain IP address, and conversely that a viewing from a particular IP address occurred under the login of a particular user–is itself a potentially important further piece of information. If the judge thought about this fact–if he thought about the further privacy risk involved in the combination of IPs and login IDs–I couldn’t find any evidence of such consideration in his ruling.

Google wants to be permitted to modify the data to reduce the privacy risk before handing it over to Viacom, but it’s not yet clear what agreement if any the parties will reach that would do more to protect privacy that Judge Stanton’s ruling requires. It’s also not yet apparent exactly how the judge’s protective order will be constructed. But if the logs are turned over unaltered, as they may yet be, the result could be significant risk: YouTube’s users would then face extreme privacy harm in the event that the data were to leak from Viacom’s possession.

[As always, this post is the opinion of the author (David Robinson) only.]

Last week, I testified before the Texas House Committee on Elections (you can read my testimony). I’ve done this many times before, but I figured this time would be different. This time, I was armed with the research from the California “Top to Bottom” reports and the Ohio EVEREST reports. I was part of the Hart InterCivic source code team for California’s analysis. I knew the problems. I was prepared to discuss them at length.

Wow, was I disappointed. Here’s a quote from Peter Lichtenheld, speaking on behalf of Hart InterCivic:

Security reviews of the Hart system as tested in California, Colorado, and Ohio were conducted by people who were given unfettered access to code, equipment, tools and time and they had no threat model. While this may provide some information about system architecture in a way that casts light on questions of security, it should not be mistaken for a realistic approximation of what happens in an election environment. In a realistic election environment, the technology is enhanced by elections professionals and procedures, and those professionals safeguard equipment and passwords, and physical barriers are there to inhibit tampering. Additionally, jurisdiction ballot count, audit, and reconciliation processes safeguard against voter fraud.

You can find the whole hearing online (via RealAudio streaming), where you will hear the Diebold/Premier representative, as well as David Beirne, the director of their trade organization, saying essentially the same thing. Since this seems to be the voting system vendors’ party line, let’s spend some time analyzing it.

Did our work cast light on questions of security? Our work found a wide variety of flaws, most notably the possibility of “viral” attacks, where a single corrupted voting machine could spread that corruption, as part of regular processes and procedures, to every other voting system. In effect, one attacker, corrupting one machine, could arrange for every voting system in the county to be corrupt in the subsequent election. That’s a big deal.

At this point, the scientific evidence is in, it’s overwhelming, and it’s indisputable. The current generation of DRE voting systems have a wide variety of dangerous security flaws. There’s simply no justification for the vendors to be making excuses or otherwise downplaying the clear scientific consensus on the quality of their products.

Were we given unfettered access? The big difference between what we had and what an attacker might have is that we had some (but not nearly all) source code to the system. An attacker who arranged for some equipment to “fall off the back of a truck” would be able to extract all of the software, in binary form, and then would need to go through a tedious process of reverse engineering before reaching parity with the access we had. The lack of source code has demonstrably failed to do much to slow down attackers who find holes in other commercial software products. Debugging and decompilation tools are really quite sophisticated these days. All this means is that an attacker would need additional time to do the same work that we did.

Did we have a threat model? Absolutely! See chapter three of our report, conveniently titled “Threat Model.” The different teams working on the top to bottom report collaborated together to draft this chapter. It talks about attackers’ goals, levels of access, and different variations on how sophisticated an attacker might be. It is hard to accept that the vendors can get away with claiming that the reports did not have a threat model, when a simple check of the table of contents of the reports disproves their claim.

Was our work a “realistic approximation” of what happens in a real election? When the vendors call our work “unrealistic”, they usually mean one of two things:

Real attackers couldn’t discover these vulnerabilities

The attackers can’t be exploited in the real world.

Both of these arguments are wrong. In real elections, individual voting machines are not terribly well safeguarded. In a studio where I take swing dance lessons, I found a rack of eSlates two weeks after the election in which they were used. They were in their normal cases. There were no security seals. (I didn’t touch them, but I did have a very good look around.) That’s more than sufficient access for an attacker wanting to tamper with a voting machine. Likewise, Ed Felten has a series of Tinker posts about unguarded voting machines in Princeton.

Can an attacker learn enough about these machines to construct the attacks we described in our report? This sort of thing would need to be done in private, where a team of smart attackers could carefully reverse engineer the machine and piece together the attack. I’ll estimate that it would take a group of four talented people, working full time, two to three months of effort to do it. Once. After that, you’ve got your evil attack software, ready to go, with only minutes of effort to boot a single eSlate, install the malicious software patch, and then it’s off to the races. The attack would only need to be installed on a single eSlate per county in order to spread to every other eSlate. The election professionals and procedures would be helpless to prevent it. (Hart has a “hash code testing” mechanism that’s meant to determine if an eSlate is running authentic software, but it’s trivial to defeat. See issues 9 through 12 in our report.)

What about auditing, reconciliation, “logic and accuracy” testing, and other related procedures? Again, all easily defeated by a sophisticated attacker. Generally speaking, there are several different kinds of tests that DRE systems support. “Self-tests” are trivial for malicious software to detect, allowing the malicious software to either disable and fake the test results, or simply behave correctly. Most “logic and accuracy” tests boil down to casting a handful of votes for each candidate and then doing a tally. Malicious software might simply behave correctly until more than a handful of votes have been received. Likewise, malicious software might just look at the clock and behave correctly unless it’s the proper election day. Parallel testing is about pulling machines out of service and casting what appears to be completely normal votes on them while the real election is ongoing. This may or may not detect malicious software, but nobody in Texas does parallel testing. Auditing and reconciliation are all about comparing different records of the same event. If you’ve got a voter-verified paper audit trail (VVPAT) attachment to a DRE, then you could compare it with the electronic records. Texas has not yet certified any VVPAT printers, so those won’t help here. (The VVPAT printers sold by current DRE vendors have other problems, but that’s a topic for another day.) The “redundant” memories in the DREs are all that you’ve got left to audit or reconcile. Our work shows how this redundancy is unhelpful against security threats; malicious code will simply modify all of the copies in synchrony.

Later, the Hart representative remarked:

The Hart system is the only system approved as-is for the November 2007 general election after the top to bottom review in California.

This line of argument depends on the fact that most of Hart’s customers will never bother to read our actual report. As it turns out, this was largely true in the initial rules from the CA Secretary of State, but you need to read the current rules, which were released several months later. The new rules, in light of the viral threat against Hart systems, requires the back-end system (“SERVO”) to be rebooted after each and every eSlate is connected to it. That’s hardly “as-is”. If you have thousands of eSlates, properly managing an election with them will be exceptionally painful. If you only have one eSlate per precinct, as California required for the other vendors, with most votes cast on optical-scanned paper ballots, you would have a much more manageable election.

What’s it all mean? Unsurprisingly, the vendors and their trade organization are spinning the results of these studies, as best they can, in an attempt to downplay their significance. Hopefully, legislators and election administrators are smart enough to grasp the vendors’ behavior for what it actually is and take appropriate steps to bolster our election integrity.

Until then, the bottom line is that many jurisdictions in Texas and elsewhere in the country will be using e-voting equipment this November with known security vulnerabilities, and the procedures and controls they are using will not be sufficient to either prevent or detect sophisticated attacks on their e-voting equipment. While there are procedures with the capability to detect many of these attacks (e.g., post-election auditing of voter-verified paper records), Texas has not certified such equipment for use in the state. Texas’s DREs are simply vulnerable to and undefended against attacks.

CORRECTION: In the comments, Tom points out that Travis County (Austin) does perform parallel tests. Other Texas counties don’t. This means that some classes of malicious machine behavior could potentially be discovered in Travis County.

Freedom to Tinker is hosted by Princeton's Center for Information Technology Policy, a research center that studies digital technologies in public life. Here you'll find comment and analysis from the digital frontier, written by the Center's faculty, students, and friends.