Ben Z. Yuan - How to Make Almost Anything

Project Log: A Hardware Password Manager

Here’s a running log of project-related content.

Week 1: ideas

Project

For my final project I’m looking to build a hardware password manager.

Why

Existing software password managers, like LastPass, 1Password, KeePass, …,
are very good at managing all of our network service passwords, so we can have
different passwords for Facebook, Twitter, LinkedIn, Google, Steam, and other
such websites. However, they suffer from one common problem: you have to be
logged in to some computer to access them. This means that you can’t
(effectively) use them to manage your terminal login passwords, since you
aren’t logged in when you need to type them!

At MIT, you use your Athena credentials to log in to many internal websites, but
you also use your Athena password to log yourself in to any one of the various
physical terminals on campus. If you don’t have a computer of your own, it
becomes harder to use a strong unique password for your Athena account like
you’re supposed to, because you can’t rely on KeePass etc. to remember it for
you.

You could create some application on a smartphone to keep your passwords. But
then you run into some problems:

You have to trust that your smartphone and its operating system aren’t
surreptitiously sending your passwords somewhere else.

You still have to transcribe and type your password in by hand - meaning that
someone looking over your shoulder can learn it!

You now have to secure your phone against people who want to “borrow” it.

We solve these issues by creating a dedicated hardware device that anyone with
access to fabrication tools should be able to assemble.

Design specifications

Required features

Emulate a USB keyboard.

Encrypt credentials on-device; require PIN unlock.

User interface on-device; all features available without separate software.

Things we don’t like

Things we like

Things we don’t like

Requires separate companion software to operate

Week 3: random numbers and UI

The common thread between existing known hardware password managers is that
they don’t really work without a computer to manage them: typically pre-existing
credentials are preloaded, or passwords are generated and stored with the
assistance of a software tool. Given the limited user interface provided by
such solutions, this approach makes sense – but this is How to Make Almost
Anything, and since we aren’t constrained by such challenging size requirements as “has to
fit in the size of a USB flash drive”, we can do whatever we like.

So in order to be able to generate passwords on-device, we need a couple of
things:

A competent hardware random number generator.

A user interface powerful enough to manage many passwords.

Hardware RNG

Probably the canonical way to get “true” random noise is to grab the least
significant bit of something hard to predict. Depending on what you have access
to, this could be one of:

Some on-chip timer.

Some noisy environmental sensor.

Some noisy circuit (most famously, a reverse-biased diode junction).

The output from some dedicated crypto-processor that does one or more of
these things.

People have done this sort of thing before, of course, and based on some
discussion doing this well is apparently
deep analog.
Since I’m not trained in analog I decided to “directly” borrow the
OneRNG’s avalanche noise circuit for now
(not replicating their layout directly yet, unless it turns out during testing
that I really rather should).
In a premonition of Week 5 I used KiCAD to create a schematic and traces for a
test board (images below), but time constraints prevented me from milling the
board this week. Also, I used 0804 templates by accident instead of 1206, so
assembly could be inadvertently more difficult than I wanted! (If this ends up
being a problem I can always redo the schematic.)

As an alternative the infnoise
claims to be more robust, but I didn’t have time to try to replicate and
planarise their schematic. Maybe if the OneRNG method doesn’t work out, I might
look at their board again.

UI considerations

For frequent PIN entry there is really no better user interface element than a
keyboard or keypad. Given that someone might enter e.g. an Athena terminal
login multiple times a day (while moving around campus, or logging in to
various network services), PIN entry should be as effortless as possible. This
would seem to preclude solutions like the Mooltipass’s scroll wheel with button,
but unfortunately places certain minimum restrictions on form factor size.

Grayhill makes a decent-looking
16-key matrix keypad.
The unit price is $19.92 which is a little unfortunate, and the keypad itself
is 2.5” x 2.7” which, combined with a display device, would make the whole thing
a bit bulky. But I won’t know for sure until I prototype. This will require 8
digital GPIO pins to read directly, or as few as 6 if I do some multiplexing in
hardware.

In terms of display I’m really tempted by the various 128xFOO pixel OLED
breakouts sold by Adafruit. But these are expensive – a
2.5” wide 128x32 display
costs $40 as sold. I might have to settle for an ordinary HD44780
($10 as sold), which is the 16x2 character display you see everywhere.

I briefly entertained doing the whole thing as a TFT touchscreen instead, but
I’ve got skepticisms about the durability – it would seem easier to protect a
smallish display rather than a larger one, in the absence of whatever type of
glass smartphones use, keeping in mind that this device has to fit in a pocket.

Displays with serial support (e.g. SPI) need a minimum of like 5 or 6 digital
pins depending on the display type.

Next steps

I think attempt to mill the RNG test board (and stuff if milling succeeds),
and also get a keypad module, display module, and pre-assembled microcontroller
to play with (and put together a loose-wire prototype of final product).

Will need to think about case related matters as well. Should be able to CAD up
an appropriate enclosure with OpenSCAD once I finalize the UI hardware.

Week 4: towards random numbers

I decided after designing the last test board that 1206 is a much better size
for surface-mount parts, and since the fab lab inventory stocks 1206 everywhere
I might as well redesign the board for that size.

Shop time is possibly the most difficult commodity to get, and I had real
problems getting into the shop this week because of constraints. But eventually
I was able to find enough time to mill the test board and start populating the
parts in stock.

We have some of the parts I need in the fab lab inventory but I’m still going
to have to order a few (in particular the MMBT3904s, the AP3015, the inductors,
and the odd-sized resistors).

Week 5: noise!

gavindar was kind enough to order enough of all the missing parts that I can
make a few of these circuits if they work out well.

I finished populating the board on Tuesday afternoon:

I attached the 3V3 and GND pins on the “left” header (closer to the AP3015)
to a benchtop power supply set to produce 3.3 volts. Sure enough, an
oscilloscope connected to the SIGNAL pin on the “right” header produces a lot
of noise:

This circuit effectively converts a constant 3V3 input into noise between ground
voltage and about +2ish volts. This should be well within the range of the AVR
on-board ADC. The circuit draws much less than 30 mA of current while operating,
putting it well with range of something like a MCP1825 voltage regulator.

Notice how the voltage rises seem to level off near the peaks, and generally
feel much shallower than the voltage drops.

This is a plausible noise source, but will need debiasing to be useful.

Hallucinated microcontroller requirements

If I want to do all of this on one microcontroller then it probably wants to
end up being rather large.

I need the following pins:

8 digital pins (4 output, 4 input) for a keypad

somewhere less than 8 digital pins for a display

an ADC pin for an entropy source

I also want a USB controller.

It’s hard to say a priori how much Flash memory I’ll need. The AES library
from avr-crypto-lib takes up
on the order of 2.7KB of Flash and 240 bytes SRAM. LUFA apparently takes up
like 8KB of Flash on its own.

It might be sensible to start with the largest microcontroller we’re comfortable
working with, getting our thing to work there, and then think about scaling
down. In other words, we’ll want to “science” a board using an AT90USB1286.

Hand-soldering a TQFP64 could be a challenge. But the milling process
shouldn’t have a problem producing the necessary traces: the pin spacing
is well within our design rules.

I think I can borrow the Teensy++ 2.0 schematic
as a base schematic from which to work. (In the event I run out of time or get
too frustrated, I could also buy such a board outright, and wire some things
together, for a fragile prototype.)

We’re going to have a lot to learn in the next few weeks, that’s for sure!

An alternative is to use two smaller microcontrollers – one with an on-board
USB controller for handling the keyboard functions, and one handling everything
else – connecting the two with a sensible serial protocol. This would increase
the software complexity somewhat, as now I would have to write two separate
firmware files.

Week 7: 2-sided PCBs and final board.

I milled a double-sided PCB – specifically a 2-sided version of my noise board
that eliminates a 0-ohm resistor.

I also created a potential final board design.

Separately, I experimented with a candidate display, a candidate keypad, and
using the noise board as a hardware RNG source.

A double-sided board

On entropy and related matters

Rhys Weatherley provides some useful
crypto libraries, including a working entropy pool.

First I tried to update the pool on every ADC interrupt (manually scheduling
the next conversion inside the interrupt routine), but the ADC is way too
fast (tens of cycles) and I ended up mostly locking up the processor.

So, instead, I update the pool on a timer. It’s a fine balance trying to choose
a timeout value that keeps the main system responsive while filling up the
entropy pool in a timely fashion. This wouldn’t be a bad task for a
coprocessor.

rweather’s RNG, by default, sources entropy from timer jitter. This
semi-reliably provides 48 bits of entropy every few seconds. It can also take
avalanche noise of the type my noise board produces. I modified rweather’s
TransistorNoiseSource to work directly with the AD0 pin instead of going
through an analogRead(), since the ADC is needed for nothing else. This
source takes a significant amount of time (sometimes over a minute) to
calibrate from a cold start, but once it’s calibrated it can provide
high-quality entropy in varying block sizes at impressive rate (even much more
quickly than the timer source).

Since rweather’s RNG uses a 384-bit internal state (using ChaCha20 as a
mixing function), it can provide a maximum of 384 bits of true entropy
in a single operation. While the RNG can provide pseudo-random bits even after
entropy pool “depletion”, doing so increases the impact of RNG state compromise.

Final board design

This initially proved unroutable. After determining how to reliably make
double-sided PCBs, I was finally able to route the board:

This will be extremely unforgiving to build:

Clearances and trace widths basically at the allowable limits of our milling
process.

So many vias. Just because they’re possible doesn’t mean they’re easy to
assemble; each one needs a wire pole and those have been harder than expected
to solder.

Components on the back (namely, the noise board and its RF shield).

Main chip is a giant TQFP64. The TQFP32 chip in the AT90USB series has no ADC
and is therefore useless to us in a single-MCU design, but might be usable in
a later revision with multiple MCUs. While investigating keypad and display
functionality, I determined that having additional coprocessors could be a
useful innovation, as the AVR platform has limited support for true nonblocking
I/O.

Another possible innovation is to change architectures (e.g. to something
ARM-based). But doing this particular project in as small a MCU as practical
is appealing: it’s much harder to hide something nefarious in a less powerful
MCU, making lower-capability design more appealing from a verifiability
perspective.

Week 10: board completion and revisions.

I completed revision 0 of my desired project board.

It communicates just fine with an Adafruit Trellis and a SPI OLED module.

Programs are loaded using the LUFA CDC bootloader which I had to burn onto the
usb1286 using a temporary board with ISP header. (Fun fact: the usb1286 ships
with fuses set for a 8M crystal, and with protection bits enabled – so to change
fuses one has to erase the stock bootloader.)

The trouble here is that the pads for the I2C connector are positioned such that
a 6-pin header will physically interfere with the display module. This, in
addition to some inadvertent physical damage, will lead me to redesign and
rebuild this board.

Week 12: a revised board.

I completed revision 1 of my desired project board.

This board moves the I2C header away from the display location. It also adds
breakout pads for port A and port C on the usb1286.

It still remains to complete the main system software, design an enclosure with
fixturing for all the hardware components, and find some way to replace the
Trellis with some self-made input device.