pkg_add vs https

pkg_add vs https

So we've been working with the tls crowd to bring you decent https support.

The crux of the matter is that pkg_add does not handle https directly,
it does use ftp(1) to fetch every single file.

... which means a new connection for each single package it looks at.

with http, that's fairly okay. Establishing a new connection is pretty cheap.

For https, there's tls involved, and if you look at the protocol, normal
connections do begin with an authentication exchange, which uses public
key cryptography, which is, for the most part, using RSA in the https world.

This slows things down in two ways:
- public key cryptography is still somewhat expensive, especially for slower
machines.
- there is a lot of back-and-forth involved, on top of normal tcp handshakes.
Namely client sends clienthello, server replies with serverhello,
client responds with certificateinfo, server responds with finished
and we can FINALLY send data. (from 3 to 7 back&forth, wee)

There's a functionality in tls called "session resumption", where all the
back-and-forth already used in the first connection is replaced in
subsequent connections by a simple token exchange (after all
the client and server already authenticated each other, and can prove they
know the same shared secret). This speeds up things a little:
- no public key cryptography involved
- the handshake devolves into clienthello, serverhello+finished, finished
so it shaves one packet... well, not as good as could be expected but
still something.

So, this functionality exists in libressl. What Joel and others did was
expose it in libtls, and add a hook in ftp(1) so that it could be used by
pkg_add.

This is still fairly secure: pkg_add creates a temporary file as _pkgfetch,
unlinks it from the filesystem, and passes /dev/fd/<whatever>
as the session file to ftp(1), which is happy to work with it
(no easy way to spy on that secret... and if you can look at other
processes opened files, I'd say you have bigger problems).

(I don't know who exactly came up with the idea of making it work with a
pure file descriptor, but this is brilliant)

Measuring this shows that https gets somewhat less slow. It's still slower
than pure http (the extra 3 messages are still something), but it becomes
more of a choice for people who want anonymity.

Note that, if you use session resumption, ftp(1) will report on whether or
not it was successful. pkg_add(1) parses those messages and will tell whether
you are using a "slow" https mirror or not.

I've also run into bizarre session resumption implementations (apache...)
which seems to think that 5mn is a good timeout for expiring tokens (we're
talking about session tokens that are actually ACTIVE AND USED, not keeping
tokens around while the connection is closed. Nope, looks like at least one
mirror expires tokens after five minutes, irregardless of their use).

So, this is as good as it gets from this end, until I figure out a better
way to interleave operations...