Hackage Security and Stack

February 14, 2017

Back in 2015, there were two proposals made for securing package
distribution in Haskell. The Stackage team proposed and implemented a
solution using HTTPS and Git, which was then used as the default in
Stack. Meanwhile, the Hackage team moved ahead with
hackage-security. Over the past few weeks, I've been working on moving
Stack over to hackage-security (more on motivation below). The current
status of the overall hackage-security roll-out is:

Hackage is now providing the relevant data for hackage-security (the
01-index.tar file and signature files)

One upside to this is more reliable package index download time. We
have had complaints from some firewalled users of slow Git clone time,
so this is a good thing. We're still planning on maintaining the
Git-based package indices for people using them (to my knowledge they
are still being used by Nix, and all-cabal-metadata is still used to
power a lot of the information on stackage.org).

However, there's one significant downside I've encountered in the
current implementation that I want to discuss.

Background

Quick summary of how hackage-security works: there is a 01-index.tar
file, the contents of which I'll discuss momentarily. This is the file
which is downloaded by Stack/cabal-install when you "update your
index." It is signed by a cryptographic algorithm specified within the
hackage-security project, and whenever a client does an update, it
must verify the signature. In theory, when that signature is verified,
we know that the contents of the 01-index.tar file are unmodified.

Within this file are two (relevant) kinds of files: the .cabal files
for every upload to Hackage (including revisions), and .json files
containing metadata about the package tarballs
themselves. Importantly, this includes a SHA256 checksum and the size
of the tarball. Using these already-validated-to-be-correct JSON
files, we can download and verify a package tarball, even over an
insecure connection.

The alternative Git-based approach that the Stackage team proposed has
an almost-identical JSON file concept in the all-cabal-hashes
repo. Originally, these were generated by downloading tarballs from
https://hackage.haskell.org (note the HTTPS). However, a number of
months back it became known that the connection between the CDN in
front of Hackage and Hackage itself was not TLS-secured, and therefore
reliance on HTTPS was not possible. We now rely on the JSON files
provided by hackage-security to generate the JSON files used in the
Git repo.

How it manifests

There are a number of outcomes to be aware of from this issue:

The FP Complete mirror, and any other mirror using Herbert's tool,
will sometimes stop updating if a new JSON file is missing. This is
an annoyance for end users, and a frustration for the mirror
maintainers. Fortunately, updating the mirror tool code with the
added index isn't too heavy a burden. Unfortunately, due to the lack
of HTTPS between Hackage and its CDN, there's no truly secure way to
do this update.

End users cannot currently use packages securely if they are
affected by this bug. You can
see the full list
at the time of writing this post.

Stack has had code in place to reject indices that do not provide
complete signature cover for a long while (I think since its
initial release). Unfortunately, this code cannot be turned on for
hackage-security (which is how I discovered this bug in the first
place). We can implement a new functionality with weaker
requirements (refuse to download a package that is missing signature
information), but ideally we could use the more strict semantics.

The Nix team cannot rely on hashes being present in
all-cabal-hashes. I can't speak to the Nix team internal
processes, and cannot therefore assess how big an impact that is.

Conclusion

Overall, I'm still very happy that we've moved Stack over to
hackage-security:

It fixed an immediate problem for users behind a firewall, which we
otherwise would have needed to work around with new code
(downloading a Git repo snapshot). Avoiding writing new code is
always a win :).

Layering the HTTPS/Git-based security system on top of
hackage-security doesn't make things more secure, it just adds two
layers for security holes to exist in instead of one. From a
security standpoint, if Hackage is providing a security mechanism,
it makes sense to leverage it directly. Said another way: if it
turns out that hackage-security is completely insecure, our
Git-based layer would have been vulnerable anyway since it relied on
hackage-security.

By moving both Stack and cabal-install over to hackage-security for
client access, we'll be able to test that code more thoroughly,
hopefully resulting in a more reliable security mechanism for both
projects to share
(small example of such stress-testing).

Stack has always maintained compatibility with some form of non-Git
index, so we've always had two code paths for index updates. As
hinted at above, this change opens the door to removing the
Git-based code path. And removing code is almost as good as avoiding
writing new code.

I would still feel more comfortable with the security of Hackage if
HTTPS was used throughout, if only as a level of sanity in case all
else fails. I hope that in the future the connection between Hackage
and its CDN switches from insecure to secure. I also hope that
cabal-install is still planning on moving over to using HTTPS for
its downloads.