PHP Package Signing: My Current Thoughts

We figured out how to write good code. We figured out how to write good code in a reusable way…for the most part. We figured out how to distribute and mix all that good reusable code in a sensible fashion. Can we now figure out how to do it all securely?

Package signing is a simple enough idea, and I’ve been spending time trying to fit it, Composer and Packagist together as a model in my head. The concept is to have parties create a cryptographic signature of the code to be distributed at a specific point in time using a private key. If anyone were to change that code or its metadata (think composer.json) with malevolent intent, the end user would then notice that the cryptographic signature cannot be verified using the package author’s known public key. It’s a familiar topic from all those cryptography books you’ve doubtlessly read ;).

Alright, it’s actually a horrendously complicated topic that boggles the minds of many a programmer. We’re a practical bunch, and we just want the damn code. NOW!

Practical considerations and security are locked in a continuous battle for primacy. Look at SSL/TLS – it is a security nightmare but we keep it around because, until someone comes up with a decent replacement, the alternative is no encrypted HTTPS with a verifiable host for anyone. We continue to support old versions of SSL/TLS out of practical concerns despite knowing their weaknesses. They are old versions for a reason!

Those same concerns have been at war in my own head since last week, when I made the mistake of contemplating package signing. Eventually, my practical side won out and my security persona has been sulking in a corner ever since refusing to talk to me.

The problem with package signing from my perspective is tied up in a phrase most of you would know: The needs of the many outweigh the needs of the few. Thank you, Spock.

PKI vs GPG (Some Context!)

I won’t go into too much detail here…

Right off the bat, we have two contenders for signing packages: Public-key infrastructure (PKI) backed by a Certificate Authority (CA) and Pretty Good Privacy (PGP) also commonly referred to by its GNU implementation, GNU Privacy Guard (GPG). You’d be most familiar with PKI in the form of the SSL certificates used online. Both have the notion of private keys and public keys. Data encrypted by one key can only be decrypted by the other key. If you keep one private, then holders of the public key can always verify that data sent by you was really sent by you. If you lose the private key, you’ll need to revoke it and get a new one.

Assuming, they trust it is you to start with!

Trust is the core difference between PKI and GPG. How do you know, with certainty, than any given public key is firmly associated with the person you know it should be associated with? Maybe it’s a hacker posing as that person? Maybe it’s the local friendly NSA office masquerading as Google? Establishing trust takes diverging paths for PKI and GPG. PKI keys (in the form of certificates) are either self-signed or signed by a trusted Certificate Authority. Generally, we put zero faith in self-signed certificates because anyone can claim to be anyone else using them. We instead trust a select number of CAs to sign certs because they’ll hopefully do stuff like asking for passports, addresses, and other person or company specific information to verify any entity’s real identity before doing so. GPG avoids centralised authorities like the plague and instead uses a “web of trust” model where everyone can sign everyone else’s public key, i.e. the more of these endorsements a GPG private key gets, the more likely it is that the public key represents the expected identity (based on the number of endorsers you already trust). Webs of trust require time, care, and effort, but they have been extremely successful and certainly do work.

Excellent, your brain is still not smeared over the monitor. That’s a good sign. Now, what the heck has this got to do with PHP and package signing?

People Are Lazy And Cheap

These are the two things you can count on with rational people, though perhaps using more charitable terms. People don’t want to do any more work than they need to, and they generally don’t want to spend any more money than they need to. Add those to one other – they generally don’t think about security when downloading code – and you have something of a problem.

PKI dies an immediate death in this worldview, because obtaining a CA issued code signing certificate costs money. That guy who wants to put 100 lines of code onto Github? That’ll be $100 please. Package signing using a CA will never work for PHP because it imposes a monetary cost on open source contributions. Many of us might not blink at spending $100 dollars to indulge our willingness to write free code, but we’d probably all prefer not to.

GPG is utterly free. Surely it’s a winner then? That guy who wants to put 100 lines of code onto Github? He probably has to now generate a new GPG keypair with a resulting public key that nobody has signed and which nobody will trust. He’s way past caring now. And so are the potentially tens of thousands of people who can’t verify his code. They want the damn code. NOW! Not months down the line when he’s begged, pleaded and bled his willpower dry trying to get sufficient signatures from others in the community to get a widely trusted public key. I’m exaggerating a wee bit, but the barriers to entry for GPG were intended to prevent weakness. There are shortcuts, but shortcuts undermine the purpose of a web of trust. Meeting in person is one of the most often quoted means of getting your GPG public key signed properly. You can find me somewhere in the Wicklow mountains of Ireland. I’ll wait ;).

From a security perspective, both of these options can and do work. Practically? They sort of stink if you want global adoption. One costs money, and the other requires time and effort for both package authors and users. In the real world, these problems are not uncommon. Throw GPG signatures at everything, and a tiny minority will actually bother checking them. Require CA code signing certs, and you will be ignored by programmers. These options, particularly given the distributed nature of PHP packages (i.e. github repos), will only ever serve a minority of users. Would you oblige us again, Spock?

The needs of the many outweigh the needs of the few.

Thank you, gravelly voiced Spock. Sorry about Abrams, we didn’t think he’d actually blow up Vulcan, replace phasers with blasters, and subtitute you with an emotional basket case because he thought he was making a Star Wars movie.

I just killed PKI and GPG as the basis for any proposal I’d make for package signing in PHP. Making <1% of the community extremely safe while leaving >99% of it extremely vulnerable leaves me with a bad taste in my mouth because it conflicts with my desire to spread security as much as possible to protect as many people as possible. I could blame the ignorance of users for the need to give up some notional perfect security, but they just want their code and that is the reality we live in. One might as well stand on a beach battling the tides if they are going to deny that Humans are, well, only Human. So, it’s time to come up with something else that can be easily implemented in PHP.

The Double Signature Alternative

The double signature approach to package signing relies on having a minimum of two parties verify the integrity of a package independently of each other. Luckily, we already have two parties: the package author and the Packagist repository. The basis of its effectiveness is probably obvious. Each independent signature requires a separate private key which is ultimately held offline. In order for an attacker to compromise a package, they would also need to compromise both private keys. This assumption rests on the notion that each party’s public keys are known by the user in advance and maintained in some permanent keystore, e.g. included in composer.phar or accepted upon first download of a public package and stored in a simple file that you can copy between systems (in effect quite like GPG’s keyring).

Let’s say that I tag a release for LibraryX 1.0.0 tomorrow. As the package author, I would sign it using a private key that I generated at home. I’d then advertise the public key widely for users to download and cache (our permanent keyring). Packagist, when building the metadata for the new tagged release, would also generate its own signature for the package. Our download client, which we’ll assume is Composer, will check both signatures and only accept a package for use when both signatures can be verified.

While I refer to this as a “double signature” approach, there remains scope for adding a third independent party. It also doesn’t necessarily impose digital signing on package authors. Package authors can simply not sign anything, but Packagist would still do its own signing. This significantly lessens security but it trumps the current situation where Packagist signs nothing at all. It should also have Composer differentiate by marking packages to be installed as verified or partially verified, while allowing options for users to impose a mimimum acceptable level of verification.

Happily, this line of thinking is exactly what is at the core of a proposed solution for both Rubygems and Python’s pypi called theupdateframework. A little external validation doesn’t hurt and my security persona might stop sulking soon. It can also be implemented entirely using openssl.

Signing A Package

At its core, signing a package for PHP is straightforward. For every single file in the package, you calculated its checksum, e.g. a SHA256 hash. You store a list of files and their hashes in a single file called the manifest. You then sign the manifest. Upon download, the user can verify the manifest’s signature and check that the list of files and hashes it contains actually matches the files in the package received.

If only it were that simple…

Metadata Is Dangerous

This part is somewhat technical, but hopefully it makes sense!

While we might suspect that files are important, securing something like Packagist requires an obsession with package metadata. Whenever you use Composer it doesn’t run off to Github, it downloads a file called packages.json from packagist.org. Composer quite literally relies on Packagist to tell it about available packages, versions and URLs to remote downloads and VCS urls. In the absence of a secure signature-based process, this creates a gigantic single point of failure for all Composer users.

If Packagist is ever hacked, an attacker can now respond to every single Composer request unchallenged. And Composer implicitly trusts everything that Packagist tells it. Basically, it would allow an attacker to poison the entire population of Composer users. That is simply intolerable.

Metadata is the core problem we need to solve. We need to prevent attackers from redirecting package downloads, informing us of incorrect available versions, replaying old copies of the metadata, and so on. There must also then be a way to recover from the attack. In other words, we need a metadata architecture that can survive Packagist’s downfall and allow for its restoration.

If we implement not just package signing, but more specifically metadata signing, then we immediately alleviate the problem. It’s still based on the primary goal of there being at least two private keys which are kept offline. The missing features that it also requires are fourfold:

Role delegation is a simple mechanism where Packagist maintains private key(s) to sign all of its automatically generated metadata files. These keys are delegated to by one or more master root keys which are kept offline. If the server is ever compromised, the root key is not. This would allow the Packagist maintainers to, upon restoring control, revoke the online delegated keys and replace them.

Timestamping prevents attackers from reusing older correctly signed metadata by imposing an expiry date and version number. Old data will expire, new data will have a limited predetermined validity period. Versioning merely ensures the Composer client can check that metadata it downloads is fresher than an older copy, i.e. it can continue running off cached copies.

Signing thresholds are an opt-in measure where you can require certain metadata to be co-signed. For example, Packagist could require that automated key delegations are signed by at least two private keys (increasing the difficulty for attackers since they now need both!). You could do the same at the developer level for newly tagged releases.

Key revocation is another recovery mechanism. It allows the holder(s) of root keys to revoke delegated keys. After a compromise, you’ll want to replace the online private keys. This can all be done by simply root signing a new file which details the delegated keys (I’m trying to avoid too many implementation details so hit me up in the comments if you’re confused).

All of this works in concert, but with one crucial additional element I need to look into. We’re basically saying that blind trust in Packagist is a bad thing, even with signing, so we can’t only rely on Packagist or we’d be vulnerable to replay/rollback attacks. For example, we should have a means of independently cross-verifying versions back to their origin, the actual git repository for the package – for example, using “git ls-remote –tags” to compare release tags as a means of validating the correctness of Packagist metadata. I have no idea how that would work outside of git, but it’s an obvious validation method that is possible due to Packagist’s distributed nature.

Conclusion

This article was me thinking aloud but, if you follow the logic, it demonstrates that package signing is an understatement. We don’t want to just secure packages, we also want to secure the metadata that describes what packages exist. It’s a far more complex problem than first glances suggest. Luckily, it’s not an overwhelming problem. It is entirely possible to implement basic defences using openssl, public/private keys, and some independent validation separate from Packagist itself. Though your brain will appreciate it more when it’s automated ;).

Thank you. First of all a disclaimer: I think I followed this “in broad terms” (i.e. it’s possible I didn’t!)

I think you’re looking at the problem “is the code I requested the code I’ve received?”.

But being lazy and cheap also means there’s a subsequent problem: “is the code I’ve received evil (by design)?”. That is, a lovely package that does just what I need, really well documented etc. and has an evil backdoor.

Now obviously I’m irrational and so spend the time and money to review that code carefully ;). But perhaps someone else who is rational just starts using it without spotting the evilness.

Now I suppose that the assumption is that someone will notice and call it out and the package will get removed and all will be well with the world again. Of course with everyone making that assumption we have a problem.

So I’m wondering if there is a role for GPG where people who do review a (particular version of a) package can sign it in the style of GPG. If you’re doing the review anyway, it could presumably be fairly straightforward then to say so in the keychain, so there’s not lots of overhead.

And you could also presumably look at who’s reviewed and signed the code. If it were someone like you, I’d weigh that more than if it were me.

All together, we’d then have both “this is the package and code I think it is” and “these people have had a look and reckon it’s not evil”.

Matt

http://blog.astrumfutura.com Pádraic Brady

Hi Matt,

It’s effectively beyond the scope of this system. Here, we are merely stating that at a given time, a package was signed by 1+ parties as having a set of files with given checksums (the other metadata issues are a repository level problem). We’re only really checking that the package (usually a tagged release) was not subsequently altered by anyone. It doesn’t, in any way, verify the security of that code.

I suppose you could, in theory, setup a third signing party who will review code before signing a package signature. You could then add them as a trusted signatory, and exclude installing packages that it has not signed. It’s another of expressing that you trust that party.

Matt Parker

Hi,

Sorry, yes I’d realised it was beyond what you’re suggesting; it was your description of the GPG web of trust that triggered the thought. Let me try and be a bit more precise.

The problem: I do not want to use software that is evil by design via composer/packagist.
The solution: I review the code, satisfy myself that it’s all OK, and then use it. The new signed packages feature means that I am confident that the code I reviewed is what is now on my system.

The problem part 2 (first statement): I am too rational (and therefore cheap & lazy) to review the package I’m about to use.
The problem part 2 (second statement): I lack the knowledge to review the package effectively.
The problem part 2 (third statement): It’s such a waste of time for all of us to be reviewing the same packages over and over.

The solution part 2: package metadata includes a second GPG-like set of signatures of people that have reviewed the source of a particular package.

When I come to look at using a particular package, I can look at the people that have added their signature. If I think they are particularly diligent/knowledgeable/generally trustworthy then I may not review the code. I am confident in the code because:
1. their signatures can be verified using their public keys
2. their signatures are part of the metadata that is also signed

If I choose to review the code myself, I can add my signature to the package to say so. Others will make of that what they will.

I really don’t know enough about this to know whether this is feasible? The core idea is to use a GPG style, decentralised web of trust to verify the security of the code, and then we have a signing mechanism to ensure the code that arrives is what it ought to be.

Matt

http://androidflavor.com James Frost

great post

http://www.NiceDealsSite.com/ Daniel Thompson

Learn How To Get Targeted Leads And Massive Traffic To Any Website In 15 Minutes Or Less.

You handle package signing in the same way it is handled for Chrome extensions, IPhone apps, Android Apps and others.

It’s not up to the programmer to provide a signature – it’s up to the package repository.

For composer, by default, all packages come from packagist.org

But packagist.org is just a bunch of open source scripts for generating a central list of packages. If *I* am concerned about security, I disable packagist.org and maintain my own website where the only releases that go into it are the ones I verified. Composer will only install what it can find in it’s directory. Add a short plugin to verify the signature on the package.json file before additing it and I can make sure I only install packages I have vetted.

If someone else is concerned and doesn’t want to spend the time vetting those packages, they can pay me a few bucks a month and use my database. Heck, they could even use multiple databases.

It’s the same model as app databases – the database maintainer is the one who does the signing – so you only need one certificate.

Find out the means of happiness in just a single click. Make an astonishing surprise to the person whom you love very much by Deliver Flowers and Gifts in India at cheap rate. Select amazing Gifts for your loved ones and spread happiness in their lives.