A collection of information

In the first part I discussed the balancing mechanisms the GPL provides for enabling corporate contributions, giving users a voice and the right one for mutually competing corporations to collaborate on equal terms. In this part I’ll look at how the legal elements of the GPL licences make it pretty much the perfect one for supporting a community of developers co-operating with corporations and users.

As far as a summary of my talk goes, this series is complete. However, I’ve been asked to add some elaboration on the legal structure of GPL+DCO contrasted to other CLAs and also include AGPL, so I’ll likely do some other one off posts in the Legal category about this.

Free Software vs Open Source

There have been many definitions of both of these. Rather than review them, in the spirit of Humpty Dumpty, I’ll give you mine: Free Software, to me, means espousing a set of underlying beliefs about the code (for instance the four freedoms of the FSF). While this isn’t problematic for many developers (code freedom, of course, is what enables developer driven communities) it is an anathema to most corporations and in particular their lawyers because, generally applied, it would require the release of all software based intellectual property. Open Source on the other hand, to me, means that you follow all the rules of the project (usually licensing and contribution requirements) but don’t necessarily sign up to the philosophy underlying the project (if there is one; most Open Source projects won’t have one).

Open Source projects are compatible with Corporations because, provided they have some commonality in goals, even a corporation seeking to exploit a market can march a long way with a developer driven community before the goals diverge. This period of marching together can be extremely beneficial for both the project and the corporation and if corporate priorities change, the corporation can simply stop contributing. As I have stated before, Community Managers serve an essential purpose in keeping this goal alignment by making the necessary internal business adjustments within a corporation and by explaining the alignment externally.

The irony of the above is that collaborating within the framework of the project, as Open Source encourages, could work just as well for a Free Software project, provided the philosophical differences could be overcome (or overlooked). In fact, one could go a stage further and theorize that the four freedoms as well as being input axioms to Free Software are, in fact, the generated end points of corporate pursuit of Open Source, so if the Open Source model wins in business, there won’t actually be a discernible difference between Open Source and Free Software.

Licences and Philosophy

It has often been said that the licence embodies the philosophy of the project (I’ve said it myself on more than one occasion, for which I’d now like to apologize). However, it is an extremely reckless statement because it’s manifestly untrue in the case of GPL. Neither v2 nor v3 does anything to require that adopters also espouse the four freedoms, although it could be said that the Tivoization Clause of v3, to which the kernel developers objected, goes slightly further down the road of trying to embed philosophy in the licence. The reason for avoiding this statement is that it’s very easy for an inexperienced corporation (or pretty much any corporate legal counsel with lack of Open Source familiarity) to take this statement at face value and assume adopting the code or the licence will force some sort of viral adoption of a philosophy which is incompatible with their current business model and thus reject the use of reciprocal licences altogether. Whenever any corporation debates using or contributing to Open Source, there’s inevitably an internal debate and this licence embeds philosophy argument is a powerful weapon for the Open Source opponents.

Equity in Contribution Models

Some licensing models, like those pioneered by Apache, are thought to require a foundation to pass the rights through under the licence: developers (or their corporations) sign a Contributor Licence Agreement (CLA) which basically grants the foundation redistributable licences to both copyrights and patents in the code and then the the Foundation licenses the contribution to the Project under Apache-2. The net result is the outbound rights (what consumers of the project gets) are Apache-2 but the inbound rights (what contributors are required to give away) are considerably more. The danger in this model is that control of the foundation gives control of the inbound rights, so who controls the foundation and how control may be transferred forms an important part of the analysis of what happens to contributor rights. Note that this model is also the basis of open core, with a corporation taking the place of the foundation.

Inequity in the inbound versus the outbound rights creates an imbalance of power within the project between those who possess the inbound rights and everyone else (who only possess the outbound rights) and can damage developer driven communities by creating an alternate power structure (the one which controls the IP rights). Further, the IP rights tend to be a focus for corporations, so simply joining the controlling entity (or taking a licence from it) instead of actually contributing to the project can become an end goal, thus weakening the technical contributions to the project and breaking the link with end users.

Creating equity in the licensing framework is thus a key to preserving the developer driven nature of a community. This equity can be preserved by using the Inbound = Outbound principle, first pioneered by Richard Fontana, the essential element being that contributors should only give away exactly the rights that downstream recipients require under the licence. This structure means there is no need for a formal CLA and instead a model like the Developer Certificate of Origin (DCO) can be used whereby the contributor simply places a statement in the source control of the project itself attesting to giving away exactly the rights required by the licence. In this model, there’s no requirement to store non-electronic copies of the the contribution attestation (which inevitably seem to get lost), because the source control system used by the project does this. Additionally, the source browsing functions of the source control system can trace a single line of code back exactly to all the contributor attestations thus allowing fully transparent inspection and independent verification of all the inbound contribution grants.

The Dangers of Foundations

Foundations which have no special inbound contribution rights can still present a threat to the project by becoming an alternate power structure. In the worst case, the alternate power structure is cemented by the Foundation having a direct control link with the project, usually via some Technical Oversight Committee (TOC). In this case, the natural Developer Driven nature of the project is sapped by the TOC creating uncertainty over whether a contribution should be accepted or not, so now the object isn’t to enthuse fellow developers, it’s to please the TOC. The control confusion created by this type of foundation directly atrophies the project.

Even if a Foundation specifically doesn’t create any form of control link with the project, there’s still the danger that a corporation’s marketing department sees joining the Foundation as a way of linking itself with the project without having to engage the engineering department, and thus still causing a weakening in both potential contributions and the link between the project and its end users.

There are specific reasons why projects need foundations (anything requiring financial resources like conferences or grants requires some entity to hold the cash) but they should be driven by the need of the community for a service and not by the need of corporations for an entity.

GPL+DCO as the Perfect Licence and Contribution Framework

Reciprocity is the key to this: the requirement to give back the modifications levels the playing field for corporations by ensuring that they each see what the others are doing. Since there’s little benefit (and often considerable down side) to hiding modifications and doing a dump at release time, it actively encourages collaboration between competitors on shared features. Reciprocity also contains patent leakage as we saw in Part 1. Coupled with the DCO using the Inbound = Outbound principle, means that the Licence and DCO process are everything you need to form an effective and equal community.

Equality enforced by licensing coupled with reciprocity also provides a level playing field for corporate contributors as we saw in part 1, so equality before the community ensures equity among all participants. Since this is analogous to the equity principles that underlie a lot of the world’s legal systems, it should be no real surprise that it generates the best contribution framework for the project. Best of all, the model works simply and effectively for a group of contributors without necessity for any more formal body.

Contributions and Commits

Although GPL+DCO can ensure equity in contribution, some human agency is still required to go from contribution to commit. The application of this agency is one of the most important aspects to the vibrancy of the project and the community. The agency can be exercised by an individual or a group; however, the composition of the agency doesn’t much matter, what does is that the commit decisions of the agency essentially (and impartially) judge the technical merit of the contribution in relation to the project.

A bad commit agency can be even more atrophying to a community than a Foundation because it directly saps the confidence the community has in the ability of good (or interesting) code to get into the tree. Conversely, a good agency is simply required to make sound technical decisions about the contribution, which directly preserves the confidence of the community that good code gets into the tree. As such, the role requires leadership, impartiality and sound judgment rather than any particular structure.

Governance and Enforcement

Governance seems to have many meanings depending on context, so lets narrow it to the rules by which the project is run (this necessarily includes gathering the IP contribution rights) and how they get followed. In a GPL+DCO framework, the only additional governance component required is the commit agency.

However, having rules isn’t sufficient unless you also follow them; in other words you need some sort of enforcement mechanism. In a non-GPL+DCO system, this usually involves having an elaborate set of sanctions and some sort of adjudication system, which, if not set up correctly, can also be a source of inequity and project atrophy. In a GPL+DCO system, most of the adjudication system and sanctions can be replaced by copyright law (this was the design of the licence, after all), which means licence enforcement (or at least the threat of it) becomes the enforcement mechanism. The only aspect of governance this doesn’t cover is the commit agency. However, with no other formal mechanisms to support its authority, the commit agency depends on the trust of the community to function and could easily be replaced by that community simply forking the tree and trusting a new commit agency.

The two essential corollaries of the above is that enforcement does serve an essential governance purpose in a GPL+DCO ecosystem and lack of a formal power structure keeps the commit agency honest because the community could replace it.

The final thing worth noting is that too many formal rules can also seriously weaken a project by encouraging infighting over rule interpretations, how exactly they should be followed and who did or did not dot the i’s and cross the t’s. This makes the very lack of formality and lack of a formalised power structure which the GPL+DCO encourages a key strength of the model.

Conclusions

In the first part I concluded that the GPL fostered the best ecosystem between developers, corporations and users by virtue of the essential ecosystem fairness it engenders. In this part I conclude that formal control structures are actually detrimental to a developer driven community and thus the best structural mechanism is pure GPL+DCO with no additional formality. Finally I conclude that this lack of ecosystem control is no bar to strong governance, since that can be enforced by any contributor through the copyright mechanism, and the very lack of control is what keeps the commit agency correctly serving the community.

A Community of Developers

The standard definition of any group of people building some form of open source software is usually that they’re developers (people with the necessary technical skills to create or contribute to the project). In pretty much every developer driven community, they’re doing it because they get something out of the project itself (this is the scratch your own itch idea in the Cathedral and the Bazaar): usually because they use the project in some form, but sometimes because they’re fascinated by the ideas it embodies (this latter is really how the Linux Kernel got started because ordinarily a kernel on its own isn’t particularly useful but, for a lot of the developers, the ideas that went into creating unix were enormously fascinating and implementations were completely inaccessible in Europe thanks to the USL vs BSDi lawsuit).

The reason for discussing developer driven communities is very simple: they’re the predominant type of community in open source (Think Linux Kernel, Gnome, KDE etc) which implies that they’re the natural type of community that forms around shared code collaboration. In this model of interaction, community and code are interlinked: Caring for the code means you also care for the community. The health of this type of developer community is very easily checked: ask how many contributors would still contribute to the project if they weren’t paid to do it (reduction in patch volume doesn’t matter, just the desire to continue sending patches). If fewer than 50% of the core contributors would cease contributing if they weren’t paid then the community is unhealthy.

Developer driven communities suffer from three specific drawbacks:

They’re fractious: people who care about stuff tend to spend a lot of time arguing about it. Usually some form of self organising leadership fixes a significant part of this, but it’s not guaranteed.

Since the code is built by developers for developers (which is why they care about it) there’s no room for users who aren’t also developers in this model.

The community is informal so there’s no organisation for corporations to have a peer relationship with, plus developers don’t tend to trust corporate motives anyway making it very difficult for corporations to join the community.

Trusting Corporations and Involving Users

Developer communities often distrust the motives of corporations because they think corporations don’t care about the code in the same way as developers do. This is actually completely true: developers care about code for its own sake but corporations care about code only as far as it furthers their business interests. However, this business interest motivation does provide the basis for trust within the community: as long as the developer community can see and understand the business motivation, they can trust the Corporation to do the right thing; within limits, of course, for instance code quality requirements of developers often conflict with time to release requirements for market opportunity. This shared interest in the code base becomes the framework for limited trust.

Enter the community manager: A community manager’s job, when executed properly, is twofold: one is to take corporate business plans and realign them so that some of the corporate goals align with those of useful open source communities and the second is to explain this goal alignment to the relevant communities. This means that a good community manager never touts corporate “community credentials” but instead explains in terms developers can understand the business reasons why the community and the corporation should work together. Once the goals are visible and aligned, the developer community will usually welcome the addition of paid corporate developers to work on the code. Paying for contributions is the most effective path for Corporations to exert significant influence on the community and assurance of goal alignment is how the community understands how this influence benefits the community.

Involving users is another benefit corporations can add to the developer ecosystem. Users who aren’t developers don’t have the technical skills necessary to make their voices and opinions heard within the developer driven community but corporations, which usually have paying users in some form consuming the packaged code, can respond to user input and could act as a proxy between the user base and the developer community. For some corporations responding to user feedback which enhances uptake of the product is a natural business goal. For others, it could be a goal the community manager pushes for within the corporation as a useful goal that would help business and which could be aligned with the developer community. In either case, as long as the motives and goals are clearly understood, the corporation can exert influence in the community directly on behalf of users.

Corporate Fear around Community Code

All corporations have a significant worry about investing in something which they don’t control. However, these worries become definite fears around community code because not only might it take a significant investment to exert the needed influence, there’s also the possibility that the investment might enable a competitor to secure market advantage.

Another big potential fear is loss of intellectual property in the form of patent grants. Specifically, permissive licences with patent grants allow any other entity to take the code on which the patent reads, incorporate it into a proprietary code base and then claim the benefit of the patent grant under the licence. This problem, essentially, means that, unless it doesn’t care about IP leakage (or the benefit gained outweighs the problem caused), no corporation should contribute code to which they own patents under a permissive licence with a patent grant.

Both of these fears are significant drivers of “privatisation”, the behaviour whereby a corporation takes community code but does all of its enhancements and modifications in secret and never contributes them back to the community, under the assumption that bearing the forking cost of doing this as less onerous than the problems above.

GPL is the Key to Allaying these Fears

The IP leak fear is easily allayed: whether the version of GPL that includes an explicit or implicit patent licence, the IP can only leak as far as the code can go and the code cannot be included in a proprietary product because of the reciprocal code release requirements, thus the Corporation always has visibility into how far the IP rights might leak by following licence mandated code releases.

GPL cannot entirely allay the fear of being out competed with your own code but it can, at least, ensure that if a competitor is using a modification of your code, you know about it (as do your competition), so everyone has a level playing field. Most customers tend to prefer active participants in open code bases, so to be competitive in the market places, corporations using the same code base tend to be trying to contribute actively. The reciprocal requirements of GPL provide assurance that no-one can go to market with a secret modification of the code base that they haven’t shared with others. Therefore, although corporations would prefer dominance and control, they’re prepared to settle for a fully level playing field, which the GPL provides.

Finally, from the community’s point of view, reciprocal licences prevent code privatisation (you can still work from a fork, but you must publish it) and thus encourage code sharing which tends to be a key community requirement.

Conclusions

In this first part, I conclude that the GPL, by ensuring fairness between mutually distrustful contributors and stemming IP leaks, can act as a guarantor of a workable code ecosystem for both developers and corporations and, by using the natural desire of corporations to appeal to customers, can use corporations to bridge the gap between user requirements and the developer community.

In the second part of this series, I’ll look at philosophy and governance and why GPL creates self regulating ecosystems which give corporations and users a useful function while not constraining the natural desire of developers to contribute and contrast this with other possible ecosystem models.

One of the most significant advances going from TPM1.2 to TPM2 was the addition of algorithm agility: The ability of TPM2 to work with arbitrary symmetric and asymmetric encryption schemes. In practice, in spite of this much vaunted agile encryption capability, most actual TPM2 chips I’ve seen only support a small number of asymmetric encryption schemes, usually RSA2048 and a couple of Elliptic Curves. However, the ability to support any Elliptic Curve at all is a step up from TPM1.2. This blog post will detail how elliptic curve schemes can be integrated into existing cryptographic systems using TPM2. However, before we start on the practice, we need at least a tiny swing through the theory of Elliptic Curves.

What is an Elliptic Curve?

An Elliptic Curve (EC) is simply the set of points that lie on the curve in the two dimensional plane (x,y) defined by the equation

y2 = x3 + ax + b

which means that every elliptic curve can be parametrised by two constants a and b. The set of all points lying on the curve plus a point at infinity is combined with an addition operation to produce an abelian (commutative) group. The addition property is defined by drawing straight lines between two points and seeing where they intersect the curve (or picking the infinity point if they don’t intersect). Wikipedia has a nice diagrammatic description of this here. The infinity point acts as the identity of the addition rule and the whole group is denoted E.

The utility for cryptography is that you can define an integer multiplier operation which is simply the element added to itself n times, so for P ∈ E, you can always find Q ∈ E such that

Q = P + P + P … = n × P

And, since it’s a simple multiplication like operation, it’s very easy to compute Q. However, given P and Q it is computationally very difficult to get back to n. In fact, it can be demonstrated mathematically that trying to compute n is equivalent to the discrete logarithm problem which is the mathematical basis for the cryptographic security of RSA. This also means that EC keys suffer the same (actually more so) problems as RSA keys: they’re not Quantum Computing secure (vulnerable to the Quantum Shor’s algorithm) and they would be instantly compromised if the discrete logarithm problem were ever solved.

Therefore, for any elliptic curve, E, you can choose a known point G ∈ E, select a large integer d and you can compute a point P = d × G. You can then publish (P, G, E) as your public key knowing it’s computationally infeasible for anyone to derive your private key d.

For instance, Diffie-Hellman key exchange can be done by agreeing (E, G) and getting Alice and Bob to select private keys dA, dB. Then knowing Bob’s public key PB, Alice can select a random integer r, which she publishes, and compute a key agreement as a secret point on the Elliptic Curve (r dA) × PB. Bob can derive the same Elliptic Curve point because

(r dA) × PB = (r dA)dB × G = (r dB) dA × G = (r dB) × PA

The agreement is a point on the curve, but you can use an agreed hashing or other mechanism to get from the point to a symmetric key.

Seems simple, but the problem for computing is that we really want to use integers and right at the moment the elliptic curve is defined over all the real numbers, meaning E is of infinite size and involves floating point computations (of rather large precision).

Elliptic Curves over Finite Fields

Fortunately there is a mathematical theory of finite fields, called Galois Theory, which allows us to take the Galois Field over prime number p, which is denoted GF(p), and compute Elliptic Curve points over this field. This derivation, which is mathematically rather complicated, is denoted E(GF(p)), where every point (x,y) is represented by a pair of integers between 0 and p-1. There is another theory that says the number of elements in E(GF(p))

n = |E(GF(p))|

is roughly the same size as p, meaning if you choose a 32 bit prime p, you likely have a field over roughly 2^32 elements. For every point P in E(GF(p)) it is also mathematically proveable that n × P = 0. where 0 is the zero point (which was the infinity point in the real elliptic curve).

This means that you can take any point, G, in E(GF(p)) and compute a subgroup based on it:

EG = { ∀m ∈ Zn : m × G }

If you’re lucky |EG| = |E(GF(p))| and G is the generator of the entire group. However, G may only generate a subgroup and you will find |EG| = h|E(GF(p))| where integer h is called the cofactor. In general you want the cofactor to be small (preferably less than four) for EG to be cryptographically useful.

For a computer’s purposes, EG is the elliptic curve group used for integer arithmetic in the cryptographic algorithms. The Curve and Generator is then defined by (p, a, b, Gx, Gy, n, h) which are the published parameters of the key (Gx, Gy represent the x and y elements of point G). You select a random number d as your private key and your public key P = d × G exactly as above, except now P is easy to compute with integer operations.

Problems with Elliptic Curves

Although I stated above that solving P = d × G is equivalent in difficulty to the discrete logarithm problem, that’s not generally true. If the discrete logarithm problem were solved, then we’d easily be able to compute d for every generator and curve, but it is possible to pick curves for which d can be easily computed without solving the discrete logarithm problem. This is the reason why you should never pick your own curve parameters (even if you think you know what you’re doing) because it’s very easy to choose a compromised curve. As a demonstration of the difficulty of the problem: each of the major nation state actors, Russia, China and the US, publishes their own curve parameters for use in their own cryptographic EC implementations and each of them thinks the parameters published by the others is compromised in a way that allows the respective national security agencies to derive private keys. So if nation state actors can’t tell if a curve is compromised or not, you surely won’t be able to either.

Therefore, to be secure in EC cryptography, you pick and existing curve which has been vetted and select some random Generator Point on it. Of course, if you’re paranoid, that means you won’t be using any of the nation state supplied curves …

Using the TPM2 with Elliptic Curves in Cryptosystems

The initial target for this work was the openssl cryptosystem whose libraries are widely used for deriving other uses (like https in apache or openssh). Originally, when I did the initial TPM2 enabling of openssl as described in this blog post, I added TPM2 as a patch to the existing TPM 1.2 openssl_tpm_engine. Unfortunately, openssl_tpm_engine seems to be pretty much defunct at this point, so I started my own openssl_tpm2_engine as a separate git tree to begin experimenting with Elliptic Curve keys (if you don’t use git, you can download the tar file here). One of the benefits to running my own source tree is that I can now add a testing infrastructure that makes use of the IBM TPM emulator to make sure that the basic cryptographic operations all work which means that make check functions even when a TPM2 isn’t available. The current key creation and import algorithms use secured connections to the TPM (to avoid eavesdropping) which means it’s only really possible to construct them using the IBM TSS. To make all of this easier, I’ve set up an openSUSE Build Service repository which is building for all major architectures and the openSUSE and Fedora distributions (ignore the failures, they’re currently induced because the TPM emulator only currently works on 64 bit little endian systems, so make check is failing, but the TPM people at IBM are working on this, so eventually the builds should be complete).

TPM2 itself also has some annoying restrictions. The biggest of which is that it doesn’t allow you to pass in arbitrary elliptic curve parameters; you may only use elliptic curves which the TPM itself knows. This will be annoying if you have an existing EC key you’re trying to import because the TPM may reject it as an unknown algorithm. For instance, openssl can actually compute with arbitrary EC parameters, but has 39 current elliptic curves parametrised by name. By contrast, my Nuvoton TPM2 inside my Dell XPS 13 knows precisely two curves:

jejb@jarvis:~> create_tpm2_key --list-curves
prime256v1
bnp256

However, assuming you’ve picked a compatible curve for your EC private key (and you’ve defined a parent key for the storage hierarchy) you can simply import it to a TPM bound key:

create_tpm2_key -p 81000001 -w key.priv key.tpm

The tool will report an error if it can’t convert the curve parameters to a named elliptic curve known to the TPM

Why you should use EC keys with the TPM

The initial attraction is the same as for RSA keys: making it impossible to extract your private key from the system. However, the mathematical calculations for EC keys are much simpler than for RSA keys and don’t involve finding strong primes, so it’s much simpler for the TPM (being a fairly weak calculation machine) to derive private and public EC keys. For instance the times taken to derive a RSA key from the primary seed and an EC key differ dramatically

so for a slow system like the TPM, using EC keys is a significant speed advantage. Additionally, there are other advantages. The standard EC Key signature algorithm is a modification of the NIST Digital Signature Algorithm called ECDSA. However DSA and ECDSA require a cryptographically strong (and secret) random number as Sony found out to their cost in the EC Key compromise of Playstation 3. The TPM is a good source of cryptographically strong random numbers and if it generates the signature internally, you can be absolutely sure of keeping the input random number secret.

Why you might want to avoid EC keys altogether

In spite of the many advantages described above, EC keys suffer one additional disadvantage over RSA keys in that Elliptic Curves in general are very hot fields of mathematical research so even if the curve you use today is genuinely not compromised, it’s not impossible that a mathematical advance tomorrow will make the curve you chose (and thus all the private keys you generated) vulnerable. Of course, the same goes for RSA if anyone ever cracks the discrete logarithm problem, but solving that problem would likely be fully published to world acclaim and recognition as a significant contribution to the advancement of number theory. Discovering an attack on a currently used elliptic curve on the other hand might be better remunerated by offering to sell it privately to one of the national security agencies …

If, like me, you run your own cloud server, at some point you need TLS certificates to export secure services. Like a lot of people I object to paying a so called X.509 authority a yearly fee just to get a certificate, so I’ve been using a free startcom one for a while. With web browsers delisting startcom, I’m unable to get a new usable certificate from them, so I’ve been investigating letsencrypt instead (if you’re in to fun ironies, by the way, you can observe that currently the letsencrypt site isn’t using a letsencrypt certificate, perhaps indicating the administrative difficulty of doing so).

The problem with letsencrypt certificates is that they currently have a 90 day expiry, which means you really need to use an automated tool to keep your TLS certificate current. Fortunately the EFF has developed such a tool: certbot (they use a letsencrypt certificate for their site, indicating you can have trust that they do know what they’re doing). However, one of the problems with certbot is that, by default, it generates a new key each time the certificate is renewed. This isn’t a problem for most people, but if you use DANE records, it causes significant issues.

Why use both DANE and letsencrypt?

The beauty of DANE, as I’ve written before, is that it gives you a much more secure way of identifying your TLS certificate (provided you run DNSSEC). People verifying your certificate may use DANE as the only verification mechanism (perhaps because they also distrust the X.509 authorities) which means the best practice is to publish a DANE TLSA record for each service and also use an X.509 authority rooted certificate. That way your site just works for everyone.

The problem here is that being DNS based, DANE records can be cached for a while, so it can take a few days for DANE certificate updates to propagate through the DNS infrastructure. DANE records have an answer for this: they have a mode where the record identifies only the hash of the public key used by the website, not the certificate itself, which means you can change your certificate as much as you want provided you keep the same public/private key pair. And here’s the rub: if certbot is going to try to give you a new key on each renewal, this isn’t going to work.

Making certbot work with DANE

Fortunately, there is a solution: the certbot manual mode (certonly) takes a –csr flag which allows you to construct your own certificate request to send to letsencrypt, meaning you can keep a fixed key … at the cost of not using most of the certbot automation. So, how do you construct a correct csr for letsencrypt? Like most free certificates, letsencrypt will only allow you to specify the certificate commonName, which must be a DNS pointer to the actual website. If you need a certificate that covers multiple sites, all the other sites must be enumerated in the x509 v3 extensions field subjectAltName. Let’s look at how openssl can generate such a certificate request. One of the slight problems is that openssl, being a cranky tool, does not allow you to specify a subjectAltName on the command line, so you have to construct a special configuration file for it. I called mine letsencrypt.conf

Where mykey.key is the path to your private key (you need a private key because even though the CSR only contains the public key, it is also signed). However, once you’ve produced this letsencrypt.csr, you no longer need the private key and, because it’s undated, it will now work forever, meaning the infrastructure you put into place with certbot doesn’t need to be privileged enough to access your private key. Once this is done, you make sure you have TLSA 3 1 1 records pointing to the hash of your public key (here’s a handy website to generate them for you) and you never need to alter your DANE record again. Note, by the way that letsencrypt certificates can be used for non-web purposes (I use mine for encrypted SMTP as well), so you’ll need one DANE record for every service you use them for.

Putting it all together

Now that you have your certificate request, depending on what version of certbot you have, you may need it in DER format

And finally you can run this as a cron script under whichever user you’ve chosen to have sufficient privilege to write the certificates. I run this every month, so I know I if anything goes wrong I have at least two months to fix it.

Oh, and just in case you need proof that I got this all working, here you are!

Recently Microsoft started mandating TPM2 as a hardware requirement for all platforms running recent versions of windows. This means that eventually all shipping systems (starting with laptops first) will have a TPM2 chip. The reason this impacts Linux is that TPM2 is radically different from its predecessor TPM1.2; so different, in fact, that none of the existing TPM1.2 software on Linux (trousers, the libtpm.so plug in for openssl, even my gnome keyring enhancements) will work with TPM2. The purpose of this blog is to explore the differences and how we can make ready for the transition.

What are the Main 1.2 vs 2.0 Differences?

The big one is termed Algorithm Agility. TPM1.2 had SHA1 and RSA2048 only. TPM2 is designed to have many possible algorithms, including support for elliptic curve and a host of government mandated (Russian and Chinese) crypto systems. There’s no requirement for any shipping TPM2 to support any particular algorithms, so you actually have to ask your TPM what it supports. The bedrock for TPM2 in the West seems to be RSA1024-2048, ECC and AES for crypto and SHA1 and SHA256 for hashes1.

What algorithm agility means is that you can no longer have root keys (EK and SRK see here for details) like TPM1.2 did, because a key requires a specific crypto algorithm. Instead TPM2 has primary “seeds” and a Key Derivation Function (KDF). The way this works is that a seed is simply a long string of random numbers, but it is used as input to the KDF along with the key parameters and the algorithm and out pops a real key based on the seed. The KDF is deterministic, so if you input the same algorithm and the same parameters you get the same key again. There are four primary seeds in the TPM2: Three permanent ones which only change when the TPM2 is cleared: endorsement (EPS), Platform (PPS) and Storage (SPS). There’s also a Null seed, which is used for ephemeral keys and changes every reboot. A key derived from the SPS can be regarded as the SRK and a key derived from the EPS can be regarded as the EK. Objects descending from these keys are called members of hierarchies2. One of the interesting aspects of the TPM is that the root of a hierarchy is a key not a seed (because you need to exchange secret information with the TPM), and that there can be multiple of these roots with different key algorithms and parameters.

Additionally, the mechanism for making use of keys has changed slightly. In TPM 1.2 to import a secret key you wrapped it asymmetrically to the SRK and then called LoadKeyByBlob to get a use handle. In TPM2 this is a two stage operation, firstly you import a wrapped (or otherwise protected) private key with TPM2_Import, but that returns a private key structure encrypted with the parent key’s internal symmetric key. This symmetrically encrypted key is then loaded (using TPM2_Load) to obtain a use handle whenever needed. The philosophical change is from online keys in TPM 1.2 (keys which were resident inside the TPM) to offline keys in TPM2 (keys which now can be loaded when needed). This philosophy has been reinforced by reducing the space available to keep keys loaded in TPM2 (see later).

Playing with TPM2

If you have a recent laptop, chances are you either have or can software upgrade to a TPM2. I have a dell XPS13 the skylake version which comes with a software upgradeable Nuvoton TPM. Dell kindly provides a 1.2->2 switching program here, which seems to work under Freedos (odin boot) so I have a physical TPM2 based system. For those of you who aren’t so lucky, you can still play along, but you need a TPM2 emulator. The best one is here; simply download and untar it then type make in the src directory and run it as ./tpm_server. It listens on two TCP ports, 2321 and 2322, for TPM commands so there’s no need to install it anywhere, it can be run directly from the source directory.

After that, you need the interface software called tss2. The source is here, but Fedora 25 and recent Ubuntu already package it. I’ve also built openSUSE packages here. The configuration of tss2 is controlled by environment variables. The most important one is TPM_INTERFACE_TYPE which tells it how to connect to the TPM2. If you’re using a simulator, you set this to “socsim” and if you have a real TPM2 device you set it to “dev”. One final thing about direct device connection: in tss2 there’s no daemon like trousers had to broker the connection, all your users connect directly to the TPM2 device /dev/tpm0. To do this, the device has to support read and write by arbitrary users, so its permissions need to be 0666. I’ve got a udev script to achieve this

Which goes in /etc/udev/rules.d/80-tpm-2.rules on openSUSE. The next thing you need to do, if you’re running the simulator, is power it on and start it up (for a real device, this is done by the bios):

tsspowerup
tssstartup

The simulator will now create a NVChip file wherever you started it to store NV ram based objects, which it will read on next start up. The first thing you need to do is create an SRK and store it in NV memory. Microsoft uses the well known key handle 81000001 for this, so we’ll do the same. The reason for doing this is that a real TPM takes ages to run the KDF for RSA keys because it has to look for prime numbers:

As you can see: the simulator created a primary storage key (the SRK) in a few milliseconds, but it took my real TPM2 20 seconds to do it3 … not something you want to wait for, hence the need to store this permanently under a well known key handle and get rid of the temporary copy

tssevictcontrol tells the TPM to copy the key at transient handle 800000004 to permanent NV handle 81000001 and tssflushcontext erases the transient key. Flushing transient objects is very important, because TPM2 has a lot less transient storage space than TPM1.2 did; usually only about three handles worth. You can tell how much you have by doing

tssgetcapability -cap 6|grep -i transient
TPM_PT 0000010e value 00000003 TPM_PT_HR_TRANSIENT_MIN - the minimum number of transient objects that can be held in TPM RAM

Where the value (00000003) tells me that the TPM can store at least 3 transient objects. After that you’ll start getting out of space errors from it.

The final step in taking ownership of a TPM2 is to set the authorization passwords. Each of the four hierarchies (Null, Owner, Endorsement, Platform) and the Lockout has a possible authority password. The Platform authority is cleared on startup, so there’s not much point setting it (it’s used by the BIOS or Firmware to perform TPM functions). Of the other four, you really only need to set Owner, Endorsement and Lockout (I use the same password for all of them).

After this is done, you’re all set. Note that as well as these authorizations, each object can have its own authorization (or even policy), so the SRK you created earlier still has no password, allowing it to be used by anyone. Note also that the owner authorization controls access to the NV memory, so you’ll need to supply it now to make other objects persistent.

An Aside about Real TPM2 devices and the Resource Manager

Although I’m using the code below to store my keys in the TPM2, there’s a couple of practical limitations which means it won’t work for you if you have multiple TPM2 using applications without a kernel update. The two problems are

The Linux Kernel TPM2 device /dev/tpm0 only allows one user at once. If a second application tries to open the device it will get an EBUSY which causes TSS_Create() to fail.

Because most applications make use of transient key slots and most TPM2s have only a couple of these, simultaneous users can end up running out of these and getting unexpected out of space errors.

The solution to both of these is something called a Resource Manager (RM). What the RM does is effectively swap transient objects in and out of the TPM as needed to prevent it from running out of space. Linux has an issue in that both the kernel and userspace are potential users of TPM keys so the resource manager has to live inside the kernel. Jarkko Sakkinen has preliminary resource manager patches here, and they will likely make it into kernel 4.11 or 4.12. I’m currently running my laptop with the RM patches applied, so multiple applications work for me, but since these are preliminary patches, I wouldn’t currently advise others to do this. The way these patches work is that once you declare to the kernel via an ioctl that you want to use the RM, every time you send a command to the TPM, your context gets swapped in, the command is executed, the context is swapped out and the response sent meaning that no other user of the TPM sees your transient objects. The moment you send the ioctl, the TPM device allows another user to open it as well.

Using TPM2 as a keystore

Once the implementation is sorted out, openssl and gnome-keyring patches can be produced for TPM2. The only slight wrinkle is that for create_tpm2_key you require a parent key to exist in the NV storage (at the 81000001 handle we talked about previously). So to convert from a password protected openssh RSA key to a TPM2 based one, you do

And then gnome keyring manager will work nicely (make sure you keep a copy of your original private key in case you want to move to a new laptop or reseed the TPM2). If you use the same TPM2 password as your original key password, you won’t even need to update the gnome loginkeyring for the new password.

Conclusions

Because of the lack of an in-kernel Resource Manager, TPM2 is ready for experimentation in Linux but definitely not ready for prime time yet (unless you’re willing to patch your kernel). Hopefully this will change in the 4.11 or 4.12 kernel when the Resource Manager finally goes upstream5.

Looking forwards to the new stack, the lack of a central daemon is really a nice feature: tcsd crashing used to kill all of my TPM key based applications, but with tss2 having no central daemon, everything has just worked(tm) so far. A kernel based RM also means that the kernel can happily use the TPM (for its trusted keys and disk encryption) without interfering with whatever userspace is doing.

One of the questions about the previous post on using your TPM as a secure key store was “could the TPM be used to protect ssh keys?” The answer is yes, because openssh uses openssl (so you can simply convert an openssh private key to a TPM private key) but the ssh-agent wouldn’t work because ssh-add passes in the private keys by their component primes, which is not possible when the private key is guarded by the TPM. However, that made me actually look at gnome-keyring to figure out how it worked and whether it could be used with the TPM. The answer is yes, but there are also some interesting side effects of TPM enabling gnome-keyring.

Gnome-keyring Architecture

Gnome-keyring consists of essentially three components: a pluggable store backend, a secure passphrase holder (which is implemented as a backend store) and an agent frontend. The frontend and backend talk to each other using the pkcs11 protocol. This also means that the backend can serve anything that also speaks pkcs11, which most encryption systems do. The stores consist of a variety of file or directory backed keys (for instance the ssh-store simply loads all the ssh keys in your $HOME/.ssh; the secret-store uses the gnome default keyring store, $HOME/.local/shared/keyring/, to store collections of passwords) The frontends act as bridges between a variety of external protocols and the keyring daemon. They take whatever external protocol they’re speaking in, convert the request to pkcs11 and query the backends for the information. The most important frontend is the login one which is called by gnome-keyring-daemon at start of day to unlock the secret-store which contains all your other key passwords using your login password as the key. The ssh-agent frontend speaks the ssh agent protocol and converts key and signing requests to pkcs11 which is mostly served by the ssh-store. The gpg-agent store speaks a very cut down version of the gpg agent protocol: basically all it does is allow you to store gpg key passwords in the secret-store; it doesn’t do any cryptographic operations.

Pkcs11 Essentials

Pkcs11 is a highly complex protocol designed for opening sessions which query or operate on tokens, which roughly speaking represent a bundle of objects (If you have a USB crypto key, that’s usually represented as a token and your stored keys as objects). There is some intermediate stuff about slots, but that mostly applies to tokens which may be offline and need insertion, which isn’t relevant to the gnome keyring. The objects may have several levels of visibility, but the most common is public (always visible) and private (must be logged in to the token to see them). Objects themselves have a variety of attributes, some of which depend on what type of object they are and some of which are universal (like id and label). For instance, an object representing an RSA public key would have the public exponent and the public modulus as queryable attributes.

The pkcs11 protocol also has a reasonably comprehensive object finding protocol. An arbitrary list of attributes and values can be passed to the query and it will return all objects that fully match. The token identifier is a query attribute which may be present but doesn’t have to be, so if you omit it, you end up searching over every token the pkcs11 subsystem knows about.

The other operation that pkcs11 understands is logging into a token with a pin (which is pkcs11 speak for a passphrase). The pin doesn’t have to be supplied by the entity performing the login, it may be supplied to the token via an out of band mechanism (for instance the little button on the yukikey, or even a real keypad). The important thing for gnome keyring here is that logging into a token may be as simple as sending the login command and letting the token sort out the authorization, which is the way gnome keyring operates.

You also need to understand is conventions about searching for keys. The pkcs11 standard recommends (but does not require) that public and private key pairs, which exist as two separate objects, should have the same id attribute. This means that if you want to find an rsa private key, the way you do this is by searching the public objects for the exponent and modulus. Once this is returned, you log into the token and retrieve the private key object by the public key id.

The final thing to understand is that once you have the private key object, it merely acts as an entitlement to have the token perform certain private key operations for you (like encryption, decryption or signing); it doesn’t mean you have access to the private key itself.

Gnome Keyring handling of ssh keys

Once the ssh-agent side of gnome keyring receives a challenge, it must respond by returning the private key signature of the challenge. To do this, it searches the pkcs11 for the key used for the challenge (for RSA keys, it searches by modulus and exponent, for DSA keys it searches by signature primes, etc). One interesting point to note is the search isn’t limited to the gnome keyring ssh token store, so if the key is found anywhere, in any pkcs11 token, it will be used. The expectation is that they key id attribute will be the ssh fingerprint and the key label attribute will be the ssh comment, but these aren’t used in the actual search. Once the public key is found, the agent logs into the token, retrieves the private key by label and id and proceeds to get the private key to sign the challenge.

Adding TPM key handling to the ssh store

Gnome keyring is based on GNU libgcrypt which handles all cryptographic objects via s-expressions (which are basically lisp like strings). Gcrypt itself doesn’t seem to have an s-expression for a token, so the actual signing will be done inside the keyring code. The s-expression I chose to represent a TPM based private key is

The rsa is necessary for it to be recognised for RSA signatures. Now the ssh-store is modified to recognise the PEM guards for TPM keys, load the key blob up into the s-expression and ask the secret-store for the authorization passphrase which is also loaded. In theory, it might be safer to ask the secret store for the authorization at key use time, but this method mirrors what happens to private keys, which are decrypted at load time and stored as s-expressions containing the component primes.

Now the RSA signing code is hooked to check for TPM s-expressions and divert the signature to the TPM if they’re found. Once this is done, gnome keyring is fully enabled for TPM based ssh keys. The initial email thread about this is here, and an openSUSE build repository of the modified code is here.

One important design constraint for this is that people (well, OK me) have a lot of ssh keys, sometimes more than could be effectively stored by the TPM in its internal shielded memory (plus if you’re like me, you’re using the TPM for other things like VPN), so the design of the keyring TPM additions is not to burden that memory further, thus TPM keys are only loaded into the TPM whenever they’re needed for an operation and are unloaded otherwise. This means the TPM can scale to hundreds of keys, but at the expense of taking longer: Instead of simply asking the TPM to sign something, you have to first ask the TPM to load and unwrap (which is an RSA operation) the key, then use it to sign. Effectively it’s two expensive TPM operations per real cryptographic function.

Using TPM based ssh keys

Once you’ve installed the modified gnome-keyring package, you’re ready actually to make use of it. Since we’re going to replace all your ssh private keys with TPM equivalents, which are keyed to a given storage root key (SRK)1 which changes every time the TPM is cleared, it is prudent to take a backup of all your ssh keys on some offline encrypted USB stick, just in case you ever need to restore them (or transfer them to a new laptop2).

You’ll be prompted first to create a TPM authorization password for the key, then to verify it, then to give the PEM password for the ssh key (for each key). Note, this only transfers RSA keys (the only algorithm the TPM can handle) and also note the script above is overwriting the PEM private key, so make sure you have a backup. The create_tpm_key command comes from the openssl_tpm_engine package, which I’ve patched here to support random migration authority and well known SRK authority.

All you have to do now is log out and back in (to restart the gnome-keyring daemon with the new ssh keystore) and you’re using TPM based keys for all your ssh operations. I’ve noticed that this adds a couple of hundred milliseconds per login, so if you batch stuff over ssh, this is why your scripts are slower.

One of the new features of Linux Plumbers Conference this year was the TPM Microconference, which facilitated great discussions both in the session itself and in the hallways. Quite a bit of discussion was generated by the Beginner’s Guide to the TPM talk I gave, mostly because I blamed the Trusted Computing Group for the abject failure to adopt TPMs for anything citing the incredible complexity of their stack.

The main thing that came out of this discussion was that a lot of this stack complexity can be hidden from users and we should concentrate on making the TPM “just work” for all cryptographic functions where we have parallels in the existing security layers (like the keystore). One of the great advantages of the TPM, instead of messing about with USB pkcs11 tokens, is that it has a file format for TPM keys (I’ll explain this later) which can be used directly in place of standard private key files. However, before we get there, lets discuss some of the basics of how your TPM works and how to make use of it.

TPM Basics

Note that all of what I’m saying below applies to a 1.2 TPM (the type most people have in their laptops) 2.0 TPMs are now appearing on the market, but chances are you have a 1.2.

A TPM is traditionally delivered in your laptop in an uninitialised state. In older laptops, the TPM is traditionally disabled and you usually have to find an entry in the BIOS menu to enable it. In more modern laptops (thanks to Windows 10) the TPM is enabled in the bios and ready for the OS install to make use of it. All TPMs are delivered with one manufacturer set key called the Endorsement Key (EK). This key is unique to your TPM (like an identifying label) and is used as part of the attestation protocol. Because the EK is a unique label, the attestation protocol is rather complex involving a so called privacy CA to protect your identity, but because it isn’t necessary to use the TPM as a secure keystore, I won’t cover it further.

The other important key, which you have to generate, is called the Storage Root Key. This key is generated internally within the TPM once somebody takes ownership of it. The package you need to begin using the tpm is tpm-tools, which is packaged by most distros. You must also have the Linux TSS stack trousers installed (just installing tpm-tools will often pull this in) and have the tcsd part of trousers running (usually systemctl start tcsd; systemctl enable tcsd). I tend to configure my TPM with an owner password (for things like resetting dictionary attacks) but a well known storage root key authority. To do this from a fully cleared and enabled TPM, execute

tpm_takeownership -z

And give your chosen owner password when prompted. If you get an error, chances are you need to go back to the BIOS menu and actively clear and reset the TPM (usually under the security options).

Aside about Authority and the Trusted Security Stack

To the TPM, an “authority” is a 20 byte number you use to prove you’re allowed to manipulate whatever object you’re trying to use. The TPM typically has a well known way of converting typed passwords into these 20 byte codes. The way you prove you know the authority is to add a Hashed Message Authentication Code (HMAC) to your TPM command. This means that the hash can only be generated by someone who knows the authority for the object, but anyone seeing the hash cannot derive the authority from it. The utility of this is that the trousers library (tspi) generates the HMAC before the TPM command is passed to the central daemon (tcsd) meaning that nothing except you and the TPM know the authority

The final thing about authority you need to know is that the TPM has a concept of “well known authority” which simply means supply 20 bytes of zeros. It’s kind of paradoxical to have a secret everyone knows, however, there are reasons for this: For most objects in the TPM whether you require authority to use them is optional, but for some it is mandatory. For objects (like the SRK) where authority is mandatory, using the well known authority is equivalent to saying actually I don’t need authorization for this object.

The Storage Root Key (SRK)

Once you’ve generated this above, the TPM keeps the secret part permanently hidden, but can be persuaded to give anyone the public part. In TPM 1.2, the SRK is a RSA 2048 key. On most modern TPMs, you have to tell the tpm you want anyone to be able to read the public part of the storage root key, which you do with this command

tpm_restrictsrk -a

You’ll get prompted for the owner password. Once you execute this command, anyone who knows the SRK authority (which you’ve set to be well known) is allowed to read the public part.

Why all this fuss about SRK authorization? Well, traditionally, the TPM is designed for use in a hostile multi-user environment. In the relaxed, no authorization, environment I’ve advised you to set up, anyone who knows the SRK can upload any storage object (like a key or protected blob) into the TPM. This means, since the TPM has very limited storage, that they could in theory do a DoS attack against the TPM simply by filling it with objects. On a laptop where there’s only one user (you) this is not usually a concern, hence the advice to use a well known authority, which makes the TPM much easier to use.

The way external objects (like keys or data blobs) are uploaded into the TPM is that they all have a parent (which must be a storage key) and they are encrypted to the public part of this key (in TPM parlance, this is called wrapping). The TPM can have deep key hierarchies (all eventually parented to the SRK), but for a laptop, it makes sense simply to use the SRK as the only storage key and wrap everything for it as the parent. Now here’s the reason for the well known authority: to upload an object into the TPM, it not only needs to be wrapped to the parent key, you also need to use the parent key authority to perform the upload. The object you’re using also has a separate authority. This means that when you upload and use a key, if you’ve set a SRK password, you’ll end up having to type both the SRK password and the key password pretty much every time you use it, which is a bit of a pain.

The tools used to create wrapped keys are found in the openssl_tpm_engine package. I’ve done a few patches to make it easier to use (mostly by trying well known authority first before asking for the SRK password), so you can see my patched version here. The first thing you can do is take any PEM key file you have and wrap it for your tpm

create_tpm_key -m -w test.key test.tpm.key

This creates a TPM key file test.tpm.key containing a wrapped key for your TPM with no authority (to add an authority password, use the -a option). If you cat the test.tpm.key file, you’ll see it looks like a standard PEM file, except the guards are now

-----BEGIN TSS KEY BLOB-----
-----END TSS KEY BLOB-----

This key is now wrapped for your TPM’s SRK and would only be usable on your laptop. If you’re fortunate enough to be using an application linked with gnutls, you can simply use this key with the URI tpmkey:file=<path to test.tpm.key>. If you’re using openssl, you need to patch it to get it to use TPM keys easily (see below).

The ideal, however, would be since these are PEM files with unique guards, any ssl provider should simply recognise the guards and load the key into the TPM This means that in order to use a TPM key, you take a standard PEM private key file, transform it into a TPM key file and then simply copy it back to where the original key file was being used from and voila! you’re using a TPM based key. This is what the openssl patches below do.

Getting TPM keys to “just work” with openssl

In openssl, external encryption processors, like the TPM or USB keys are used by things called engines. The engine you need for the TPM is also in the openssl_tpm_engine package, so once you’ve installed that package, the engine is available. Unfortunately, openssl doesn’t naturally use a particular engine unless told to do so (most of the openssl tools have a -engine option for this). However, having to specify the engine in every application somewhat spoils the “just works” aspect we’re looking for, so the openssl patches here allow an engine to specify that it knows how to parse a PEM file and can load a key from it. This allows you simply to replace the original key file with a TPM protected key file and have your application continue working with it.

As a demo of the usefulness, I’m using it on my current laptop with all my VPN keys. It is also possible to use it with openssh keys, since they’re standard PEM files. However, the way openssh works with agents means that the agent cannot handle the keys and you have to type the password (if you set one one the key) each time you use it.

It should be noted that the idea of having PEM based TPM keys just work in openssl is encountering resistance. However, it does just work in gnutls (provided you change the file name to be a tpmkey:file= URL).

Conclusions (or How Well is it Working?)

As I said above, I’m currently using this scheme for my openvpn and ssh keys. I have to confess, since I use openssh a lot, I got very tired of having to type the password on every ssh operation, so I’ve gone back to using non-TPM based keys which can be handled by the agent. Fixing this is on my list of things to look at. However, I still am using TPM based keys for my openvpn.

Even for openvpn, though there are hiccoughs: the trousers daemon, tcsd, crashes periodically on my platform. When it does, the vpn goes down (because the VPN needs a key based authentication transaction every hour to rotate the symmetric encryption keys). Unfortunately, just restarting tcsd isn’t enough because the design of trousers doesn’t seem to be robust to this failure (even though the tspi part linked with the application could recreate all the keys), so the VPN itself must be restarted when this happens, which makes it rather user unfriendly. Fixing trousers to cope with tcsd failure is also on my list of things to fix …

Reading Matthew Garret’s exposés of home automation IoT devices makes most engineers think “hell no!” or “over my dead body!”. However, there’s also the siren lure that the ability to program your home, or update its settings from anywhere in the world is phenomenally useful: for instance, the outside lights in my house used to depend on two timers (located about 50m from each other). They were old, loud (to the point the neighbours used to wonder what the buzzing was when they visited) and almost always wrongly set for turning the lights on at sunset. The final precipitating factor for me was the need to replace our thermostat, whose thermistor got so eccentric it started cooling in winter; so away went all the timers and their loud noises and in came a z-wave based home automation system, and the guilty pleasure of having an IoT based home automation system. Now the lights precisely and quietly turn on at sunset and off at 23:00 (adjusting themselves for daylight savings); the thermostat is accessible from my phone, meaning I can adjust it from wherever I happen to be (including Hong Kong airport when I realised I’d forgotten to set it to energy saving mode before we went on holiday). Finally, there’s waking up at 3am to realise your wife has fallen asleep over her book again and being able to turn off her reading light from your alarm clock without having to get out of bed … Automation bliss!

Selecting your network

For me, nothing IP/Wifi based was partly due to Matthew’s blog and partly because my home Wifi network looks different from everyone else’s: I actually run an internal, secure, home network that is wired and have my Wifi sit unsecured and outside the firewall. This goes back to the good old days of expecting to find wifi wherever you travelled and returning the courtesy by ensuring your wifi was accessible, but it does mean that any wifi connected device would be outside my firewall and open to all, which, given the general insecurity of the devices, makes this a non-starter.

The next level down is to use a private network, like zigbee or z-wave. I chose z-wave because it covers longer distances (which I need) and it doesn’t interfere with wifi (I have a hard time covering the entire house, even with two wifi access points). Z-wave also looks secure, but, if you dig deeply, you find that there are flaws in the protocol that lay you open to a local attacker. This, by the way, shows the futility of demanding security from IoT vendors who really don’t understand how to do it: a flawed security implementation is pretty much as bad as no security at all.

Once this decision is made, the next is to choose a gateway to the internet that does what you want, namely give you remote control without giving up your security.

Gateway Phone Home?

A surprising number of z-wave controllers are of the phone home type (this means phone their manufacturer’s home, not you), and almost all of these simply won’t work if they’re not allowed to phone home. Google comprehensively demonstrated the issues this raises with nest: lots of early adopters now have so much non-functional junk.

For me, there was also the burned hand experience with Google services: whenever I travel, I invariably get locked out because of some pseudo-security issue and it takes a fight to get back in again. This ultimately precipitated my move away from the Google cloud and on to Owncloud for calendar and contacts, but also means I really don’t want to have to trust another external service for my home automation.

Given the significantly limited choice of non-phone home z-wave controllers, I chose the HomeSeer Zee S2. It’s basically a raspberry pi with a z-wave dongle and Linux. If you’re into Linux on evereything, you should be aware that the home automation system is actually written in .net and it uses mono to bridge the gap; an odd choice given that there’s no known windows platform that could actually possibly run this system.

Secure Internet based Automation

The ZS2 does actually come with wifi, but given my already listed wifi problems, it’s actually plugged into my secure wired network with all phone home capabilities disabled. Great, but that means it’s only accessible over a VPN and I want to be able to control it from things like my phone, where running a VPN is cumbersome, so lets do some magic tricks to make it securely accessible by any member of the family from any device.

Obviously, since I already run Owncloud, I have a server of my own in a co-located site. It’s this server I propose to use as my secure gateway. The obvious way of doing this is simply proxying the ZS2 controller web page, but there are a couple of problems: firstly if I do it globally the ZS2 will be visible to port scans and secondly it only actually has an unencrypted web page with http authentication, meaning the login credentials would go over the internet in clear text … oops!

The solution to the first of these is to make the web page only accessible to authenticated devices. My current method is to use firewall whitelisting and a hook to an existing service authentication to open up the port. So in the firewall mangle table, all the ports which require whitelisting are marked. Then, in the input firewall, any packet so marked is checked against the whitelist for a matching source IP. If a match is found, then the packet is permitted, otherwise it is denied.

Meaning that any IP address that gets an authenticated imap connection (which is basically everybody’s internet device, since they all connect to email) is now allowed to access the authenticated ports. Since imap requires re-authentication after a configurable timeout, the whitelist entry only lasts for just over that timeout and hey presto, we have our secured port system.

Obviously, this isn’t foolproof: in particular whitelisting by external IP means that anyone sharing the same ip address via nat (like at a hotel) also has access to the secured ports, but it does cut down enormously on generic internet visibility.

The final thing is to add security to the insecure web page, so anyone in the path to my internet host can’t sniff the password. This is easily achieved by an stunnel redirect from the secure incoming port to the ZS2 over the VPN that connects to the internal network. The beauty of this is that stunnel can now use the existing web certificate for my internet host to afford protection from man in the middle attacks as well.

Last thoughts about Security

Obviously, the security above isn’t perfect. Anyone sharing my external IP would be able to run a port scan and (if they’re clever) detect the https port the ZS2 is on. However, it does require a lot of luck to do this and, obviously, even if they’re in the fortunate position of sharing an IP address, I’ve changed the default password, so the recent Mirai attack wouldn’t have been able to compromise the device.

Do I think this is good enough security: absolutely. In security, the bear principle applies: in that when escaping from a ravenous bear, you don’t have to be able to run faster than the bear itself, you merely need to be able to run faster than the slowest other potential food source … In internet terms, this means that while there are so many completely insecure devices out there, no-one can be bothered to hack a moderately secure system like mine because the customisation makes it quite a bit harder. It’s also instructive to think that the bear principle is why Linux has such a security reputation: it’s not that we have perfect security against virus and trojan systems, it’s just that Windows was always so much worse …

Eventually, something like Mirai will look to attack the ZS2 web server itself (it is .net based, after all) rather than simply try a list of default passwords and then I’ll need to be a bit more clever, but while everyone else is so much more insecure, that day will be long delayed.

A while ago, a goal I set myself was to be able to maintain my build and test environments for architecture emulation containers without having to do any of the tasks as root and without creating any suid binaries to do this. One of the big problems here is that distributions get annoyed (and don’t run correctly) if root doesn’t own most of the files … for instance the installers all check to see that the file got installed with the correct ownership and permissions and fail if they don’t. Debian has an interesting mechanism, called fakeroot, to get around this using a preload library intercepting the chmod and chown system calls, but it’s getting a bit hackish to try to extend this to work permanently for an emulation container.

The correct way to do this is with user namespaces so the rest of this post will show you how. Before we get into how to use them, lets begin with the theory of how user namespaces actually work.

Theory of User Namespaces

A user namespace is the single namespace that can be created by an unprivileged user. Their job is to map a set of interior (inside the user namespace) uids, gids and projids3 to a set of exterior (outside the user namespace).

The way this works is that the root user namespace simply has a 1:1 identity mapping of all 2^32 identifiers, meaning it fully covers the space. However, any new user namespace only need remap a subset of these. Any id that is not mapped into the user namespace becomes inaccessible to that namespace. This doesn’t mean completely inaccessible, it just means any resource owned or accessed by an unmapped id treats an attempted access (even from root in the namespace) as though it were completely unprivileged, so if the resource is readable by any id, it can still be read even in a user namespace where its owning id is unmapped.

User namespaces can also be nested but the nested namespace can only map ids that exist in its parent, so you can only reduce but not expand the id space by nesting. The way the nested mapping works is that it remaps through all the parent namespaces, so what appears on the resource is still the original exterior ids.

User Namespaces also come with an owner (the uid/gid of the process that created the container). The reason for this is that this owner is allowed to execute setuid/setgid to any id mapped into the namespace, so the owning uid/gid pair is the effective “root” of the container. Note that this setuid/setgid property works on entry to the namespace even if uid 0 is not mapped inside the namespace, but won’t survive once the first setuid/setgid is issued.

The final piece of the puzzle is that every other namespace also has an owning user namespace, so while I cannot create a mount namespace as unprivileged user jejb, I can as remapped root inside my user namespace

And once created, I can always enter this mount namespace provided I’m also in my user namespace.

Setting up Unprivileged Namespaces

Any system user can actually create a user namespace. However a non-root (meaning not uid zero in the parent namespace) user cannot remap any exterior id except their own. This means that, because a build container needs a range of ids, it’s not possible to set up the intial remapped namespace without the help of root. However, once that is done, the user can pretty much do every other operation4

The way remap ranges are set up is via the uid_map, gid_map and projid_map files sitting inside the /proc/<pid> directory. These files may only be written to once and never updated5

As an example, to set up a build container, I need a remapping for every id that would be created during installation. Traditionally for Linux, these are ids 0-999. I want to remap them down to something unprivileged, say 100,000 so my line entry for this is

0 100000 1000

However, I also want an identity mapping for my own id (currently I’m at uid 1000), so I can still use my home directory from within the build container. This also means I can create the roots for the containers within my home directory. Finally, the nobody user and nobody,nogroup groups also need to be mapped, so the final uid map entries look like

0 100000 1000
1000 1000 1
65534 101001 1

For the groups, it’s even more complex because on openSUSE, I’m a member of the users group (gid 100) which sits in the middle of the privileged 0-999 group range, so the gid_map entry I use is

0 100000 100
100 100 1
101 100100 899
65533 101000 2

Which is almost up to the kernel imposed limit of five separate lines.

Finally, here’s how to set this up and create a binding for the user namespace. As myself (I’m uid 1000 user name jejb) I do

Note that I become nobody inside the container because currently the map files are unwritten so there are no mapped ids at all. Now as root, I have to write the mapping files and bind the entry file to the namespace somewhere

Unprivileged Architecture Emulation Containers

Essentially, I can use the user namespace constructed above to bootstrap and enter the entire build container and its mount namespace with one proviso that I have to have a pre-created devices directory because I don’t possess the mknod capability as myself, so my container root also doesn’t possess it. The way I get around this is to create the initial dev directory as root and then change the ownership to 100000.100000 (my unprivileged ids)

This seems to be sufficient of a dev skeleton to function with. For completeness sake, I placed my bound user namespace into /run/build-container/userns and following the original architecture container emulation post, with modified susebootstrap and build-container scripts. The net result is that as myself I can now enter and administer and update the build and test architecture emulation container with no suid required

Canonical recently threw this issue into sharp relief by their decision to ship CDDL licensed ZFS as a module of the GPLv2 licensed Linux kernel. Reading their legal justification for this leaves me somewhat unconvinced because it’s essentially the same “not a derivative work” argument that a number of dubious actors have used to justify binary modules. So what I’d like to do is look at this issue from a completely different viewpoint. First by accepting the premise that CDDL and GPLv2 are incompatible (since there’s some debate on this) and secondly by accepting the even more controversial proposition that creating a kernel module is a derivative work. I don’t want to debate these premises because it’s a worst case assumption I’m using as inputs to make the following analysis possible.

What is compliance?

One of the curious thing about CDDL and GPLv2 is that they’re both copyleft (albeit in differing forms) and the compliance requirements: the release of complete corresponding source code for your binary containing the licensed work. In fact, the only significant difference is the requirement for build scripts, which is in GPLv2 but not in CDDL. Therefore you can say that if you follow the compliance regime for GPLv2 on CDDL code, you’ll be in full compliance with the CDDL. The licences do, in fact, have compatible compliance requirements. This fact becomes very relevant when you consider the requirements for bringing a copyright lawsuit in the first place.

Where’s the Harm?

Copyright law is something called a tort in law. That essentially means a branch of law for resolving disputes between individual parties. However, in order to stem what could be seen as frivolous lawsuits, bringing a claim under tort law requires not just a showing that someone broke the rules of whatever agreement they were under but also that quantifiable harm resulted6. The essential elements of a tort claim are a showing of a violation, a theory of the harm produced and a request for damages based on the harm7.

All of the bodies which do GPL enforcement recently published a charter in which they make clear that the sole requirement from an enforcement action should be compliance with the terms of the licence. However, as I showed above, it is perfectly possible to be compatibly in compliance with both the CDDL and the GPLv2. So the question becomes if the party is already in compliance, even though there’s a technical violation of the terms of the licence produced by the combination, what would our theory of harm be given that we don’t really seem to have anything extra we’d ask of the violating party?

Of course, one can wax lyrical about the “harm to the licence” of allowing incompatible combinations. However, here we’re on a very sticky wicket because there have been a lot of works published (including by the FSF itself) bemoaning the problems of licence incompatibility. Indeed, part of the design of the GPLv3 process was to make the licence more amenable to combination with other open source licences. So suddenly becoming ardent advocates for the benefits of licence incompatibility is probably to be unlikely to fly before a judge as a theory of harm.

Conclusion

What the above analysis shows is that even though we presumed combination of GPLv2 and CDDL works to be a technical violation, there’s no way actually to prosecute such a violation because we can’t develop a convincing theory of harm resulting. Because this makes it impossible to take the case to court, effectively it must be concluded that the combination of GPLv2 and CDDL, provided you’re following a GPLv2 compliance regime for all the code, is allowable. This conclusion is the same as the one Canonical reached, but the means by which I got there are very different.

Note that this conclusion would apply to mixing any open source licence with GPLv2: provided the compliance regimes are compatible and the stricter one is followed, it’s difficult to develop a theory of harm and thus the combination is allowable.

Final Thought

The above analysis is all from the point of view of the Linux kernel compliance activities. However, with ZFS, there is another copyright holder: Oracle. Nothing prevents Oracle suing for copyright violation with a theory of harm that says something like the CDDL was deliberately designed to be incompatible with GPLv2 to prevent ZFS being shipped in Linux and as the shipper of products base on ZFS, they’ve suffered commercial harm (which would be quantifiable) by this action.