Watzmann.Blog
Varying amounts of fiber

Handling SSL certificates is not a lot of fun, and while Puppet’s use of
client certificates protects the server and all its deep, dark secrets very
well from rogue clients, it also leads to a lot of frustration. In many
cases, users would configure their autosign.conf to allow any (or almost
any) client’s certificate to be signed automatically, which isn’t exactly
great for security. Since Puppet 3.4.0, it is possible to use
policy-based autosigning
to have much more control over autosigning, and to do that in a much more
secure manner than the old autosigning based solely on client’s hostnames.

One of the uses for this is automatically providing certificates to
instances in EC2. Chris Barker
wrote a nice module,
based on a gist by
Jeremy Bouse that uses policy-based
autosigning to provide EC2 instances with certificates, based on their
instance_id.

I recently got curious, and wanted to use that same mechanism but with
preshared keys. Here’s a quick step-by-step guide of what I had to do:

The autosign script

When you set autosign in puppet.conf to point at a script, Puppet will
call that script every time a client request a certificate with the
client’s certname as the sole command line argument of the script and the
CSR on stdin. If the script exits successfully, Puppet will sign the
certificate, and refuse to sign it otherwise.

On the master, we’ll maintain a directory /etc/puppet/autosign/psk; files
in that directory must have the certname of the client and contain the
preshared key.

Here is the autosign-psk script; the OID’s for Puppet-specific
certificate extensions can be found
here:

DHH has a
post
on some of the hoopla around hypermedia API’s over at
SvN, complete with a cool picture of the
WS-*. While I agree with most of his points, he’s missing the larger
point of API discoverability.

The reason discoverability is front and center in RESTful API’s isn’t some
naive belief that the semantics of the API will just magically be
discovered by the client — instead, it’s a strategy to keep logic
that belongs on the server out of clients. When a client is told that they
have to discover the URL for posting a comment to an article, they are also
told to prepare that that operation might not be available. There are lots
of reasons why that operation may not be possible for the client; none of
them need to interest the client, all it cares about is whether that
operation is advertised in the article or not.

DHH also puts up a nice strawman, and then ceremoniously burns it to the
ground:

The idea that you can write one client to access multiple different APIs
in any meaningful way disregards the idea that different apps do
different things.

Again, that misses the point, especially of discoverability. Not every API
has exactly one deployment. Many clients need to work with multiple
different deployments of the same API; the
Deltacloud API is a good example of how
discoverability lays down clear guidelines for clients on what they can
assume, and what they have to be prepared for being different with each
different endpoint they want to talk to. You can look at that as making the
contract between server and client explicit in the API. Discoverability
makes conditional promises to the client: if you see X, you may safely do
Y.

We are all in agreement though that overall we want to tread very lightly
when it comes to standardizing API mechanisms - I think there are some
areas around RESTful API’s where some carefully crafted standards might
help, but staying out of range of the WS-* is much more important.

This morning, the DMTF officially
announced the availability of
CIMI v1.0. After two years of hard work,
heated discussions, and many a vote on proposed changes, CIMI is the best
shot the fragmented, confusing, and in places legally encumbered, landscape
of IaaS API’s has at a universally supported API. Not just because of the
impressive number of industry players that are part of the working group
but also because it has been designed from the ground up as a modular
RESTful API,
taking the breadth of existing IaaS API’s into account.

While the name suggests that CIMI is 75% CIM, the two have actually no
relation to each other, except that they are both DMTF standards. CIMI
covers most of the familiar concepts from IaaS management: instances
(called machines), block storage (volumes), images, and networks. The
standard itself is big, though most of the features in it are optional, and
I don’t expect that any one provider will support everything mentioned in
the standard. To get started, I highly recommend reading the
primer
first, as a gentle introduction to how CIMI views the world and how it
models common IaaS concepts. The
standard
itself then serves as a convenient reference to fill in the details.

One of the goals of CIMI is that providers with widely varying feature sets
can implement it, and it therefore puts a lot of emphasis on making what
exactly a provider supports discoverable, using the
well-known mechanisms that a
RESTful style makes possible
, and that we’ve also
used in the Deltacloud API to
expose as much of each backend’s features as possible. This emphasis on
discoverability is one of the things that sets CIMI apart from the popular
vendor-specific API’s, where the API has to be implemented in its entirety,
or not at all.

We’ve been involved in the working group for the last two years, bringing
our experience in designing Deltacloud to
the table. We’ve also been busy adding various pieces to Deltacloud, and
that implementation experience has been invaluable in the CIMI
discussion. We’ll continue to improve our CIMI support, and build out what
we have; in particular, we are working on

the CIMI frontend for Deltacloud; when you run deltacloudd -f cimi, you
get a server that speaks CIMI, with the antrypoint at
/cimi/cloudEntryPoint. You can try out the latest code at
https://dev.deltacloud.org/cimi/cloudEntryPoint

the CIMI client app (in clients/cimi/ in our
git repo — the app makes it
both easier to experiment with the CIMI API, and serves as an example of
CIMI client code.

a CIMI test suite; as part of our test suites, we are adding tests that
can be run against any CIMI implementation and will eventually be a
useful tool to informally qualify such implementations

As with all open source projects, we always have way more on the todo list
than we actually have time to do. If you are interested in contributing to
Deltacloud’s CIMI effort, have a look at
our Contribute page,
stop by the mailing list, or
drop into our IRC channel #deltacloud on freenode.

Like everything,
REST API’s
change over time. An important question is how these changes should be
incorporated into your API, and how your clients should behave to survive
that evolution.

The first reflex of anybody who’s thought about API’s and their evolution
is to stick a version number on the API, and use that to signal to clients
what capabilities this incarnation of the API has, and maybe even let
clients use that to negotiate how they talk to the
server. Mark
has a very good post explaining why, for the Web, that is not just
undesirable, but often not feasible.

If versioning is out, what else can be done to safely evolve REST API’s ?
Before we dive into specific examples, it’s useful to recall what our
overriding goal is. Since it is much easier to update a server than all the
clients that might talk to it, the fundamental aim of careful evolution of
REST API’s is:

Old clients must work against new servers

To make this maxim practical, clients need to follow the simple rule:

Ignore any unexpected data in the interaction with the server

In particular, clients can never assume that they have a complete picture
of what they will find in a response from the server.

Let’s look at a little toy API to make these ideas more tangible, and to
explore how this API can change while adhering to these rules. The API is
for a simplistic blogging application that allows posting articles, and
retrieveing them. For the sake of simplicity, I will omit all HTTP request
and response headers.

A simple REST API

In sticking with good
REST practice,
the API has a single entrypoint at /api. Issuing a GET /api will result
in the response

It’s worth pointing out a subtlety in including a link for the create
action: one reason for including that link is to tell clients the URL to
which they can POST to create new articles, and keep them from making
assumptions about the URL space of the server. A more important reason
though is that we use the presence of this link to communicate to the
client that it may post new articles. This, following the
HATEOS constraint for REST API’s,
is the more important reason to include an explicit link: clients should
not even assume that they are allowed to create new articles.

Adding information from the server

Readers might want to know when a particular article has been made
available. We therefore add a published attribute to the representation
of articles that a GET on the articles collection or on an individual
article’s URI returns:

This does not break old clients, because we told them to ignore things they
do not know about. A client that only knows about the previous version of
our API will still work fine, it just won’t do anything with the
published element.

Allowing more data when creating an article

Some articles might be related to other resources on the web, and we’d want
to let authors call them out explicitly in their articles. We therefore change
the API to accept articles with some additional data on POST
/api/articles:

As long as our new API allows posting of articles without any related
links, old clients will continue to work.

Blogging API’s everywhere

If our blogging software is so successful that clients must be prepared to
deal with both servers that support adding related reosurces, and ones that
do not, we need a way to indicate that to those clients that know about
related resources. While there are many ways to do that, one that we’ve
found works well for Deltacloud is annotating the
collections in the toplevel API entrypoint. When a client does a GET /api
from a server that supports related resources, we’d send them the following
XML back:

Updating articles

Authors want to revise their articles from time to time; we’d make that
possible by allowing them to PUT the updated version of an article to its
URL. This won’t introduce any problems for old clients, but new clients
will need to know whether the particular instance of the API they are
talking to supports updating articles. We’d solve that by adding actions
to the article itself, so that a GET of an article or the articles
collection will return

Not only does the update link tell clients that they are talking to a
version of the blogging API that supports updates, it also lets us hide
complicated business logic that decides whether an article can be updated
or not by simply showing or suppressing the update link.

Merging blogs

Because of its spectacular content, our blog has been so successful that
we want to turn it from a personal blog into a group blog, supporting
multiple authors. That of course calls for adding the name of each author
(or their avatar or whatnot) to each post — in other words, we want
to make passing in an author mandatory when creating or updating an
article. Rather than break old clients by silently slipping in the author
requirement, we add a new action to the articles collection:

Old clients will ignore that new action; the remaining question is if we
can still allow old clients to post new articles. If we can, for example,
by defining a default author out-of-band with this API, we’d still show the
old create action in the articles collection. If not, we’d take the
ability to post away from old clients by not displaying the create action
anymore — but we haven’t broken them, since they can still continue
to retrieve posts, we’ve merely degraded them to readonly clients.

While this seems like an extreme change, consider that we’ve changed our
application so much that existing clients can simply not provide the data
we deem necessary for a successful post. It’s much more realistic that we’d
find a way to let old clients still post articles using the old create
link.

Some consequences for XML

There are two representations that are popular with REST API’s: JSON and
XML. The latter poses an additional challenge for the evolution of REST
API’s because the use of XML in REST API’s differs subtly from that in many
other places. Since clients can never be sure that they know about
everything that might be in a server’s response, it is not possible to
write down a schema (or
RelaxNG grammar) that the client could use to
validate server responses, since responses from an updated server would
violate that schema, as the simple example of adding a published date to
articles above shows.

It’s of course possible to write down RelaxNG grammars for a specific
version of the API, but they are tied to that specific version, and must
therefore be ignored by clients who want to happily evolve with the server.

Questions ?

I’ve tried to cover all the different scenarios that one encounters when
evolving a RESTful API — I’ve left out HTTP specific issues like
status codes (must never change) and headers (adding new optional headers
is ok) as the Openstack folks have decided for their
API Change Guidelines.

I’d be very curious to hear about changes that can not be addressed by one
of the mechanisms described above.

The upcoming release of Deltacloud 1.0 is a huge milestone for the project:
even though no sausages were hurt in its making, it is still chockful of
the broadest blend of the finest IaaS API ingredients. The changes and
improvements are too numerous to list in detail, but it is worth
highlighting some of them. TL;DR: the
release candidate
is available now.

EC2 frontend

With this release, Deltacloud moves another step towards being a universal
cloud IaaS API proxy: when we started adding support for
DMTF CIMI as an alternative to the ‘classic’
Deltacloud API, it became apparent that adding additional frontends could
be done with very little efforts. The new
EC2 frontend proves that
this is even possible for API’s that are not RESTful. With that, Deltacloud
allows clients that only know the EC2 API to talk to various backends,
including OpenStack, vSphere, and
oVirt.

The EC2 frontend supports the most commonly needed operations, in
particular those necessary for finding an image, launching an instance off
of it and managing that instance’s lifecycle. In addition, managing SSH key
pairs is also supported. We hope to grow the coverage of the API in future
releases to the point where the EC2 frontend is good enough to support the
majority of uses.

The debate around the ‘right’ cloud IaaS API is heated and continues,
especially around standards, and we still see the right answer to this
debate in a properly standardized, broadly supported, and openly governed
API such as DMTF’s CIMI — yet it is undeniable that EC2 is the front
runner in this space, and that large investments into EC2’s API exist; it
is Deltacloud’s mission to alleviate the resulting lockin, and the addition
of the EC2 frontend allows users to experiment with different backend
technologies while migrating off the EC2 API on their own pace.

One issue that the EC2 frontend brings to the forefront is just how
unsuitable that API is for fronting different backend implementations: IaaS
API’s that are designed for this purpose provide extensive capabilities for
client discovery of various features. EC2 on the other hand provides no way
for providers to advertise their deviation from EC2’s feature set, and no
possibilities for clients to discover them.

CIMI frontend

We continue our quest to support the fledgling CIMI standard as broadly and
as faithfully as possible. With this release, we introduce support for the
CIMI networking API; for now only for the Mock driver, but we are looking
to expand backend support for networking as clouds add the needed features
for them.

Besides the core CIMI API, which is purely a RESTful XML and JSON API, work
also continues on the simple human-consumable HTML interface for it; we’ve
learned from designing the Deltacloud API and helping others using that
API, that a web application that stays close to the API, but is easy to use
for humans is an invaluable tool. With this release, that application can
now talk to OpenStack, RHEV-M/oVirt, and EC2 via Deltacloud’s CIMI proxy.

Operational and code enhancements

With three frontends, it’s become even more urgent that the three frontends
can be run from the same server instance to reduce the number of daemons
that need to be babysat. Thanks to an extensive revamp of the guts of
Deltacloud to turn it into a
modular Sinatra app
it is now possible to expose all three frontends (or only one or two) from
the same server.

We now also base our RESTful routes and controllers on
sinatra-rabbit — only
fitting since sinatra-rabbit started life as the DSL we used inside
Deltacloud for our RESTful routing and our controllers.

A lot of work has gone into rationalizing the HTTP status codes that
Deltacloud returns, especially when errors occur; in the process, we
learned quite a bit about just how fickle and moody vSphere can be.

Other drivers have seen major updates, not least of which the OpenStack
driver, which now works against the OpenStack v2.0 API; in particular, it
works against the HP cloud — with the EC2
frontend, Deltacloud provides a capable EC2 proxy for OpenStack. We’ve also
added a driver for the
Fujitsu Global Cloud Platform,
which was mostly written by Dies Köper of Fujitsu.

The
release candidate
for version 1.0.0 is out now, packages for rubygems.org, Fedora and other
distributions will appear as soon as the release has passed the vote on the
mailing list.

When we converted Deltacloud from
Rails to Sinatra,
we needed a way to conveniently write the controller logic for RESTful
routes with Sinatra. On a lark, I cooked up a DSL called ‘Rabbit’ that lets
you write things like

That makes supporting the common REST operations convenient, and allows us
to auto-generate documentation for the REST API. It has been very useful in
writing the
twofrontends
for Deltacloud.

The DSL has lots of features, for example, validation of input parameters,
conditionally allowing additional parameters, describing subcollections,
autogenerating HEAD and OPTIONS routes and controllers, and many more.

Michal Fojtik has pulled that code out of Deltacloud and
extracted it into its own github project as
sinatra-rabbit In the process,
there were quite a few dragones to slay: for example, in Deltacloud we
change what parameters some operations can accept based on the specific
backend driver. For example, in some clouds, it is possible to inject
user-defined data into instances upon launch. In Deltacloud, the logic of
what routes to turn on or off is based on introspecting the current driver,
which means that Deltacloud’s Rabbit knows about drivers. That, of course,
has to be changed for the standalone sinatra-rabbit. Michal just added
route conditions that look like

Hopefully, sinatra-rabbit will grow to the point where we can remove our
bundled implementation from Deltacloud, and use the standalone version;
there’s still a couple of features missing, but with enough people
sending patches, it can’t be very long now ;)

Installing Deltacloud is work. Not a lot of work, in fact it is
very easy, but it
still involves installing a package/gem and starting a server. For simple
development and test uses, even that is not necessary any more.

Both use the same self-signed SSL certificate. Its SHA-1 fingerprint is
D3:3D:13:73:37:88:59:F1:FE:08:51:70:A0:BA:60:99:F1:E9:DD:45.

If you’ve been scratching your head, wondering what all this Deltacloud
business is about, just head over to one of these two servers and explore
the API. There’s a friendly HTML interface for just that, or, of course,
the obligatory XML and JSON, variants. The public servers run the EC2
driver as their default; when prompted for a username and password, just
enter your Amazon AWS access key ID and secret.