This blog comments on a variety of technology news, trends, and products and how they connect. I'm in Red Hat's cloud product strategy group in my day job although I cover a broader set of topics here. This is a personal blog; the opinions are mine alone.

Tuesday, December 16, 2014

[Transcript] Tyler Cowen on Stories - Less Wrong - "If you hear a story and you think, "Wow, that would make a great movie!" That's when the "uh-oh" reaction should pop in a bit more, and you should start thinking more in terms of how the whole thing is maybe a bit of a mess."

The New Republic’s demise: The magazine’s heterodox liberalism is what made it unique. - "The impending transformation of the New Republic from a liberal magazine into a “vertically integrated digital media company” is regrettable for many reasons, not all of them sentimental. Conservatives need a liberal magazine that’s unpredictable enough to make them want to read it. Liberals and leftists need a magazine that will prod them to question their beliefs, and revise or strengthen them. All of us need robust intellectual debate of a high caliber that treats politics and ideas with the seriousness that they deserve."

Mark Lamourine and I continue our ongoing containerization series. In this entry, we break down the containerization stack from the Linux kernel to container packaging to orchestration across multiple hosts and talk about where containers are going.

This will probably wrap-up 2014 for the Cloudy Chat podcast. Among other goals for 2015, I'd like to continue this series with Mark on a biweekly basis or so. I also expect to start folding in DevOps as a way of discussing the ways to take advantage of containerized infrastructures.

Gordon
Haff: I'm sitting here with Mark Lamourine, who's a co‑worker of
mine, again today. One of our plans for this coming year is I'm going to be
inviting Mark onto this show more frequently because Mark's doing a lot of work
around the integration of containers, the integration around microservices, or
open hybrid cloud platforms. A lot of interest in these topics, and some of the
other technologies and trends that intersect with them.

We're
going to spend a fair bit of time next year diving down into some of the details.
One of the things as we dive down in all these details is we're not going to
get into the ABC basics every week, but I'm going to make sure to put some
links in the show notes.

If
you like what you hear, but want to learn a little bit more about some of the
foundational aspects or some of the basics, just go to my blog and look at the
show notes. That should have lots of good pointers for you.

Finally,
last piece of housekeeping for today, we're going to be talking about the
future of containers. There's been some, shall we say, interesting news around
containers this week. But we're going to stay focused on this podcast from a
customer, a user, a consumer of containers perspective, looking at where
they're going to be going, where they might want to be paying attention over
the next, let's say, 6 to 12 months type of time frame.

We
don't want to get into a lot of inside baseball, inside the beltways sort of
politics about what's going on with different companies and personalities, and
really we'll stay focused on things from a technology perspective. That's my
intro. Welcome, Mark

Mark
Lamourine: Thank you.

Gordon:
Mark, I think most of the listeners here appreciate essentially what
containers are, at a high level. Operating system virtualization, the ability
to run multiple workloads, multiple applications, within a single instance of
an operating system, within a single kernel. But that's, if you would, the
first layer of the onion.

What
I'd like in this show, as we're talking about where we are today, and where
we're going in the future, is to talk a bit more about the different pieces of
containers, the different aspects of containers.

The
first aspect I'd like to talk about is the foundational elements. What's in the
kernel of the operating system. This is kernel space stuff. So could you talk
about that?

Mark:
We've discussed before that the initial technology, the enabling
technology, which in this case is kernel namespaces, that there have been
things like this before in the past. Essentially what they do is allow someone
to give a process a different view of the operating system.

They
operate when a kernel, when a process asks for, "Where am I in the file
system?" The name spaces can say, "Oh, you're at slash," or,
"You're at something," and the answer they're getting is a little bit
different from what you'd see outside the container. That's really the core
technology: the idea of an altered view of the operating system from the point
of view of the contained process.

Gordon:
Now there are some different philosophies out there about exactly how you
go about doing this from a process perspective.

Mark:
Not so much the technology, but what do you with it once you've got it?
How do you mount file systems? What views are useful? How do you build up a
special view for a process which is this thing inside a container? There are
different ways of doing that and people have different goals. That informs how
they want to build one.

Gordon:
I think although this part of containers, this aspect of containers is
often hidden, I think it's important to note it's a pretty important part of
the entire subsystem because everything else is resting on top of it.

We've
some news stories recently, for example, about how, if you don't have this
consistency among kernel levels, it's hard to have portability between
environments of the applications in a container.

Mark:
How you look at that view, how you compose that view is one element
that's interesting and can be different, but you want to make sure that they're
providing uniformly so everybody knows what they're getting. One important
aspect of that is that these views, they're different views. There's the view
that the PIDs can see, that the processes can.

What
other processes are available? That's one possible view. There's a view of the
file system that each process can see the file system from a different way or
they can share one which gives two processes the same view of the file system,
but maybe a different process.

This
composition is something that people are still working out, how an application
would want to see things that have multiple processes with different
responsibilities and how do you build up the environment for each one?

Gordon:
That's the foundational container part, which is part of the operating
system, depends on a lot of operating system services. It depends on a lot of
things the operating system does for security, for resource isolation, that type
of stuff.

Now
let's talk about the image that is going to make use of that container. As we
were talking before this podcast, from your perspective, there are really two
particular aspects of images ‑‑ the spec of the image and the instantiation,
the actual deployed image.

Let's
first talk about the spec of the image and what are some of the things, the
characteristics that you see as being important there now and moving forward.

Mark:
Again, uniformity is probably the biggest one. The big container system
right now is Docker and Docker has a very concise way of describing what should
go into a container. The specification is very small and that's one of the
things that Docker has brought and made people realize that this is important.

Prior
to using something like Docker, describing a container was very difficult and
very time‑consuming and it required expert knowledge. With the realization that
you need some kind of concise specification and that you can make some
assumptions, containers have become easier to build, and that's really what's
instigated the rise of containers in the marketplace.

Gordon:
Let's talk about the other aspect of containers, the instantiation, the
payload, the actual instance, if you would. What are some of the trends you see
happening there?

Mark:
Again, Docker was kind of the inception. The assumption they made was
that you can take this specification, create a series of reusable layers to
build up the image. But they specified that they were a tar ball.

Mostly
they established a standard, and once that standard is there, people can just
stop thinking about it and they can just go on and start working with it. That
uniformity of whatever the composed piece is going to really important going
forward.

Gordon:
However, that's not necessarily tied into all the other aspects of a
container subsystem. That spec, that format can really exist independently of
other pieces of technology, and that's probably going to be kind of a theme
that we hit a few times in this podcast.

Mark:
At each place you want to have a uniformity, but like you said, that
doesn't preclude having a different way of specifying what goes in ‑‑ just that
once you've specified it it's got to have a form that other people can accept.
The same thing is true with the image format itself.

Once
that's there, how it gets instantiated on the far machine, as long as the
behavior is the same. That really gets the job done. That allows people to
focus on the job they need to do and not a lot of extra work putting everything
together.

Gordon:
This always was the conflict with standards at some level. Standards are
always great from the point of view of the customer and they really have
enormous value in terms of portability, in terms of just not having to think
about certain things.

On
the other hand, they need to embody the functionality that's needed to get the
job done. We don't use parallel printer cables any longer, thank God, because
there are standards, certainly, but they're also not very useful in today's
world.

Mark:
Yeah, I've said before that probably one of the biggest things that
Docker did was to make a certain set of assumptions, and to live with those
assumptions, those simplifying assumptions.

That
allowed them to get on with the work of building something that was functional.
I think that the assumptions are going to be challenged. There are going to be
places where their assumptions are too tight for some kinds of uses.

I
think the community is going to inform that and the community is going to say,
"This is something we need to expand on it." Without a different
assumption or without the ability to control those assumptions, we can't really
move forward. There are a number of different responses in the market to that.

Gordon:
This is how successful open source projects work. You have a community.
You have members of that community with specific needs. If the project as it
exists doesn't meet those needs, they need to argue, they need to contribute,
they need to convince other people that their ideas, the things they need are
really important to the project as a whole.

Of
course, there need to be mechanisms in place in that project to have that wide
range of contributions.

Mark:
In any good open source project, you get that right from the beginning.
The assumption by the authors is, we've got a good idea here or I think I've
got a good idea here and I'm going to instantiate it. I'm going to create it
and make it the way I think it needs to be.

Then
I'm going to accept feedback, because people are going to want to do things
with it. Once they see something's neat, they're going to want to say,
"Yeah, that's exactly what I want. Only it would be better if I had this
too."

Gordon:
Let's talk about the repositories, the ecosystems. You talked about this
a little bit last time, but where are we now and what are the next steps? What
needs to be done here?

Mark:
Again, returning to Docker, another one of their simplifying assumptions
was the creation of this centralized repository of images. That allowed people
to get started really quickly. One of the things that people found when they
started looking at their enterprise, though, was that it was a public space.

What
we need to go forward is we need the ability to know where images come from.
Right now things are just thrown out into space, and when you pull something
down you don't know where it came from. I don't think there's anybody who
really thinks that that's the ideal in the end.

I
think to go forward with it, the community needs to build mechanisms where
someone who builds a new container image can sign it, can verify that it comes
from the person who claims that they built it, and that it has only the parts
that were specified and that it gets put out in a public place if it's intended
to be public, so that people can be assured that it meets all their
requirements and that it's not something malicious.

On
the flip side you get companies where they're going to say, "No, I don't
want to put this in a public space." There needs to be some private
repository mechanism where a company can develop their own images, place them
where they can reach them, and retrieve them and use them in ways that they
want without exposing it to the public.

Gordon:
Again, this is another example of, there's not just going to be just one
way of doing things, because there's a lot of legitimate different requirements
out there.

Mark:
There are different environments, although I think there's probably a
limited number that we'll find over time. I don't think it's completely open. I
think there are a limited number of environments and uses that will fall out
over time as people explore how they want to use it.

Gordon:
Finally, let's talk about and again, you touched on some of this during
our last podcast, but the orchestration and scheduling piece, which is another
piece that I think we sometimes tend to think of as just part of this container
subsystem.

In
fact we're pretty early in the container concept and we're really still
developing how these pieces fit with and complement the lower‑level container
function.

Mark:
The whole technology started off with, "Let's build something that
runs one." It's actually working out really nicely that as people start
using containers, they're kind of naturally backing into bigger and bigger
spaces.

They
start off going, "Oh, this is really cool. I can run a container on my box
that can either run a command I want or I can build a small application using a
database and a web server container and I can just push my content into it and
it goes."

And
people are going, "That's great. Now, how do I do 12?" Or companies
are looking at it and saying, "Here's an opportunity. If I can make it so
other people can do this, I can sell that service, but I have to enable it for
lots of people." I think we're backing into this growing environment that
orchestration is going to fill.

I
think there's still a lot of work to be done with the orchestration right now.
The various orchestration mechanisms, they're not really finished. There are pieces
that are still unclear ‑‑ how to manage storage between containers, and a big
one is, in a container farm, in an orchestrated container farm, how do you
provide network access from the outside?

A
lot of work has gone into making it so the containers can communicate with each
other, but they're not very useful for most cases until you can reach in from
the outside and get information out of them. That requires a software‑defined
network, which, if you follow the OpenStack model, they have these things.

That's
actually still one of the most difficult problems within OpenStack. I think if
you ask people about the three iterations of software‑defined networks within
OpenStack, you're going to find that they're still working out the problems
with that and OpenStack is four or five years older than any of the container
systems are.

Gordon:
One of the things that strikes me when I go to events like LinuxCon and
CloudOpen and other types of particularly open source‑oriented industry events
is that there's a lot of different work, in many cases addressing different use
cases, whether it's Twitter's use cases or Facebook's use cases or some
enterprise use case or Google.

There're
all these different projects that are being integrated together in different
ways, and the thing that strikes me is first of all, wow, there's a lot of
smart people working in this stuff out there. But b) we're nowhere ready to
say, "This is the golden path to container orchestration now and
forever."

Mark:
I would be really surprised if we found that there ever was one golden
way. I suspect in the same way that we've got different environments for
different uses, you'll find that there are small‑scale orchestration systems
that are great for a small shop, and then you're going to get large enterprise
systems.

I
can guarantee that whatever Google uses in the next five years is going to be
something that I probably wouldn't want to install in my house.

Gordon:
Or your phone.

Mark:
Or my phone, yeah. The different scales are going to have very different
patterns for use and very different characteristics. I think that there's room
in each of those environments to fill it.

Certainly
there're adherents and detractors for all of them and they're at various
different points in their maturity cycles, but the other thing that strikes me
is there's also a very clear affinity between certain groups of users, like
developers or sys admins towards one tool rather than another, because they're
really not just the same thing.

Mark:
They're not, and I thought it was interesting that you used the term
"provisioning tool" when talking about Puppet and Chef, because that
is the way in which people are starting to use it now, where five years ago
they would have called it a configuration management tool and the focus
wouldn't have been on the initial setup, although that's important. It would
have been on long‑term state management.

That's
one of the places where containers are going to change how people think about
this work, because I think the focus is going to be more on the initial setup
and short‑term life of software rather than the traditional ‑‑ actually someone
told me to use the word "conventional," although in this case
"traditional" might make sense.

The
traditional "Set it up and maintain it for a long period of time."
Your point about people having different tools for different perspectives is
true. I also want to point out that all of these things, even while they're
under development, they have use. You might claim that Puppet and Chef and
these various things, the configuration management or the provisioning or the
container market are evolving.

But
at the same time, they're in use. People are getting work out of them right
now. People are getting work out of containers now, as much as we're talking
about the long‑term aspects, people are using containers now for real work.

Gordon:
Gardner has this idea they call bimodal IT and they have this traditional
IT, conventional IT, whatever you want to call it, either you have these “pets”
type system. The system runs for a long time. If the application gets sick you
try and nurse it back to health.

You
do remediation in the running system for security patches, and other types of
batches and the like. Then you have this fast IT and the idea there is I've got
these relatively short lived systems. If something's wrong with it, it takes
what, half a second to spit up a new container. Why on earth would I bother
nursing it back to health?

Mark:
I think this is another case where perspective is going to be really
important. If you're a PaaS or an IaaS shop, the individual pieces to you are
cattle. You don't really care. You've got hundreds, thousands, hundreds of
thousands of them out there, and one of them dropping off isn't all that big a
deal.

But
if you're a PaaS situation, you're cattle is somebody else's pet, and it's
going to be really important to either keep this cattle alive, the individual
ones, because, to someone, it's really their most important thing. Or, to help
them find ways so that they can treat it like a pet while you treat it like
cattle.

Where
they say, "I want my special application," and you spin up
automatically two or three redundant systems so that you see the pieces dying,
you kill them off, you restart them, but the user doesn't see that. They
shouldn't have to manage it.

Gordon:
To pick Netflix as a much overused example. Obviously, Netflix delivering
movies to you as a consumer, that's type the cattle at one level. You lose your
ability to watch Orange is the New Black or whatever and you're going to be
unhappy.

From
Netflix point of view, if you're unhappy, they're unhappy, but the individual
micro services are very explicitly designed so that they can individually fail.

Mark:
This is what I was saying that they need to be able to treat it both
ways. I don't know, but I suspect that when you're watching your movie, if the
server which is feeding it dies, what Netflix sees is, "Oh, something
died. Start a new one." What you see is maybe a few seconds glitch in your
move, but it comes back.

Mostly,
they're reliable. If that's true, then they've managed to do what I was saying.
They've managed to make it so that they preserved the important pet information
for you somehow. It might be on your client side, but the cattle part of it is
still, "Get rid of it and start again."

Gordon:
Well, Mark, this has been a great conversation. We've probably gone on
long enough today. But, as I said at the beginning, we're going to continue
this as a series going into the New Year because there is a lot happening here,
and nobody has all the answers today.

The history of the IT industry is a history of cyclical reimaginings. Not repeated cycles exactly. But repeated themes reflected in new and different technologies and environments. One such cycle that’s upon us today is the reinvention of centralized computing under the “cloud” rubric. It’s much different from the mainframe of the 1960s but it shares the motion of intelligence and state to the core and away from the network edge.

Indeed, this centralization cycle is arguably even more intense than that of the past. Author Nick Carr calls it “The Big Switch” by analogy to the centralization of electrical power generation. And, while the ecosystem of cloud service providers is both large and varied, there are but a handful of true global service providers. One data point. The Amazon Web Services re:Invent conference scored about 14,000 attendees this year. Sold out. Just year three for the conference. Just year eight for the service.

Some other day, I’ll be happy to argue why this handful of global service providers isn’t the future of all computing—certainly not within an interesting planning horizon. But there is significant centralization going on for important swaths of computing. And that makes it important to have detailed and precise discussions about governance and sovereignty as they relate to these large entities storing and processing our data.

Need some more convincing? Consider “security,” which leads just about every survey about cloudy concerns or roadblocks. Except security in this context often doesn’t mean classic security concerns like unlatched software or misconfigured firewalls. As 451 Research VP William Fellows noted in his HCTS keynote in October, it’s actually jurisdiction which is the number one question. Perhaps not surprising really given the headlines of that the last year but it reinforces that when people voice concerns about security, they are often talking about matters quite different from the traditional Infosec headaches. Transparency, control over data, and data locality are the big “security” concerns in the context of public cloud providers.

When using public clouds, it’s important to understand where data is stored, how encryption is or can be used, what protections are available, the procedures for notifications in the event of a breach or a judicial request, and many other aspects of due diligence. And, given appropriate vetting, public clouds can be entirely appropriate for many classes of data. At the same time, it’s also important to recognize that there is an inherent sharing of responsibility when using public cloud providers. Reduced control and visibility are just part of the bargain in exchange for not having to run your own servers.

This tradeoff is one reason for the increasing recognition that much IT will be hybrid. Public clouds remain attractive for many uses whether for reasons of pricing or reasons of flexibility. But private clouds can give greater control over aspects of compute and data storage—as well as making it possible to tailor the environment to an organization’s specific requirements. (Of course, on-premise computing also makes it possible to create gratuitous customizations and complexity but that’s a topic for another day.) Furthermore, public clouds can be something of golden handcuffs—especially above the base infrastructure level. The more cloud provider-specific features you use, the harder it will be to move your workloads on-premise or even to another public cloud provider. You may deem such inflexibility a reasonable tradeoff but it is a tradeoff just as proprietary vertical hardware/software stacks once were in the systems space.

Open source was one alternative then and it's still an alternative to lock-in today. Control over technology. Control over formats. Control over use. Much of the impetus behind ongoing development of OpenStack, for example, is that organizations of many types have a strategy to become an in-house service provider. The central idea behind OpenStack is to let you build a software defined datacenter for your own use.

The storage of data is central to this concept. Open source storage projects like Gluster and Ceph work on-premise, in a public cloud, or across both using a hybrid model. Ultimately not about public cloud or private cloud being better or worse but which is best suited for a specific use and purpose. And that's leading to hybrid computing, which open source enables in important ways.

About Me

I'm technology evangelist for Red Hat, the leading provider of commercial open source software. I'm a frequent speaker at customer and industry events. I also write extensively on and develop strategy for Red Hat’s hybrid cloud portfolio.

Prior to Red Hat, as an IT industry analyst, I wrote hundreds of research notes, was frequently quoted in publications such as The New York Times on a wide range of IT topics, and advised clients on product and marketing strategies. Among other hobbies, I do a lot of photography and enjoy the outdoors.