Puppet Podcast: Meet Lumogon

Breadcrumb

Editor's note: The Lumogon open source project is now unmaintained. The technology is part of Puppet Discovery, which lets you discover what's running in your hybrid infrastructure — from servers to VMs, cloud instances and containers.

On May 11th, we announced Lumogon™, a new technology for inspecting, reporting on, and analyzing your container applications. In our podcast about Lumogon, we hear from Kenaz Kwa, Lumogon’s product manager, and Tyler Pace, Lumogon’s UX architect, on the challenges of operating containers at scale while providing visibility into what makes up your containers.

Containers are quickly shifting how applications are delivered. Their black-box nature is great when it comes to running them, but not everyone’s life is being made easier by containers. If, for example, you need to know whether software in a container needs a security update, the black-box nature of containers makes your life a lot harder.
In this podcast, we talk about the problems that operators encounter across various tasks in the product delivery process, and how these problems change in a world of container delivery. We also discuss the operational responsibilities application developers are increasingly taking on, and the challenges they face when using containers.

We hope you enjoy learning more about Lumogon straight from the mouths of the project leads. Get started with Lumogon at https://lumogon.com/. You can also chat with the Lumogon team and other Lumogon users in the #lumogon Puppet Slack channel.

Find the rest of the Puppet podcasts here. And if you would like to read the transcript of this one, it's below the Learn more section.

Carl Caum is a technical product marketing manager at Puppet.

Learn more

Transcript

Carl Caum: Hello. Welcome to the Puppet Podcast. My name is Carl Caum. Today we're going to be talking about Lumogon, a project for inspecting your container applications and understanding what makes up those container applications in production and its scale.

Today, [we're] talking with us — we're going to have Kenaz Kwa, the product manager of Lumogon. And we're also going to have Tyler Pace, a UX engineer. And I'll let them introduce themselves. Kenaz?

Kenaz Kwa: Hi, my name is Kenaz.

Tyler Pace: Hello. This is Tyler.

Carl Caum: All right. So let's just jump right in. Kenaz, what is Lumogon?

Kenaz Kwa: So, Lumogon is a tool that we're releasing for inspecting and reporting on, and really analyzing the contents of a container. This is everything from the actual packages and the actual bits that make up that container, as well as some of the metadata that you might find interesting.

We're observing that many people, as they adopt containers, are taking the existing applications that live in their enterprise infrastructures and packaging them as containers, and sort of throwing them into production. And that gets a lot of benefits, from taking that packaging format and re-using it for some of their existing applications.

But what they find is now they have a giant box of applications that are running in these containers, and they don't really know what's going on inside of them. So what Lumogon is, the tool, is supposed to be used for is to understand: "What are the contents of that container? And, how do I get more information about things actually running in production?"

The tools — there's two components to it; one is an open-source CLI tool that you can download from Docker Hub, and you just docker run a container in your environment. And then, there's also a web application reporting tool that gives you more information or gives you a nice report of everything that's going on inside that container.

Awesome. So essentially, it's about solving the problem of containers just being a black box and understanding what is exactly in those black boxes at any time.

Tyler Pace: Mm-hmm. Yep.

Carl Caum: Awesome. So why is that important?

Tyler Pace: Well, from the scheduler perspective, a lot of best practice for running large amounts of containers on container hosts is using some type of scheduler that basically treats all of your compute resources as a single contiguous blob of compute resources.

And so containers today are treated by those schedulers — like Cooper Netties and Docker Swarm and DCOS — as a black box. Because all it really cares about is the amount of CPU that's being used, the amount of memory that's being used, the amount of disk that's being used.

But what it doesn't really take into account today is everything that's actually going on inside that container. And we're seeing that that's becoming more and more important. Because there could be lots of things that you don't necessarily know about that's actually running in production; that could be security vulnerabilities; that could be compliance concerns; the same sorts of problems that we typically see in configuration management: around visibility, around audit control, status accounting.

Those are all problems that still remain present when you move all of your workloads to containers.

Carl Caum: Yeah. I know that when I work with containers — especially working with someone like Cooper Netties — if I ever have a question of, "What was there a week ago?" that's exceptionally difficult to answer. Is that something that Lumogon is going to be able to help with?

Tyler Pace: That's sort of the direction that we're going. At the beginning, it's obviously a very simplistic tool. It's really just to show you what -- on a single point in time and for a single report or for a collection of containers -- what's actually going on inside of them.

But where we see Lumogon going is to be able to answer those kinds of questions. Uou know, in: "I noticed that there was a vulnerability in this application at 3 p.m. last week. Can you tell me what was going on five minutes before that? What was everything in my production infrastructure up until that point? And what was in my infrastructure 10 minutes after that?" — so that you can really drill down, and one: understand, "Were you safe or not?" and two: be able to debug and diagnose what the issues are.

Carl Caum: Awesome. So as we're talking through this, a number of questions come into my head that, I think, that a lot of different personas would ask. There's not just the operator of the containers being able to say like, "What's in this thing?" But also, InfoSec asking about compliance, auditors asking about what the past state was. Tyler, can you talk a little bit about everyone and the different personas that would benefit from this?

Tyler Pace: Yeah. Absolutely. We envision a lot of different people involved in container workflows benefiting from a tool like Lumogon. I mean, first in that list are actual application developers who are starting to use containers as part of their application development and deployment workflows.

And we know that increasingly, application developers are taking on more responsibilities, not just the development of an application but it's ongoing maintenance, updates, rollouts, taking some ownership in the uptime of the application, and sort of just being more involved in the end-to-end work that's required to have a sort of fully, powered up application and production.

And so having a tool, like Lumogon, to be able to inspect not just the images you're building — and being able to get more insight into the packages that your application is depending on, the packages that are in those images — to be able to also look at containers that are running, so you [would] actually look at the live-running application; be able to get information on that similar metadata that you might be interested in.

So you can compare: "What does my development state look like versus the production state? This job that's in Jenkins right now, what image is it playing with? And what's the state of that image? And how does that differ from the other environments that I might be interested in?"

So it allows you to start asking questions about your application, sort of different parts in its timeline. You can see how that would be, in particular, helpful in CI/CD environments where you can start to gather reports on: What your application looked like? Did it look like what you thought it was going to look like? Was the right third-party library in there? Is that missing library a reason why you failed your Jenkins build? Now, you can start to get more insight into those practices.

We also see InfoSec people being interested in a tool like this where you can start to gather the data that you need for audits. We know that containers turn over quite quickly. There's a recent report from Datadog where containers are turning over about two days on average.

And so, you know, that's a very short window to be able to put something into production, scan it, and make sure it's the thing that you thought you were going to put into production. And then a month from now, when you've had 15 different versions of that container spin up and spin down, how do you make sure that some OpenSSL vulnerability that existed, very briefly, for three days, three weeks ago -- that you weren't somehow impacted by that?

And tools like Lumogon help you be able to answer questions like that by giving you the data you need, to know what was running and when.

And then, of course, also the more traditional operations teams that are trying to do things like understand: "Why are our applications going down? Are we dealing with some buggy third-party libraries? -- you know, we all write bugs so buggy first-party libraries -- What might be the root cause of our applications going down?

And, of course, a tool like Lumogon that gives you sort of extracted data in a JSON format, makes it really easy to feed into your monitoring systems. It can go into that ELK stack or Splunk tool that you're using to conglomerate all of your data to give you all of your data to give you that birds-eye view of your infrastructure.

And this is another tool that can help you take containers and make them a first-class member of that reporting dual-chain that you've built for your virtual machine infrastructure.

Carl Caum: Cool. That makes a lot of sense. There's a lot of things that jump out to me there. You know, one: you were talking about containers living for about two days.

I recently attended a talk by Diogo Monica from Docker, and he was talking about some stuff that Docker's working on where they expect a container to live for about 60 seconds; it's just a constant state of refresh, even if there's no update that needs to be refreshed. It's more around security of: "If a container gets compromised at any time, well, you want to kill that as soon as possible. So that if you do get compromised, the threat level is removed very, very quickly."

And he was talking about this, and of course, my head is exploding because it was just amazing. But then that question of: "Well, wait a minute. If we're just killing things every 60 seconds, how do we ever have a sense of what's actually going on? How do we actually know what things are running when and where?" So I'm pretty excited about Lumogon being able to start answering some of those questions.

Another area that really jumps out at me is CI/CD, particularly around the peer-review process. So right now when somebody -- actually even this morning, I was looking at a container that somebody was trying to push through a pipeline -- that somebody may have been me.

But when I went to look into it — like: "I'm having a problem with this thing. I need to know what operating system it is. All right. So I open up the Docker file, and it has a from clause. Okay. Well, let's go find that image. Well, I don't actually know where that GitHub project is so let me go Google for that. So I find that and find its Docker file. Oh, wait. It has another from image."

And so it's just this rabbit hole of stuff just to answer a very question of: What the heck -- what is the OS in this thing? So having this data available as a peer-review process, so we could look at those and say: "Hey, you might want to update this library. Hey, you might want to add this Docker label." You know, having one single-source of truth through the pipeline, makes a lot of sense.

Kenaz Kwa: Yeah. And just to add to that -- because the data is being represented in a structured format, then you can inject that as a sort statement into your CI process. So that as operations people are participating in deploying containers, they're able to prescribe the policies that they need in order to be able to make those kinds of guarantees to their stakeholders that: "Yeah. You know, that we are running in a safe, secure sort of way."

Carl Caum: Yeah. That makes a lot of sense. So to that point, Tyler, would you say this is a security tool?

Yes and no. We're not designing explicitly for security in mind, but good security starts with an awareness of what's running in your infrastructure.

You have to be able to understand that landscape of: "What does your surface area look like for your applications? What are all the different dependencies that you're relying on, both the ones that you are developing yourself, and the ones that you're inheriting from different vendors and other third-party providers?"

And that, one of the -- sort of a long-term goal that we see with a lot of people stepping into containers is they understand that in the beginning, they -- from the perspective of containers – kind of have a big surface area.

They're importing a full-fledged operating system like Ubuntu or CentOS into their container, and the hundreds of default packages that you get in an operating system like that -- some of which are probably required for your application, many of which aren't.

And you don't, without an awareness of what your starting point is -- it's hard for you to be able to say, "Okay. This is my big starting point surface area. This was a good place to be for my first application, my first Brownfield application, that I moved to a container. But now I need to start reducing that surface area."

And the only way to get there is to understand what you're dealing with, and just so you can start making educated guesses about: "What can I take away? What can I remove from this image?" And then test and remove, and test and remove, so that you can [winnow] that down to a small surface area.

And that reduced surface area, that's a win for everyone. It's easier to understand an application with the minimum number of dependencies. They're easier to run. They're easier to develop, and they're easier to harden and secure.

Carl Caum: Awesome. So how do people get it?

Tyler Pace: Well, you can get Lumogon from lumogon.com. It's also available on GitHub and Docker Hub.

Carl Caum: Great. So I'm assuming people, as they try this thing out, they're going to want to get in touch with the team and learn what they can do -- some of the hidden stuff that might be in there. How do people get feedback?

Tyler Pace: The Lumogon team, we're hanging out all the time in our Slack channel. And so we encourage anyone to come out and say, "Hello," and tell us, "What they're doing? And what they like? And what they'd like to see in the future?"

And we'd love to get in touch with you. And you can also reach out at [email protected]

Kenaz Kwa: So it turns out, it's shockingly difficult to name things, especially when everyone is taking the four-letter, five-letter words and domain names.

So, it doesn't mean anything. It's completely made up. But the idea of lum- having some implications of light and some notions of shining a light into your containers, and -gon sort of the suffix for a polygon -- the idea of like: "When you shine that light and you see what sort of shape it takes, that you have a better idea of what actually you're looking at." And so that seems workable. So we're going to go with that.

Carl Caum: Awesome. So you talked a little bit earlier about where we're going with it and being able to understand past states of your infrastructure. Is there any more to it? Like, where is this thing going to go in the future?

Kenaz Kwa: Yeah. For sure. Right now we started with a fairly basic set of metadata that we're collecting from the containers. Over time, we expect that collection of information to grow, both from our conversations with customers and with users as well as potentially community contributions.

The entire CLI tool that is collecting the information is all open-source. And we'll help to define what some of those -- that contribution model -- what [it] might look like.

And the other thing is we're looking at expanding the types of targets that you can run this tool on. Right now, it's basically running it on a single container or a container host. But we know specifically, with things like schedulers -- like Cooper Netties -- people are no longer interacting directly with the container host, and instead, all they see are these higher level interfaces like a Cooper Netties replication set or a Docker Swarm service.

So we hope to be able to then instead, point the same tool, just at a higher level of abstraction, that produces the collection of information underneath. So you can say, "Okay. In this service, what are actually all the packages that are in there? What are all the labels that have been applied to these containers?"

Carl Caum: Awesome. So that does beg one question that we haven't asked yet? Usually, when you try to understand what's in a container, you look at its image. And that requires a change to your build process. So you have to build something into the container image, and then when you spin it up, it's running in that environment and can then ship data back. Is that how Lumogon is working?

Kenaz Kwa: No, so it actually -- the whole goal is that you don't have to change anything about your infrastructure. That, we know that you're going to do your infrastructure the way you're going to do it. And we want to get out of the way, but provide you as much value as possible by being able to see inside your containers without having to change anything about the build process.

So to that end, all you really have to do is run another container inside, on your container host, and point that container at the running containers, which will harvest and collect information and report it out to you.

Carl Caum: Awesome. Sounds easy.

Kenaz Kwa: It is.

Carl Caum: All right. Well, that's all the questions I have. Anything else we need to -- think we should cover?