10:30am

Mesos will soon reach the 1.0 milestone. In addition to new features, this would mean a more stable user-facing API and stricter support/release guarantees for operators/framework developers. The aim of this talk is to apprise the operators/framework developers/users about the new API and also discuss the support/compatibility guarantees offered by Mesos going forward.

This talk is a sequel to the last year’s MesosCon Seattle talk on “Mesos HTTP API” and continues from where it left off.

This talk will cover the following specific topics: - Discuss the newly introduced Operator APl. - Update on the recent improvements to the Framework API. - Update on client libraries for the new Framework API. - Release cadence for Mesos going forward. - Support/Compatibility guarantees for operators/framework developers e.g., backporting of patches etc. - Master->Agent renaming in the 1.0 API.

Vinod Kone is a committer and PMC member of the Apache Mesos project. He is currently a Tech Lead and Engineering Manager @ Mesosphere. Previously, he was a Tech Lead and Manager of the Mesos team @Twitter. Vinod completed his PhD in Computer Science from UC Santa Barbara.

Anand Mazumdar is a software engineer at Mesosphere where he works on the Apache Mesos project. Prior to that, he used to work at a Quantitative Hedge Fund and Amazon Web Services on scalable data stream processing. He holds a Masters in Computer Science from The University of Texas... Read More →

11:30am

Apache Mesos and Apache Aurora have been crucial in growing a microservices culture at Twitter. The initial phase of adoption at Aurora at Twitter saw many teams abandon their own deploy tooling and frameworks in favor of concise Aurora configuration files.

But over time a whole new class of bespoke deploy tooling has emerged as service owners had to target large deployment matrices consisting of multiple pre-production environments (for things like performance testing, integration testing, regression testing, canary testing, etc.) across multiple availability zones (i.e. multiple Mesos clusters). As the matrix grows, manual orchestration quickly becomes untenable.

The custom tooling also had a strong emphasis on supporting rolling back services to a previous good state, which only becomes more complicated as your deployment matrix grows.

In this talk I’ll outline the ways in which Mesos and Aurora helped our engineers to implement more sophisticated DevOps processes by making it easy to grow their deployment matrix. I’ll outline how that process surfaced some of the holes in currently available tooling and how it led to a huge amount of duplicate effort for our service owners. Finally I’ll describe a system we’ve built at Twitter to support our CI/CD pipeline that we feel closes those gaps.

2:00pm

Micro service architectures result in up to 20 times larger environments than their monolithic counterparts. In such big and interconnected environments container metrics will tell you about infrastructure health but not service health. Even if you have implemented service health checks to quickly react on service failures, in a resilient system (like built on top of Mesos/Marathon or DC/OS) you will see intermediary mushroom cloud effects of a large number of services being affected temporarily. The mushroom cloud shows you all services, containers and hosts being affected by a failing component. How do you find out what really caused the problem and how to distinguish effect vs. cause?

In this session Alois will do post-mortem analysis by walking through different cases of failures we've observed in a real-world large e-commerce production environment running on Apache Mesos and show you how to figure out what actually caused the failures.

3:00pm

DC/OS, the recently open sourced Mesos based operating system allows system administrators and devops departments to run entire data centers as a single compute unit. But what about managing your servers and scaling your infrastructure?

With the advent of cloud computing and vastly reduced infrastructure costs, the compute resource available to businesses is virtually limitless and Juju, created by Canonical, allows us to manage our applications flexibility and across bothcloud and physical infrastructure as if it were the same thing. In this presentation Tom Barber will take you into the new world of application modelling.

Already used to drive many Open Stack deployments we will take a look at how Juju can help you model your infrastructure in a way that vastly simplifies managing your DC/OS or Mesos installation and networking, whilst avoiding single cloud lock in or managing various services across different vendors and APIs.

Tom Barber is the director of Meteorite BI and Spicule BI. A member of the Apache Software Foundation and regular speaker at ApacheCon, Tom has a passion for simplifying technology. The creator of Saiku Analytics and open source stalwart, when not working for NASA, Tom currently deals... Read More →

4:20pm

A CI/CD pipeline running on Mesos needs a dedicated component to store, serve the artefacts and their metadata. It is now possible to run JFrog's Artifactory, a universal binary repository manager, in a highly available configuration on Apache Mesos that can support many hundreds of users. Once usage of Artifactory scales beyond the demands of a single node, a highly available configuration uses a primary/secondary architecture to scale to several nodes. These nodes also require access to a relational database and shared filesystem.

In this presentation, Alexis Tual, a Solution Engineer at JFrog, will review the challenges faced when adapting the existing highly available architecture of JFrog to the world of Mesos: from storage to scheduling. He will also show how you can use Artifactory to push and pull Docker images with Marathon to create CI/CD pipelines for containerized projects. He will also discuss future work for both Mesos and Artifactory that will improve this integration.

Alexis is a versatile Solution Engineer working at JFrog in Toulouse (France). He has a strong Java (Groovy !) / Web Developer background and in the past few years dedicated his time to automation and CI/CD.

5:20pm

Mesos is never the only OSS you need to run your production datacenter. And just like all of us hanging out together at MesosCon, if you surround Mesos with its OSS friends, you get a happy, highly productive Mesos. But you have to be careful, not all OSS plays well with others.

In this talk, we’ll start by looking at a handful of production Mesos datacenters from major users. We’ll use these real-world examples to abstract a standard Mesos datacenter architecture that contains all of the components needed to run today’s modern, containerized apps with big data and analytics frameworks.

With that harmonized Mesos datacenter architecture, we’ll look at each of the abstract components and discuss the leading OSS projects that fit each piece. We’ll discuss the characteristics of what makes some solutions work well with Mesos, and call out the projects that don’t meet the standard.

We’ll wrap up the talk by showing the complete architecture diagram, and show the single- command trick for bringing all of the best OSS components together and getting them up and running in a production cluster.

Accomplished engineering manager with a passion and drive for building and scaling infrastructure, and utilizing data to solve complex issues. Strong believer in collaborative teamwork -- the sum is greater than its parts. Demonstrated track record of directing fast-paced, high-performing... Read More →

10:15am

Process migration (also known as process checkpointing) is the ability to move a group of related running processes from one set of nodes to another. It involves three phases: take a snapshot of all processes, copy the snapshot data over to the target nodes, and restart processes from that snapshot. Process migration has traditionally been used for fault-tolerance in the context of long-running stateful applications. Without it, the application developers need to modify the stateful application to periodically save the state to disk in order to restart in case of a failure. This is inefficient and error-prone!

In this talk, we demonstrate process migration within a Mesos cluster for a more enjoyable way to schedule your containers. Apart from fault-tolerance, process migration within Mesos opens up new avenues for implementing better scheduling policies. Some of the other use cases include improved maintenance primitives, debugging, speculative execution and managing “tails” for multi-threaded applications.

Kapil Arya is an Engineer at Mesosphere focussing on the core team. He recently finished his PhD at Northeastern University where he was the lead developer of the open source distributed checkpointing project DMTCP, and contributed to the reversible debugger FReD. Interning at VMware... Read More →

11:15am

Using Mesos to run Docker containers at scale is a common practice for many users. In this talk we will give an overview of the different ways on how to run docker on top of Mesos including discussing the differences between running docker with different (i.e., Mesos, Docker, or universal) containerizers.

As running Docker at large scale offers its own challenges (e.g., how to start up 1000 containers as quickly as possible), we present best practices and common pitfalls we encountered over the last years. We also discuss approaches for debugging Docker related problems.

1:30pm

The next generation distributed data center architecture is making applications more powerful and more responsive. But as many teams are starting to find out, the complexity of securing these applications and monitoring their behavior can be impractical, painful, and sometimes plain impossible.

In this demo-driven presentation, Luca Marturana will take you through the underlying challenges of container operations, cover the current state of the art of container and microservice monitoring, and discuss new techniques such as behavioral monitoring to secure your infrastructure. Using open source tools running in live environments, he will demonstrate how to effectively monitor, troubleshoot, and secure Mesos deployments.

The presentation will feature live interaction with container environments and live demos of all tools and techniques discussed. Special emphasis will be put on using the Mesos portfolio of scheduling and management tools as well as sysdig, an open source container and system troubleshooting tool developed by the presenter, and the open source behavioral security monitor falco.

Alessandro Gallotta is a software engineer at Sysdig. He is a core developer where he focuses on backend services dealing with big data and high availability issues. He holds a M.Sc. in Computer Engineering from University of Catania, Italy. Prior to Sysdig he worked as web developer... Read More →

Gastón Kleiman, Apache Mesos PMC/Committer, is a Staff Software Engineer at Mesosphere. He fell in love with distributed systems and infrastructure automation while contracting for Google, where he got to use Borg, MapReduce and other cool technology. That led him to work at Amazon... Read More →

4:00pm

DC/OS is a powerful platform to run containers and resilient microservices architectures at scale. But releasing or upgrading software to production often is a stressful moment due to the risk of performance issues or even downtime. Applying canary-patterns to ContinuousDelivery pipelines provides a safety-net which makes releasing containers less risky and stressful. By publishing new software versions to only a small percentage of visitors with specific criteria, it enables you to test, optimise and scale in a controlled and gradual way, without negatively impacting the majority of users. In this presentation we’re going to talk about how VAMP adds powerful opensource canary-releasing features to the DC/OS stack, and how to setup a smart ContinuousDelivery pipeline.

Founder and CTO of Vamp.io (formerly Magnetic.io), builders of Vamp. Vamp is a modern cloudnative solution to continuously release new micro services into production without downtime using advanced AI based Canary testing and releasing features, and delivering smart right-scaling... Read More →