Featured in Architecture & Design

Monal Daxini presents a blueprint for streaming data architectures and a review of desirable features of a streaming engine. He also talks about streaming application patterns and anti-patterns, and use cases and concrete examples using Apache Flink.

Featured in AI, ML & Data Engineering

Joy Gao talks about how database streaming is essential to WePay's infrastructure and the many functions that database streaming serves. She provides information on how the database streaming infrastructure was created & managed so that others can leverage their work to develop their own database streaming solutions. She goes over challenges faced with streaming peer-to-peer distributed databases.

AWS Release “Firecracker”, an Open Source Rust-Based microVM for Container and Serverless Workloads

At AWS re:Invent 2018 Amazon announced the release of Firecracker, an open source virtualization technology that is purpose-built for "creating and managing secure, multi-tenant containers and functions-based services". Firecracker is a fork of Chromium OS's Virtual Machine Monitor (crosvm), an open source VMM written in Rust, and the technology is used behind the scenes to power Amazon's AWS Fargate and AWS Lambda services.

According to the AWS Open Source blog, Firecracker is a new virtualization technology that enables engineers to deploy micro Virtual Machines or "microVMs". In much the same way as existing lightweight VM projects, such as Kata Containers and gVisor, Firecracker microVMs aim to combine the security and workload isolation properties of traditional VMs with the speed and resource efficiency enabled by containers. According to Jeff Barr, chief evangelist for AWS, the increasing popularity of the AWS "serverless" offerings were the motivation to create Firecracker:

[When launching AWS Lambda] we used per-customer EC2 instances to provide strong security and isolation between customers. As Lambda grew, we saw the need for technology to provide a highly secure, flexible, and efficient runtime environment for services like Lambda and Fargate. Using our experience building isolated EC2 instances with hardware virtualization technology, we started an effort to build a VMM that was tailored to integrate with container ecosystems.

At its core, Firecracker is a virtual machine monitor (VMM) that uses the Linux Kernel-based Virtual Machine (KVM). Firecracker has a minimalist design. It excludes unnecessary devices and guest-facing functionality in order to reduce the memory footprint and the security attack surface area of each microVM. There are only four emulated devices: virtio-net, virtio-block, a serial console, and a one-button keyboard controller used only to stop the microVM. AWS claim that this, along with a streamlined kernel loading process, enables a sub-125 ms startup time.

The project's GitHub repo contains detailed design decision documents that discuss core architectural choices. For example, each Firecracker process encapsulates one and only one microVM, and this process runs the following threads: API, VMM and vCPU(s). A specification document states runtime guarantees that quantify Firecracker's promise to enable "minimal-overhead execution of container and serverless workloads". These specifications are enforced by integration tests that run for each PR and master branch merge, and are executed against a bare metal I3.metal instance with hyperthreading disabled.

A RESTful control API is provided for Firecracker (specified in OpenAPI format), which handles resource rate limiting for microVMs, and also provides a microVM metadata service to enable the sharing of configuration data between the host and guest. The API thread is fully responsible for Firecracker's API server and associated control plane. This thread is never in the fast path of the virtual machine.

The VMM thread exposes the "machine model, minimal legacy device model, microVM metadata service (MMDS) and VirtIO device emulated Net and Block devices, complete with I/O rate limiting". There are one or more vCPU threads (one per guest CPU core), and these threads are created via KVM and run the KVM_RUN main loop. They execute synchronous I/O and memory-mapped I/O operations on devices models.

Firecracker runs on Linux hosts with 4.14 or newer kernels and with Linux guest OSs, and currently supports Intel CPUs, with planned AMD and Arm support. Firecracker will also be integrated with popular container runtimes, and there is a prototype firecracker-containerd implementation on GitHub. Initially, this project allows the launch of one container per microVM. There is also an open issue for support of Firecracker within HashiCorp's Nomad scheduling framework.

From a security perspective, all Firecracker vCPU threads are considered to be running malicious code as soon as they have been started, and accordingly, "these malicious threads need to be contained". Containment is implemented by nesting several "trust zones" which increment from "least trusted or least safe (guest vCPU threads)" to "most trusted or safest (host)". In production Firecracker should be started only via the jailer binary (the Firecracker binary can currently be executed directly, but this feature will be removed in a future release).

The Firecracker jailer binary sets up system resources that require elevated permissions (e.g., cgroup, chroot), drops privileges, and then exec()s into the Firecracker binary, which is then run as an unprivileged process. Past this point, Firecracker can only access resources that a privileged third-party grants access to, for example, by copying a file into the chroot, or passing a file descriptor. Seccomp filters are used to further limit the system calls Firecracker can use.

One thing missing from firecracker is either the DCO or a CLA. This is very surprising! One of these is typical when accepting any outside contributions. I expect to see this fixed ASAP once external PRs start coming in. Will be interesting to see what they pick.

Gupta responded by thanking Beda for the feedback, and stated that "we'll actively listen to community and customers actively and evolve!"