Featured in Architecture & Design

Monal Daxini presents a blueprint for streaming data architectures and a review of desirable features of a streaming engine. He also talks about streaming application patterns and anti-patterns, and use cases and concrete examples using Apache Flink.

Featured in AI, ML & Data Engineering

Joy Gao talks about how database streaming is essential to WePay's infrastructure and the many functions that database streaming serves. She provides information on how the database streaming infrastructure was created & managed so that others can leverage their work to develop their own database streaming solutions. She goes over challenges faced with streaming peer-to-peer distributed databases.

Google has released "kaniko", an open source tool to build container images inside an unprivileged container or Kubernetes cluster. Although kaniko builds the image from a supplied Dockerfile, it does not depend on a Docker daemon, and instead executes each command completely in userspace and snapshots the resulting filesystem changes.

Building images from a standard Dockerfile typically relies upon interactive access to a Docker daemon, which requires root access on the machine on which it is run. As stated on the Google Cloud Platform blog post announcing the release of kaniko, this can make it difficult to build container images in environments that cannot easily or securely expose their Docker daemons, such as Kubernetes clusters.

To overcome these challenges kaniko can build a container image from a Dockerfile even without privileged root access. Kaniko can be run in a standard Kubernetes cluster (with a Kubernetes secret that contains the auth required to push the final image), Google Container Builder, or locally via Docker and the gcloud SDK.

kaniko is run as a container image that requires three arguments: a Dockerfile, a build context, and the name of the registry to which it should push the final image. This image is built from the scratch image, and contains only a static Go binary plus the configuration files needed for pushing and pulling images. The kaniko executor fetches and extracts the specified base image file system to the container filesystem root. The "base image" in this context in the image specified in the FROM in the supplied Dockerfile.

Kaniko then executes each Dockerfile command in the order specified, and takes a snapshot of the file system after each command. The snapshot is created in user-space by walking the filesystem and comparing it to the prior state that was stored in memory. It appends any modifications to the filesystem as a new layer to the base image, and makes any relevant changes to image metadata. After executing every command in the Dockerfile, the executor pushes the newly built image to the desired registry. All of the above steps are conducted completely in user-space within the executor image, which is how it avoids requiring privileged access on the machine: "the docker daemon or CLI is not involved".

The majority of Dockerfile commands can be executed with kaniko, with the current exception of SHELL, HEALTHCHECK, STOPSIGNAL, and ARG. Multi-Stage Dockerfiles are also unsupported currently. The kaniko team has stated that work is underway on both of these current limitations.

Similar tools to kaniko include img,orca-build,buildah,FTL, and Bazel rules_docker. img can run as a non root user from within a container, but requires that the img container has "RawProc access" to create nested containers (kaniko does not create nested containers, and so does not require RawProc access). orca-build depends on runC to build images from Dockerfiles, which cannot run inside a container, and buildah requires the same privileges as a Docker daemon does to run.

FTL and Bazel aim to achieve the fastest possible creation of Docker images for a subset of images, and the kankio README states that "these can be thought of as a special-case 'fast path' that can be used in conjunction with the support for general Dockerfiles kaniko provides."