Understanding SCHED_DEADLINE

Every task starts as SCHED_OTHER, where each task gets a fair share of the CPU bandwidth.

Then comes SCHED_FIFO, where it's first in, first out, a task will run until it gives up the CPU. SCHED_RR shouldn't be used said SCHED_FIFO because it works between tasks of the same priority.

Steve gave an example of a machine that runs two tasks, one of a nuclear power plant, and one of a washing machine. The point it to show that priorities should be thought of in a system-wide view when using Rate Monotonic Scheduling. It's not as simple as which task is most important.

Earliest Deadline First (EDF)

Earliest deadline first solves some of the issues of RMS, by allowing to run times without missing deadlines.

Steve explained that sched_yield should never be used because it's almost always buggy. Except when using SCHED_DEADLINE of course, where it can be useful.

Multi processors

Steve then introduced a simple example to show Dhall's effect. It shows you can't get over utilization of 1 when using EDF.

If you want to partition EDF, it becomes similar to the packing problem, which is NP complete. A solution is to use global EDF, which constrains the problem, but can solve a special case, and get more than 1 of utilization when using multiple processors.

The limits of SCHED_DEADLINE

It has to run on all CPUs.

It can not fork, because the tasks has been fixed.

It's very hard to calculate the worst case execution time(WCET), and if you get it wrong, it breaks.

Using cgroups, it's possible to configure SCHEAD_DEADLINE affinity, but it's still a long series of commands, and stuff in /proc to do, but this is being worked on, Steve says.

It's possible to use Greedy Reclaim of Unused Bandwidth (GRUB) in order to utilize bandwidth left by some tasks, leaving more leeway to deal with WCET.

Proper APIs to HW video accelerators

by Olivier Crête

There are various types of codecs: software, hardware, and then hardware accelerators. The last ones are the subject of Olivier's talk.

Codecs can be used in a variety of contexts: players, encoders, streamers, transcoders, VoIP systems, content creation software, etc.

The different use cases have different requirements: Broadcast production want high quality, and user generated content will have lower quality for example.

Video calls care mostly about latency. When transcoding, you might care about latency if you're live, or about quality per bit if you want to store it.

Requirements

Exchange formats on the encoded side might need to support variance in packetization, byte stream, etc.

The raw content might have different subsambling, color space, etc.

Then, the memory layout might vary as well: is it planar (RGBRGBRGB) or packed (RRGGBBRRGGBB) ? Are there multiple planes (multiple DMAbuf fds )? Do you have alignment requirements in memory ? You might have tiled formats, with different tiled formats, compressed in-memory formats, padding, etc.

The memory allocation can be internal or external. In Linux, you mostly care about DMAbuf.

A good API, Olivier says, should support push or pull modes for different uses. A good API should be a living, maintained project, with Open Source code as opposed to being just a specification.

Existing (wrong) solutions

OpenMAX IL is everywhere because it's required by Android. But no one implements the full OpenMAX, only the Android subset, validated through CTS. The spec isn't maintained at Khronos anymore, and the last library passing the full test suite was from 2011. It's a fragmented landscape.

OpenMAX has a specific threading and allocation model. The whole framework isn't a good API according to Olivier.

libv4l is a "transparent" wrapper over the kernel API, but it's tied to the kernel API rules, and has limited maintenance, Olivier says.

VA-API is more interesting, albeit Intel-specific. Still, it requires complex code, and is video-only.

GStreamer is a whole multimedia framework, with a specific way of working(threads, allocation, etc.), not a HW acceleration API. It's not designed for low latency.

FFmpeg/libav is kind of OK, Olivier says, but is not focused on the hardware side. MFT on Windows is close to what Olivier is seeking, but tied to Windows.

Simple Plugin API (SPA)

This is library, coming from Pipewire, matches all of Olivier's requirements: no pipeline, no framework, registered buffers, synchronous or asynchronous modes, externally-provided thread contexts, and not limited to codecs.

It's available on github:

https://github.com/PipeWire/pipewire/tree/master/spa

It can work outside PipeWire, although it hasn't been picked-up elswhere yet.

Introduction to the Yocto Project - OpenEmbedded-core

by Mylène Josserand

Why use a build system ?

There are many constraints in embedded systems to match. You can try building everything manually, but despite the flexibility, it's a dependency hell, and lacks reroducibility. Binary distributions are less flexible, harder to customize, and not available on all architectures.

A build system like Buildroot or Yocto is a middle ground between the two.

In yocto, you have multiple tasks, to download, configure, compile, install, the builds. The tasks are grouped in recipes, and you manage recipes with Bitbake.

Many common tasks are already defined in the OpenEmbedded core. Many recipes are available, organised in layers.

OpenEmbedded is co-maintained by the Yocto Project and OE project. It's the base layer, the core of all the magic as Mylène says.

Workflow

Poky is a distribution built on top of the OpenEmbedded Core, and provides Bitbake.

The general workflow is Download -> Configure -> Compile. You download the proper version you want with git clone.

To add applications, add layers (compilations of recipes). There are folders in your poky directory. Always look at existing layers before creating a recipe. Do not edit upstream layer if you don't want breakage when updating.

To configure the build, you first source the Bitbake environment, which moves you to a build folder, and gives you a set of commands in order to do the build. You can then edit the local.conf to set your MACHINE, which describes the hardware and can be found in specific BSP layers, and setup your DISTRO, which represents top-level configuration that will be applied on every build, and brings toolchains, libc, etc. And then the IMAGE, brings the apps, libs, etc.

Creating a layer

When needed, you might create a layer, whether you have custom hardware, or want to integrate your own-application. You can do that with the yocto-layer tool which does the heavy-lifting. It's a good practice to create multiple layers to share common tasks/recipes between projects.

The recipe are created in .bb files, the format that bitbake understands. The naming of the file is application-name_version.bb, and a file is split in header, source, and tasks parts.

Mylène says it's a good practice to always use remote repositories to host app sources to make development quicker. App sources should never be in the layer directly. The folder organization should always be the same in order to find the recipes faster.

Sometimes, you might want to extend an existing recipe, without modifying it. It's possible with the Bitbake engine, when creating .bbappend files. All .bbappend files are version specific. They can be used to add patches, or customize the install process by appending a task.

Creating an image

An image is a top-level recipe, it has the same format as other recipes, with specific variables on top, like IMAGE_INSTALL to list the included package, or IMAGE_FSTYPES for the binary format of images you want (ext4, tar, etc.).

It's a good practice to only install what you need for your system to work.

Creating a machine

The machine describes the hardware. It contains variables related to the architecture, like TARGET_ARCH for the architecture, or KERNEL_IMAGETYPE.

Mainline Linux on AmLogic SoCs

by Neil Armstrong

The AmLogic SoC Family has multimedia capabilities, and is used in many products. They have different products, ranging from the Cortex-A9 to Cortex A53 CPUs.

The SoCs are very cheap at ~7$ when compared to competitors.

Amlogic SoCs are used in many different cheap Android boxes. They are also in community boards from ODroid, the Khadas VIM1/2, NanoPi K2 or Le Potato which has been designed by BayLibre.

The Libre Computer board has been backed on Kickstarter. Mainline support is done by BayLibre, with many peripherals already working.

The upstream support started from 4.1 by independent hackers. From 4.7, BayLibre started working on it. The bulk of the work went in 4.10 and 4.12.

The work was concentrated on 64bit SoCs (the latest ones), but the devices are very similar inside the family.

Drivers so far

Dynamic Voltage and Frequency Scaling is a complex part of the work, since it's done on a specific CPU in the SoC, but ARM changed the protocol after some time and did not publish the old one at first.

SCPI (the DVFS driver on this SoC) is now supported on 4.10 though.

Kevin Hilman wrote a new eMMC host driver from the original implementation and public datasheet. It's very performant.

At the end of 2016, Amlogic did a new variant S905X of those SoCs, and supporting it was easily done through re-architecturing the Device Tree files.

For CVBS (analog video support), support was integrated in 4.10. For HDMI, Amlogic integrated a Synopsys DesignWare HDMI Controller, and a clean dw-hdmi bridge has been published sharing the code between different SoCs family. The PHY was custom, as well as the HPD though.

CEC support was merged using the CEC framework maintained by Hans Verkuil.

The Mali GPU inside the SoC does not have an open driver. The open source kernel driver is available. But the userspace shared binary is delivered as a blob, that has to be compiled by the SoC vendor to customize it.

Work in progress

There is still a lot of work for the Video Display: cursor plane, overlay planes, osd scaling, overlay scaling are missing for example.

DRM Planes only have a single, primary plane, without scaling. Support for scaling, or planes with different sub-sampling (various YUVformats), overlay planes, is still missing.

In Audio land, S/PDIF input and output is missing. I2S is working for output only through HDMI or external DAC, but the embedded stereo DAC in GXL/GXM or I2S input aren't support.

Video Hardware Acceleration, while one of the best feature of the SoC, is still missing, Neil says. There's at least 6 month of development to have a proper V4L2 driver.

Community

There are a lot of hobbyist hacking on the Odroid-C2 board, and running LibreELEC and KODI. Many raspberry-pi oriented projects are also ported to Amlogic boards.

There are upstream contributions from independent hackers. This is also helped by the growing of Single Board Computer (SBC) diversity with these SoCs.

Long-Term Maintenance, or How to (Mis-)Manage Embedded Systems for 10+ Years

by Marc Kleine-Budde

Marc started by asking the audience who had Embedded Systems in the field, for how long, and which ones were still maintained. Then he asked who had to update to fix a vulnerability, and how long it took to deploy.

The context of the talk are systems created by small teams, using custom hardware, and pushing out new products every few years, that need to be supported for more than 10 years.

The traditional Embedded Systems Lifecycle starts with a Component Version Decision, followed by HW/SW development, then the maintenance starts. It's usually the longest phase.

Marc showed graphs of vulnerabilities per-year in the Kernel, glibc and openssl. Despite most vulnerabilities being Denial of Services, it's still a lot. There's also the infamous "rootmydevice" in a proc file that was in a published linux-sunxi kernel from Allwinner.

Don't trust you vendor kernel, Marc says.

Field Observations

vendor kernels are already obsolete at start of project

the workflow for customized pre-built distributions isn't standard

you get the worst of both world if you select "longterm" components but don't have an update concept

if your update process isn't proven, it's bad

there's a critical vulnerability in a relevant component at least twice a year

upstream only maintain components for 2 to 5 years

Server distros are made for admin interaction, and not suited to embedded systems.

It all leads to the conclusion that Continous Maintenance is very important.

Backporting, while simple at its core — you take a patch and apply it — doesn't scale. As you get more products, versions diverge, as you make local modifications test coverage is reduced, and after a few years, it's almost impossible to decide which upstream fixes are relevant.

If you don't want your product to become part of botnet, you need to have a few safeguards. You need to have short time between incident and fix, have low risk of negative side effects, predict maintenance cost, and have this whole process scalable to multiple products.

These are ingredients for a sustainable process: making sure you can upgrade in the field, review security announcements regularly, always use releases maintained by upstream, disable unused components and enable security hardening.

Development Workflow

It's important to submit changes to upstream to reduce maintenance effort.

You need to automate the processes as early as possible: use CI.

When starting a new project, use the development version of upstream projects, so that when you reach completion, it's in stable state, and still maintained, as opposed to already obsoleted.

Every month, do periodic maintenance: integrate maintenance releases in order to be prepared, review security announcements, and evaluate impact on the product.

When you identify a problem, apply the upstream fix, and leverage your automated build, testing and deployment infrastructure to publish.

Marc advises using Jenkins 2 with Pipeline as Code. For test automation, take there's kernelci.org or LAVA. For redundant boot, barebox as bootchooser, u-boot/grub can do it with custom scripts as well as UEFI. For the update system, there is RAUC, OSTree or Swupdate. Finally, there are now many different rollout schedulers like hawkBit, mender.io, resin.io, but you can also use a static server or custom application.

The application discussed in this talk has high real time and performance constraints. It must be able to merge and synchronize images issued by 2 cameras, with safety constraints. Target latency is less than 200ms, with boot time less than 5s.

Christian says that in a previous video application, they worked on an ARM SoC with gstreamer, but it didn't match the safety requirements, so they decided to go with a hybrid FPGA+linux solution.

Target hardware is a PicoZED, an System On Module based on a Xilinx Zynq, which embeds and ARM processor as well as an FPGA in the SoC. Its software environment is yocto-based, and does not use the Xilinx-provided solutions Petalinux or Wind River Pulsar Linux, because of their particular quirks. Yocto is now well known and Christian decided to pick-up the meta-xilinx layer and start from that instead.
All necessary layers are from the OE layer Index.

The FPGA development are made with the Eclipse-based Xilinx Vivado tool, which enables scripting with tcl.

The AXI bus is used to communicate between the Linux host and the FPGA design. It allows adding devices accessible from Linux, extending the capabilities: for example, a new serial line, dedicated hardware. It also allows dynamically changing the video pipeline by changing the parameters.

Boot mechanism

The PicoZed needs a First Stage Boot Loader (FSBL), before u-boot. This FSBL is generated by the Vivado IDE according to the design. The FSBL then starts u-boot, which starts Linux.

The FPGA can't start alone, and it's code (bitstream) is loaded by the FSBL or u-boot. The Xilinx Linux kernel has a drivers for devices programmed in the FPGA. It uses device tree files to describe the specific configuration available at the moment. Vivado generated the whole device tree, not just the part for the Programmable Logic (FPGA), it merges the two in a single system.dts file.

It's a good idea to automate the process of rebuilding the device tree after each change in Vivado, Christian says.

The boot is comprised of several tasks before showing an image, making boot time optimization a complex problem: FSBL, u-boot, bitstream loading, kernel start, etc. Various techniques were used to reduce boot time. Inside u-boot, the bootstage report was activated, some devices init were disabled.

Bootchart was used to profile Linux startup: the kernel size was reduced, the system console removed, and the init scripts reordered. Filesystem checks were bypassed by using a read-only filesystem. SPI bus speed was increased. Other techniques were used, and the 5 second goal was met.

Closing words

While the design of the system was done so that only the part on the FPGA is impacted by the certification process, the bitstream code is still updated through Linux on the network. Therefore code signing was used in the installer and updater mechanisms to protect the integrity of the system.

According to Christian, the project has many unknown before starting, but those were surmounted. The splitted design constraint payed off. The choice of meta-xilinx layer is good one, because of its good quality. You only need to understand that the device tree is not built within the kernel; once you understand the general structure, it's working well, and the distribution is well tailored to the requirements.

Lightning talks

Atom Linux

by Christophe Blaess

Atom Linux is a new embedded linux distro designed by Christophe. It's a binary distribution, but definitely embedded-oriented. It aims to be industrial-quality.

Atom Linux targets small companies, that already have an embedded Linux project, but with poor embedded Linux knowledge. It aims to provide a secure update system (with rollback, factory defaults, etc.). It want to be power-failure proof with a read-only rootfs, and data backup.

It's easy to configure Christophe says. The base system is already compiled. It provides a UI for configuration. It aims to make custom code integration simple by providing a toolchain in a VM or natively if needed.

The user starts by downloading the base image for his target, then installing the configuration tool. The user configures the base image with a few parameters. The configuration tool merges the prebuilt packages and the user custom code in a new root filesystem image.

This image is then stored in the user's repository (update server), and at first boot, the system does an update.

Currently, the base image builder works, as well as u-boot and the update shell scripts. The first version of the configuration tool is Qt-based, but it's very ugly according to Christophe. He still wants to improve the tool, and rewrite the base image builder as a Yocto layer. Christophe is looking for contributors and ask anyone interested to contact him.

Wayland is coming

by Fabien Lahoudere

Fabien started that he is just a user, not a Wayland developer. Wayland is protocol for compositors to talk to its clients. It's aimed as a simpler replacement for X.

Wayland is designed for modern devices, more performant, simpler to use and configure according to Fabien. It's also more secure, supported by toolkits, and the future of Linux distributions. For instance, it prevents keyloggers, that are very easy to implement with X11.

Wayland is more performant, because it has less ping/pong between the compositor and the clients. Weston is the reference implementation. It's a minimal and fast Wayland compositor. You can extend it by using libweston. There's also AsteroidOS and Maynard which are two embedded-oriented Wayland compositors.

It's also possible to use a "legacy" X application through Xwayland. In fact, Fabien did his whole presentation on a small iMX6Solo based board running evince on top of wayland.

Someone from the audience said they recently had to work with QtCompositor, and it was very simple to use.

Process monitoring with systemd

by Jérémy Rosen

Jérémy says systemd is a very good tool for embedded systems. It cost about ~6Mb of disk space when built with yocto. It's already integrated in Yocto and Buildroot.

systemd makes it easy to secure processes with capabilities, and limits system calls; it can bind mount files to control exactly what a process sees. It makes it easy to control resources with cgroups, as well as monitoring processes.

Jérémy compared moving to systemd from init scripts is like going from svn to git. It requires to understand and re-learn a lot of things, but is really worth it in the end.

systemd provide very fine grained control on how to kill a unit: which command to send, which signal to send when it doesn't work, what cleanup command to run, etc. You can define what is a normal or abnormal stop. It can restart an app automatically, and rate-limit this. You can also do coredump management, soft watchdog monitoring, it also monitors itself with a hardware watchdog.

A fine-grained integration of how services work interact is also available. You can react to hardware changes, filesystem changes, use socket activation, etc.

Jérémy said monitoring is a solved problem for him in embedded and he does not want to work on custom solutions anymore.

That's it for Embedded Recipes first edition ! Congratulations on reading this far !

The soundpad

For the Go 1.8 Release Party in Paris I gave a lightning talk on monotonic clocks. It's essentially a talk version of the Russ Cox's design document on monotonic clocks, which is really well written and sourced. You should go read it !

So I have this NAS made by my ISP, that does a lot of things; but recently, I
started having issues with its behavior. Recorded TV shows had lag/jitter while replaying, and the
same happened with other types of videos I put on it. I narrowed it down to ...

So I was in Edinburgh this year, and I took notes as I usually do. These are intended for personal consumption (do no expect LWN-style reports), but as more people were asking me to share them, I thought why not do it in public ?