Search

LUG Community Blogs

Work on Debian for mobile devices, i.e. telephones, tablets, and
handheld computers, continues. During the recent
DebConf17 in Montréal, Canada, more
than 50 people had a meeting to reconsider opportunities and
challenges for Debian on mobile devices.

A number of devices were shown at DebConf:

PocketCHIP: A very small
handheld computer with keyboard, Wi-Fi, USB, and Bluetooth, running
Debian 8 (Jessie) or 9 (Stretch).

Pyra: A modular
handheld computer with a touchscreen, gaming controls, Wi-Fi,
keyboard, multiple USB ports and SD card slots, and an optional
modem for either Europe or the USA. It will come preinstalled with
Debian.

Samsung Galaxy S Relay 4G:
An Android smartphone featuring a physical keyboard, which can
already run portions of Debian userspace on the Android kernel.
Kernel upstreaming is on the way.

ZeroPhone: An
open-source smartphone based on Raspberry Pi Zero, with a small
screen, classic telephone keypad and hardware switches for
telephony, Wi-Fi, and the microphone. It is running Debian-based
Raspbian OS.

The photo (click to enlarge) shows all four devices, together with a
Nokia N900, which was the
first Linux-based smartphone by Nokia, running Debian-based Maemo and
a completely unrelated
Gnuk cryptographic token,
which just sneaked into the setting.

Today is Debian's 24th anniversary. If you are close to any of the cities
celebrating Debian Day 2017, you're
very welcome to join the party!

If not, there's still time for you to organize a little celebration or
contribution to Debian. For example, spread the word about Debian Day
with this nice piece of artwork
created by Debian Developer Daniel Lenharo de Souza and Valessio Brito,
taking inspiration from the desktop themes
Lines
and softWaves by Juliette Belin:

If you also like graphics design, or design in general, have a look at
https://wiki.debian.org/Design and join
the team! Or you can visit the general list of Debian Teams
for many other opportunities to participate in Debian development.

Thanks to everybody who has contributed to develop our beloved operating system
in these 24 years, and happy birthday Debian!

Today, Saturday 12 August 2017, the annual Debian Developers
and Contributors Conference came to a close.
With over 405 people attending from all over the world,
and 169 events including 89 talks, 61 discussion sessions or BoFs,
6 workshops and 13 other activities,
DebConf17 has been hailed as a success.

Highlights included DebCamp with 117 participants,
the Open Day,
where events of interest to a broader audience were offered,
talks from invited speakers
(Deb Nicholson,
Matthew Garrett
and Katheryn Sutter),
the traditional Bits from the DPL, lightning talks and live demos
and the announcement of next year's DebConf (DebConf18 in Hsinchu, Taiwan).

The schedule
has been updated every day, including 32 ad-hoc new activities, planned
by attendees during the whole conference.

For those not able to attend, talks and sessions were recorded and live streamed,
and videos are being made available at the
Debian meetings archive website.
Many sessions also facilitated remote participation via IRC or a collaborative pad.

The DebConf17 website
will remain active for archive purposes, and will continue to offer
links to the presentations and videos of talks and events.

Next year, DebConf18 will be held in Hsinchu, Taiwan, from 29 July 2018
until 5 August 2018. It will be the first DebConf held in Asia.
For the days before DebConf the local organisers will again set up DebCamp
(21 July - 27 July),
a session for some intense work on improving the distribution,
and organise the Open Day on 28 July 2018, aimed at the general public.

Savoir-faire Linux
is a Montreal-based Free/Open-Source Software company
with offices in Quebec City, Toronto, Paris and Lyon. It offers Linux and
Free Software integration solutions in order to provide performance,
flexibility and independence for its clients. The company actively contributes
to many free software projects, and provides mirrors of Debian, Ubuntu, Linux
and others.

About Hewlett Packard Enterprise

Hewlett Packard Enterprise (HPE)
is one of the largest computer companies in the
world, providing a wide range of products and services, such as servers, storage,
networking, consulting and support, software, and financial services.

HPE is also a development partner of Debian,
and provides hardware for port development, Debian mirrors, and other Debian services.

About Google

Google
is one of the largest technology companies in the
world, providing a wide range of Internet-related services and products
as online advertising technologies, search, cloud computing, software, and hardware.

Google has been supporting Debian by sponsoring DebConf since more than
ten years, at gold level since DebConf12, and at platinum level for this DebConf17.

I used to think I was a programmer who did "sysadmin-stuff". Nowadays I interact with too many real programmers to believe that.

Or rather I can code/program/develop, but I'm not often as good as I could be. These days I'm getting more consistent with writing tests, and I like it when things are thoroughly planned and developed. But too often if I'm busy, or distracted, I think to myself "Hrm .. compiles? Probably done. Oops. Bug, you say?"

I was going to write about working with golang today. The go language is minimal and quite neat. I like the toolset:

Instead I think today I'm going to write about something else. Since having a child a lot of my life is different. Routine becomes something that is essential, as is planning and scheduling.

So an average week-day goes something like this:

6:00AM

Wake up (naturally).

7:00AM

Wake up Oiva and play with him for 45 minutes.

7:45AM

Prepare breakfast for my wife, and wake her up, then play with Oiva for another 15 minutes while she eats.

8:00AM

Take tram to office.

8:30AM

Make coffee, make a rough plan for the day.

9:00AM

Work, until lunchtime which might be 1pm, 2pm, or even 3pm.

5:00PM

Leave work, and take bus home.

Yes I go to work via tram, but come back via bus. There are reasons.

5:40PM

Arrive home, and relax in peace for 20 minutes.

6:00PM-7:00PM

Take Oiva for a walk, stop en route to relax in a hammock for 30 minutes reading a book.

7:00-7:20PM

Feed Oiva his evening meal.

7:30PM

Give Oiva his bath, then pass him over to my wife to put him to bed.

7:30PM - 8:00pm

Relax

8:00PM - 10:00PM

Deal with Oiva waking up, making noises, or being unsettled.

Try to spend quality time with my wife, watch TV, read a book, do some coding, etc.

10:00PM ~ 11:30PM

Go to bed.

In short I'm responsible for Oiva from 6ish-8ish in the morning, then from 6PM-10PM (with a little break while he's put to bed.) There are some exceptions to this routine - for example I work from home on Monday/Friday afternoons, and Monday evenings he goes to his swimming classes. But most working-days are the same.

Weekends are a bit different. There I tend to take him 6AM-8AM, then 1PM-10PM with a few breaks for tea, and bed. At the moment we're starting to reach the peak-party time of year, which means weekends often involve negotiation(s) about which parent is having a party, and which parent is either leaving early, or not going out at all.

Today I have him all day, and it's awesome. He's just learned to say "Daddy" which makes any stress, angst or unpleasantness utterly worthwhile.

I have been working with STM32 chips on-and-off for at least eight, possibly
closer to nine years. About as long as ST have been touting them around. I
love the STM32, and have done much with them in C. But, as my
previoustwo posts may have hinted, I
would like to start working with Rust instead of C. To that end, I have
been looking with great joy at the work which Jorge Aparicio has
been doing around Cortex-M3 and Rust. I've had many comments in person at
Debconf, and also several people mention on Twitter, that they're glad more
people are looking at this. But before I can get too much deeper into trying
to write my USB stack, I need to sort a few things from what Jorge has done as
demonstration work.

Okay, this is fast, but we need Ludicrous speed

All of Jorge's examples seem to leave the system clocks in a fairly default
state, excepting turning on the clocks to the peripherals needed during the
initialisation phase. Sadly, if we're going to be running the USB at all, we
need the clocks to run a tad faster. Since my goal is to run something
moderately CPU intensive on the end of the USB too, it makes sense to try and
get our STM32 running at maximum clock speed. For the one I have, that's 72MHz
rather than the 8MHz it starts out with. Nine times more cycles to do
computing in makes a lot of sense.

As I said above, I've been doing STM32 in C a lot for many years; and
fortunately I have built systems with the exact chip that's on the blue-pill
before. As such, if I rummage, I can find some old C code which does what
we need...

This code, rather conveniently, uses an 8MHz external crystal so we can almost
direct-port it to the blue-pill Rust code and see how we go. If you're used to
the CMSIS libraries for STM32, then you won't completely recognise the
above since it uses the pre-CMSIS core libraries to do its thing. Library code
from 2008 and it's still good on today's STM32s providing they're in the right
family :-)

A direct conversion to Rust, using Jorge's beautifully easy to work with crates
made from svd2rust results in:

Now, I've not put the comments in which were in the C code, because I'm being
very lazy right now, but if you follow the two together you should be able to
work it through. I don't have timeouts for the waits, and you'll notice a
single comment there (I cannot set up the ADC prescaler because for some reason
the SVD is missing any useful information and so the generated crate only
carries an unsafe function (bits()) and I'm trying to steer clear of
unsafe for now. Still, I don't need the ADC immediately, so I'm okay with
this.

By using this function in the beginning of the init() function of the
blinky example, I can easily demonstrate the clock is going faster since the
LED blinks more quickly.

This function demonstrates just how simple it is to take bit-manipulation from
the C code and turn it into (admittedly bad looking) Rust with relative ease
and without any of the actual bit-twiddling. I love it.

Mess with time, and you get unexpected consequences

Sadly, when you mess with the clock tree on a microcontroller, you throw a lot
of things out of whack. Not least, by adjusting the clock frequency up we end
up adjusting the AHB, APB1, and APB2 clock frequencies. This has direct
consequences for peripherals floating around on those busses. Fortunately
Jorge thought of this and while the blue-pill crate hard-wires those
frequencies to 8MHz, they are, at least, configurable in code in some sense.

If we apply the make_go_faster() function to the serial loopback example, it
simply fails to work because now the bus which the USART1 peripheral is
connected to (APB2) is going at a different speed from the expected power-on
default of 8MHz. If you remember from the function, we did .hpre().div1()
which set HCLK to 72MHz, then .ppre1().div2() which sets the APB1 bus
clock to be HCLK divided by 2, and .ppre2().div1() which sets APB2 bus
clock to be HCLK. This means that we'd need to alter src/lib.rs to
reflect these changes in the clock frequences and in theory loopback would
start working once more.

It'd be awkward to try and demonstrate all that to you since I only have a
phone camera to hand, but if you own a blue-pill then you can clone Jorge's
repo and have a go yourself and see that I'm not bluffing you.

With all this done, it'll be time to see if we can bring the USB peripheral
in the STM32 online, and that will be the topic of my next post in this
discovery series.

DebConf17, the 18th annual
Debian Conference, is taking place in Montreal, Canada
from August 6 to August 12, 2017.

Debian contributors from all over the world have come together at
Collège Maisonneuve
during the preceding week for DebCamp (focused on individual work
and team sprints for in-person collaboration developing Debian),
and the Open Day on August 5th (with presentations and workshops
of interest to a wide audience).

Today the main conference starts with nearly 400 attendants
and over 120 activities scheduled,
including 45- and 20-minute talks and team meetings,
workshops, a job fair, talks from invited speakers,
as well as a variety of other events.

We are very pleased to announce that Google
has committed support to DebConf17 as a Platinum sponsor.

Google is one of the largest technology companies in the
world, providing a wide range of Internet-related services and products
as online advertising technologies, search, cloud computing, software, and hardware.

Google has been supporting Debian by sponsoring DebConf since more than
ten years, and at gold level since DebConf12.

With this additional commitment as Platinum Sponsor for DebConf17,
Google contributes to make possible our annual conference,
and directly supports the progress of Debian and Free Software
helping to strengthen the community that continues to collaborate on
Debian projects throughout the rest of the year.

Thank you very much Google, for your support of DebConf17!

DebConf17 is starting!

Many Debian contributors are already taking advantage of DebCamp
and the Open Day
to work individually or in groups developing and improving Debian.
DebConf17 will officially start on August 6, 2017.
Visit the DebConf17 website at https://debconf17.debconf.org
to know the schedule, live streaming and other details.

Previously we talked about all the different kinds of
descriptors which USB devices use to communicate their capability. This is
important stuff because to write any useful USB device firmware we need to
be able to determine how to populate our descriptors. However, having that
data on the device is entirely worthless without an understanding of how it
gets from the device to the host so that it can be acted upon. To understand
that, let's look at the USB wire protocol.

Note, I'll again be talking mostly about USB2.0 low- and full-speed. I
believe that high speed is approximately the same but with faster wires,
except not quite that simple.

Down to the wire

I don't intend to talk about the actual electrical signalling, though it's not
un-reasonable for you to know that USB is a pair of wires forming a
differentially signalled bidirectional serial communications link.
The host is responsible for managing all the framing and timing on the link,
and for formatting the communications into packets.

There are a number of packet types which can appear on the USB link:

Packet type
Purpose
Token Packet
When the host wishes to send a message to the Control endpoint to configure the device, read data IN, or write data OUT, it uses this to start the transaction.
Data(0/1) Packet
Following a Setup, In, or Out token, a Data packet is a transfer of data (in either direction). The 0 and 1 alternate to provide a measure of confidence against lost packets.
Handshake Packet
Following a data packet of some kind, the other end may ACK the packet (all was well), NAK the packet (report that the device cannot, temporarily, send/receive data, or that an interrupt endpoint isn't triggered), or STALL the bus in which case the host needs to intervene.
Start of Frame
Every 1ms (full-speed) the host will send a SOF packet which carries a frame number. This can be used to help keep time on very simple devices. It also divides the bus into frames within which bandwidth is allocated.

As an example, when the host wishes to perform a control transfer, the
following packets are transacted in turn:

Setup Token - The host addresses the device and endpoint (OUT0)

Data0 Packet - The host transmits a GET_DESCRIPTOR for the device
descriptor

Ack Packet - The device acknowledges receipt of the request

This marks the end of the first transaction. The device decodes the
GET_DESCRIPTOR request and prepares the device descriptor for transmission.
The transmission occurs as the next transaction on the bus. In this example,
we're assuming 8 byte maximum transmission sizes, for illustrative purposes.

In Token - The host addresses the device and endpoint (IN0)

Data1 Packet - The device transmits the first 8 bytes of the descriptor

Ack Packet - The device acknowledges the completion, indicating its own
satisfaction

And thus ends the full control transaction in which the host retrieves the
device descriptor.

From a high level, we need only consider the activity which occurs at the point
of the acknowledgement packets. In the above example:

On the first ACK the device prepares IN0 to transmit the descriptor,
readying whatever low level device stack there is with a pointer to the
descriptor and its length in bytes.

On the second ACK the low levels are still thinking.

On the third ACK the transmission from IN0 is complete and the endpoint
no longer expects to transfer data.

On the fourth ACK the control transaction is entirely complete.

Thinking at the low levels of the control interface

Before we can build a high level USB stack, we need to consider the activity
which might occur at the lower levels. At the low levels, particularly of
the device control interface, work has to be done at each and every packet.
The hardware likely deals with the token packet for us, leaving the data
packets for us to process, and the resultant handshake packets will be likely
handled by the hardware in response to our processing the data packets.

Since every control transaction is initiated by a setup token, let's look at
the setup requests which can come our way...

Setup Packet (Data) Format
Field Name
Byte start
Byte length
Encoding
Meaning
bmRequestType
0
1
Bitmap
Describes the kind of request, and the target of it. See below.
bRequest
1
1
Code
The request code itself, meanings of the rest of the fields vary by bRequest
wValue
2
2
Number
A 16 bit value whose meaning varies by request type
wIndex
4
2
Number
A 16 bit value whose meaning varies by request type but typically encodes an interface number or endpoint.
wLength
6
2
Number
A 16 bit value indicating the length of the transfer to come.

Since bRequest is essentially a switch against which multiple kinds of setup
packet are selected between, here's the meanings of a few...

GET_DESCRIPTOR (Device) setup packet
Field Name
Value
Meaning
bmRequestType
0x08
Data direction is IN (from device to host), recipient is the device
bRequest
0x06
GET_DESCRIPTOR (in this instance, the device descriptor is requested)
wValue
0x0001
This means the device descriptor
wIndex
0x0000
Irrelevant, there's only 1 device descriptor anyway
wLength
12
This is the length of a device descriptor (12 bytes)
SET_ADDRESS to set a device's USB address
Field Name
Value
Meaning
bmRequestType
0x00
Data direction is OUT (from host to device), recipient is the device
bRequest
0x05
SET_ADDRESS (Set the device's USB address)
wValue
0x00nn
The address for the device to adopt (max 127)
wIndex
0x0000
Irrelevant for address setting
wLength
0
There's no data transfer expected for this setup operation

Most hardware blocks will implement an interrupt at the point that the Data
packet following the Setup packet has been receive. This is typically called
receiving a 'Setup' packet and then it's up to the device stack low levels
to determine what to do and dispatch a handler. Otherwise an interrupt will
fire for the IN or OUT tokens and if the endpoint is zero, the low level
stack will handle it once more.

One final thing worth noting about SET_ADDRESS is that it doesn't take
effect until the completion of the zero-length "status" transaction following
the setup transaction. As such, the status request from the host will still
be sent to address zero (the default for new devices).

A very basic early "packet trace"

This is an example, and is not guaranteed to be the packet sequence in all
cases. It's a good indication of the relative complexity involved in getting
a fresh USB device onto the bus though...

When a device first attaches to the bus, the bus is in RESET state and so
the first event a device sees is a RESET which causes it to set its address
to zero, clear any endpoints, clear the configuration, and become ready for
control transfers. Shortly after this, the device will become suspended.

Next, the host kicks in and sends a port reset of around 30ms. After this, the
host is ready to interrogate the device.

The host sends a GET_DESCRIPTOR to the device, whose address at this point is
zero. Using the information it receives from this, it can set up the host-side
memory buffers since the device descriptor contains the maximum transfer size
which the device supports.

The host is now ready to actually 'address' the device, and so it sends another
reset to the device, again around 30ms in length.

The host sends a SET_ADDRESS control request to the device, telling it that
its new address is nn. Once the acknowledgement has been sent from the host
for the zero-data status update from the device, the device sets its internal
address to the value supplied in the request. From now on, the device shall
respond only to requests to nn rather than to zero.

At this point, the host will begin interrogating further descriptors, looking
at the configuration descriptors and the strings, to build its host-side
representation of the device. These will be GET_DESCRIPTOR and
GET_STRING_DESCRIPTOR requests and may continue for some time.

Once the host has satisfied itself that it knows everything it needs to about
the device, it will issue a SET_CONFIGURATION request which basically starts
everything up in the device. Once the configuration is set, interrupt
endpoints will be polled, bulk traffic will be transferred, Isochronous
streams begin to run, etc.

Okay, but how do we make this concrete?

So far, everything we've spoken about has been fairly abstract, or at least
"soft". But to transfer data over USB does require some hardware. (Okay,
okay, we could do it all virtualised, but there's no fun in that). The
hardware I'm going to be using for the duration of this series is the STM32
on the blue-pill development board. This is a very simple development
board which does (in theory at least) support USB device mode.

If we view the schematic for the blue-pill, we can see a very "lightweight" USB
interface which has a pullup resistor for D+. This is the way that a device
signals to the host that it is present, and that it wants to speak at
full-speed. If the pullup were on D- then it would be a low-speed device.
High speed devices need a little more complexity which I'm not going to go into
for today.

The USB lines connect to pins PA11 and PA12 which are the USB pins on the
STM32 on the board. Since USB is quite finicky, the STM32 doesn't let you
remap that function elsewhere, so this is all looking quite good for us so far.

The specific STM32 on the blue-pill is the STM32F103C8T6. By viewing its
product page on ST's website we can find the
reference manual for the part. Jumping to section 23 we learn that
this STM32 supports full-speed USB2.0 which is convenient given the past
article and a half. We also learn it supports up to eight endpoints active at
any one time, and offers double-buffering for our bulk and isochronous
transfers. It has some internal memory for packet buffering, so it won't use
our RAM bandwidth while performing transfers, which is lovely.

I'm not going to distill the rest of that section here, because there's a large
amount of data which explains how the USB macrocell operates. However useful
things to note are:

How IN OUT and SETUP transfers work.

How the endpoint buffer memory is configured.

That all bus-powered devices MUST respond to suspend/resume properly

That the hardware will prioritise endpoint interrupts for us so that
we only need deal with the most pressing item at any given time.

There is an 'Enable Function' bit in the address register which must be
set or we won't see any transactions at all.

How the endpoint registers signal events to the device firmware.

Next time, we're going to begin the process of writing a very hacky setup
routine to try and initialise the USB device macrocell so that we can see
incoming transactions through the ITM. It should be quite exciting,
but given how complex this will be for me to learn, it might be a little
while before it comes through.

I have been spending time with Jorge Aparicio's
RTFM for Cortex M3 framework for writing Rust to target Cortex-M3
devices from Arm (and particularly the STM32F103 from ST Microelectronics).
Jorge's work in this area has been of interest to me ever since I discovered
him working on this stuff a while ago. I am very tempted by the idea of being
able to implement code for the STM32 with the guarantees of Rust and the
language features which I have come to love such as the trait system.

I have been thinking to myself that, while I admire and appreciate the work
done on the GNUK, I would like to, personally, have a go at implementing
some kind of security token on an STM32 as a USB device. And with the advent
of the RTFM for M3 work, and Jorge's magical tooling to make it easier to
access and control the registers on an M3 microcontroller, I figured it'd be
super-nice to do this in Rust, with all the advantages that entails in terms
of isolating unsafe behaviour and generally having the potential to be more
easily verified as not misbehaving.

To do this though, means that I need a USB device stack which will work in the
RTFM framework. Sadly it seems that, thus-far, only Jorge has been working on
drivers for any of the M3 devices his framework supports. And one person can
only do so much. So, in my infinite madness, I decided I should investigate
the complexity of writing a USB device stack in Rust for the RTFM/M3 framework.
(Why I thought this was a good idea is lost to the mists of late night
Googling, but hey, it might make a good talk at the next conference I go to).
As such, this blog post, and further ones along these lines, will serve as a
partial tour of what I'm up to, and a partial aide-memoir for me about learning
USB. If I get something horribly wrong, please DO contact me to correct
me, otherwise I'll just continue to be wrong. If I've simplified something but
it's still strictly correct, just let me know if it's an oversimplification
since in a lot of cases there's no point in me putting the full details into a
blog posting. I will mostly be considering USB2.0 protocol details but only
really for low and full speed devices. (The hardware I'm targetting does
low-speed and full-speed, but not high-speed. Though some similar HW does
high-speed too, I don't have any to hand right now)

A brief introduction to USB

In order to go much further, I needed a grounding in USB. It's a multi-layer
protocol as you might expect, though we can probably ignore the actual
electrical layer since any device we might hope to support will have to have a
hardware block to deal with that. We will however need to consider the packet
layer (since that will inform how the hardware block is implemented and thus
its interface) and then the higher level protocols on top.

USB is a deliberately asymmetric protocol. Devices are meant to be
significantly easier to implement, both in terms of hardware and software, as
compared with hosts. As such, despite some STM32s having OTG ports, I have no
intention of supporting host mode at this time.

USB is arranged into a set of busses which are, at least in the USB1.1 case,
broadcast domains. As such, each device has an address assigned to it by the
host during an early phase called 'configuration'. Once the address is
assigned, the device is expected to only ever respond to messages addressed to
it. Note that since everything is asymmetric in USB, the device can't send
messages on its own, but has to be asked for them by the host, and as such
the addressing is always from host toward device.

USB devices then expose a number of endpoints through which communication can
flow IN to the host or OUT to the device. Endpoints are not bidirectional,
but the in and out endpoints do overlap in numbering. There is a special pair
of endpoints, IN0 and OUT0 which, between them, form what I will call the
device control endpoints. The device control endpoints are important since
every USB device MUST implement them, and there are a number of well
defined messages which pass over them to control the USB device. In theory a
bare minimum USB device would implement only the device control endpoints.

Configurations, and Classes, and Interfaces, Oh My!

In order for the host to understand what the USB device is, and what it is
capable of, part of the device control endpoints' responsibility is to provide
a set of descriptors which describe the device. These descriptors form a
heirarchy and are then glommed together into a big lump of data which the host
can download from the device in order to decide what it is and how to use it.
Because of various historical reasons, where a multi-byte value is used, they
are defined to be little-endian, though there are some BCD fields.
Descriptors always start with a length byte and a type byte because that way
the host can parse/skip as necessary, with ease.

The first descriptor is the device descriptor, is a big one, and looks like
this:

This looks quite complex, but breaks down into a relatively simple two halves.
The first eight bytes carries everything necessary for the host to be able to
configure itself and the device control endpoints properly in order to
communicate effectively. Since eight bytes is the bare minimum a device must
be able to transmit in one go, the host can guarantee to get those, and they
tell it what kind of device it is, what USB protocol it supports, and what the
maximum transfer size is for its device control endpoints.

The encoding of the bcdUSB and bcdDevice fields is interesting too. It is
of the form 0xMMmm where MM is the major number, mm the minor. So USB2.0
is encoded as 0x0200, USB1.1 as 0x0110 etc. If the device version is 17.36
then that'd be 0x1736.

Other fields of note are bDeviceClass which can be 0 meaning that
interfaces will specify their classes, and idVendor/idProduct which between
them form the primary way for the specific USB device to be identified. The
Index fields are indices into a string table which we'll look at later. For
now it's enough to know that wherever a string index is needed, 0 can be
provided to mean "no string here".

The last field is bNumConfigurations and this indicates the number of ways in
which this device might function. A USB device can provide any number of these
configurations, though typically only one is provided. If the host wishes to
switch between configurations then it will have to effectively entirely quiesce
and reset the device.

The next kind of descriptor is the configuration descriptor. This one is much
shorter, but starts with the same two fields:

Configuration Descriptor
Field Name
Byte start
Byte length
Encoding
Meaning
bLength
0
1
Number
Size of the descriptor in bytes (9)
bDescriptorType
1
1
Constant
Configuration Descriptor (0x02)
wTotalLength
2
2
Number
Size of the configuration in bytes, in total
bNumInterfaces
4
1
Number
The number of interfaces in this configuration
bConfigurationValue
5
1
Number
The value to use to select this configuration
iConfiguration
6
1
Index
The name of this configuration (0 for unavailable)
bmAttributes
7
1
Bitmap
Attributes field (see below)
bMaxPower
8
1
Number
Maximum bus power this configuration will draw (in 2mA increments)

An important field to consider here is the bmAttributes field which tells the
host some useful information. Bit 7 must be set, bit 6 is set if the device
would be self-powered in this configuration, bit 5 indicates that the device
would like to be able to wake the host from sleep mode, and bits 4 to 0 must be
unset.

The bMaxPower field is interesting because it encodes the power draw of the
device (when set to this configuration). USB allows for up to 100mA of draw
per device when it isn't yet configured, and up to 500mA when configured. The
value may be used to decide if it's sensible to configure a device if the host
is in a low power situation. Typically this field will be set to 50 to
indicate the nominal 100mA is fine, or 250 to request the full 500mA.

Finally, the wTotalLength field is interesting because it tells the host the
total length of this configuration, including all the interface and endpoint
descriptors which make it up. With this field, the host can allocate enough
RAM to fetch the entire configuration descriptor block at once, simplifying
matters dramatically for it.

Each configuration has one or more interfaces. The interfaces group some
endpoints together into a logical function. For example a configuration for
a multifunction scanner/fax/printer might have an interface for the scanner
function, one for the fax, and one for the printer. Endpoints are not shared
among interfaces, so when building this table, be careful.

The important values here are the class/subclass/protocol fields which provide
a lot of information to the host about what the interface is. If the class is
a USB Org defined one (e.g. 0x02 for Communications Device Class) then the host
may already have drivers designed to work with the interface meaning that the
device manufacturer doesn't have to provide host drivers.

The bInterfaceNumber is used by the host to indicate this interface when
sending messages, and the bAlternateSetting is a way to vary interfaces. Two
interfaces with the came bInterfaceNumber but different bAlternateSettings
can be switched between (like configurations, but) without resetting the
device.

The bEndpointAddress is a 4 bit endpoint number (so there're 16 endpoint
indices) and a bit to indicate IN vs. OUT. Bit 7 is the direction marker
and bits 3 to 0 are the endpoint number. This means there are 32 endpoints in
total, 16 in each direction, 2 of which are reserved (IN0 and OUT0) giving
30 endpoints available for interfaces to use in any given configuration. The
bmAttributes bitmap covers the transfer type of the endpoint (more below), and
the bInterval is an interval measured in frames (1ms for low or full speed,
125µs in high speed). bInterval is only valid for some endpoint types.

The final descriptor kind is for the strings which we've seen indices for
throughout the above. String descriptors have two forms:

This form (for descriptor 0) is that of a series of language IDs supported by
the device. The device may support any number of languages. When the host
requests a string descriptor, it will supply both the index of the string and
also the language id it desires (from the list available in string descriptor
zero). The host can tell how many language IDs are available simply by
dividing bLength by 2 and subtracting 1 for the two header bytes.

This second form of the string descriptor is simply the the string is in what
the USB spec calls 'Unicode' format which is, as of 2005, defined to be
UTF16-LE without a BOM or terminator.

Since string descriptors are of a variable length, the host must request
strings in two transactions. First a request for 2 bytes is sent, retrieving
the bLength and bDescriptorType fields which can be checked and memory
allocated. Then a request for bLength bytes can be sent to retrieve the
entire string descriptor.

Putting that all together

Phew, this is getting to be quite a long posting, so I'm going to leave this
here and in my next post I'll talk about how the host and device pass packets
to get all that information to the host, and how it gets used.

Today marks the release of Gitano 1.1. Richard(s) and I have spent quite
a lot of time and effort on this release, and there's plenty of good stuff
in it. We also released new versions of Lace, Supple, Luxio, and
Gall to go alongside it, with bugfixes and improvements.

At this point, I intend to take a short break from Gitano to investigate some
Rust-on-STM32 stuff, and then perhaps do some NetSurf work too.

I decided to perform some slightly more realistic benchmarks against lvmcache.

The problem with the initial benchmark was that it only covered 4GiB of data with a 4GiB cache device. Naturally once lvmcache was working correctly its performance was awesome – the entire dataset was in the cache. But clearly if you have enough fast block device available to fit all your data then you don’t need to cache it at all and may as well just use the fast device directly.

I decided to perform some fio tests with varying data sizes, some of which were larger than the cache device.

Test methodology

Once again I used a Zipf distribution with a factor of 1.2, which should have caused about 90% of the hits to come from just 10% of the data. I kept the cache device at 4GiB but varied the data size. The following data sizes were tested:

1GiB

2GiB

4GiB

8GiB

16GiB

32GiB

48GiB

With the 48GiB test I expected to see lvmcache struggling, as the hot 10% (~4.8GiB) would no longer fit within the 4GiB cache device.

…the only difference being that several different job files were used each with a different size= directive. Note that as there are two jobs, the size= is half the desired total data size: each job lays out a data file of the specified size.

For each data size I took care to fill the cache with data first before doing a test run, as unreproducible performance is still seen against a completely empty cache device. This produced IOPS logs and a completion latency histogram. Test were also run against SSD and HDD to provide baseline figures.

Results
IOPS graphs
All-in-one

Immediately we can see that for data sizes 4GiB and below performance converges quite quickly to near-SSD levels. That is very much what we would expect when the cache device is 4GiB, so big enough to completely cache everything.

Let’s just have a look at the lower-performing configurations.

Low-end performers

For 8, 16 and 32GiB data sizes performance clearly gets progressively worse, but it is still much better than baseline HDD. The 10% of hot data still fits within the cache device, so plenty of acceleration is still happening.

For the 48GiB data size it is a little bit of a different story. Performance is still better (on average) than baseline HDD, but there are periodic dips back down to roughly HDD figures. This is because not all of the 10% hot data fits into the cache device any more. Cache misses cause reads from HDD and consequently end up with HDD levels of performance for those reads.

The results no longer look quite so impressive, with even the 8GiB data set achieving only a few thousand IOPS on average. Are things as bad as they seem? Well no, I don’t think they are, and to see why we will have to look at the completion latency histograms.

Completion latency histograms

The above graphs are generated by fitting a Bezier curve to a scatter of data points each of which represents a 500ms average of IOPS achieved. The problem there is the word average.

It’s important to understand what effect averaging the figures gives. We’ve already seen that HDDs are really slow. Even if only a few percent of IOs end up missing cache and going to HDD, the massive latency of those requests will pull the average for the whole 500ms window way down.

Presumably we have a cache because we suspect we have hot spots of data, and we’ve been trying to evaluate that by doing most of the reads from only 10% of the data. Do we care what the average performance is then? Well it’s a useful metric but it’s not going to say much about the performance of reads from the hot data.

The histogram of completion latencies can be more useful. This shows how long it took between issuing the IO and completing the read for a certain percentage of issued IOs. Below I have focused on the 50% to 99% latency buckets, with the times for each bucket averaged between the two jobs. In the interests of being able to see anything at all I’ve had to double the height of the graph and still cut off the y axis for the three worst performers.

A couple of observations:

Somewhere between 70% and 80% of IOs complete with a latency that’s so close to SSD performance as to be near-indistinguishable, no matter what the data size. So what I think I am proving is that:

you can cache a 48GiB slow backing device with 4GiB of fast SSD and if you have 10% hot data then you can expect it to be served up at near-SSD latencies 70%–80% of the time. If your hot spots are larger (not so hot) then you won’t achieve that. If your fast device is larger than 1/12th the backing device then you should do better than 70%–80%.

If the cache were perfect then we should expect the 90th percentile to be near SSD performance even for the 32GiB data set, as the 10% hot spot of ~3.2GiB fits inside the 4GiB cache. For whatever reason this is not achieved, but for that data size the 90th percentile latency is still about half that of HDD.

When the backing device is many times larger (32GiB+) than the cache device, the 99th percentile latencies can be slightly worse than for baseline HDD.

I hesitate to suggest there is a problem here as there are going to be very few samples in the top 1%, so it could just be showing close to HDD performance.

Conclusion

Assuming you are okay with using a 4.12..x kernel, and assuming you are already comfortable using LVM, then at the moment it looks fairly harmless to deploy lvmcache.

Getting a decent performance boost out of it though will require you to check that your data really does have hot spots and size your cache appropriately.

Measuring your existing workload with something like blktrace is probably advisable, and these days you can feed the output of blktrace back into fio to see what performance might be like in a difference configuration.

Full test output

You probably want to stop reading here unless the complete output of all the fio runs is of interest to you.

In other news a while back I slipped in a casual note about having a brain scan done, here in sunny Helsinki.

One of the cool things about that experience, in addition to being told I wasn't going to drop dead that particular day, was that the radiologist told me that I could pay €25 to get a copy of my brain data in DICOM format.

I've not yet played with this very much, but I couldn't resist a brief animation:

So,”real people” don’t care about privacy? All they really want is ease of use and a pretty GUI so that they can chat to all their friends on-line? Only “the enemy” (who is that exactly anyway?) needs encryption? Excuse me for asking, but what have you been smoking? Does the Home Office know about that?

I’m a real person. And I care deeply about privacy. I care enough to fund both my own Tor node and various openVPN servers dotted around the world just to get past your ludicrous attempts at gratuitous surveillance of my (and my family’s) routine use of the ‘net. I care about the security and privacy of my transactions with various commercial enterprises, including my bank (which is why I expect them to use TLS on their website). I care about privacy when I correspond with my Doctor and other professionals. I care about privacy when I use an on-line search engine (which, incidentally, is not Google). I care about privacy because privacy matters. I have the right to freedom of thought and expression. I have the right to discuss those thoughts with others of my choice – when I choose and how I choose. You may not like that, but it’s a fact of life. That doesn’t make me “the enemy”. Get over it.

Every month or two keyring-maint gets a comment about how a key update we say we’ve performed hasn’t actually made it to the active keyring, or a query about why the keyring is so out of date, or told that although a key has been sent to the HKP interface and that is showing the update as received it isn’t working when trying to upload to the Debian archive. It’s frustrating to have to deal with these queries, but the confusion is understandable. There are multiple public interfaces to the Debian keyrings and they’re not all equal. This post attempts to explain the interactions between them, and how I go about working with them as part of the keyring-maint team.

First, a diagram to show the different interfaces to the keyring and how they connect to each other:

Public interfaces
rsync: keyring.debian.org::keyrings

This is the most important public interface; it’s the one that the Debian infrastructure uses. It’s the canonical location of the active set of Debian keyrings and is what you should be using if you want the most up to date copy. The validity of the keyrings can be checked using the included sha512sums.txt file, which will be signed by whoever in keyring-maint did the last keyring update.

HKP interface: hkp://keyring.debian.org/

What you talk to with gpg --keyserver keyring.debian.org. Serves out the current keyrings, and accepts updates to any key it already knows about (allowing, for example, expiry updates, new subkeys + uids or new signatures without the need to file a ticket in RT or otherwise explicitly request it). Updates sent to this interface will be available via it within a few hours, but must be manually folded into the active keyring. This in general happens about once a month when preparing for a general update of the keyring; for example b490c1d5f075951e80b22641b2a133c725adaab8.

Why not do this automatically? Even though the site uses GnuPG to verify incoming updates there are still occasions we’ve seen bugs (such as #787046, where GnuPG would always import subkeys it didn’t understand, even when that subkey was already present). Also we don’t want to allow just any UID to be part of the keyring. It is thus useful to retain a final set of human based sanity checking for any update before it becomes part of the keyring proper.

A public mirror of the git repository the keyring-maint team use to maintain the keyring. Every action is recorded here, and in general each commit should be a single action (such as adding a new key, doing a key replacement or moving a key between keyrings). Note that pulling in the updates sent via HKP count as a single action, rather than having a commit per key updated. This mirror is updated whenever a new keyring is made active (i.e. made available via the rsync interface). Until that point pending changes are kept private; we sometimes deal with information such as the fact someone has potentially had a key compromised that we don’t want to be public until we’ve actually disabled it. Every “keyring push” (as we refer to the process of making a new keyring active) is tagged with the date it was performed. Releases are also tagged with their codenames, to make it easy to do comparisons over time.

This is actually the least important public interface to the keyring, at least from the perspective of the keyring-maint team. No infrastructure makes use of it and while it’s mostly updated when a new keyring is made active we only make a concerted effort to do so when it is coming up to release. It’s provided as a convenience package rather than something which should be utilised for active verification of which keys are and aren’t currently part of the keyring.

The master git repository for keyring maintenance is stored on kaufmann.debian.org AKA keyring.debian.org. This system is centrally managed by DSA, with only DSA and keyring-maint having login rights to it. None of the actual maintenance work takes place here; it is a bare repo providing a central point for the members of keyring-maint to collaborate around.

Private interface
Private working clone

This is where all of the actual keyring work happens. I have a local clone of the repository from kaufmann on a personal machine. The key additions / changes I perform all happen here, and are then pushed to the master repository so that they’re visible to the rest of the team. When preparing to make a new keyring active the changes that have been sent to the HKP interface are copied from kaufmann via scp and folded in using the pull-updates script. The tree is assembled into keyrings with a simple make and some sanity tests performed using make test. If these are successful the sha512sums.txt file is signed using gpg --clearsign and the output copied over to kaufmann. update-keyrings is then called to update the active keyrings (both rsync + HKP). A git push public pushes the changes to the public repository on anonscm. Finally gbp buildpackage --git-builder='sbuild -d sid' tells git-buildpackage to use sbuild to build a package ready to be uploaded to the archive.

Hopefully that helps explain the different stages and outputs of keyring maintenance; I’m aware that it would be a good idea for this to exist somewhere on keyring.debian.org as well and will look at doing so.

Once again, my focus was on Gitano, which we're working toward a 1.1 for.
We had another one of our Gitano developer days which
was attended by Richard maw and myself. You are invited to read the wiki page
but a summary of what happened, which directly involved me, is:

Whilst anyone can inspect the source code of free software for malicious flaws, most software is distributed pre-compiled to end users.

The motivation behind the Reproducible Builds effort is to permit verification that no flaws have been introduced — either maliciously or accidentally — during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.

4:4.0.0-2 — Make /usr/bin/redis-server in the primary package a symlink to /usr/bin/redis-check-rdb in the redis-tools package to prevent duplicate debug symbols that result in a package file collision. (#868551)

4:4.0.0-3 — Add -latomic to LDFLAGS to avoid a FTBFS on the mips & mipsel architectures.

Here’s a graph showing the IOPS over time for baseline SSD and lvmcache with a full cache under several different kernel versions. As in previous articles, the lines are actually Bezier curves fitted to the data which is scattered all over the place from 500ms averages.

file and committed on 2017-05-14. By the time we reach commit 6cf4cc8f8b3b we have the long-term good performance that we were looking for.

The first of Joe Thornber’s commits on that day in the dm-cache area was 072792dcdfc8 and stepping back to the commit immediately prior to that one (2ea659a9ef48) we get a kernel representing the moment that Linus designated the v4.12-rc1 tag. Joe’s commits went into -rc1, and without them the performance of lvmcache under these test conditions isn’t much better than baseline HDD.

It seems like some of Joe’s changes helped a lot and then the last one really provided the long term performance.

git bisect procedure

Normally when you do a git bisect you’re starting with something that works and you’re looking for the commit that introduced a bug. In this case I was starting off with a known-good state and was interested in which commit(s) got me there. The normal bisect key words of “good” and “bad” in this case would be backwards to what I wanted. Dominic gave me the tip that I could alias the terms in order to reduce my confusion:

$ git bisect start --term-old broken --term-new fixed

From here on, when I encountered a test run that produced poor results I would issue:

$ git bisect broken

and when I had a test run with good results I would issue:

$ git bisect fixed

As I knew that the tag v4.13-rc1 produced a good run and v4.11 was bad, I could start off with:

The only difference from the other articles was that the run time was reduced to 15 minutes as all of the interesting behaviour happened within the first 11 minutes.

To recap, this fio job specification lays out two 2GiB files of random data and then starts two processes that perform 4kiB-sized reads against the files. Direct IO is used, in order to bypass the page cache.

A Zipfian distribution with a factor of 1.2 is used; this gives a 90/10 split where about 90% of the reads should come from about 10% of the data. The purpose of this is to simulate the hot spots of popular data that occur in real life. If the access pattern were to be perfectly and uniformly random then caching would not be effective.

In previous tests we had observed that dramatically different performance would be seen on the first run against an empty cache device compared to all other subsequent runs against what would be a full cache device. In the tests using kernels with the fix present the IOPS achieved would converge towards baseline SSD performance, whereas in kernels without the fix the performance would remain down near the level of baseline HDD. Therefore the fio tests were carried out twice.

Where to next?

I think I am going to see what happens when the cache device is pretty small in comparison to the working data.

All of the tests so far have used a 4GiB cache with 4GiB of data, so if everything got promoted it would entirely fit in cache. Not only that but the Zipf distribution makes most of the hits come from 10% of the data, so it’s actually just ~400MiB of hot data. I think it would be interesting to see what happens when the hot 10% is bigger than the cache device.

git bisect progress and test output

Unless you are particularly interested in the fio output and why I considered each one to be either fixed or broken, you probably want to stop reading now.

4.12.0-rc1+

I forgot to record the commit revision for this test so all I know is that it’s from somewhere in the first v4.12 release candidate.

The second job against an empty cache shows poor performance (8,210 average IOPS), and both jobs against a full cache show poor performance (2,141, 730 average IOPS), so this was marked broken.

The first job running against a full cache managed 17,398 average IOPS which isn’t terrible, but the second job only managed 694 average IOPS. Although this is a lot better than the worst performance seen, it is far below the best so this must be considered broken.

A couple of the release candidates with some of Joe’s fixes in had started to produce approaching decent performance so I wanted to go back to the first tagged kernel for this release candidate to get an overview of where Joe had been starting from.

2,069 and 689 average IOPS against a full cache demonstrate the poor performance.

Lintian is static analysis tool for Debian packages, reporting on various errors, omissions and quality-assurance issues to the maintainer.

I seem to have found myself hacking on it a bit more recently (see my previous installment). In particular, here's the code of mine — which made for a total of 20 bugs closed — that made it into the recent 2.5.52 release:

New tags

Check for the presence of an .asc signature in a .changes file if an upstream signing key is present. (#833585, tag)

Warn when dpkg-statoverride --add is called without a corresponding --list. (#652963, tag)

Check for years in debian/copyright that are later than the top entry in debian/changelog. (#807461, tag)

Trigger a warning when DEB_BUILD_OPTIONS is used instead of DEB_BUILD_MAINT_OPTIONS. (#833691, tag)

Look for "FIXME" and similar placeholders in various files in the debian directory. (#846009, tag)

In the past there used to be a puppet-labs project called puppet-dashboard, which would let you see the state of your managed-nodes. Having even a very basic and simple "report user-interface" is pretty neat when you're pushing out a change, and you want to see it be applied across your fleet of hosts.

There are some other neat features, such as allowing you to identify failures easily, and see nodes that haven't reported-in recently.

This was spun out into a community-supported project which is largely stale:

Inside each directory is a bunch of YAML files which describe the state of the host, and the recipes that were applied. Parsing those is pretty simple, the hardest part would be making a useful/attractive GUI. But happily we have the existing one to "inspire" us.

I think I just need to write down a list of assumptions and see if they make sense. After all the existing installation(s) won't break, it's just a matter of deciding whether it is useful/worthwhile way to spend some time.

Assume you have 100+ hosts running puppet 4.x

Assume you want a broad overview:

All the nodes you're managing.

Whether their last run triggered a change, resulted in an error, or logged anything.

If so what changed/failed/was output?

For each individual run you want to see:

Rough overview.

Assume you don't want to keep history indefinitely, just the last 50 runs or so of each host.

Beyond that you might want to export data about the managed-nodes themselves. For example you might want a list of all the hosts which have "bash" installed on them. Or "All nodes with local user "steve"." I've written that stuff already, as it is very useful for auditing & etc.

The hard part about that is that to get the extra data you'll need to include a puppet module to collect it. I suspect a new dashboard would be broadly interesting/useful but unless you have that extra detail it might not be so useful. You can't point to a slightly more modern installation and say "Yes this is worth migrating to". But if you have extra meta-data you can say:

Give me a list of all hosts running wheezy.

Give me a list of all hosts running exim4 version 4.84.2-2+deb8u4.

And that facility is very useful when you have shellshock, or similar knocking at your door.

Anyway as a hacky start I wrote some code to parse reports, avoiding the magic object-fu that the YAML would usually invoke. The end result is this: