Timers

Timers in Go just do something after a period of time. They are located in the
standard package time.
In particular, timers are time.Timer,
time.Ticker, and less obvious timer
time.Sleep.
It’s not clear from the documentation how timers work exactly. Some people think
that each timer spawns its own goroutine which exists until the timer’s deadline
is reached because that’s how we’d implement timers in “naive” way in Go. We can
check that assumption with the small program:

It prints all goroutine traces before timers spawned if run without arguments
and after timers spawned if any argument is passed. We need those shady panics
because otherwise there is no easy way to see runtime goroutines - they’re excluded
from runtime.NumGoroutines and runtime.Stack, so the only way to see them
is crash(refer to golang/go#9791
for reasons). Let’s see how many goroutines Go spawns before spawning any timers:

runtime.timer

All timers are based on the same data structure -
runtime.timer.
To add new timer, you need to instantiate runtime.timer and pass it to the function
runtime.startTimer.
Here is example from time package:

Note that each new timer takes at least 40 bytes of memory. Large amount of timers
can significantly increase the memory footprint of your program.

So, now we understand what timers look like in the runtime and what they are
supposed to do. Now let’s see how the runtime stores timers and calls functions when
it’s time to call them.

runtime.timers

runtime.timers
is just a Heap data structure.
Heap is very useful when you want to repeatedly find extremum (minimum or maximum) among
some elements. In our case extremum is a timer with closest when to the current
time. Very convenient, isn’t it? So, let’s see what algorithmic complexity the
operations with timers for the worst case:

add new timer - O(log(n))

delete timer - O(log(n))

spawning timers functions - O(log(n))

So, if you have 1 million timers, the number of operations with heap will usually be
less than 1000(log(1kk) ~= 20, but spawning can require multiple minimum deletions,
because multiple timers can reach their deadline at about the same time).
It’s very fast and all the work is happening in a separate goroutine, so it doesn’t block.
The siftupTimer
and siftdownTimer
functions are used for maintaining heap properties.
But data structures don’t work on their own; something should use them. In our
case it’s just one goroutine with the function
timerproc.
It’s spawned on
first timer start.

runtime.timerproc

It’s kinda hard to describe what’s going on without source code, so this section
will be in the form of commented Go code. Code is a direct copy from the src/runtime/time.go
file with added comments.

There are two variables which I think deserve explanation: rescheduling and
sleeping. They both indicate that the goroutine was put to sleep, but different
synchronization mechanisms are used, let’s discuss them.

sleeping is set when all “current” timers are processed, but there are more
which we need to spawn in future. It uses OS-based synchronization, so it calls
some OS syscalls to put to sleep and wake up the goroutine and syscalls means it spawns
OS threads for this.
It uses note
structure and next functions for synchronization:

notetsleepg -
puts goroutine to sleep until notewakeup is called or after some period
of time (in case of timers it’s time until next timer). This func fills timers.waitnote
with “pointer to timer goroutine”.

notewakeup might be called in addtimerLocked if the new timer is “earlier” than
the previous “earliest” timer.

rescheduling is set when there are no timers in our heap, so nothing to do.
It uses the go scheduler to put the goroutine to sleep with function
goparkunlock.
Unlike notetsleepg it does not consume any OS resources, but also does not
support “wakeup timeout” so it can’t be used instead of notetsleepg in
the sleeping branch.
The goready
function is used for waking up the goroutine when a new timer is added with addTimerLocked.

Conclusion

We learned how Go timers work “under the hood” - the runtime neither uses one goroutine per
timer, nor are timers “free” to use. It’s important to understand how things work
to avoid premature optimizations. Also, we learned that it’s quite easy to read
runtime code and you shouldn’t be afraid to do so.
I hope you enjoyed this reading and will share info to your fellow Gophers.

Benchmarks

Benchmarks are tests for performance. It’s pretty useful to have them in
project and compare results from commit to commit. Go has very good tooling for
writing and executing benchmarks. In this article I’ll show how to use package
testing for writing benchmarks.

We see here that one iteration takes 206 nanoseconds. That was easy, indeed.
There are couple of things more about benchmarks in Go, though.

What you can benchmark?

By default go test -bench=. tests only speed of your code, however you can
add flag -benchmem, which will also test a memory consumption and an
allocations count. It’ll look like:

PASS
BenchmarkSample 10000000 208 ns/op 32 B/op 2 allocs/op

Here we have bytes per operation and allocations per operation. Pretty useful
information as for me. You can also enable those reports per-benchmark with
b.ReportAllocs() method.
But that’s not all, you can also specify a throughput of one operation with
b.SetBytes(n int64) method. For example:

Writing profiles

You can read how to analyze profiles in awesome blog post on blog.golang.org
here.

Conclusion

Benchmarks is awesome instrument for programmer. And in Go you to writing and
analyzing becnhmarks is extremely easy. New benchmarks allows you to find
performance bottlenecks, weird code (efficient code is often simpler and more
readable) or usage of wrong instruments. Old benchmarks allow you to be more
confident in your changes and could be another +1 in review process. So,
writing writing benchmarks has enormous benefits for programmer and code and
I encourage you to write more. It’s fun!

Finalizers

Finalizer is basically a function which will be called when your object will lost
all references and will be found by GC. In Go you can add finalizers to your
objects with runtime.SetFinalizer
function. Let’s see how it works.

So, we created object a which is pointer and set simple finalizer to it. When
code left test function - all references to it disappeared and therefore garbage
collector was able to collect a and call finalizer in its own goroutine. You
can try to modify test() function to return *Test an print it in main(),
then you’ll see that finalizer won’t be called.
Also if you remove A field from Test type, because then Test became just
empty struct and empty struct allocates no memory and can’t be collected by GC.

Finalizers examples

Let’s try to find finalizers usage in standard library. There it is used only for
for closing file descriptors like this in net package:

runtime.SetFinalizer(fd, (*netFD).Close)

So, you’ll never leak fd even if you forget to Closenet.Conn.

So probably finalizers not so good idea if even in standard library it has so
limited usage. Let’s see what problems can be.

Why you should avoid finalizers

Finalizers is pretty tempting idea if you come from languages without GC or where
you’re not expecting users to write proper code. In Go we have both GC and
pro-users :) So, in my opinion explicit call of Close is always better than
magic finalizer. For example there is finalizer for fd in os:

and NewFile is called by OpenFile which is called by Open, so if you’re
opening file you’ll hit that code. Problem with finalizers that you have no
control over them, and more than that you’re not expecting them. Look at code:

It’s pretty common operation to get file descriptor from path when you’re
writing some stuff for linux. But that code is unreliable, because when you’re
return from getFdf loses its last reference and so your file is doomed to
be closed sooner or later (when next GC cycle will come). Here is problem not
that file will be closed, but that it’s not documented and not expected at all.

Conclusion

I think it’s better to suppose that users are smart enough to cleanup object
themselves. At least all methods which call SetFinalizer should document this,
but I personally don’t see any value in this method for me.

Network namespace

From man namespaces:

Network namespaces provide isolation of the system resources associated with
networking: network devices, IPv4 and IPv6 protocol stacks, IP routing tables,
firewalls, the /proc/net directory, the /sys/class/net directory, port numbers
(sockets), and so on. A physical network device can live in exactly one
network namespace.
A virtual network device ("veth") pair provides a pipe-like abstraction that
can be used to create tunnels between network namespaces, and can be used to
create a bridge to a physical network device in another namespace.

Network namespace allows you to isolate a network stack for your container. Note
that it’s not include hostname - it’s tasks of UTS namespace.

We can create network namespace with flag syscall.CLONE_NEWNET in
SysProcAttr.Cloneflags. After namespace creation there are only autogenerated
network namespaces(in most cases only loopback interface). So we need to inject
some network interface into namespace, which allow container to talk to other
containers. We will use veth-pairs for this as it was mentioned in man-page.
It’s not only way and probably not best, but it is most known and used in Docker
by default.

Unet

For interfaces creation we will need new binary with suid bit set, because
it’s pretty privileged operations. We can create them with awesome iproute2
set of utilities, but I decided to write all in Go, because it’s fun and I want
to promote awesome netlink library -
with this library you can do any operations on networking stuff.

After all this we will have “pipe” from container to bridge unc0. But all not
so easy, don’t forget that we talking about unprivileged containers, so we need
to run all code from unprivileged user, but that particular part must be
executed with root rights. We can set suid bit for this, this will allow
unprivileged user to run that binary as privileged. I did next:

That’s all you need to run this binary. Actually you don’t need to run it,
unc will do this :)

Waiting for interface

Now we can create interfaces in namespaces of specified PID. But process expects
that network already ready when it starts, so we need somehow to wait until
interface will be created by unet in fork part of program, before calling
syscall.Exec. I decided to use pretty simple idea for this: just poll an
interface list until first veth device is appear. Let’s modify our
container.Start to put interface in namespace after we start fork-process:

They can talk! It’s like magic, right? You can find all code under tag
netns.

The end

This is last post about unprivileged containers(at least about namespaces). We
created an isolated environment for process, which you can run under unprivileged
user. Containers though is little more than just isolation - also you want to
specify what process can do inside container
(Linux capabilities),
how much resources process can use
(Cgroups)
and you can imagine many other things. I invite you to look what we have in
runc/libcontainer,
it’s not very easy code, but I hope that after my posts you will be able to
understand it. If you have any questions feel free to write me, I’m always happy
to share my humble knowledge about containers.

Mount namespace

From man namespaces:

Mount namespaces isolate the set of filesystem mount points, meaning that
processes in different mount namespaces can have different views of the
filesystem hierarchy. The set of mounts in a mount namespace is modified using
mount(2) and umount(2).

So, mount namespace allows you to give your process different set of mounts. You
can have separate /proc, /dev etc. It’s easy just like pass one more flag to
SysProcAttr.Cloneflags: syscall.CLONE_NEWNS. It has such weird name because
it was first introduced namespace and nobody could think that there will be more.
So, if you see CLONE_NEWNS, know - this is mount namespace.
Let’s try to enter our container with new mount namespace. We’ll see all the same
mounts as in host. That’s because new mount namespace receives copy of parent
host namespace as initial mount table. In our case we’re pretty restricted in
what we can do with this mounts, for example we can’t unmount anything:

$ umount /proc
umount: /proc: not mounted

That’s because we use “unprivileged” namespace. But we can mount new /proc over
old:

mount -t proc none /proc

Now you can see, that ps shows you only your process. So, to get rid of host
mounts and have nice clean mount table we can use pivot_root syscall to change
root from host root to some another. But first we need to write some code to
really mount something into new rootfs.

Mounting inside root file system

So, for next steps we will need some root filesystem for tests. I will use
busybox one, because it’s very small, but useful. Busybox rootfs from Docker
official image you can take
here. Just
unpack it to directory busybox somewhere:

Now we have something mounted inside our new rootfs. Time to pivot_root to it.

Pivot root

From man 2 pivot_root:

int pivot_root(const char *new_root, const char *put_old);
...
pivot_root() moves the root filesystem of the calling process to the directory
put_old and makes new_root the new root filesystem of the calling process.
...
The following restrictions apply to new_root and put_old:
- They must be directories.
- new_root and put_old must not be on the same filesystem as the current root.
- put_old must be underneath new_root, that is, adding a nonzero number
of /.. to the string pointed to by put_old must yield the same directory as new_root.
- No other filesystem may be mounted on put_old.

So, it’s taking current root, moves it to old_root with all mounts and makes
new_root as new root. pivot_root is more secure than chroot, it’s pretty hard
to escape from it. Sometimes pivot_root isn’t working(for example on Android
systems, because of special kernel loading process), then you need to use
mount to “/” with MS_MOVE flag and chroot there, here we won’t discuss this case.

I hope that all is clear from comments, let me know if not. It is all code that
you need to have your own unprivileged container with its own rootfs. You can
try to find other rootfs among docker images sources, for example alpine linux
is pretty exciting distribution. Also you can try to mount something more inside
container.

That’s all for today. Tag for this article on github is
mnt_ns. Remember that you should
run unc from unprivileged user and from directory, which contains rootfs. Here
is examples of some commands inside container(excluding logging):

Setup namespaces

In previous part we created some namespaces and executed process in them. It was
cool, but in real world we need to setup namespaces before process starts.
For example setup mounts, make chroot, set hostname, create network interfaces
etc. We need this because we can’t expect from user process that it will do it,
it want all ready to execute.

So, in our case we want to insert some code after namespaces creation, but before
process execution. In C it’s pretty easy to do, because there is clone call
there. Not so easy in Go(but easy, really). In Go we need to spawn new process
with our code in new namespaces. We can do it with executing our own binary
again with different arguments.

Here we create *exec.Cmd which will call same binary with same arguments as
caller, but will replace os.Args[0] with string unc-fork (yes, you can
specify any os.Args[0], not only program name).
It will be our keyword, which indicates that we want to setup namespaces and
execute process.

syscall.Exec calls execve syscall, you can read about it more in
man execve. It receives path to binary, arguments and array of environmental
variables. Here we just passing all variables down to process, but we can change
them in fork() too.

UTS namespace

Let’s do some real work in our new shiny function. Let’s try to setup hostname
for our “container” (by default it inherits hostname of host). Let’s add next
lines to fork():

UTS namespaces provide isolation of two system identifiers: the hostname and the NIS domain name.

So let’s isolate our hostname from host’s hostname. We can create our own UTS
namespace by adding syscall.CLONE_NEWUTS to Cloneflags. Now we’ll see
successfully changed hostname:

$ unc hostname
unc

Code

Tag on github for this article is uts_setup, it can be found
here. I added some functions
to separate steps, created Cfg structure in container.go file, so later we
can change container configuration in one place.
Also I added logging with awesome library
logrus.

Thanks for reading! I hope to see you next week in part about mount namespaces,
it’ll be very interesting.

Unprivileged namespaces

Unprivileged(or user) namespaces are Linux
namespaces, which can
be created by an unprivileged(non-root) user. It is possible only with a usage
of user namespaces. Exhaustive info about user namespaces you can find in
manpage man user_namespaces. Basically for creating your namespaces you need
to create user namespace first. The kernel can take a job of creating
namespaces in the right order for you, so you can just pass a bunch of flags to
clone and user namespace always created first and is a parent for other
namespaces.

User namespace

In user namespace you can map user and groups from host to this namespace, so
for example, your user with uid 1000 can be 0(root) in a namespace.

Mrunal Patel introduced to
Go
support for user and groups and go 1.4.0 including it. Unfortunately, there was
security fix to linux kernel 3.18, which prevents group mappings from the
unprivileged user without disabling setgroups syscall. It was
fixed
by me and will be released in 1.5.0 (UPD: Already released!).

For executing process in new user namespace, you need to create *exec.Cmd
like this:

Here you can see syscall.CLONE_NEWUSER flag in SysProcAttr.Cloneflags,
which means just “please, create new user namespace for this process”, another
namespace can be specified there too. Mappings fields talk for themselves.
Size means a size of a range of mapped IDs, so you can remap many IDs without
specifying each.

PID namespaces

From man pid_namespaces:

PID namespaces isolate the process ID number space

That is it, your process in this namespace has PID 1, which is sorta cool:
You are like systemd, but better. In our first part ps awux won’t show only
our process, because we need mount namespace and remount /proc, but still you
can see PID 1 with echo $$.

First unprivileged container

I am pretty bad at writing big texts, so I decided to split container creation
to several parts. Today we will see only user and PID namespace creation, which
still pretty impressive. So, for adding PID namespace we need to modify
Cloneflags:

Cloneflags: syscall.CLONE_NEWUSER | syscall.CLONE_NEWPID

For this articles, I created a project on Github: https://github.com/LK4D4/unc.
unc means “unprivileged container” and has nothing in common with
runc(maybe only a little). I will tag
code for each article in a repo. Tag for this article is
user_pid. Just compile it with
go1.5 and try to run different commands from an unprivileged user in
namespaces:

$ unc sh
$ unc whoami
$ unc sh -c "echo \$\$"

It is doing nothing fancy, but just connects your standard streams to executed
process and execute it in new namespaces with a remapping current user and
group to root user and the group inside user namespace. Please read all code,
there is not much for now.

Next steps

Most interesting part of containers is mount namespace. It allows you to have
mounts separate from host(/proc for example). Another interesting namespace
is a network, it is little tough for an unprivileged user, because you need to
create network interfaces on host first, so for this you need some superpowers
from the root. In next article, I hope to cover mount namespace - so it a real
container with own root filesystem.

Thanks for reading! I am learning all this stuff myself right now by writing
this articles, so if you have something to say, please feel free to comment!

It is easy with Docker though. Let’s see how we can upload our first program
to Arduino Uno without installing anything apart from Docker.

Kernel Modules

For Arduino Uno I need to enable

Device Drivers -> USB support -> USB Modem (CDC ACM) support

as module.

Then I compiling and loading it with

make modules && make modules_install && modprobe cdc-acm

in my /usr/src/linux. At last I connect Arduino and see it as /dev/ttyACM0.

Installing ino

For this we just need image from hub.docker.com:

docker pull coopermaa/ino

It’s slightly outdated, but I sent
PR to use new base image,
because that’s how we do this in opensource world. Anyway it works great. Let’s
create script for calling ino through Docker, add next script to your $PATH

Uploading program

Vim integration

I’m using Vim plugin for ino, you
can easily install it with any plugin manager for vim. You don’t need anything
special, it’ll just work. You can compile and upload your sketch with
<Leader>ad.

Known issues

For using ino serial you need to add -t to docker run arguments to your
script. It works pretty weird though, you need to kill process
/usr/bin/python /usr/local/bin/ino serial by hands every time, but it works
and looks not so bad.

Also files created by ino init will belong to root, which isn’t very
convenient.

That’s all!

Prelude

Yesterday I finished my first 30-day streak on GitHub.
Most of contributions were to Docker –
the biggest opensource project on Go. I learned a lot in this month, and it was
really cool. I think that this is mostly because of Go language. I’ve been
programming on Python for five years and I was never so excited about open source,
because Python is not even half so fun as Go.

1. Tools

There are a lot of tools for go, some of them just are “must have”.

Goimports - like
go fmt but with cool imports handling, I really think that go fmt needs to
be replaced with Goimports in future Go versions.

Vet - analyzes code for
some suspicious constructs. You can find with it: bad format strings, unreachable
code, passing mutex by value and etc.
PR about vet erros in Docker.

2. Editor

I love my awesome vim with awesome vim-go plugin,
which is integrated with tools mentioned above.
It formats code for me, adds needed imports, removes unused imports, shows
documentation, supports tagbar and more. And my favourite - go to definition. I
really suffered without it :) With vim-go my development rate became faster
than I could imagine. You can see my config in my dotfiles
repo.

3. Race detector

This is one of the most important and one of the most underestimated thing.
Very useful and very easy to use. You can find description and examples
here. I’ve found many race conditions
with this tool (#1,
#2,
#3,
#4,
#5).

4. Docker specific

Docker has very smart and friendly community. You can always ask for help about
hacking in #docker-dev on Freenode. But I’ll describe some simple tasks that appears
when you try to hack docker first time.

Tests

There are three kinds of tests in docker repo:

unit - unit tests(ah, we all know what unit tests are, right?). These tests
spreaded all over repository and can be run by make test-unit. You can run
tests for one directory, specifying it in TESTDIRS variable. For example

TESTDIRS="daemon" make test-unit

will run tests only for daemon directory.

integration-cli - integration tests, that use external docker commands
(for example docker build, docker run, etc.). It is very easy to write this
kind of tests and you should do it if you think that your changes can change
Docker’s behavior from client’s point of view. These tests are located in integration-cli
directory and can be run by make test-integration-cli. You can run one or more
specific tests with setting TESTFLAGS variable. For example

TESTFLAGS="-run TestBuild" make test-integration-cli

will run all tests whose names starts with TestBuild.

integration - integration tests, that use internal docker datastructures.
It is deprecated now, so if you want to write tests you should prefer
integration-cli or unit. These tests are located in integration directory and
can be run by make test-integration.

All tests can be run by make test.

Build and run tests on host

All make commands execute in docker container, it can be pretty annoying to
build container just for running unit tests for example.

So, for running unit test on host machine you need canonical Go
workspace. When it’s ready you can
just do symlink to docker repo in src/github.com/dotcloud/docker. But we still
need right $GOPATH, here is the trick:

Useful links

Conclusion

This is all that I wanted to tell you about my first big opensource experience.
Also, just today Docker folks launched some
new projects and I am very excited about it.
So, I want to invite you all to the magical world of Go, Opensource and,
of course, Docker.

Prelude

This post based on
real events
in docker repository.
When I revealed that my 20-percent-cooler refactoring made Pop function x4-x5
times slower, I did some research and concluded, that problem was in using
defer statement for unlocking everywhere.

In this post I’ll write simple program and benchmarks from which we will see,
that sometimes defer statement can slowdown your program a lot.

Let’s create simple queue with methods Put and Get. Next snippets shows such
queue and benchmarks for it. Also I wrote duplicate methods with defer and
without it.

Prelude

There is awesome coverage in go. You can read about
it here. But also it has some limitations.
For example let’s assume that we have next code structure:

src
├── pkg1
│ ├── pkg11
│ └── pkg12
└── pkg2
├── pkg21
└── pkg22

pkg2, pkg21, pkg22 uses pkg1, pkg11 and pkg12 in different cases.
So question is – how we can compute overall coverage for our code base?

Generating cover profiles

Let’s consider some possible go test commands with -coveprofile:

go test -coverprofile=cover.out pkg2

tests run only for pkg1 and cover profile generated only for pkg2

go test -coverprofile=cover.out -coverpkg=./... pkg2

tests run only for pkg2 and cover profile generated for all packages

go test -coverprofile=cover.out -coverpkg=./... ./...

boo hoo: cannot use test profile flag with multiple packages

So, what we can do for running tests on all packages and get cover profile
for all packages?

Merging cover profiles

Now we able to get overall profile for each package individually.
It seems that we can merge this files. Profile file has next structure,
according to
cover code:

// First line is "mode: foo", where foo is "set", "count", or "atomic".
// Rest of file is in the format
// encoding/base64/base64.go:34.44,37.40 3 1
// where the fields are: name.go:line.column,line.column numberOfStatements count