Digital ramblings of a french gopher

31 Jul 2018

I am starting today an article for arXiv about Go and Go-HEP.
I thought structuring my thoughts a bit (in the form of a blog post) would help fluidify the process.

(HEP) Software is painful

In my introduction talk(s) about Go and Go-HEP, such as here, I usually talk about software being painful.
HENP software is no exception.
It is painful.

As a C++/Python developer and former software architect of one of the four LHC experiments, I can tell you from vivid experience that software is painful develop.
One has to tame deep and complex software stacks with huge dependency lists.
Each dependency comes with its own way to be configured, built and installed.
Each dependency comes with its own dependencies.
When you start working with one of these software stacks, installing them on your own machine is no walk in the park, even for experienced developers.
These software stacks are real snowflakes: they need their unique cocktail of dependencies, with the right version, compiler toolchain and OS, tightly integrated on usually a single development platform.

Granted, the de facto standardization on CMake and docker did help with some of these aspects, allowing projects to cleanly encapsulate the list of dependencies in a reproducible way, in a container.
Alas, this renders code easier to deploy but less portable: everything is linux/amd64 plus some arbitrary Linux distribution.

In HENP, with C++ being now the lingua franca for everything that is related with framework or infrastructure, we get unwiedly compilation times and thus a very unpleasant edit-compile-run development cycle.
Because C++ is a very complex language to learn, read and write - each new revision more complex than the previous one - it is becoming harder to bring new people on board with existing C++ projects that have accumulated a lot of technical debt over the years: there are many layers of accumulated cruft, different styles, different ways to do things, etc…

Also, HENP projects heavily rely on shared libraries: not because of security, not because they are faster at runtime (they are not), but because as C++ is so slow to compile, it is more convenient to not recompile everything into a static binary.
And thus, we have to devise sophisticated deployment scenarii to deal with all these shared libraries, properly configuring $LD_LIBRARY_PATH, $DYLD_LIBRARY_PATH or -rpath, adding yet another moving piece in the machinery.
We did not have to do that in the FORTRAN days: we were building static binaries.

From a user perspective, HENP software is also - even more so - painful.
One needs to deal with:

overly complicated Object Oriented systems,

overly complicated inheritance hierarchies,

overly complicated meta-template programming,

and, of course, dependencies.
It’s 2018 and there are still no simple way to handle dependencies, nor a standard one that would work across operating systems, experiments or analysis groups, when one lives in a C++ world.
Finally, there is no standard way to retrieve documentation - and here we are just talking about APIs - nor a system that works across projects and across dependencies.

All of these issues might explain why many physicists are migrating to Python.
The ecosystem is much more integrated and standardized with regard to installation procedures, serving, fetching and describing dependencies and documentation tools.
Python is also simpler to learn, teach, write and read than C++.
But it is also slower.

Most physicists and analysts are willing to pay that price, trading reduced runtime efficiency for a wealth of scientific, turn-key pure-Python tools and libraries.
Other physicists strike a different compromise and are willing to trade the relatively seamless installation procedures of pure-Python software with some runtime efficiency by wrapping C/C++ libraries.

To summarize, Python and C++ are no panacea when you take into account the vast diversity of programming skills in HENP, the distributed nature of scientific code development in HENP, the many different teams’ sizes and the constraints coming from the development of scientific analyses (agility, fast edit-compile-run cycles, reproducibility, deployment, portability, …)
To add insult to injury, these languages are rather ill equiped to cope with distributed programming and parallel programming: either because of a technical limitation (CPython’s Global Interpreter Lock) or because the current toolbox is too low-level or error-prone.

Are we really left with either:

a language that is relatively fast to develop with, but slow at runtime, or

a language that is painful to develop with but fast at runtime ?

Mending software with Go

Of course, I think Go can greatly help with the general situation of software in HENP.
It is not a magic wand, you still have to think and apply work.
But it is a definitive, positive improvement.

Go was created to tackle all the challenges that C++ and Python couldn’t overcome.
Go was designed for “programming in the large”.
Go was designed to strive at scales: software development at Google-like scale but also at 2-3 people scale.

But, most importantly, Go wasn’t designed to be a good programming language, it was designed for software engineering:

Software engineering is what happens to programming
when you add time and other programmers.

Go is a simple language - not a simplistic language - so one can easily learn most of it in a couple of days and be proficient with it in a few weeks.

Go has builtin tools for concurrency (the famed goroutines and channels) and that is what made me try it initially.
But I stayed with Go for everything else, ie the tooling that enables:

integrated, simple, build system (go build) that handles dependencies (go get), without messing around with CMakeList.txt, Makefile, setup.py nor pom.xml build files: all the needed information is in the source files,

easiest cross-compiling toolchain to date.

And all these tools are usable from every single editor or IDE.

Go compiles optimized code really quickly.
So much so that the go run foo.go command, that compiles a complete program and executes it on the fly, feels like running python foo.py - but with builtin concurrency and better runtime performances (CPU and memory.)
Go produces static binaries that usually do not even require libc.
One can take a binary compiled for linux/amd64, copy it on a Centos-7 machine or on a Debian-8 one, and it will happily perform the requested task.

As a Gedankexperiment, take a standard centos7docker image from docker-hub and imagine having to build your entire experiment software stack, from the exact gcc version down to the last wagon of your train analysis.

How much time would it take?

How much effort of tracking dependencies and ensuring internal consistency would it take?

How much effort would it be to deploy the binary results on another machine? on another non-Linux machine?

Gonum is almost at feature parity with the numpy/scipy stack.
Gonum is still missing some tools, like ODE or more interpolation tools, but the chasm is closing.

Right now, in a HENP context, it is not possible to perform an analysis in Go and insert it in an already existing C++/Python pipeline.
At least not easily: while reading is possible, Go-HEP is still missing the ability to write ROOT files.
This restriction should be lifted before the end of 2018.

That said, Go can already be quite useful and usable, now, in science and HENP, for data acquisition, monitoring, cloud computing, control frameworks and some physics analyses.
Indeed, Go-HEP provides HEP-oriented tools such as histograms and n-tuples, Lorentz vectors, fitting, interoperability with HepMC and other Monte-Carlo programs (HepPDT, LHEF, SLHA), a toolkit for a fast detector simulation à la Delphes and libraries to interact with ROOT and XRootD.

I think building the missing scientific libraries in Go is a better investment than trying to fix the C++/Python languages and ecosystems.

math/rand exposes convenience functions (Float32, Float64, ExpFloat64, …) that share a global rand.Rand value, the “default” source of (pseudo) random numbers.
These convenience functions are safe to be used from multiple goroutines concurrently, but this may generate lock contention.
It’s probably a good idea in your libraries to not rely on these convenience functions and instead provide a way to use local rand.Rand values, especially if you want to be able to change the seed of these rand.Rand values.

Note that this has slightly changed the previous "uniform.png" plot: we are sharing the source of random numbers between the 2 histograms.
The sequence of random numbers is exactly the same than before (modulo the fact that now we generate -at least- twice the number than previously) but they are not associated to the same histograms.

OK, this does generate a gaussian.
But what if we want to generate a gaussian with a mean other than 0 and/or a standard deviation other than 1 ?

10 Oct 2017

Now, we tackle the L3 LEP data.
L3 was an experiment at the Large Electron Positron collider, at CERN, near Geneva.
Until 2000, it recorded the decay products of e+e- collisions at center of mass energies up to 208 GeV.

An example is the muon pair production:

$$ e^+ e^- \rightarrow \mu^+\mu^-$$

Both muons are mainly detected and reconstructed from the tracking system.
From the measurements, the curvature, charge and momentum are determined.

The file L3.dat contains recorded muon pair events.
Every line is an event, a recorded collision of a \(e^+e^-\) pair producing a \(\mu^+\mu^-\) pair.

The first three columns contain the momentum components \(p_x\), \(p_y\) and \(p_z\) of the \(\mu^+\).
The other three columns contain the momentum components for the \(mu^-\).
Units are in \(GeV/c\).

Forward-Backward Asymmetry

An important parameter that constrains the Standard Model (the theoretical framework that models our current understanding of Physics) is the forward-backward asymmetry A:

$$ A = (N_F - N_B) / (N_F + N_B) $$

where:

\(N_F\) are the events in which the \(\mu^-\) flies forwards (\(\cos \theta_{\mu^-} > 0\));

\(N_B\) are the events in which the \(\mu^-\) flies backwards.

Given the L3.dat dataset, we would like to estimate the value of \(A\) and determine its statistical error.

In a simple counting experiment, we can write the statistical error as:

gonum/optimize doesn’t try to automatically numerically compute the first- and second-derivative of an objective function (MINUIT does.)
But using gonum/diff/fd, it’s rather easy to provide it to gonum/optimize.

gonum/optimize.Result only exposes the following informations (through gonum/optimize.Location):

With this quite blunt tool, we can analyse some real data from real life.
We will use a dataset pertaining to the salary of European developers, all 1147 of them :).
We have this dataset in a file named salary.txt.

Have a look at the official dis
module documentation for more informations.
In a nutshell, the LOAD_CONST is the same than our toy OpLoadValue and LOAD_FAST
is the same than our toy OpLoadName.

Simply inspecting this little bytecode snippet shows how conditions and branch-y
code might be handled.
The instruction POP_JUMP_IF_FALSE implements the if x < 5 statement from the
cond() function.
If the condition is false (i.e.:x is greater or equal than 5), the interpreter
is instructed to jump to position 22 in the bytecode stream, i.e. the return "no"
body of the false branch.
Loops are handled pretty much the same way:

The above bytecode dump should be rather self-explanatory.
Except perhaps for the RETURN_VALUE instruction: where does the
instruction return to?

To answer this, a new concept must be introduced: the Frame.

Frames

As the AOSA article puts it:

A frame is a collection of information[s] and context for a chunk of code.

Whenever a function is called, a new Frame is created, carrying a data stack
(the local variables we have played with so far) and a block stack (to handle
control flow such as loops and exceptions.)

The RETURN_VALUE instructs the interpreter to pass a value between Frames,
from the callee’s data stack back to the caller’s data stack.

I’ll show the pygo implementation of a Frame in a moment.

Pygo components

Still following the blueprints of AOSA and byterun, pygo is built on
the following types:

a VM (virtual machine) which manages the high-level structures (call stack
of frames, mapping of instructions to operations, etc…).
The VM is a slightly more complex version of the previous Interpreter
type from tiny-interp,

a Frame: every Frame value contains a code value and manages some state
(such as the global and local namespaces, a pointer to the calling Frame
and the last bytecode instruction executed),

a Function to model real Python functions: this is to correctly handle
the creation and destruction of Frames,

a Block to handle Python block management on to which control flow and loops
are mapped.

Virtual machine

Each value of a pygo.VM must store the call stack, the Python
exception state and the return values as they flow between frames:

The astute reader will probably notice I have slightly departed from
AOSA’s python code.
In the book, each instruction is actually a 2-tuple (Opcode, Value).
Here, an instruction is just a stream of “integers”, being (implicitly) either
an Opcode or an operand.

The CPython interpreter is a stack machine.
Its instruction set reflects that implementation detail and thus,
our tiny interpreter implementation will have to cater for this aspect too:

Now, the interpreter has to actually run the code, iterating over each
instructions, pushing/popping values to/from the stack, according to
the current instruction.
That’s done in the Run(code Code) method:

Variables

The AOSA article sharply notices that, even though this tiny-interp interpreter
is quite limited, its overall architecture and modus operandi are quite comparable
to how the real python interpreter works.

Save for variables.
tiny-interp doesn’t do variables.
Let’s fix that.

Consider this code fragment:

a = 1
b = 2
print(a+b)

tiny-interp needs to be modified so that:

values can be associated to names (variables), and

new Opcodes need to be added to describe these associations.

Under these new considerations, the above code fragment would be compiled
down to the following program:

The new opcodes OpStoreName and OpLoadName respectively store the current
value on the stack with some variable name (the index into the Names slice) and
load the value (push it on the stack) associated with the current variable.

The Interpreter now looks like:

type Interpreter struct {
stack stack
env map[string]int
}

where env is the association of variable names with their current value.

07 Sep 2016

In this series of posts, I’ll try to explain how one can write an interpreter
in Go and for Go.
If, like me, you lack a bit in terms of interpreters know-how, you should be
in for a treat.

Introduction

Go is starting to get traction in the science and data science communities.
And, why not?
Go is fast to compile and run, is statically typed and thus presents a nice
“edit/compile/run” development cycle.
Moreover, a program written in Go is easily deployable and cross-compilable
on a variety of machines and operating systems.

Go is also starting to have the foundation libraries for scientific work:

And the data science community is bootstrapping itself around the gopherds
community (slack channel: #data-science).

For data science, a central tool and workflow is the Jupyter and its
notebook.
The Jupyter notebook provides a nice “REPL”-based workflow and the ability
to share algorithms, plots and results.
The REPL (Read-Eval-Print-Loop) allows people to engage fast exploratory
work of someone’s data, quickly iterating over various algorithms or
different ways to interpret data.
For this kind of work, an interactive interpreter is paramount.

But Go is compiled and even if the compilation is lightning fast, a true
interpreter is needed to integrate well with a REPL-based workflow.

The go-interpreter project (also available
on Slack: #go-interpreter)
is starting to work on that: implement a Go interpreter, in Go and for Go.
The first step is to design a bit this beast: here.

Before going there, let’s do a little detour: writing a (toy) interpreter
in Go for Python.
Why? you ask…
Well, there is a very nice article in the AOSA series:
A Python interpreter written in Python.
I will use it as a guide to gain a bit of knowledge in writing interpreters.

PyGo: A (toy) Python interpreter

In the following, I’ll show how one can write a toy Python interpreter in Go.
But first, let me define exactly what pygo will do.
pygo won’t lex, parse nor compile Python code.

No.
pygo will take directly the already compiled bytecode, produced with a
python3 program, and then interpret the bytecode instructions: