We are pleased to announce the release of a second release candidate for OPAM 2.0.0.

This new version brings us very close to a final 2.0.0 release, and in addition to many fixes, features big performance enhancements over the RC1.

Among the new features, we have squeezed in full sandboxing of package commands for both Linux and Mac OS, to protect our users from any misbehaving scripts.

NOTE: if upgrading manually from 2.0.0~rc, you need to run opam init --reinit -ni to enable sandboxing.

The new release candidate also offers the possibility to setup a hook in your shell, so that you won’t need to run eval $(opam env) anymore. This is specially useful in combination with local switches, because with it enabled, you are guaranteed that running make from a project directory containing a local switch will use it.

A better solution can be found by reading the manual section for simpl (!). It is possible to customize for each definition the amount of unfolding performed by simpl. For example, the following instructs simpl to never unfold Z.mul:

ArgumentsZ.mul:simplnever.

Since this invocation only matters for simpl, eval compute and PolTac both continue to work.

This article assumes familiarity with Dijkstra's shortest path algorithm. For a refresher, see [1]. The code assumes open Core is in effect and is online here.

The first part of the program organizes our thoughts about what we are setting out to compute. The signature summarizes the notion (for our purposes) of a graph definition in modular form. A module implementing this signature defines a type vertex_t for vertices, a type t for graphs and type extern_t

This article assumes familiarity with Dijkstra's shortest path algorithm. For a refresher, see [1]. The code assumes open Core is in effect and is online here.

The first part of the program organizes our thoughts about what we are setting out to compute. The signature summarizes the notion (for our purposes) of a graph definition in modular form. A module implementing this signature defines a type vertex_t for vertices, a type t for graphs and type extern_t : a representation of a t for interaction between an implementing module and its "outside world".

A realization of Graph_sig provides "conversion" functions of_adjacency/to_adjacency between the types extern_t and t and nests a module Dijkstra. The signature of the sub-module Dijkstra requires concrete modules provide a type state and an implementation of Dijkstra's algorithm in terms of the function signature val dijkstra : vertex_t -> t -> [ `Ok of state | `Error of error ].

For reusability, the strategy for implementing graphs will be generic programming via functors over modules implementing s vertex type.

An implementation of the module type GRAPH defines a module type VERT which is required to provide a comparable type t. It further defines a module type S that is exactly module type Graph_sig above. Lastly, modules of type GRAPH provide a functor Make that maps any module of type VERT to new module of type S fixing extern_t to an adjacency list representation in terms of the native OCaml type 'a list and float to represent weights on edges.

As per the requirements of GRAPH the module types VERT and S are provided as is the functor Make. It is the code that is ellided by the ... above in the definition of Make that is now the focus.

Modules produced by applications of Make satisfy S. This requires suitable definitions of types vertext_t, t and extern_t. The modules Map and Set are available due to modules of type VERT being comparable in their type t.

While the external representation extern_t of graphs is chosen to be an adjacency list representation in terms of association lists, the internal representation t is a vertex map of adjacency lists providing logarithmic loookup complexity. The conversion functions between the two representations "come for free" via module Map.

At this point the "scaffolding" for Dijkstra's algorithm, that part of GRAPH dealing with the representation of graphs is implemented.

The interpretation of Dijkstra's algorithm we adopt is functional : the idea is we loop over vertices relaxing their edges until all shortest paths are known. What we know on any recursive iteration of the loop is a current "state" (of the computation) and each iteration produces a new state. This next definition is the formal definition of type state.

s : Set.t, the set S of nodes for which the lower bound shortest path weight is known;

v_s : (vertex_t * float) Heap.t, V - {S}, , the set of nodes of G for which the lower bound of the shortest path weight is not yet known ordered on their estimates.

Function invocation init src g compuates an initial state for the graph g containing the source node src. In the initial state, d is everywhere ∞ except for src which is 0. Set S (i.e. s) and the predecessor relation π (i.e. pred) are empty and the set V - {S} (i.e. v_s) contains all nodes.

Jane Street is a proprietary quantitative trading firm, focusing primarily on trading equities and equity derivatives. We use innovative technology, a scientific approach, and a deep understanding of markets to stay successful in our highly competitive field. We operate around the clock and around the globe, employing over 500 people in offices in New York, London and Hong Kong.

The markets in which we trade change rapidly, but our intellectual approach changes faster still…

Jane Street is a proprietary quantitative trading firm, focusing primarily on trading equities and equity derivatives. We use innovative technology, a scientific approach, and a deep understanding of markets to stay successful in our highly competitive field. We operate around the clock and around the globe, employing over 500 people in offices in New York, London and Hong Kong.

The markets in which we trade change rapidly, but our intellectual approach changes faster still. Every day, we have new problems to solve and new theories to test. Our entrepreneurial culture is driven by our talented team of traders and programmers. At Jane Street, we don't come to work wanting to leave. We come to work excited to test new theories, have thought-provoking discussions, and maybe sneak in a game of ping-pong or two. Keeping our culture casual and our employees happy is of paramount importance to us.

We are looking to hire great software developers with an interest in functional programming. OCaml, a statically typed functional programming language with similarities to Haskell, Scheme, Erlang, F# and SML, is our language of choice. We've got the largest team of OCaml developers in any industrial setting, and probably the world's largest OCaml codebase. We use OCaml for running our entire business, supporting everything from research to systems administration to trading systems. If you're interested in seeing how functional programming plays out in the real world, there's no better place.

The atmosphere is informal and intellectual. There is a focus on education, and people learn about software and trading, both through formal classes and on the job. The work is challenging, and you get to see the practical impact of your efforts in quick and dramatic terms. Jane Street is also small enough that people have the freedom to get involved in many different areas of the business. Compensation is highly competitive, and there's a lot of room for growth.

Jane Street is a proprietary quantitative trading firm, focusing primarily on trading equities and equity derivatives. We use innovative technology, a scientific approach, and a deep understanding of markets to stay successful in our highly competitive field. We operate around the clock and around the globe, employing over 500 people in offices in New York, London and Hong Kong.

The markets in which we trade change rapidly, but our intellectual approach changes faster still. Every day, we have …

Jane Street is a proprietary quantitative trading firm, focusing primarily on trading equities and equity derivatives. We use innovative technology, a scientific approach, and a deep understanding of markets to stay successful in our highly competitive field. We operate around the clock and around the globe, employing over 500 people in offices in New York, London and Hong Kong.

The markets in which we trade change rapidly, but our intellectual approach changes faster still. Every day, we have new problems to solve and new theories to test. Our entrepreneurial culture is driven by our talented team of traders and programmers. At Jane Street, we don't come to work wanting to leave. We come to work excited to test new theories, have thought-provoking discussions, and maybe sneak in a game of ping-pong or two. Keeping our culture casual and our employees happy is of paramount importance to us.

As Jane Street grows, the quality of the development tools we use matters more and more. We increasingly work on the OCaml compiler itself: adding useful language features, fine-tuning the type system and improving the performance of the generated code. Alongside this, we also work on the surrounding toolchain, developing new tools for profiling, debugging, documentation and build automation.

We're looking to hire a developer with experience working on compilers to join us. That experience might be from working on a production compiler in industry or from working on research compilers in an academic setting. No previous experience with OCaml or functional programming languages is required.

We’re looking for candidates for both our London and New York offices. Benefits and compensation are highly competitive.

You can get it from Alt-Ergo’s website. An OPAM package for it will be published in the next few days.

The major novelty of this release is a new experimental front-end that supports the SMT-LIB 2 language, extended prenex polymorphism. This extension is implemented as a standalone library, and is available here: https://github.com/Coquera/psmt2-frontend

The full list of CHANGES is available here. As usual, do not hesitate to report b…

You can get it from Alt-Ergo’s website. An OPAM package for it will be published in the next few days.

The major novelty of this release is a new experimental front-end that supports the SMT-LIB 2 language, extended prenex polymorphism. This extension is implemented as a standalone library, and is available here: https://github.com/Coquera/psmt2-frontend

The full list of CHANGES is available here. As usual, do not hesitate to report bugs, to ask questions, or to give your feedback!

In March 2018, I attended my first MirageOS hack retreat in Morrocco.
MirageOS is a library operating system which allows everyone to build very small, specialized operating system kernels that are intended to run directly on the virtualization layer.
The application code itself is the guest operating system kernel, and can be deployed at scale without the need for an extra containerization step in between.
It is written …

In March 2018, I attended my first MirageOS hack retreat in Morrocco.
MirageOS is a library operating system which allows everyone to build very small, specialized operating system kernels that are intended to run directly on the virtualization layer.
The application code itself is the guest operating system kernel, and can be deployed at scale without the need for an extra containerization step in between.
It is written in OCaml and each kernel is built only with exactly the code that is necessary for the particular application.
A pretty different approach from traditional operating systems. Linux feels huge all of a sudden.

I flew in from New York via Casablanca to Marrakesh, and then took a cab to the city center, to the main square, Jemaa El Fnaa.
At Cafe de France, Hannes was picking me up and we walked back through the labyrinth of the Medina to the hostel Riad "Priscilla" where we lived with about 20 MirageOS folks, two turtles and a dog.
We ate some food, and there were talks about Mirage's quickcheck-style fuzzing library Crowbar, and an API realized on top of a message queue written in OCaml.

Coming from compiler construction in Haskell and building "stateless" services for information retrieval in Scala, I have a good grasp of functional programming. The funny problem is I don't know much about OCaml yet.

At Etsy, I was part of the Core Platform team where we first used hhvm (Facebook's hip-hop virtual machine) on the API cluster, and then advocated to use their gradually typed "hack" language to introduce typing to the gigantic PHP codebase at Etsy. Dan Miller and I added types to the codebase with Facebook's hackificator, but then
PHP 7 added the possibility of type annotations and great speedups, and PHP's own static analyzer phan was developed by Rasmus Lerdorf and Andrew Morrison to work with PHP's types.
We abandoned the hackification approach.
Why is this interesting? These were my first encounters with OCaml! The hack typechecker is written in OCaml, and Dan and I have read it to understand the gradual typing approach.
Also, we played with pfff, a tool written in OCaml that allows structural edits on PHP programs, based on the abstact syntax tree.
I made a list to translate between Haskell and OCaml syntax, and later Maggie Zhou and I used pfff to unify the syntax of several hundred endpoints in Etsy's internal API.

At the MirageOS retreat, I started my week reading "Real World OCaml", but got stuck because the examples did not work with the buildsystem used in the book. Stephen helped me to find a workaround, I made a PR to the book but it was closed since it is a temporary problem. Also, I started reading about OCaml's "lwt" library for concurrent programming. The abbreviation stands for lightweight threads and the library provides a monadic way to do multithreading, really similar to twitter futures in Scala. Asynchronous calls can be made in a thread, which then returns at some point when the call was successful or failed. We can do operations "inside" lwt with bind (>>=) in the same way we can flatMap over Futures in scala. The library also provides ways to run multiple threads in sequence or in parallel, and to block and wait.
In the evening, there was a talk about a high-end smart card that based on a private start value can provide a succession of keys. The hardware is interesting, being the size of a credit card it has a small keypad and a screen. Some banks use these cards already (for their TAN system?), and we all got a sample card to play with.

One day I went swimming with Lix and Reynir, which was quite the adventure since the swimming pool was closed and we were not sure what to do. We still made it to the part that was still open, swam a lot and then got a cake for Hannes birthday which lead to a cake overflow since there were multiple cakes and an awesome party with candles, food and live music already. :D Thanks everyone for organizing!! Happy birthday Hannes!

I started reading another book, "OCaml from the very beginning", and working through it with Kugg. This book was more focused on algorithms and the language itself than on tooling and libraries, and the exercises were really fun to solve. Fire up OCaml's REPL utop and go! :D

At the same time I started reading the code for solo5 to get an understanding of the underlying hypervisor abstraction layer and the backends we compile to. This code is really a pleasure to read.
It is called solo5 because of MirageOS's system calls, initially a set of 5 calls to the hypervisor, called hypercalls which sounds really futuristic. :D

So that's the other fun problem: I don't know too much about kernel programming yet. I did the Eudyptula (Linux kernel) challenge, an email-based challenge that sends you programming quests to learn about kernel programming.
Over the course of the challenge, I've made my own Linux kernel module that says "Hello world!" but I have not built anything serious yet.

The next things I learned were configuring and compiling a MirageOS unikernel. Hannes showed me how this works.
The config system is powerful and can be tailored to the unikernel we are about to build, via a config file.
After configuring the build, we can build the kernel for a target backend of our choice. I started out with compiling to Unix, which means all network calls go through unix pipes and the unikernel runs as a simple unix binary in my host system, which is really useful for testing.

The next way to run MirageOS that I tried was running it in ukvm. For this setup you have to change the networking approach so that you can talk from the host system to you unikernel inside ukvm. In Linux you can use the Tun/Tap loopback interface for networking to wire up this connection.

We had a session with Jeremie about our vision for MirageOS which was super fun, and very interesting because people have all kinds of different backgrounds but the goals are still very aligned.

Another thing I learned was how to look at network traffic with wireshark. Sidney and I had previously recorded a TLS handshake with tcpdump and looked at the binary data in the pcap file with "hexfiend" next to Wikipedia to decode what we saw.
Derpeter gave me a nice introduction about how to do this with wireshark, which knows about most protocols already and will do the decoding of the fields for us. We talked about all layers of the usual stack, other kinds of internet protocols, the iptables flow, and bgp / peeringDB. Quite interesting and I feel I have a pretty good foundational understanding about how the internet actually works now.

During the last days I wanted to write a unikernel that does something new, and I thought about monitoring, as there is no monitoring for MirageOS yet. I set up a grafana on my computer and sent some simple data packets to grafana from a unikernel, producing little peaks in a test graph. Reynir and I played with this a bit and restructured the program.

After this, the week was over, I walked back to Jemaa el Fnaa with Jeremie, I feel I learned a ton and yet am still at the very beginning, excited what to build next. On the way back I got stuck in a weird hotel in Casablanca due to the flight being cancelled, where I bumped into a Moroccan wedding and met some awesome travelling women from Brazil and the US who also got stuck. All in all a fun adventure!

On March 18th 2018, after more than three years, IPredator, the lender of the Bitcoins, repurposed the 10 Bitcoins for other projects. Initially, we thought that the Piñata would maybe run for a month or two, but IPredator, David, and I decided to keep it running. The update of the Piñata's bounty is a good opportunity to reflect on the project.

The 10 Bitcoin in the Piñata were fluctuating in price over time, at peak worth 165000€.

From the start of the Piñata project, we published the source code, the virtual machine image, and the versions of the used libraries in a git repository. Everybody could develop their exploits locally before launching them against our Piñata. The Piñata provides TLS endpoints, which require private keys and certificates. These are generated by the Piñata at startup, and the secret for the Bitcoin wallet is provided as a command line argument.

Initially the Piñata was deployed on a Linux/Xen machine, later it was migrated to a FreeBSD host using BHyve and VirtIO with solo5, and in December 2017 it was migrated to native BHyve (using ukvm-bin and solo5). We also changed the Piñata code to accomodate for updates, such as the MirageOS 3.0 release, and the discontinuation of floating point numbers for timestamps (asn1-combinators 0.2.0, x509 0.6.0, tls 0.9.0).

Most bug bounty programs require communication via forms and long wait times for
human experts to evaluate the potential bug. This evaluation is subjective,
intransparent, and often requires signing of non-disclosure agreements (NDA),
even before the evaluation starts.

Our Piñata automates these parts, getting rid of wait times and NDAs. To get
the private wallet key that holds the bounty, you need to successfully establish
an authenticated TLS session or find a flaw elsewhere in the stack, which allows
to read arbitrary memory. Everyone can track transactions of the blockchain and
see if the wallet still contains the bounty.

Of course, the Piñata can't prove that our stack is secure, and it can't prove
that the access to the wallet is actually inside. But trust us, it is!

Observations

I still remember vividly the first nights in February 2015, being so nervous that I woke up every two hours and checked the blockchain. Did the Piñata still have the Bitcoins? I was familiar with the code of the Piñata and was afraid there might be a bug which allows to bypass authentication or leak the private key. So far, this doesn't seem to be the case.

We analysed the Piñata's access logs to the and bucketed them into website traffic and bounty connections. We are still wondering what happened in July 2015 and July 2017 where the graph shows spikes. Could it be a presentation mentioning the Piñata, or a new automated tool which tests for TLS vulnerabilities, or an increase in market price for Bitcoins?

The cumulative graph shows that more than 500,000 accesses to the Piñata website, and more than 150,000 attempts at connecting to the Piñata bounty.

You can short-circuit the client and server Piñata endpoint and observe the private wallet key being transferred on your computer, TLS encrypted with the secret exchanged by client and server, using socat -x TCP:ownme.ipredator.se:10000 TCP:ownme.ipredator.se:10002.

If you attempted to exploit the Piñata, please let us know what you tried! Via

Since the start of 2018 we are developing robust software and systems at robur. If you like our work and want to support us with donations or development contracts, please get in touch with team@robur.io. Robur is a project of the German non-profit Center for the cultivation of technology. Donations to robur are tax-deductible in Europe.

The piñata will establish TLS connections only with endpoints presenting a certificate signed by its own, undisclosed certificate authority, but allows an attacker to easily listen to the encrypted traffic. The piñata always sends the same plaintext in such a connection: the private key to a wallet containing approximately 10 bitcoin. If the attacker can decrypt the ciphertext, or trick the piñata into negotiating a TLS connection with another host and disclosing the key, the information (and therefore the money) is theirs.

Crowbar

Crowbar is a library for writing tests. It combines a property-based API (like QuickCheck) with a coverage-driven generator of test cases (like the fuzzer American Fuzzy Lop). Crowbar tries to find counterexamples to stated properties by prioritizing the generation of test cases which touch more code. It is very good at finding counterexamples.

Testing ocaml-x509

TLS connections are usually authenticated via X509 certificates. ocaml-tls uses ocaml-x509 for this purpose, which is written as a standalone library. There is a clear separation of concerns between ocaml-x509 and ocaml-tls, and a straightforward API for certificate operations in ocaml-x509; both features help tremendously in writing tests for certificate handling.

Stating Tests

Of the possible operations in X509, the most interesting in the context of the BTC piñata are those related to certificate validation. We expect the piñata to check whether a certificate provided by the attacker has a trust chain to any CA it is aware of. This suggests a property we might want to find a counterexample to:
* certificates signed by CA m will not be judged valid unless CA m is provided as a trust anchor.

If we can find a counterexample, we’re well on the way to getting the piñata to negotiate a TLS connection with an endpoint presenting a certificate not signed by the CA it generated at boot time.

Generating Certificates

In order to test this property, we need to be able to call X509.Validation.verify_chain_of_trust with something of type X509.CA.t (representing the certificate authority; for the piñata, this is a known and trusted value) and something of type X509.t (representing the certificate; for the piñata, this is presented by the remote host of the TLS connection).

For our tests, we’ll allow Crowbar’s random generators to specify most parts of the certificate, with the important exception of the key material – generating this randomly will cause the execution of each test to be very slow, and it’s not the goal of this testing to try to brute-force the key of our test CA.

The generators for certificates are largely automatically created by ppx_deriving_crowbar, although some manual help is needed to include generators for data types in dependencies used by X509 (and generators for their dependencies and dependencies’ dependencies). A maximalist interpretation can be seen in this iteration of the tests, which uses ppx_deriving to automatically generate equality tests and pretty-printers as well as generators for many of the relevant types.

Increasing Stability

Since we want all randomness in the test execution to be driven by the fuzzer, it’s important to remove other sources of entropy in the code execution. Luckily, ocaml-x509 and the cryptography library underneath it, nocrypto, were written considerately, and there is a facility for providing a constant seed to the pseudorandom number generator. Using the same seed across all test runs removes noise from the measurement of coverage between different test runs.

Running Tests

Tests built with the Crowbar library need to be run via afl-fuzz. To automatically launch afl-fuzz in a manner that uses all available computing resources and reports failures as quickly as possible, we used ocaml-bun. You can see the results of such a test run in Travis CI here.

The number of executions per second is appallingly low for OCaml native code running in afl-fuzz (cryptography is hard!); to compensate I ran this code over a weekend locally instead of briefly on whatever free resources Travis would give me.

As we hoped, a certificate signed by the wrong CA with these sets of extensions don’t validate. ocaml-x509 isn’t tricked by arbitrarily ridiculous sets of extensions in a signed certificate, and I didn’t manage to steal any bitcoin.

Future and Related Work

None of this work targets ocaml-tls or any of the more general parts of the stack run by the piñata. Notably, neither tcpip which provides the TCP, IPv4, and Ethernet implementations, nor any of the hypervisor-specific virtual network devices, are examined by this work. (This includes mirage-net-xen, the originator of the only MirageOS security advisory to date.) With releases of all tooling used to test ocaml-x509 available via opam, this is easier to rectify than it previously was!

Acknowledgements

Thanks to OCaml Labs for funding this work, IPredator for stuffing the piñata, and robur.io for future and continuing work in building resilient systems.

The final release of Coq 8.8.0 is
available. It features better performances, tactic improvements, many
enhancements for universe users, a new Export modifier for setting options,
support for goal selectors in front of focusing brackets and a new experimental
-mangle-names option for linting proof scripts.
Feedback and bug reports are extremely welcome.

As we are preparing to work on the Tezos Protocol, we’re still actively keeping the pace on the block explorer TZScan.io, adding cool information for baking accounts. We’d like to allow people to see who is contributing to the network and to understand the distribution of rolls, rights, etc.

For starters, we are showing the roll balance used for baking in the current cycle and the rolls history of a baker.

As we are preparing to work on the Tezos Protocol, we’re still actively keeping the pace on the block explorer TZScan.io, adding cool information for baking accounts. We’d like to allow people to see who is contributing to the network and to understand the distribution of rolls, rights, etc.

For starters, we are showing the roll balance used for baking in the current cycle and the rolls history of a baker.

O(1) Labs is a small, well-funded startup aiming to develop the first cryptocurrency protocol that can deliver on the promise of supporting real-world applications and widespread use. Our team is based in San Francisco, and we are funded by top investors (including Polychain, Metastable, and Naval Ravikant).

Cryptocurrency is a domain where correctness really counts. As such, a cornerstone of our approach is a focus on building reliable software through the use of statically-typed functional pr…

O(1) Labs is a small, well-funded startup aiming to develop the first cryptocurrency protocol that can deliver on the promise of supporting real-world applications and widespread use. Our team is based in San Francisco, and we are funded by top investors (including Polychain, Metastable, and Naval Ravikant).

Cryptocurrency is a domain where correctness really counts. As such, a cornerstone of our approach is a focus on building reliable software through the use of statically-typed functional programming languages. This is reflected in our OCaml codebase and style of structuring code around DSLs, as well as in the design of the smart-contracts platform we’re developing.

There is no need to have prior experience in cryptography, and we're hiring engineers to work on a bunch of exciting projects including:

The design of a virtual machine and higher-level languages for smart contracts (there a lot of interesting challenges here since the VM has to be efficient inside SNARKs).

Working on the core networking, cryptography, and reliability aspects of the protocol.

This is a chance to join a small, collaborative team and have a ton of independence while working on fascinating cross-disciplinary problems in computing. We also offer competitive compensation both in salary and equity, as well as top-of-the-market benefits.

Ideal candidates will be interested in any of:

functional programming

programming language theory

security

distributed systems

cryptography

cryptocurrency

There are no hard requirements and we’re more interested in learning about your individual background.

We are committed to building a diverse, inclusive company. People of color, LGBTQ individuals, women, and people with disabilities are strongly encouraged to apply.

In some occasions, using the Coq proof assistant stops resembling a normal software development activity, and becomes more similar to puzzle solving.

Similarly to the excellent video games of the Zachtronics studio (TIS-100, SpaceChem, …), the system provides you with puzzles where obstacles have to be side-stepped using a fair amount of tricks and ingenuity, and finally solving the problem has often no other utility than the satisfaction of having completed it.

In some occasions, using the Coq proof assistant stops resembling a normal software development activity, and becomes more similar to puzzle solving.

Similarly to the excellent video games of the Zachtronics studio (TIS-100, SpaceChem, …), the system provides you with puzzles where obstacles have to be side-stepped using a fair amount of tricks and ingenuity, and finally solving the problem has often no other utility than the satisfaction of having completed it.

In this blog-post, I would like to present what I think is one such situation. What the puzzle is, how we solved it, and why you shouldn’t probably do that if you like spending your time in a useful manner.

Prelude

A few months ago, I was wondering if it was possible to count the number of exists in front of the goal, using Ltac (the tactic language of Coq). That is, write a tactic that would for example produce 3 on the following goal:

====================================existsxyz:nat,x+y=z

“That’s easy”, I thought. “I will just write a recursive function in Ltac”. First, we check that we can get the body of the exists by matching on the goal – this seems to work as expected:

At this point, the reasonable choice would be to give up, write specialized versions of the tactic for 0 to 7 exists (who has goals with more than 7 exists anyway?), and wait for one of the successors of Ltac (Ltac2 maybe?) to come up without these limitations. Unreasonably, I sought the help of Cyprien Mangin, my local puzzle game and Coq expert, and we came up with the following solution.

A first idea: “destruct”ing the goal

A first idea: one thing that we can do iteratively on a exists x y z ... term is splitting it using destruct – if we have it as an hypothesis. For example:

Notice we had to do the “standard trick” of writing the tactic in continuation passing style (using kont here as a continuation to return the number of exists). This is required since a Ltac tactic cannot do side-effects on the goal (here, destruct) and at the same time return a term.

Now, we want to count the number of exists in the goal, not in an hypothesis. How could we turn the goal into an hypothesis – after all, these exists are something we need to provide, not something we get. In fact, it is possible to get a sub-goal with an hypothesis of the same type as the goal – simply, this sub-goal won’t be much relevant for proving the goal. We define the following helper lemma:

Applying this lemma produces an extra sub-goal with an hypothesis of type P. Notice how the proof term corresponding to this sub-goal is completely discarded in the definition of helper_lemma: this sub-goal is only relevant for our Ltac tricks. To get a sub-goal which allows destructing the exists in front of the goal, we do:

Second idea: communicating through evars

We are not done yet: we can count the number of exists in the first – dummy – sub-goal, but we need to transmit this information to the main sub-goal.

The second idea is to propagate this information using an “evar”. An evar is a Coq term representing a “hole”: its definition is not known yet, and will be given later in the proof. This discipline only exists when constructing the proof: evars do not appear in the proof term, where everything happens in order.

The idea here is to introduce an evar before applying our auxiliary helper_lemma. This evar will appear in the context of both sub-goals introduced by the lemma: in the first one, we can “instantiate” it (ie. set its definition) with the number of exists computed, and use it in the second sub-goal.

Third idea: cleaning up

The tactic above indeed works, and successfully counts the number of exists in the goal. However, it is still a bit messy. In particular, the trick of using a helper lemma shows up in the proof term. Using Show Proof after running count_in_ty on the goal yields:

This is mostly noise! Indeed, helper_lemma P Q H1 H2 is equivalent to simply H2 – we only use the lemma for our Ltac tricks, and ideally, this should not appear in the final proof term. We can do better. The third idea is to isolate the messy proof term containing helper_lemma, and simplify it after it has been produced by counting the exists. Isolating the proof term can be achieved with the following pattern:

letn:=constr:(ltac:(mytactic):nat)in...

constr:() and ltac:() are both quotations: the first one indicates that its contents must be parsed as a Coq term, and the second one that it must be parsed as a tactic. Their combination above indicates that we want to produce a term (of type nat), and to run the tactic mytactic to produce it.

mytactic will run on a goal of type nat, and the proof term it produces by proving this goal will become the definition of n. Let us replace mytactic by our counting tactic (its continuation being simply exact to prove the nat goal using the result of the count):

This time of the year is, just like Christmas time, a time for laughs and magic… although the magic we are talking about, in the OCaml community, is not exactly nice, nor beautiful. Let’s say that we are somehow akin to many religions: we know magic does exist , but that it is satanic and shouldn’t be introduced to children.

Introducing Just The Right Time (JTRT)

Let me first introduce you to the concept of ‘Just The Right Time’ [1]. JTRT is somehow a ‘Just In…

This time of the year is, just like Christmas time, a time for laughs and magic… although the magic we are talking about, in the OCaml community, is not exactly nice, nor beautiful. Let’s say that we are somehow akin to many religions: we know magic does exist , but that it is satanic and shouldn’t be introduced to children.

Introducing Just The Right Time (JTRT)

Let me first introduce you to the concept of ‘Just The Right Time’ [1]. JTRT is somehow a ‘Just In Time’ compiler, but one that runs at the right time, not at some random moment decided by a contrived heuristic.

How does the compiler know when that specific good moment occurs? Well, he doesn’t, and that’s the point: you certainly know far better. In the OCaml world, we like good performances, like any other, but we prefer predictable ones to performances that may sometimes be awesome, and sometimes really slow. And we are ready to trade off some annotations for better predictability (or is it just me trying to give the impression that my opinion is everyone’s opinion…). Don’t forget that OCaml is a compiled language; hence the average generated code is good enough. Runtime compilation only matters for some subtle situations where a patterns gets repeated a lot, and you don’t know about that pattern before receiving some inputs.

Of course the tradeoff wouldn’t be the same in Javascript if you had to write something like that to get your code to perform decently.

The magical ‘this_is_the_right_time’ function

There are already nice tools for doing that in OCaml. In particular, you should look at metaocaml, which is an extension of the language that has been maintained for years. But it requires you to think a bit about what your program is doing and add a few types, here and there.

Fortunately, today is the day you may want to try this ugly weekend hack instead.

To add a bit of context, let’s say there are 1/ the Dirty Little Tricks, and 2/ the Other Kind of Ugly Hacks. We are presenting one of the latter; the kind of hacks for which you are both ashamed and a bit proud (but you should really be a lot more ashamed). I’ve made quite a few of those, and this one would probably rank well among the top 5 (and I’m deeply sorry about the other ones that are still in production somewhere…).

This is composed of two parts: a small compiler patch, and a runtime library. That library only exposes the following single function:

That’s all. By stating that this is the right time, you told the compiler to take that function and do its magic on it.

How the f**k does that work?!

The compiler patch is quite simple. It adds to every function some annotation to allow the compiler to know enough things about it. (It is annotated with its representation in the Flambda IR.) This is just a partial dump of the compiler memory state when transforming the Flambda IR to clambda. I tried to do it in some more ‘disciplined’ way (it used some magic to traverse the compiler internal memory representation to create a static version of it in the binary), but ‘ld’ was not so happy linking a ~500MB binary. So I went the ‘marshal’ way.

This now means that at runtime the program can find the representation of the closures. To give an example of the kind of code you really shouldn’t write, here is the magic invocation to retrieve that representation:

With that, we now know the layout of the closure and we can extract all the variables that it binds. We can further inspect the value of those bound variables, and build an IR representation for them. That’s the nice thing about having an untyped IR, you can produce some even when you lost the types. It will just probably be quite wrong, but who cares…

Now that we know everything about our closure, we can rebuild it, and so will we. As we can’t statically build a non-closed function (the flambda IR happens after closure conversion), we will instead build a closed function that allocates the closure for us. For our example, it would look like this:

The user writes some expression, that gets parsed to Add (Const 11, Add (Var, Const 22)), it goes through optimizing and results in Add (Const 33, Var). Then you find that this looks like the right time.

Annnnd… nothing happens. The reason being that there is no way to distinguish between mutable and immutable values at runtime, hence the safe assumption is to assume that everything is mutable, which limits optimizations a lot. So let’s enable the ‘special’ mode:

incorrect_mode := true

And MAGIC happens! The code that gets spitted out is exactly what we want (that is fun x -> 33 + x).

Conclusion

Just so that you know, I don’t really recommend using it. It’s buggy, and many details are left unresolved (I suspect that the names you would come up for that kind of details would often sound like ‘segfault’). Flambda was not designed to be used that way. In particular, there are some invariants that must be maintained, like the uniqueness of variables and functions… that we completely disregarded. That lead to some ‘funny’ behaviors (like ‘power 2 8′ returning 512…). It is possible to do that correctly, but that would require far more than a few hours’ hacking. This might be a lot easier with the upcoming version of Flambda.

So this is far from ready, and it’s not going to be anytime soon (supposing that this is a good idea, which I’m still not convinced it is).

The first beta
release of Coq 8.8 is available for testing. It features better
performances, tactic improvements, many enhancements for universe users, a new
Export modifier for setting options, support for goal selectors in front of
focusing brackets and a new experimental -mangle-names option for linting proof
scripts.
Feedback and bug reports are extremely welcome.

A new release of Alt-Ergo (version 2.1.0) is available on Alt-Ergo’s website: https://alt-ergo.ocamlpro.com/#releases. An OPAM package for it will be published soon.
In this release, we mainly improved the CDCL-based SAT solver to get performances similar to/better than the old Tableaux-like SAT. The CDCL solver is now the default Boolean reasoner. The full list of CHANGES is available here.
Despite our various tests, you may still encounter some issues with this new solver. Please, don&…

A new release of Alt-Ergo (version 2.1.0) is available on Alt-Ergo’s website: https://alt-ergo.ocamlpro.com/#releases. An OPAM package for it will be published soon.
In this release, we mainly improved the CDCL-based SAT solver to get performances similar to/better than the old Tableaux-like SAT. The CDCL solver is now the default Boolean reasoner. The full list of CHANGES is available here.
Despite our various tests, you may still encounter some issues with this new solver. Please, don’t hesitate to report bugs, ask questions, and give your feedback!

Version 8.7.2 of Coq is available. It fixes a critical bug in the VM handling of universes. This bug affected all releases since 8.5.

Other changes include improved support for building with OCaml 4.06.0 and external num package, many other bug fixes, documentation improvements, and user message improvements (for details, see the 8.7.2 milestone).

OCamlPro is proud to release a first version of TzScan (http://tzscan.io), its Tezos
block explorer to ease the use of the Tezos network.

What TzScan can do for you :

– Several charts on blocks, operations, network, volumes, fees, and more,
– Marketcap and Futures/IOU prices from coinmarket.com,
– Blocks, operations, accounts and contracts detail pages,
– Public API to get information about blocks, operations, accounts and more,
– Documentation on different concepts of Tezos like Endorsements, Nonces, etc.

What we tried to do with TzScan is to show differently the Tezos network to have a better understanding of what is really going on by showing the main points of Proof of Stake. Further enhancements and optimization are to come but enjoy and play with our explorer.

If you have suggestions or bugs, please send us reports at contact@tzscan.io !

As a tradition, we took part in this year’s Journées Francophones des Langages Applicatifs (JFLA 2018) that was chaired by LRI’s Sylvie Boldo and hosted in Banyuls the last week of January. That was a nice opportunity to present a live demo of a multisignature smart-contract entirely written in the #Liquidity language designed at OCamlPro, and deployed live on the Tezos alphanet (the slides are now available, see at the end of the post).

As a tradition, we took part in this year’s Journées Francophones des Langages Applicatifs (JFLA 2018) that was chaired by LRI’s Sylvie Boldo and hosted in Banyuls the last week of January. That was a nice opportunity to present a live demo of a multisignature smart-contract entirely written in the #Liquidity language designed at OCamlPro, and deployed live on the Tezos alphanet (the slides are now available, see at the end of the post).

Tezos is the only blockchain to use a strongly typed, functional language, with a formal semantic and an interpreter validated by the use of GADTs (generalized abstract data-types). This stack-based language, named Michelson, is somewhat tricky to use as-is, the absence of variables (among others) necessitating to manipulate the stack directly. For this reason, we have developed, starting in June 2017, a higher level language, Liquidity, implementing the type system of Michelson in a subset of OCaml.

In addition to the compiler which allows to compile Liquidity programs to Michelson ones, we have developed a decompiler which, from Michelson code, can recover a Liquidity version, much easier to look at and understand (for humans). This tool is of some significance considering that contracts will be stored on the blockchain in Michelson format, making them more approachable and understandable for end users.

To facilitate designing contracts and foster Liquidity adoption we have also developed a web application. This app offers somewhat bare-bone editors for Liquidity and Michelson, allows compilation in the browser directly, deployment of Liquidity contracts and interaction with them (using the Tezos alphanet).

This blog post presents these different tools in more detail.

Michelson

Michelson is a stack-based, functional, statically and strongly typed language. It comes with a set of built-in base types like strings, Booleans, unbounded integers and naturals, lists, pairs, option types, union (of two) types, sets, maps. There also a number of domain dependent types like amounts (in tezzies), cryptographic keys and signatures, dates, etc. A Michelson program consists in a structured sequence of instructions, each of which operates on the stack. The program takes as inputs a parameter as well as a storage and returns a result and a new value for the storage. They can fail at runtime with the instruction FAIL, or another error (call of a failing contract, out of gas, etc.), but most instructions that could fail return an option instead (e.g.EDIV returns None when dividing by zero). The following example is a smart contract which implements a voting system on the blockchain. The storage consists in a map from possible votes (as strings) to integers counting number of votes. A transaction to this contract must be made with an amount (accessible with instruction AMOUNT) greater or equal to 5 tezzies and a parameter which is a valid vote. If one of these conditions is not respected, the execution, and thus the transaction, fail. Otherwise the program retrieves the previous number of votes in the storage and increments them. At the end of the execution, the stack contains the pair composed of the value Unit and the updated map (the new storage).

Typing a Michelson program is done by types propagation, and not à la Milner. Polymorphic types are forbidden and type annotations are required when a type is ambiguous (e.g. empty list).

Functions (lambdas) are pure and are not closures, i.e. they must have an empty environment. For instance, a function passed to another contract as parameter acts in a purely functional way, only accessing the environment of the new contract.

Method calls is preformed with the instruction TRANSFER_TOKENS: it requires an empty stack (not counting its arguments). It takes as argument the current storage, saves it before the call is made, and finally returns it after the call together with the result. This forces developers to save anything worth saving in the current storage, while keeping in mind that a reentring call can happend (the returned storage might be different).

We won’t explain the semantics of Michelson here, a good one in big step form is available here.

The Liquidity Language

Liquidity is also a functional, statically and strongly typed language that compiles down to the stack-based language Michelson. Its syntax is a subset of OCaml and its semantic is given by its compilation schema. By making the choice of staying close to Michelson in spirit while offering higher level constraints, Liquidity allows to easily write legible smart contracts with the same safety guaranties offered by Michelson. In particular we decided that it was important to keep the purely functional aspect of the language so that simply reading a contract is not obscured by effects and global state. In addition, the OCaml syntax makes Liquidity an immediately accessible tool to programmers who already know OCaml while its limited span makes the learning curve not too steep.

The following example is a liquidity version of the vote contract. Its inner workings are rather obvious for anyone who has already programmed in a ML-like language.

A Liquidity contract starts with an optional version meta-information. The compiler can reject the program if it is written in a too old version of the language or if it is itself not recent enough. Then comes a set of type and function definitions. It is also possible to specify an initial storage (constant, or a non-constant storage initializer) with let%init storage. Here we define a type abbreviation votes for a map from strings to integers. It is the structure that we will use to store our vote counts.

The storage initializer creates a map containing two bindings, "ocaml" to 0 and "pro" to 0 to which we add another vote option depending on the argument myname given at deploy time.

The entry point of the program is a function main defined with a special annotation let%entry. It takes as arguments a call parameter (parameter) and a storage (storage) and returns a pair whose first element is the result of the call, and second element is a potentially modified storage.

The above program defines a local variable amount which contains the amount of the transaction which generated the call. It checks that it is greater than 5 tezzies. If not, we fail with an explanatory message. Then the program retrieves the number of votes for the chosen option given as parameter. If the vote is not a valid one (i.e., there is no binding in the map), execution fails. Otherwise, the current number of votes is bound to the name x. Storage is updated by incrementing the number of votes for the chosen option. The built-in function Map.add adds a new binding (here, it replaces a previously existing binding) and returns the modified map. The program terminates, in the normal case, on its last expression which is its returned value (a pair containing () — the contract only modifies the storage — and the storage itself).

Compilation

Encodings

Because Liquidity is a lot richer than Michelson, some types and constructs must be simplified or encoded. Record types are translated to right-associated pairs with as many components as the record has fields. t1 is encoded as t1' in the following example.

Field accesses in a record is translated to accesses in the corresponding tuples (pairs). Sum (or union) types are translated using the built-in variant type (this is the or type in Michelson). t2 is encoded as t2' in the following example.

Liquidity also supports closures while Michelson only allows pure lambdas. Closures are translated by lambda-lifting, i.e. encoded as pairs whose first element is a lambda and second element is the closure environment. The resulting lambda takes as argument a pair composed of the closure’s argument and environment. Adequate transformations are also performed for built-in functions that take lambdas as arguments (e.g. in List.map) to allow closures instead.

Compilation schema

This little section is a bit more technical, so if you don’t care how Liquidity is compiled precisely, you can skip over to the next one.

We note by Γ, [|x|]d ⊢ X ↑t compilation of the Liquidity instruction x, in environment Γ. Γ is a map associating variable names to a position in the stack. The compilation algorithm also maintains the size of the current stack (at compilation of instruction x), denoted by d in the previous expression. Below is a non-deterministic version of the compilation schema, the one implemented in the Liquidity compiler being a determinized version.

The result of compiling x is a Michelson instruction (or sequence of instructions) X together with a Boolean transfer information t. The instruction Contract.call (or TRANSFER_TOKENS in Michelson) needs an empty stack to evaluate, so the compiler empties the stack before translating this call. However, the various branches of a Michelson program must have the same stack type. This is why we need to maintain this information so that the compiler can empty stacks in some parts of the program.

Some of the rules have parts annotated with ?b. This suffix denotes a potential reset or erasing. In particular:

For sets, Γ?b is ∅ if b evaluates to false, and Γ otherwise.

For integers, d?b is 0 if b evaluates to false, and d otherwise.

For instructions, (X)?b is {} if b evaluates to false, and X otherwise.

For instance, by looking at rule CONST, we can see that compiling a Liquidity constant simply consists in pushing this constant on the stack. To handle variables in a simple manner, the rule VAR tells us to look in the environment Γ for the index associated to the variable we want to compile. Then, instruction D(U)iP puts at the top of the stack a copy of the element present at depth i. Variables are added to Γ with the Liquidity instruction let ... in or with any instruction that binds an new symbol, like fun for instance.

Decompilation from Michelson

While Michelson programs are high level compared to other bytecodes, it remains difficult for a blockchain end-user to understand what a Michelson program does exactly by looking at it. However, following the idea that “code is law”, a user should be able to read a contract and understand its precise semantic. Thus, we have developed a decompiler from Michelson to Liquidity, which allows to recover a much more readable and understandable representation of a program on the blockchain.

The decompilation of Michelson code follows the diagram below where:

Cleaning consists in simplifying Michelson code to accelerate the whole process and simplify the following task. For now it consists in ereasing instructions whose continuation is a failure.

Symbolic Execution consists in executing the Michelson program with symbolic inputs, and by replacing every value placed in the stacj by a node containing the instruction that generated it. Each node of this graph can be seen as an expression of the target program, which can be bound to a variable name. Edges to this node represent future occurrences of this variable.

Decompilation consists in transforming the graph generated by the previous step in a Liquidity syntax tree. Names for variables are recovered from annotations produced by the Liquidity compiler (in case we decompile a Michelson program that was generated from Liquidty), or are chosen on the fly when no annotation is present (e.g. if the Michelson program was written by hand).

Finally the program is typed (to ensure no mistakes were made), simplified and pretty printed.

This example illustrate some of the difficulties of the decompilation process: Liquidity is a purely functional language where each construction is an expression returning a value; Michelson handles the stack directly, which is impossible to concretize in in Liquidity (values in the stack don’t have the same type, as opposed to values in a list). In this example, depending on the value of parameter the contract returns either the content of the storage, or the integer 1. In the Michelson code, the programmer used the instruction IF, but its branches do not return a value and only operates by modifying (or not) the stack.

The above translation to Liquidity also contains an if, but it has to return a value. The graph below is the result of the symbolic execution phase on the Michelson program. The IF instruction is decomposed in several nodes, but does not contain any remaining instruction: the result of this if is in fact the difference between the stack resulting from the execution of the then branch and from the else branch. It is denoted by the node N_IF_END_RESULT 0 (if there were multiple of these nodes with different indexes, the result of the if would have been a tuple, corresponding to the multiple changes in the stack).

Try-Liquidity

The first thing to do (if you want to deploy and interact with a contract) is to go into the settings menu. There you can set your Tezos private key (use one that you generated for the alphanet for the moment) or the source (i.e. your public key hash, which is derived from your private key if you set it).

You can also change which Tezos node you want to interact with (the first one should do, but you can also set one of your choosing such as one running locally on your machine). The timestamp shown next to the node name indicates how long ago was produced the last block that it knows of. Transactions that you make on a node that is not synchronized will not be included in the main chain.

You should now see your account with its balance in the top bar:

In the main editor window, you can select a few Liquidity example contracts or write your own. For this small tutorial, we will select multisig.liq which is a minimal multi-signature wallet contract. It allows anyone to send money to it, but it requires a certain number of predefined owners to agree before making a withdrawal.

Clicking on the button Compile should make the editor blink green (when there are no errors) and the compiled Michelson will appear on the editor on the right.

Let’s now deploy this contract on the Tezos alphanet. By going into the Deploy (or paper airplane icon) tab, we can choose our set of owners for the multisig contract and the minimum number of owners to be in agreement before a withdrawal can proceed. Here I put down two account addresses for which I possess the private keys, and I want the two owners to agree before any transaction is approved (2p is the natural number 2).

Then I can either forge the deployment operation which is then possible to sign offline and inject in the Tezos chain by other means, or I can directly deploy this contract (if the private key is set in settings). If deployment is successful, we can see both the deployment operation and the new contract on a block explorer by clicking on the provided links.

Now we can query the blockchain to examine our newly deployed contract. Head over to the Examine tab. The address field should already be filled with our contract handle. We just have to click on Retrieve balance and storage.

The contract has 3tz on its balance because we chose to initialize it this way. On the right is the current storage of the contract (in Liquidity syntax) which is a record with four fields. Notice that the actions field is an empty map.

Let’s make a few calls to this contract. Head over to the Call tab and fill-in the parameter and the amount. We can send for instance 5.00tz with the parameter Pay. Clicking on the button Call generates a transaction which we can observe on a block explorer. More importantly if we go back to the Examine tab, we can now retrieve the new information and see that the storage is unchanged but the balance is 8.00tz.

We can also make a call to withdraw money from the contract. This is done by passing a parameter of the form:

This is a proposition of transfer of funds in the amount of 2.00tz from the contract to the destination tz1brR6c9PY3SSfBDu7Qxdhsz3pvNRDwf68a.

The balance of the contract has not changed (it is still 8.00tz) but the storage has been modified. That is because this multisig contract requires two owners to agree before proceeding. The proposition is stored in the map actions and is associated to the owner who made said proposition.

We can now open a new browser tab and point it to http://liquidity-lang.org/edit, but this time we fill in the private key for the second owner tz1XT2pgiSRWQqjHv5cefW7oacdaXmCVTKrU. We choose the multisig contract in the Liquidity editor and fill-in the contract address in the call tab with the same one as in the other session TZ1XvTpoSUeP9zZeCNWvnkc4FzuUighQj918 (you can double check the icons for the two contracts are identical). For the the withdrawal to proceed, this owner has to make the exact same proposition so let’s make a call with the same parameter:

The call should also succeed. When we examine the contract, we can now see that its balance is down to 6.00tz and that the field actions of its storage has been reinitialized to the empty map. In addition, we can update the balance of our first account (by clicking on the circle arrow icon in the tob bar) to see that it is now up an extra 2.00tz and that was the destination of the proposed (and agreed on) withdrawal. All is well!

We have seen how to compile, deploy, call and examine Liquidity contracts on the Tezos alphanet using our online editor. Experiment with your own contracts and let us know how that works for you!

Now that the winter holiday break is over, we are starting to see the results of winter hacking among our community.

The first great hack for 2018 is from Sadiq Jaffer, who got OCaml booting on a tiny and relatively new CPU architecture -- the Espressif ESP32. After proudly demonstrating a battery powered version to the folks at OCaml Labs, he then proceeded to clean it up enough tha it can be built with a Dockerfile, so that others can start to do a native code port and get bindings to the net…

Now that the winter holiday break is over, we are starting to see the results of winter hacking among our community.

The first great hack for 2018 is from Sadiq Jaffer, who got OCaml booting on a tiny and relatively new CPU architecture -- the Espressif ESP32. After proudly demonstrating a battery powered version to the folks at OCaml Labs, he then proceeded to clean it up enough tha it can be built with a Dockerfile, so that others can start to do a native code port and get bindings to the networking interface working.

We also noticed that another OCaml generic virtual machine for even smaller microcontrollers has shown up on GitHub. This, combined with some functional metaprogramming, may mean that 2018 is the year of OCaml in all the tiny embedded things...

Since 2017 is just over, now is probably the best time to review what happened during this hectic year at OCamlPro… Here are our big 2017 achievements, in the world of blockchains(the Liquidity smart contract language, Tezos and the Tezos ICO, etc.), of OCaml (with OPAM 2, flambda 2 etc.), and of formal methods (Alt-Ergo etc.)

In the World of Blockchains

The Liquidity Language for smart contracts

Since 2017 is just over, now is probably the best time to review what happened during this hectic year at OCamlPro… Here are our big 2017 achievements, in the world of blockchains(the Liquidity smart contract language, Tezos and the Tezos ICO, etc.), of OCaml (with OPAM 2, flambda 2 etc.), and of formal methods (Alt-Ergo etc.)

In the World of Blockchains

The Liquidity Language for smart contracts

OCamlPro develops Liquidity, a high level smart contract language for Tezos. Liquidity is a human-readable language, purely functional, statically-typed, whose syntax is very close to the OCaml syntax. Programs can be compiled to the stack-based language (Michelson) of the Tezos blockchain.

To garner interest and adoption, we developed an online editor called “Try Liquidity“. Smart-contract developers can design contracts interactively, directly in the browser, compile them to Michelson, run them and deploy them on the alphanet network of Tezos.

Future plans include a full-fledged web-based IDE for Liquidity. Worth mentioning is a neat feature of Liquidity: decompiling a Michelson program back to its Liquidity version, whether it was generated from Liquidity code or not. In practice, this allows to easily read somewhat obfuscated contracts already deployed on the blockchain.

Tezos and the Tezos ICO

* Work of Grégoire Henry, Benjamin Canou, Çagdas Bozman, Alain Mebsout, Michael Laporte, Mohamed Iguernlala, Guillem Rieu, Vincent Bernardoff (for DLS) and at times all the OCamlPro team in animated and joyful brainstorms.

Since 2014, the OCamlPro team had been co-designing the Tezos prototype with Arthur Breitman based on Arthur’s White Paper, and had undertaken the implementation of the Tezos node and client. A technical prowess and design achievement we have been proud of. In 2017, we developed the infrastructure for the Tezos ICO (Initial Coin Offering) from the ground up, encompassing the web app (back-end and front-end), the Ethereum and Bitcoin (p2sh) multi-signature contracts, as well as the hardware Ledger based process for transferring funds. The ICO, conducted in collaboration with Arthur, was a resounding success — the equivalent of 230 million dollars (in ETH and BTC) at the time were raised for the Tezos Foundation!

This work was allowed thanks to Arthur Breitman and DLS’s funding.

In the World of OCaml

Towards OPAM 2.0, the OCaml Package manager

OPAM was born at Inria/OCamlPro with Frederic, Thomas and Louis, and is still maintained here at OCamlPro. Now thanks to Louis Gesbert’s thorough efforts and the OCaml Labs contribution, OPAM 2.0 is coming !

opam is now compiled with a built-in solver, improving the portability, ease of access and user experience (`aspcud` no longer a requirement)

new workflows for developers have been designed, including convenient ways to test and install local sources, more reliable ways to share development setups

the general system has seen a large number of robustness and expressivity improvements, like extended dependencies

it also provides better caching, and many hooks enabling, among others, setups with sandboxed builds, binary artifacts caching, or end-to-end package signature verification.

Flambda Compilation

* Work of Pierre Chambart, Vincent Laviron

Pierre and Vincent’s considerable work on Flambda 2 (the optimizing intermediate representation of the OCaml compiler – on which inlining occurs), in close cooperation with JaneStreet’s team (Mark, Leo and Xavier) aims at overcoming some of flambda’s limitations. This huge refactoring will help make OCaml code more maintainable, improving its theoretical grounds. Internal types are clearer, more concise, and possible control flow transformations are more flexible. Overall a precious outcome for industrial users.

This work is allowed thanks to JaneStreet’s funding.

OCaml for ia64-HPUX

In 2017, OCamlPro also worked on porting OCaml on HPUX-ia64. This came from a request of CryptoSense, a French startup working on an OCaml tool to secure cryptographic protocols. OCaml had a port on Linux-ia64, that was deprecated before 4.00.0 and even before, a port on HPUX, but not ia64. So, we expected the easiest part would be to get the bytecode version running, and the hardest part to get access to an HPUX-ia64 computer: it was quite the opposite, HPUX is an awkward system where most tools (shell, make, etc.) have uncommon behaviors, which made even compiling a bytecode version difficult. On the contrary, it was actually easy to get access to a low-power virtual machine of HPUX-ia64 on a monthly basis. Also, we found a few bugs in the former OCaml ia64 backend, mostly caused by the scheduler, since ia64 uses explicit instruction parallelism. Debugging such code was quite a challenge, as instructions were often re-ordered and interleaved. Finally, after a few weeks of work, we got both the bytecode and native code versions running, with only a few limitations.

This work was mandated by CryptoSense.

The style-checker Typerex-lint

* Work of Çagdas Bozman, Michael Laporte and Clément Dluzniewski.

In 2017, typerex-lint has been improved and extended. Typerex-lint is a style-checker to analyze the sources of OCaml programs, and can be extended using plugins. It allows to automatically check the conformance of a code base to some coding rules. We added some analysis to look for code that doesn’t comply with the recommendations made by the SecurOCaml project members. We also made an interactive web output that provides an easy way to navigate in typerex-lint results.

Build systems and tools

* Work of Fabrice Le Fessant

Every year in the OCaml world, a new build tool appears. 2017 was not different, with the rise of jbuild/dune. jbuild came with some very nice features, some of which were already in our home-made build tool, ocp-build, like the ability to build multiple packages at once in a composable way, some other ones were new, like the ability to build multiple versions of the package in one run or the wrapping of libraries using module aliases. We have started to incorporate some of these features in ocp-build. Nevertheless, from our point of view, the two tools belong to two different families: jbuild/dune belongs to the “implicit” family, like ocamlbuild and oasis, with minimal project description; ocp-build belongs to the “explicit” family, like make and omake. We prefer the explicit family, because the build file acts as a description of the project, an entry point to understand the project and the modules. Also, we have kept working on improving the project description language for ocp-build, something that we think is of utmost importance. Latest release: ocp-build 1.99.20-beta.

Other contributions and softwares

OCaml bugfixes by Pierre Chambart, Vincent Laviron, and other members of the team.

The ocp-analyzer prototype by Vincent Laviron

In the World of Formal Methods

Alt-Ergo

* By Mohamed Iguernlala

For Alt-Ergo, 2017 was the year of floating-point arithmetic reasoning. Indeed, in addition to the publication of our results at the 29th International Conference on Computer Aided Verification (CAV), Jul 2017, we polished the prototype we started in 2016 and integrated it in the main branch. This is a joint work with Sylvain Conchon (Paris-Saclay University) and Guillaume Melquiond (Inria Lab) in the context of the SOPRANO ANR Project. Another big piece of work in 2017 consisted in investigating a better integration of an efficient CDCL-based SAT solver in Alt-Ergo. In fact, although modern CDCL SAT solvers are very fast, their interaction with the decision procedures and quantifiers instantiation should be finely tuned to get good results in the context of Satisfiability Modulo Theories. This new solver should be integrated into Alt-Ergo in the next few weeks. This work has been done in the context of the LCHIP FUI Project.

We also released a new major version of Alt-Ergo (2.0.0) with a modification in the licensing scheme. Alt-Ergo@OCamlPro’s development repository is now made public. This will allow users to get updates and bugfixes as soon as possible.

Towards a formalized type system for OCaml

OCaml is known for its rich type system and strong type inference, unfortunately such complex type engine is prone to errors, and it can be hard to come up with clear idea of how typing works for some features of the language. For 3 years now, OCamlPro has been working on formalizing a subset of this type system and implementing a type checker derived from this formalization. The idea behind this work is to help the compiler developers ensure some form of correctness of the inference. This type checker takes a Typedtree, the intermediate representation resulting from the inference, and checks its consistency. Put differently, this tool checks that each annotated node from the Typedtree can be indeed given such a type according to the context, its form and its sub-expressions. In practice, we could check and catch some known bugs resulting from unsound programs that were accepted by the compiler.

This type checker is only available for OCaml 4.02 for the moment, and the document describing this formalized type system will be available shortly in a PhD thesis, by Pierrick Couderc.

A few hints about what’s ahead for OCamlPro

2018

At the end of 2017, I resigned from my PostDoc position at University of
Cambridge (in the rems project). Early
December 2017 I organised the 4th MirageOS hack
retreat, with which I'm
very satisfied. In March 2018 the 5th retreat will
happen (please sign up!).

2018

At the end of 2017, I resigned from my PostDoc position at University of
Cambridge (in the rems project). Early
December 2017 I organised the 4th MirageOS hack
retreat, with which I'm
very satisfied. In March 2018 the 5th retreat will
happen (please sign up!).

In 2018 I moved to Berlin and started to work for the (non-profit) Center for
the cultivation of technology with our
robur.io project "At robur, we build performant bespoke
minimal operating systems for high-assurance services". robur is only possible
by generous donations in autumn 2017, enthusiastic collaborateurs, supportive
friends, and a motivated community, thanks to all. We will receive funding from
the prototypefund to work on a
CalDAV server implementation in OCaml
targeting MirageOS. We're still looking for donations and further funding,
please get in touch. Apart from CalDAV, I want to start the year by finishing
several projects which I discovered on my hard drive. This includes DNS, opam
signing, TCP, ... . My personal goal for 2018 is to develop a
flexible mirage deploy, because after configuring and building a unikernel, I
want to get it smoothly up and running (spoiler: I already use
albatross in production).

To kick off (3% of 2018 is already used) this year, I'll talk in more detail
about µDNS, an opinionated from-scratch
re-engineered DNS library, which I've been using since Christmas 2017 in production for
ns.nqsb.io and
ns.robur.io. The
development started in March 2017, and continued over several evenings and long
weekends. My initial motivation was to implement a recursive resolver to run on
my laptop. I had a working prototype in use on my laptop over 4 months in the
summer 2017, but that code was not in a good shape, so I went down the rabbit
hole and (re)wrote a server (and learned more about GADT). A configurable
resolver needs a server, as local overlay, usually anyways. Furthermore,
dynamic updates are standardised and thus a configuration interface exists
inside the protocol, even with hmac-signatures for authentication!
Coincidentally, I started to solve another issue, namely automated management of let's
encrypt certificates (see this
branch for an
initial hack). On my journey, I also reported a cache poisoning vulnerability,
which was fixed in Docker for
Windows.

But let's get started with some content. Please keep in mind that while the
code is publicly available, it is not yet released (mainly since the test
coverage is not high enough, and the lack of documentation). I appreciate early
adopters, please let me know if you find any issues or find a use case which is
not straightforward to solve. This won't be the last article about DNS this
year - persistent storage, resolver, let's encrypt support are still missing.

What is DNS?

The domain name system is a core Internet
protocol, which translates domain names to IP addresses. A domain name is
easier to memorise for human beings than an IP address. DNS is hierarchical and
decentralised. It was initially "specified" in Nov 1987 in RFC
1034 and RFC
1035. Nowadays it spans over more than 20
technical RFCs, 10 security related, 5 best current practises and another 10
informational. The basic encoding and mechanisms did not change.

On the Internet, there is a set of root servers (administrated by IANA) which
provide the information about which name servers are authoritative for which top level
domain (such as ".com"). They provide the information about which name servers are
responsible for which second level domain name (such as "example.com"), and so
on. There are at least two name servers for each domain name in separate
networks - in case one is unavailable the other can be reached.

The building blocks for DNS are: the resolver, a stub (gethostbyname provided
by your C library) or caching forwarding resolver (at your ISP), which send DNS
packets to another resolver, or a recursive resolver which, once seeded with the
root servers, finds out the IP address of a requested domain name. The other
part are authoritative servers, which reply to requests for their configured
domain.

To get some terminology, a DNS client sends a query, consisting of a domain
name and a query type, and expects a set of answers, which are called resource
records, and contain: name, time to live, type, and data. The resolver
iteratively requests resource records from authoritative servers, until the requested
domain name is resolved or fails (name does not exist, server
failure, server offline).

DNS usually uses UDP as transport which is not reliable and limited to 512 byte
payload on the Internet (due to various middleboxes). DNS can also be
transported via TCP, and even via TLS over UDP or TCP. If a DNS packet
transferred via UDP is larger than 512 bytes, it is cut at the 512 byte mark,
and a bit in its header is set. The receiver can decide whether to use the 512
bytes of information, or to throw it away and attempt a TCP connection.

DNS packet

The packet encoding starts with a 16bit identifier followed by a 16bit header
(containing operation, flags, status code), and four counters, each 16bit,
specifying the amount of resource records in the body: questions, answers,
authority records, and additional records. The header starts with one bit
operation (query or response), four bits opcode, various flags (recursion,
authoritative, truncation, ...), and the last four bit encode the response code.

A question consists of a domain name, a query type, and a query class. A
resource record additionally contains a 32bit time to live, a length, and the
data.

Each domain name is a case sensitive string of up to 255 bytes, separated by .
into labels of up to 63 bytes each. A label is either encoded by its length
followed by the content, or by an offset to the start of a label in the current
DNS frame (poor mans compression). Care must be taken during decoding to avoid
cycles in offsets. Common operations on domain names are comparison: equality,
ordering, and also whether some domain name is a subdomain of another domain
name, should be efficient. My initial representation naïvely was a list of
strings, now it is an array of strings in reverse order. This speeds up common
operations by a factor of 5 (see test/bench.ml).

The only really used class is IN (for Internet), as mentioned in RFC
6895. Various query types (MD, MF,
MB, MG, MR, NULL, AFSDB, ...) are barely or never used. There is no
need to convolute the implementation and its API with these legacy options (if
you have a use case and see those in the wild, please tell me).

My implemented packet decoding does decompression, only allows valid internet
domain names, and may return a partial parse - to use as many resource records
in truncated packets as possible. There are no exceptions raised, the parsing
uses a monadic style error handling. Since label decompression requires the
parser to know absolute offsets, the original buffer and the offset is manually
passed around at all times, instead of using smaller views on the buffer. The
decoder does not allow for gaps, when the outer resource data length specifies a
byte length which is not completely consumed by the specific resource data
subparser (an A record must always consume four bytes). Failing to check this can
lead to a way to exfiltrate data without getting noticed.

Each zone (a served domain name) contains a SOA "start of authority" entry,
which includes the primary nameserver name, the hostmaster's email address (both
encoded as domain name), a serial number of the zone, a refresh, retry, expiry,
and minimum interval (all encoded as 32bit unsigned number in seconds). Common
resource records include A, which payload is 32bit IPv4 address. A nameserver
(NS) record carries a domain name as payload. A mail exchange (MX) whose
payload is a 16bit priority and a domain name. A CNAME record is an alias to
another domain name. These days, there are even records to specify the
certificate authority authorisation (CAA) records containing a flag (critical),
a tag ("issue") and a value ("letsencrypt.org").

Server

The operation of a DNS server is to listen for a request and serve a reply.
Data to be served can be canonically encoded (the RFC describes the format) in a
zone file. Apart from insecurity in DNS server implementations, another attack
vector are amplification attacks where an attacker crafts a small UDP frame
with a fake source IP address, and the server answers with a large response to
that address which may lead to a DoS attack. Various mitigations exist
including rate limiting, serving large replies only via TCP, ...

Internally, the zone file data is stored in a tree (module
Dns_trieimplementation),
where each node contains two maps: sub, which key is a label and value is a
subtree and dns_map (module Dns_map), which key is a resource record type and
value is the resource record. Both use the OCaml
Map ("also known
as finite maps or dictionaries, given a total ordering function over the
keys. All operations over maps are purely applicative (no side-effects). The
implementation uses balanced binary trees, and therefore searching and insertion
take time logarithmic in the size of the map").

The server looks up the queried name, and in the returned Dns_map the queried
type. The found resource records are sent as answer, which also includes the
question and authority information (NS records of the zone) and additional glue
records (IP addresses of names mentioned earlier in the same zone).

Dns_map

The data structure which contains resource record types as key, and a collection
of matching resource records as values. In OCaml the value type must be
homogenous - using a normal sum type leads to an unneccessary unpacking step
(or lacking type information):

This helps me to programmaticaly retrieve tightly typed values from the cache,
important when code depends on concrete values (i.e. when there are domain
names, look these up as well and add as additional records). Look into server/dns_server.ml

Dynamic updates, notifications, and authentication

Dynamic updates specify in-protocol
record updates (supported for example by nsupdate from ISC bind-tools),
notifications are used by primary servers
to notify secondary servers about updates, which then initiate a zone
transfer to retrieve up to date
data. Shared hmac secrets are used to
ensure that the transaction (update, zone transfer) was authorised. These are
all protocol extensions, there is no need to use out-of-protocol solutions.

The server logic for update and zone transfer frames is slightly more complex,
and includes a dependency upon an authenticator (implemented using the
nocrypto library, and
ptime).

Deployment and Let's Encrypt

To deploy servers without much persistent data, an authentication schema is
hardcoded in the dns-server: shared secrets are also stored as DNS entries
(DNSKEY), and _transfer.zone, _update.zone, and _key-management.zone names
are introduced to encode the permissions. A _transfer key also needs to
encode the IP address of the primary (to know where to request zone transfers)
and secondary IP (to know where to send notifications).

Please have a look at
ns.robur.io and the examples for more details. The shared secrets are provided as boot parameter of the unikernel.

I hacked maker's
ocaml-letsencrypt
library to use µDNS and sending update frames to the given IP address. I
already used this to have letsencrypt issue various certificates for my domains.

There is no persistent storage of updates yet, but this can be realised by
implementing a secondary (which is notified on update) that writes every new
zone to persistent storage (e.g. disk
or git). I also plan to have an
automated Let's Encrypt certificate unikernel which listens for certificate
signing requests and stores signed certificates in DNS. Luckily the year only
started and there's plenty of time left.

A post like this, standing in the doorway of a new year, invites some thoughts about the future as well as the past. I don’t have time for deep thoughts on this (my old cert expires in 18 minutes as of this writing, and I’d like to deploy this post before then), but I hope that in 2018, we can keep taking our time and attention away from those who would profit off of us, and continue making a better world.

MirageOS Winter Hack Retreat, Marrakesh 2017

This winter, 33 people from around the world gathered in Marrakesh for a Mirage hack retreat. This is fast becoming a MirageOStradition, and we're a little sad that it's over already! We've collected some trip reports from those who attended the 2017 winter hack retreat, and we'd like to thank our amazing hosts, organisers and everyone who took the time to write up their experiences.

MirageOS Winter Hack Retreat, Marrakesh 2017

This winter, 33 people from around the world gathered in Marrakesh for a Mirage hack retreat. This is fast becoming a MirageOStradition, and we're a little sad that it's over already! We've collected some trip reports from those who attended the 2017 winter hack retreat, and we'd like to thank our amazing hosts, organisers and everyone who took the time to write up their experiences.

We, the MirageOS community, strongly believe in using our own software: this website has been a unikernel since day one^W^W it was possible to run MirageOS unikernels. In Marrakesh we used our own DHCP and DNS server without trouble. There are many more services under heavy development (including git, ssh, ...), which we're looking forward to using soon ourselves.

A huge fraction of the Solo5 contributors gathered in Marrakesh as well and discussed the future, including terminology, the project scope, and outlined a roadmap for merging branches in various states. Adrian from the Muen project joined the discussion, and in the aftermath they are now running their website using MirageOS on top of the Muen separation kernel.

A complete list of fixes and discussions is not available, please bear with us if we forgot anything above. A sneak preview: there will be another retreat in March 2018 in Marrakesh. Following are texts written by individual participants about their experience.

Mindy Preston

I came to Marrakesh for the hack retreat with one goal in mind: documentation. I was very pleased to discover that Martin Keegan had come with the same goal in mind and fresher eyes, and so I had some time to relax, enjoy Priscilla and the sun, photograph some cats, and chat about projects both past and future. In particular, I was really pleased that there's continued interest in building on some of the projects I've worked on at previous hack retreats.

I was lucky to have a lot of discussions about fuzzing in OCaml, some of which inspired further work and suggestions on some current problems in Crowbar. (Special thanks to gasche and armael for their help there!) I'm also grateful to aantron for some discussions on ppx_bisect motivated by an attempt to estimate coverage for this testing workflow. I was prodded into trying to get Crowbar ready to release by these conversations, and wrote a lot of docstrings and an actual README for the project.

tg, hannes, halfdan, samoht, and several others (sorry if I missed you!) worked hard to get some unikernel infrastructure up and running at Priscilla, including homegrown DHCP and DNS services, self-hosted pastebin and etherpad, an FTP server for blazing-fast local filesharing, and (maybe most importantly!) a local opam mirror. I hope that in future hack retreats, we can set up a local git server using the OCaml git implementation, which got some major improvements during the hack retreat thanks to dinosaure (from the other side of the world!) and samoht.

Finally, the qubes-mirage-firewall got a lot of attention this hack retreat. (The firewall itself incorporates some past hack retreat work by me and talex5.) h01ger worked on the reproducibility of the build, and cfcs did some work on passing ruleset changes to the firewall -- currently, users of qubes-mirage-firewall need to rebuild the unikernel with ruleset changes.

Oh yes, and somewhere in there, I did find time to see some cats, eat tajine, wander around the medina, and enjoy all of the wonder that Priscilla, the Queen of the Medina and her lovely hosts have to offer. Thanks to everyone who did the hard work of organizing, feeding, and laundering this group of itinerant hackers!

Ximin Luo

This was my third MirageOS hack retreat, I continued right where I left off last time.

I've had a pet project for a while to develop a end-to-end secure protocol for group messaging. One of its themes is to completely separate the transport and application layers, by sticking an end-to-end secure session layer in between them, with the aim of unifying all the secure messaging protocols that exist today. Like many pet projects, I haven't had much time to work on it recently, and took the chance to this week.

I worked on implementing a consistency checker for the protocol. This allows chat members to verify everyone is seeing and has seen the same messages, and to distinguish between other members being silent (not sending anything) vs the transport layer dropping packets (either accidentally or maliciously). This is built on top of my as-yet-unreleased pure library for setting timeouts, monitors (scheduled tasks) and expectations (promises that can timeout), which I worked on in the previous hackathons.

I also wrote small libraries for doing 3-valued and 4-valued logic, useful for implementing complex control flows where one has to represent different control states like success, unknown/pending, temporary failure, permanent failure, and be able to compose these states in a logically coherent way.

For my day-to-day work I work on the Reproducible Builds, and as part of this we write patches and/or give advice to compilers on how to generate output deterministically. I showed Gabriel Scherer our testing framework with our results for various ocaml libraries, and we saw that the main remaining issue is that the build process embeds absolute paths into the output. I explained our BUILD_PATH_PREFIX_MAP mechanism for stripping this information without negatively impacting the build result, and he implemented this for the ocaml compiler. It works for findlib! Then, I need to run some wider tests to see the overall effect on all ocaml packages. Some of the non-reproducibility is due to GCC and/or GAS, and more analysis is needed to distinguish these cases.

I had very enjoyable chats with Anton Bachin about continuation-passing style, call-cc, coroutines, and lwt; and with Gabriel Scherer about formal methods, proof systems, and type systems.

For fun times I carried on the previous event's tradition of playing Cambio, teaching it to at least half of other people here who all seemed to enjoy it very much! I also organised a few mini walks to places a bit further out of the way, like Gueliz and the Henna Art Cafe.

On the almost-last day, I decided to submerge myself in the souks at 9am or so and explored it well enough to hopefully never get lost in there ever again! The existing data on OpenStreetMap for the souks is actually er, topologically accurate shall we say, except missing some side streets. :)

All-in-all this was another enjoyable event and it was good to be back in a place with nice weather and tasty food!

Martin Keegan

My focus at the retreat was on working out how to improve the documentation.
This decomposed into

encouraging people to fix the build for the docs system

talking to people to find out what the current state of Mirage is

actually writing some material and getting it merged

What I learnt was

which backends are in practice actually usable today

the current best example unikernels

who can actually get stuff done

how the central configuration machinery of mirage configure works today

what protocols and libraries are currently at the coal-face

that some important documentation exists in the form of blog posts

I am particularly grateful to Mindy Preston and Thomas Gazagnaire for
their assistance on documentation. I am continuing the work now that I
am back in Cambridge.

The tone and pace of the retreat was just right, for which Hannes is
due many thanks.

On the final day, I gave a brief presentation about the use of OCaml
for making part of a vote counting system, focusing on the practicalities
and cost of explaining to laymen the guarantees provided by .mli
interface files, with an implicit comparison to the higher cost in more
conventional programming languages.

The slides for the talk as delivered are here, but it deserves its own
blog post.

Michele Orrù

This year's Marrakech experience has been been a bit less productive than
past years'. I indulged a bit more chatting to people, and pair programming with
them.

I spent some of my individual time time getting my hands dirty with the Jsonm
library, hoping that I would have been able to improve the state of my
ocaml-letsencrypt library; I also learned how to integrate ocaml API in C,
improving and updating the ocaml-scrypt library, used by another fellow mirage
user in order to develop its own password manager.
Ultimately, I'm not sure either direction I took was good: a streaming Json library is
perhaps not the best choice for an application that shares few jsons (samoht
should have been selling more his easyjson library!), and the ocaml-scrypt
library has been superseeded by the pure implementation ocaml-scrypt-kdf, which
supposedly will make the integration in mirage easier.

The overall warm atmosphere and the overall positive attitude of the
group make me still think of this experience as a positive learning experience,
and how they say: failure the best teacher is.

Reynir Björnsson

For the second time this year (and ever) I went to Marrakech to participate in the MirageOS hack retreat / unconference.
I wrote about my previous trip.

The walk from the airport

Unlike the previous trip I didn't manage to meet any fellow hackers at the RAK airport.
Considering the annoying haggling taking a taxi usually involves and that the bus didn't show up last time I decided to walk the 5.3 km from the airport to Priscilla (the venue).
The walk to Jemaa el-Fnaa (AKA 'Big Square') was pretty straight forward.
Immediately after leaving the airport area I discovered every taxi driver would stop and tell me I needed a ride.
I therefore decided to walk on the opposite side of the road.
This made things more difficult because I then had more difficulties reading the road signs.
Anyway, I found my way to the square without any issues, although crossing the streets on foot requires cold blood and nerves of steel.

Once at the square I noticed a big café with lots of lights that I recognized immediately.
I went past it thinking it was Café de France.
It was not.
I spent about 30-40 minutes practicing my backtracking skills untill I finally gave up.
I went back to the square in order to call Hannes and arrange a pickup.
The two meeting points at the square was some juice stand whose number I couldn't remember and Café de France, so I went looking for the latter.
I quickly realized my mistake, and once I found the correct café the way to Priscilla was easy to remember.

All in all I don't recommend walking unless you definitely know the way and is not carrying 12-15 kg of luggage.

People

Once there I met new and old friends.
Some of the old friends I had seen at Bornhack while others I hadn't seen since March.
In either case it was really nice to meet them again!
As for the new people it's amazing how close you can get with strangers in just a week.
I had some surprisingly personal conversations with people I had only met a few days prior.
Lovely people!

My goals

Two months prior to the hack retreat I had started work on implementing the ssh-agent protocol.
I started the project because I couldn't keep up with Christiano's awa-ssh efforts in my limited spare time, and wanted to work on something related that might help that project.
My goals were to work on my ocaml-ssh-agent implementation as well as on awa-ssh.

Before going to Marrakech I had had a stressful week at work.
I had some things to wrap up before going to a place without a good internet connection.
I therefore tried to avoid doing anything on the computer the first two days.
On the plane to Marrakech I had taken up knitting again - something I hadn't done in at least two years.
The morning of the first day I started knitting.
Eventually I had to stop knitting because I had drunk too much coffee for me to have steady enough hands to continue, so I started the laptop despite my efforts not to.
I then looked at awa-ssh, and after talking with Christiano I made the first (and sadly only) contribution to awa-ssh of that trip:
The upstream nocrypto library had been changed in a way that required changes to awa-ssh.
I rewrote the digest code to reflect the upstream changes, and refactored the code on suggestion by Christiano.

In ocaml-ssh-agent I was already using angstrom for parsing ssh-agent messages.
I rewrote the serialization from my own brittle cstruct manipulations to using faraday.
This worked great, except I never quite understood how to use the Faraday_lwt_unix module.
Instead I'm serializing to a string and then writing that string to the SSH_AUTH_SOCK.

GADT !!!FUN!!!

The ssh-agent is a request-response protocol.
Only a certain subset of the responses are valid for each request.
I wanted to encode that relationship into the types so that the user of the library wouldn't have to deal with invalid responses.
In order to do that I got help by @aantron to implement this with GADTs.
The basic idea is a phantom type is added to the request and response types.
The phantom type, called request_type, is a polymorphic variant that reflects the kind of requests that are possible.
Each response is parameterized with a subset of this polymorphic variant.
For example, every request can fail, so Ssh_agent_failure is parameterized with the whole set,
while Ssh_agent_identities_answer is parameterized with `Ssh_agent_request_identities,
and Ssh_agent_success is parameterized with `Ssh_agent_successable - a collapse of all the request types that can either return success or failure.

This worked great except it broke the typing of my parser -
The compiler can't guess what the type parameter should be for the resulting ssh_agent_response.
To work around that @gasche helped me solve that problem by introducing an existential type:

Now I want to write a listen function that takes a handler of type 'a ssh_agent_request -> 'a ssh_agent_response, in other words a handler that can only create valid response types.
This unfortunately doesn't type check.
The parser returns an existential
type any_ssh_agent_request = Any_request : 'req_type ssh_agent_request -> any_ssh_agent_request.
This is causing me a problem: the 'req_type existential would escape.
I do not know how to solve this problem, or if it's possible to solve it at all.
I discussed this issue with @infinity0 after the retreat, and we're not very optimistic.
Perhaps someone in #ocaml on Freenode might know a trick.

Ideas for uses of ocaml-ssh-agent

Besides the obvious use in a ssh-agent client in a ssh client, the library could be used to write an ssh-agent unikernel.
This unikernel could then be used in Qubes OS in the same way as Qubes Split SSH where the ssh-agent is running in a separate VM not connected to the internet.
Furthermore, @cfcs suggested an extension could be implemented such that only identities relevant for a specific host or host key are offered by the ssh-agent.
When one connects to e.g. github.com using ssh keys all the available public keys are sent to the server.
This allows the server to do finger printing of the client since the set of keys is likely unique for that machine, and may leak information about keys irrelevant for the service (Github).
This requires a custom ssh client which may become a thing with awa-ssh soon-ish.

Saying goodbye

Leaving such lovely people is always difficult.
The trip to the airport was emotional.
It was a chance to spend some last few moments with some of the people from the retreat knowing it was also the last chance this time around.
I will see a lot of the participants at 34c3 in 3 weeks already, while others I might not see again in the near future.
I do hope to stay in contact with most of them online!