Archive for the ‘Programming’ Category

I recall when people first proposed writing a read-only filesystem for an internal project at work. While I cannot talk much about what we have implemented, I can at least say it made and still makes sense to solve our problem by implementing a filesystem.

Filesystems can be written in many ways and their implementation specifics and problems they try to solve can range from easy to very hard. Filesystems such as BTRFS, ext4 or zfs take years or even decades to write and stabilize. They are implemented in kernel land, hard to debug and use sophisticated data structures for maximum performance and reliability. However with the addition of FUSE in the 2.6.14 Linux kernel and it’s subsequent porting to other POSIX operating systems such as FreeBSD, OpenBSD and MacOS, it became much easier to write and prototype filesystems. These filesystems are userland processes that talk to a simple kernel driver that redirects filesystems calls to the userland process when necessary. In our particular case, we didn’t need maximum performance, and we could heavily rely on kernel side caching, which made writing a fuse filesystem a reasonable choice.

Prototyping

In order to get a proof of concept, I decided to write a first version in Python. fusepy is a well known and stable implementation. Writing the initial prototype was a pleasure. Most things worked out of the box and we knew our idea works. However filesystems aren’t your standard piece of software and come with a set of performance characteristics that made fusepy and python a good choice for a prototype but a rather mediocre choice for a production-ready implementation. Most programs obtain data, do some processing and write the data somewhere. Most of their time is spend in userland running the program. Filesystems are slightly different. A program like ls will try to obtain the information of a list of files by issuing multiple, often hundreds of stat calls to the kernel, causing context switches into the kernel. Once the kernel figured out that we are a fuse filesystem it will hand the task of returning the file information to the fuse kernel module. The fuse module will then take a look into a cache and if necessary issue a request via IPC to the fuse userland daemon and subsequently to the userland program. At that point we already doubled our context-switches in contrast to a regular kernel-filesystem. Now that we spooled up a stat request in userland we try to return it with at least effort as possible (we still have 99 other requests to go) and return it back to the kernel, which then returns the result via the virtual filesystem layer back to the userland process which issued the request, effectively leading to 100% context-switch overhead to traditional kernel filesystems. To make things worse, python will also take a global interpreter lock in the meantime, allowing one request at a time to be served. There is also an overhead in calling a python function compared to a C function, which given the amount of requests, can add up significantly. In addition python will need to do some datastructure juggling. While this barely gives you a full picture as to why python is not necessarily a perfect choice for a production-ready implementation, it’s enough to say, it’s not optimal.

Choosing a language

Now that we have our prototype we want to get a production-ready version fast, but we also want to minimize our overhead to get good performance. At this point we want to consider a reimplementation of the python prototype in a language that is more suitable for high-performance systems programming. What are our common choices? C, C++, Go, Rust, Haskell and D are common competitors in the systems programming world. So let’s see…

C/C++

Without a doubt C/C++ is reasonable choice. They have by far the best support for FUSE as libfuse is written in C. They are great for high-performance applications. However you will need to do your memory management yourself and deal with a class of potential errors which might not even exists in other languages such as a multitude of undefined behavior, memory management and certain types of overflows. In the end while we all agreed they are conservative reasonable choices we opted for languages which we believed get us faster to a working version.

Go

Go seemed interesting but actually fell through quite fast upon realization that calling from C into Go can lead to a 10x overhead due to stack-rewriting. This is caused by Go not following C calling conventions and uses different stack layouts.

Haskell

A great choice, but due to lack of expertise, ramp up time, etc we decided against it.

Rust

Certainly nowadays a great choice. The project I am talking about is a bit older and back then rust was not stable at all and could very well change behavior. So in addition to learning a new language we didn’t want to deal with a constant change of APIs and language features. These days I would certainly reconsider using it.

D and the good parts

As mentioned in earlier blogposts. D feels like Python in many ways but has C performance characteristics. In addition, it’s fairly mature, it uses C calling conventions and we had expertise in using it. It works well on MacOS and Linux which is all we cared (in fact D works pretty well on Windows too). So we went for it. We knew from the beginning we had to write a smaller wrapper around libfuse. D is very similar to C and C++ and really cares about interoperability. So writing a wrapper seemed to be an easy task and in fact within a few days we had a working version that grew as we needed. Once we started rewriting the python prototype on top of the new library, we were quite impressed how similar the languages seem to be and how fast we got to a working version. D’s string handling and it’s build in hashmaps and dynamic arrays made it a pleasure to work with.

The not-so-good parts

One thing we noted fairly quickly is that synchronization between the D wrapper and the C header files can be painful. If they change you will not notice unless you know it changed. The header files can diverge and miss some important calls. We had to add some wrappers in the standard library in order to get what we needed. Worst, if you make a mistake in the data layout your program will compile fine. However your program might randomly crash (that’s the good case) or just perform weird in rare cases as you might have written data to a wrongly aligned field and the C library will happy read whatever you write and take it as something completely else (I can tell you, it’s really fun when you get random results from your stat() syscall).

When porting to MacOSX it turned out that MacOS encodes the information if the operating systems supports 64-bit inodes in C header files and heavily depends on ifdefs around it which we could simply not port to D as it (luckily) doesn’t have a preprocessor. So we had to start creating unpleasant workarounds. Another issue we came along is D’s shared attribute. The basic idea that you have a transitive notion of shared data which can safely be shared across multiple threads. This works great as long as you assume that you have a thread-local store. The D compiler and the runtime supports all this by default, however it’s based on the assumption that the D runtime controls the threading and the data passed around…

A key difference
At this point it became apparent that there is a key difference between your standard D project and what we are trying to do. In most projects, your program is the main program and it calls into other libraries. In the FUSE world your program is wrapped in the libfuse context and only interacts with the world through callbacks you handed into libfuse. At that point you completely pass the control to fuse and your code only get’s called when fuse decides so. And worst, libfuse will also control the threading.

So back to shared. Well we initialize all our datastructures before libfuse starts threading, we then pass the datastructures around. At that point we either make everything shared and hope it somehow works, but that turned out to break half of the code due to incompatible qualifiers in the D libraries or we just go ahead and declare everything as __gshared and basically circumvent the whole shared business. De-facto everything our code is shared anyway. So if we have to use interfaces to that require shared qualifiers we just cast bakck and forth. That wasn’t really pretty and we started realizing that this is just the tip of the iceberg.

One bug came to mind. On a slightly unrelated note: we tried D’s contracts which caused the compiler to crash. I wasn’t necessarily impressed with discovering a compiler bug in a fairly straight forward code. But well, those things happen.

The thing with the runtime

At that point we had everything reimplemented and it worked like a charm. So we decided to enable multithreading in libfuse. Suddenly our program started to crash instantly. We rallied all our D expertise. People who have been working on D itself, it’s libraries, alternative compiler implementation etc and tried to debug what the heck is going on. We soon found out, disabling garbage collection made the problem go away. While that is “okay” for a short living program, it’s a not-so-okay workaround for a filesystem that potentially runs as long as your computer runs. So more digging into the Druntime and eventually after days we figured out that the druntime didn’t know about the threads the fuse library created. So when the garbage collector comes and tries to stop the world (as in, stop everything but the current thread) and perform the GC, it won’t know about the other threads but the current and stops nothing. So all the threads happily write data while the garbage collectors tries to run. BOOM!. However libfuse doesn’t tell you when and how many threads it will create. So how will you be able to tell the druntime about them? In the end upon every call we just check if we know about the current thread and if not attach it and then correctly detach it later. There were some MacOS specific problems around multithreading as osxfuse tried to join certain threads from time to time which the druntime considered to be attached. A long and tough deep-dive into OSXfuse and core.thread followed until eventually with some help from the D folks we figured out that we have to detach the thread and have to use pthread_cleanup to allow us to detach from the druntime.

In the end lessons were learned and in a more general notion, embedding a runtime, in particular one with a garbage collector into a project which get’s called from C/C++ contexts leads to a lot of “interesting” and mostly undocumented issues.

A final note

In the end I am not sure if I would choose D again. Modern C++14 is fairly nice to write and the amount of manual memory management in our case isn’t that bad. However now that we solved all the runtime issues and have a fairly stable version, we get back to the fun and turn around times which we like. So overall there is still a net win in choosing D.

Also D moves more and more towards a safe but gc-less memory model, which is something we for sure want to investigate once more and more of the ref-counting code and allocator code lands in upcoming D versions. I am still happy with D in general, but a bit less optimistic and more critical when choosing it then before, and particularly when writing a filesystem.

Python has been one of my favorite languages since I started contributing to the Mercurial project. In fact Mercurial being in Python instead of git’s C/Bash codebase was an incentive to start working on Mercurial. I admired it’s clean syntax, it’s functional patterns, including laziness through generators and it’s ease to use. Python is very expressive and thanks to battery-included, it’s very powerful right from the beginning. I wanted to write everything in Python, but I reached limitations. CPU intensive code couldn’t be easily parallelized, it’s general performance was limiting and you had to either ship your own python or rely on system python.

For deep system integration into libraries, performance and static binaries I’ve still relied on C. One might argue that I should have used C++ over C but I really loved C’s simplicity. All it’s syntax could fit in my brain and while it’s not typesafe (see void pointer), it’s incredible fast and you can choose from the largest pool of libraries available.

So while C gave me the power and performance, Python gave me memory safety, expressiveness and not to worry about types. I’ve always looked for a middle-ground, something fast, something I can easily hook into C (before there was cython) and something expressive. Last year I found D and were lucky enough tobe able to be allowed to implement a Fuse filesystem in D and an Open Source library around it: dfuse.

For once it feels I found a language that hit’s a sweet-spot between expressiveness, close to the system, and performance. There are drawbacks, for sure. I’ll talk about that in another post.

What is D? D is a programming language developer by Walter Bright and Andrei Alexandrescu over the last 10 years. It’s newest incarnation is D2 which has been developed since 2007 and available since 2010. Both have worked extensively with C++ in the past and incorporated good parts as well as redesigning the bad parts. The current version is 2.066.1. Multiple compilers are available, the reference implementation DMD, a GCC frontend called GDC and an LLVM frontend named LDC. In addition a runtime is distributed which provides thread handling, garbage collection, etc. A standard library is available, called phobos and usually distributed together with the druntime (in fact druntime is compiled into the standard library).

What makes D beautiful? First of all, D is statically typed, but in most cases types are inferred through `auto` (which C++ borrowed later on). Secondly, D has excellent support for interacting with C and C++ (just recently namespace resolution for C++ symbols was added). But most important, D offers a variety of high level data structures as part of the language. This allows for fast and efficient prototyping. Solutions in D tend to be a nice mixture between a “python”-style solution using standard datatypes and a “C”-style approach of using specialized datastructures when needed.

In addition, D offers a very elegant template syntax to provide powerful abstractions. For example the following code generates a highly optimized version of the filter for every lamda you pass in (unlike C where we would need a pointer dereference):

But my arguable most loved feature is the unified function call syntax. Think about a structure that has a set of functions:

but you want to add a new method `drop`, however you can’t modify the original code. UCFS allows you to use the dot-notation to indicate the first argument:

This gives a powerful way to add abstractions on top of existing libraries without having to obtain or fork the source code.

Get Started

If you are interested in giving D a try, a few useful resources:

Compiler: The DMD compiler is the reference compiler. It does not produce as efficient code as GDC but usually supports the most recent features. You can obtain it from http://dlang.org.

DTrace is a dynamic tracing tool build by Sun Microsystems and is available for Solaris, MacOS and FreeBSD. It features a tracing language which can be used to probe certain “probing” points in kernel or userland. This can be very useful to gather statistics, etc. Linux comes with a separate solution called systemtap. It also features a tracing language and can probe both userland and kernel space. A few Linux distributions such as Fedora enable systemtap in their default kernel.

PHP introduced DTrace support with PHP 5.3, enabling probing points in the PHP executable that can be used to simplify probing of PHP applications without having to the PHP implementation details. We enabled probes on function calls, file compilation, exceptions and errors. But this has always been limited to the operating systems that support DTrace. With the popularity of DTrace, Systemap programmers decided to add a DTrace compatibility layer that allows to use DTrace probes as Systemtap probing points as well.

With my recent commit to the PHP 5.5 branch, we allow DTrace probes to be build on Linux, so people can use Systemtap to probe those userland probes.

To compile PHP with userland probes you need to obtain the PHP 5.5 from git:

Software projects choose languages based on idioms of the languages. Languages can provide mechanisms and structures to support object orientation or functional programming. Less time is spent thinking about backwards compatibility of programming language runtimes. While this is usually a non-issue for short living software like websites or software in tightly controlled environment, it becomes an issue for software projects that need to guarantee backwards-compatibility for years. For example: a version control system.

The Mercurial project aims to support Python 2.4 to Python 2.7. It does not support Python 3. Why? Python 3 is a drastic change. Unicode is the default string type, classes removed, etc. The impact of the changes are similar to the change from PHP 4 to PHP 5. Most software projects have adopted these language changes, but for projects that need to support LTS operating systems like RHEL or Solaris 9/10, it can be become an issue. You could drop Python 2.X support and tell existing users of your software to look for something else – a no-go for a version control system. You could simply not support Python 3 at someday, but Python 2.7 already reached it’s EOL. It’s just a matter time until distribution stop shipping Python 2.X. LTS operating systems might still not have Python 3 and rely on Python 2. Writing software that needs to be backwards-compatbile for 8 years can be a problem.

The source of the problem

Why is this a not an issue for Java or C, but for Python, PHP and Ruby? Java and C compile to bytecode that is guaranteed to be stable. C compiles to machinecode. A processor architecture won’t change anymore. If it’s a x86 processor, it will support x86 machinecode. It won’t change with the next software update. If your code needs to support old C code that modern compilers don’t understand anymore, use an old one. Java is similar in that regard. The JVM runtime has a defined set of instructions, which won’t be changed anymore. It doesnt matter which Java compiler you use, in the end it will produce bytecode that will run on any JVM. Sure you still might have problems supporting multiple versions of a library, but at least the JVM will always run your compiled code.

Python and PHP compile to bytecode as well, similar Java. There is, however, one exception: They do it in memory and the VM to interprete the bytecode is bundled with the compiler. This is were the backwards compatibility problem comes in play. You cannot run Python bytecode compiled on Python 3 with a Python 2 interpreter. You cannot compile with PHP 5 and run it on PHP 4. Either the interpreter simply fails to your old code, or your VM implementation is not guaranteed to be stable. That means in Python and PHP the underlying machine that you compile might change with the next update. Let’s compare this to the x86 world. Your next software update might change the x86 instruction set? You would have to recompile all your C code and maybe some of the old C code cannot be compiled with modern C compilers and old C compilers might not be able to get compiled on the new instruction set. Sounds painful, particularly if you really care about backwards-compatibility.

Sidenote

I think that Python, PHP and others did an architectual mistake. They bundled the VM and runtime with the compiler. Thus your language version defines your runtime and the underlying machinecode. If you write a new language, write down a minimum instruction set that you will always support and separate your VM from your compiler. Always support that instruction set. This can lead to interesting problems. The implementation of Java Generics is a good example. Nobody thought about generics when defining the insturctions set. Therefore the bytecode was not designed to retain information about the generic type. Thats why the Java compiler needs to check the generic type information and than transform it, so that the resulting bytecode is compatible with old JVM versions. This is known as type erasure. Python and PHP developer would probably just introduce new bytecodes, not caring about BC. (Well PHP devs would just pretend that PHP is a web language and web projects shouldn’t care about BC at all ;)).

Conclusion
If you seriously care about backward-compatibility for LTS systems that are 8 years old, choose a language which separates the VM from the compiler. Languages like Java (probably C#) do this. Java developer won’t define behavior that requires a new opcode. PHP and Python are wonderful programming languages, but personally I am not sure if it is wise to write something like a VCS in such a language.

Long story short: Language choice matters for BC. If you write your own language, please separate your VM from your compiler. Better (as johannes pointed out) compile to an existing VM like JVM, CLR or LLVM

We recently launched geocommit.com. Geocommit is a service to add geolocation data to your commits. You only need a working WiFi connection. No GPS module is required.

This blogpost gives you an example how to use geocommit and the geocommit.com services. I’ll show how to use geocommits in your Git or mercurial project. How to make github and bitbucket more beautiful with our Chrome and Firefox extensions and how to get a fancy map of your geocommits.

What is geocommit

First of all, geocommit is a text format to attach geolocation data to version control system commits. The geocommit website has detailed information about the geocommit format.
Second, geocommit is a service to store and analyse your geocommit data. We offer a set of tools and a webservice to make geocommit cool. The Git implementation git geo runs on Mac OS X and Linux. The Mercurial implementation hg-geo runs only under Linux. Mac OS support is under way.

This will enable geocommit support in your project. If you commit something with git commit, git geo will try to get your current location and add a geocommit. If no WiFi connection is enabled, no geocommit will be created.

git geo push accepts the same options as git push. It pulls geocommits first, merges them and then pushes geocommits and the given branch to the remote repository.
That’s everything you need. Easy, isn’t it? So let’s see how to enable geocommits on Mercurial and then talk about the Chrome and Firefox extensions.

Deep dive
git geo stores geocommits in git notes. We use the namespace geocommit for that. Git notes have some cool properties. They are metadata and don’t change the commit hash. Therefore they can be added to a commit at anytime. They are displayed on github and can be deleted without any problem. You also can decide yourself when to push geocommits or not. You can delete already pushed geocommits without breaking the repository or changing any commit sha1. The drawback is that it is hard to deal with git notes from time to time. git notes is a new feature in git and not yet fully supported. We have to write a script to merge git notes as git notes merge is not available before git 1.7.7.

Mercurial & geocommit

You can add support for geocommits to Mercurial by installing the hg-geo extension. Clone the extension and enable it in your hgrc:

Deep dive
As Mercurial doesn’t have a way to store metadata, we are adding the geocommit data to the commit message itself. The obvious advantage is that you can use hg-geo with plain Mercurial. You do not need to enable hg-geo on the remote site to push geocommits (like Mercurial bookmarks). The disadvantage is that we modify the commit message and therefore the commit hash. There is no easy way to delete geocommits once they are created.

bitbucket.org and github.com

We can push geocommits easily now. But how to use them? We can install the Firefox or Chrome extension. This will display a map next to your commit!

Firefox
To install the geocommit extension for Firefox you need Greasemonkey. Greasemonkey is a well know and supported extension that enables user scripts to safely modify the displayed website.

Install Greasemonkey from userscripts.org. You can then browse bitbucket.org or github.com and see a map of your geocommit:

Post Hook

We offer a post hook that you can use with github.com and bitbucket.org. Your commits will be tracked by gecommit.com and we will create a global and a project specific map as well as provide further analytics as soon as possible.

github.com
To install the hook go to th eadmin section of your repository and select Service Hooks.

Add http://hook.geocommit.com/api/github as a POST service hook.

on bitbucket.org
Go to the admin seciton of your repository and select Services

Bookmarks is an extension to the Mercurial SCM, which adds git-like branches to Mercurial. The extension is distributed together with Mercurial.
Recently the extension has received a major update. Time to look back.

macports is a widely used ports system for Mac OS. It’s repository contains hundreds of application that can be compiled and installed. The repository contains php 5.3. So if you want to run PHP from subversion you still have to compile it yourself and install it yourself outside your managed ports environment. I created a rather simple Portfile to build it from PHP’s trunk.

For those not following the PHP development. We backported the DTraces probes from the abandoned PHP 6.0 branch, back to the new trunk PHP 5.3.99-dev. It is called 5.3.99 because the PHP dev community has not decided yet on a version number (5.4 or 6.0).