At the embedded hacking event in GBG yesterday I organized a small contest for the attendees. I’ve done something similar several times before, so I wanted to make it a bit different this time to spice things up a bit. A straight-forward N questions in a row and then a puzzle to get the final question was too easy. I wanted to create a maze or a play-field that you would need to traverse somehow in order to reach the final goal. But it is hard to create a maze that you don’t immediately spot the way through or that you can somehow “cheat” and find the way in other means rather than to actually answer the questions and do right by using your skills… Then I realized that with just a couple of things added, I could fulfill my goals and still get a fun contest. So, let me start by taking you through the first slide that details the rules:

Ok, so to make the rules be a bit clearer we take a look at a simplified example play field so that we understand what we’re about to play on:

A short summary:

start on a green box

follow the arrow in the direction that your answer to the question of the box leads you. There’s a compass rose there to help you remember the directions! :-)

each box you visit has a word associated with it, collect the words along the path

when you reach the red box you’ve read the goal and you’re done

then you re-arrange all the box words you’ve collected and create a final question

answer that questions, the fastest to answer wins!

Everything clear? To help the participants, we had both the playfield and the associated questions printed out on two sheets of paper that we handed out together with a pen. The amount of data is just a bit too much to be able to show on a single screen and it may help to use a pen etc to remember the track you take and which words to remember etc. If you want to repeat the exact same situation, you do the same! I did a special black-and-white version of the playfield to make it more printer-friendly. You may want to fire this up in full resolution to get the best experience:

So I flew down to and participated at yet another embedded Linux hacking event that was also co-organized by me, that took place yesterday (November 20th 2013) in Gothenburg Sweden.

The event was hosted by Pelagicore in their nice downtown facilities and it was fully signed up with some 28 attendees.

I held a talk about the current situation of real-time and low latency in the Linux kernel, a variation of a talk I’ve done before and even if I have modified it since before you can still get the gist of it on this old slideshare upload. As you can see on the photo I can do hand-wavy gestures while talking! When I finally shut up, we were fed tasty sandwiches and there was some time to socialize and actually hack on some stuff.

I then continued my tradition and held a contest. This time I did raise the complexity level a bit as I decided I wanted a game with more challenges and something that feels less like a quiz and more like a game or a maze. See my separate post for full details and for your chance to test your skills.

This event was also nicely synced in time with the recent introduction of the foss-gbg mailing list, which is an effort to gather people in the area that have an interest in Free and Open Source Software. Much in the same way foss-sthlm was made a couple of years ago.

Pelagicore also handed out 9 Raspberry Pis at the event to lucky attendees.

On November 20, we’ll gather a bunch of interested people in the same room and talk embedded Linux, open source and related matters. I’ll do a talk about real-time in Linux and I’ll run a contest in the same spirit as I’ve done beforeseveral times.

The curl project has its roots in the late 1996, but we haven’t kept track of all of the early code history. We imported our code to Sourceforge late 1999 and that’s how far back we can see in our current git repository. The exact date is “Wed Dec 29 14:20:26 UTC 1999″. So, almost 14 years of development.

Warning: this blog post contains more useless info and graphs than many mortals can handle. Be aware!

How much old code remain in the current source tree? Or perhaps put differently: how is the refresh rate of the code? We fix bugs, we change things, we add features. Surely we’ll slowly over time rewrite the old code and replace it with new more shiny and better working code? I decided to check this. Here’s what I found!

The tools

We have all code in git. ‘git blame’ is the primary tool I used as it lists all lines of all source code and tells us when it was added. I did some additional perl scripting around it.

The code

I decided to check all code in the src/ and lib/ directories in the curl and libcurl source tree. The source code is used to create both the curl tool and the libcurl library and back in 1999 there was no libcurl like today so we do get a slightly better coverage of history this way.

In total this sums up to some 112000 lines in the current .c and .h files.

To count the total amount of commits done to those specific files through history I ran:

git log --oneline src/*.[ch] lib/*.[ch] | wc -l

6047 commits in total. (if I don’t specify the files and count all commits in the repo it ends up at 16954)

git stats

We run gitstats on the curl repo every day so you can go there for some more and current stats. Right now it tells us that average number of commits is 4.7 per active day (that means days when actually something was committed), or 3.4 per all days over the entire time. There was git activity 3576 days in total. By 224 authors.

Surviving commits

How much of the code would you think still remains that were present already that December day 1999?

How much of the code in the current code base would you think was written the last few years?

Commit vs Author vs Date

I wanted to see how much old code that exists, or perhaps how the age of the code is represented in the current code base. I decided to therefore base my logic on the author time that git tracks. It is basically the time when the author of a change commits it to his/her local tree as then the change can be applied later on by a committer that can be someone else, but the author time remains the same. Sometimes a committer commits multiple patches at once, possibly at a much later time etc so I figured the author time would be a better time stamp. I also decided to track the date instead of just the commit hash so that I can sort the changes properly and also make interesting graphs that are based on that time. I use the time with a second precision so changes done a second apart will be recorded as two separate changes while two commits done with the same author time stamp will be counted as the same time.

I had my script run ‘git blame –line-porcelain’ for all files and had my script sum up all changes done on the same time.

Some totals

The code base contains changes written at 4147 different times. Converted to UTC times, they happened on 2076 unique days. On 167 unique months. That’s every month since the beginning.

We’re talking about 312 files.

Number of lines changed over time

A graph with changes over time. The Y axis is number of lines that were changed on that particular time. (click for higher res)

Ok you object, that doesn’t look very appealing. So here’s the same data but with all the changes accumulated over time.

Do you think the same as I do? Isn’t it strangely linear? It seems that the number of added lines that remain in the code today is virtually the same over time! But fair enough, the changes in the X axis are not distributed according to the time/date they represent so we shouldn’t be fooled by the time, but certainly we can see that changes in general only bring in a certain amount of surviving modified lines.

Another way to count the changes is then to check all the ~4000 change times of the present code, and see how many days between them there are:

Ah, now finally we’re seeing something. Older code that is still present clearly was made with longer periods in between the changes that have lasted. It makes perfect sense to me, since the many years of development probably have later overwritten a lot of code that was written in between.

Also, it is clearly that among the more recent changes that have survived they were often done on the same day or just a few days away from another lasting change.

Grouped on date ranges

The number of modified lines split up on the individual year the change came in.

Interesting! The general trend is clear and not surprising. Two years stand out from the trend, 2004 and 2011. I have not yet investigated what particular larger changes that were made those years that have survived. The bump for 1999 is simply the original import and most of those lines are preprocessor lines like #ifdef and #include or just opening and closing braces { and }.

Splitting up the number of surviving lines on the specific year+month they were added:

This helps us analyze the previous chart. As we can see, the rather tall bars from 2004 and 2011 are actually several months wide and explains the bumps in the year-chart. Clearly we made some larger effort on those periods that were good enough to still remain in the code.

Correlate to added or removed lines?

So, can we perhaps see if some years’ more activity in number of added or removed source lines can be tracked back to explain the number of surviving source code lines? I ran “git diff [hash1]..[hash2] –stat — lib/*.[ch] src/*.[ch]” for all years to get a summary of number of added and removed source code lines that year. I added those number to the table with surviving lines and then I made another graph:

Funnily enough, we see almost an exact correlation there for the first eight years and then the pattern breaks. From the year 2009 the number of removed lines went down but still the amount of surving lines went up quite a bit and then the graphs jump around a bit.

My interpretation of this graph is this boring: the amount of surviving code in absolute numbers is clearly correlating to the amount of added code. And that we removed more code yearly in the 2000-2003 period than what has survived.

But notice how the blue line is closing the gap to the orange/red one over time, which means that percentage wise there’s more surviving code in more recent code! How much?

Here’s the amount of surviving lines/added lines and a second graph looking at surviving lines/(added + removed) to see if the mere source code activity would be a more suitable factor to compare against…

Code committed within the last 5 years are basically 75% left but then it goes downhill down to the 18% survival rate of the 1999 code import.

cURL supports a bunch of crypto backends. In practice, only the OpenSSL, NSS (RedHat) and GnuTLS (Debian) variants seem to see widespread deployment

Originally he mentioned only OpenSSL and GnuTLS there until someone pointed out the massive amount of NSS users and then the page got updated. Quite telling I think. Lots of windows users these days use the schannel backend, Mac OS X users use the darwinssl backend and so on. Again statements based on his view and opinions and most probably without any closer checks done or even attempted.

As a side-node we could discuss what importance (perceived) “widespread deployment” has when selecting what to support or not, but let’s save that for another blog post a rainy day.

there exist examples of code that deadlocks on IPC if cURL is linked against OpenSSL while it works fine with GnuTLS

I can’t argue against something I don’t know about. I’m not aware of any bug reports on something like this. libcurl is not fully SSL layer agnostic, the SSL library choice “leaks” through to applications so yes an application can very well be written to be “forced” to use a libcurl built against a particular backend. That doesn’t seem what he’s complaining about here though.

Thus, application developers have to pray that the cURL version deployed by the distribution is compatible with their needs

Application developers that use a library – any library – surely always hope that it is compatible with their needs!?

it is also rather difficult to replace cURL for normal users if cURL is compiled in the wrong way

Is it really? As most autotools based projects, you just run configure –prefix=blablalba and install a separate build in a customer directory and then use that for your special-need projects. I suppose he means something else. I don’t know what.

For GNUnet, we need a modern version of GnuTLS. How modern? Well, while I write this, it hasn’t been released yet (update: the release has now happened, the GnuTLS guys are fast). So what happens if one tries to link cURL against this version of GnuTLS?

To verify his claims that building against the most recent gnutls is tricky, I tried:

download 3.2.5 tarball

unpack it

configure –prefix=$HOME/build-gnutls-3.2.5

make

make install

cd [curl source tree]

configure –without-ssl –with-gnutls=$HOME/build-gnutls-3.2.5 [and some more options if wanted]

make

invoke “./src/curl -V” to verify that the build is using the latest. Yes it does. Case closed.

How does forking fix it? Easy. First, we can get rid of all of the compatibility issues

That’s of course hard to argue with. If you introduce a brand new library it won’t have any compatibility issues since nobody used it before. Kind of shortsighted solution though, since as soon as someone starts to use it then compatibility becomes something to pay attention to.

Also, since Christian is talking about doing some changes to accomplish this new grand state, I suspect he will do this by breaking compatibility with libcurl in some aspects and then gnurl won’t be libcurl compatible so it will no longer be that easy to switch between them as desired.

Note that this pretty much CANNOT be done without a fork, as renaming is an essential part of the fix.

Is renaming the produced library really that hard to do without forking the project? If I want to produce a renamed output from an open source project out there, I apply a script or hack the makefile of that project and I keep that script or diff in my end. No fork needed. I think I must’ve misunderstood some subtle angle of this…

Now, there might be creative solutions to achieve the same thing within the standard cURL build system, but I’m not happy to wait for a decade for Daniel to review the patches.

Why would he need to send me such patches in the first place? Why would I have to review the patches? Why would we merge them?

That final paragraph is probably the most telling of his entire page. I think he did this entire fork because he is unhappy with the lack of speed in the reviewing and responses to his patches he’s sent to the libcurl mailing list. He’s publicly complained and whined about it several times. A very hostile attitude to actually get the help or review you want.

I want to note that the main motiviations for this fork are technical

Yes sure, they are technical but also based on misunderstandings and just lack of will.

But I like to stress again that I don’t mind the fork. I just mind the misinformation and the statements made as if they were true and facts and represents what we stand for in the actual curl project.

I believe in collaboration. I try to review patches and provide feedback as soon as I can. I wish Christian every success with gnurl.

It was long before Firefox and Chrome and even before the Mozilla browser appeared.

Heck, a lot of things of today didn’t exist those 16 long years ago.

It was a different and in many ways simpler world back then, but I would say that we’ve manage quite good to keep with the times and we’ve progressed fine as a company and as individuals all since then.

Me, Björn and Linus are still going strong with more contacts, more customers and possibly more fun than ever.

Time to submit some more strange emails I’ve received recently. Here’s one I suspect may be someone who spots that curl is being abused against some host. I really wouldn’t even know how to begin to answer this…

Someone is using your code to continually hack small businesses I work
with. How on earth do I stop them?!

A recent twitter discussion I had with Andrei Neculau contributed to his blog post on this subject, basically arguing that I’m wrong but with many words and explanations.

It triggered me to write up my primary reasons for why I strongly object to handle open source issues, questions and patches privately (for free) in open source projects that I have a leading role in.

1. I spend a considerable amount of my spare time on open source projects. I devote some 15-20 unpaid hours a week for those communities. By emailing me and insisting on a PRIVATE conversation you’re suddenly yanking the mutex flag and you’re now requesting that I spend parts of this time on YOU ALONE and not the rest of the community. That’s selfish.

2. By insisting on a private conversation you FORCE me to repeat myself since ideas and questions are rarely unique or done for the first time. So you have a problem or a question that’s very similar to one I just responded to. And the next person will ask the same one tomorrow. By insisting on doing them in public already in the first email, already the second person can read it without me having to write it twice. And the third person who didn’t even realize he was interested in that topic will find out and read it as well (either now when the mail gets sent out or even years later when that user find the archived mailing list on the web). Private emails deny that ability. That’s selfish.

3. By emailing me privately and asking questions and help, you assume that I am the single best person to ask this question at this given time. What if I happen to be on vacation, be under a rough period at work or just not know the particular area of the project very good. I may be the leader or a public person of a project, but I may still not know much about feature X for operating system Z about which you ask. Ask on the list at once and you’ll reach the correct person. That’s more efficient.

4. By emailing me privately, you indirectly put a load on me to reply – or to get off as a rude person. Yes you’re friendly and you ask me nicely and yet even after you remind me after a few days I STILL DON’T RESPOND. Even if I just worked five 16-hour work days and you asked questions I don’t know the answer to… That’s inefficient and rude.

5. Using pull requests on github? That could’ve been good if github would’ve been better. Right now, posting a pull request will only send that to the few persons who are listed as “collaborators” for that repository – which may reach more than a single person so it still better than to a private person but on the mailing list there may be THOUSANDS of people that can read and comment and provide feedback on patches and issues. That’s inefficient and forces OTHERS to have to do the work instead and forward stuff to the list.

6. Yes, you can say that subscribing to an email list can be daunting and flood you with hundreds or thousands of emails per month – that’s completely true. But if you only wanted to send that single question or submit the single issue, then you can unsubscribe again quite soon and escape most of that load. Then YOU do the work instead of demanding someone else to do it for you. When you want to handle a SINGLE issue, it is much better load balancing if you do the extra work and the people who do tens or HUNDREDS of issues per month in the project do less work per issue.

7. You’re suggesting that I could forward the private question to the mailing list? Yes I can, but then I need to first ask for permission to do so (or be a jerk) and if the person who sent me the mail is going to send me another mail anyway, (s)he can just as well spend that time to send the first mail to the list instead of say YES to me and then make me do his or hers work. It’s just more efficient. Also, forwarded questions tend to end up so that replies and follow-up questions don’t find their way back to the original poster and that’s bad.

8. I propose and use different lists for different purposes to ease the problem with too many (uninteresting) emails.

In the middle of my otherwise happy summer vacation, my Nexus 10 had a serious case of depression and took a nose-dive from a little over a meter above floor level and crashed into the mighty fine stone tiles. It got some serious damage and there are cracks all over the screen.

A while ago I posted a service request on Samsung’s site to get it fixed (they manufacture the Nexus-10s and the device have a product number and everything in Samsung’s systems), and they responded and directed me to my nearest “quick support center” for screen and display repairs. The nearest one happens to be located just a few hundred meters from where I work these days.

Today I stepped in there and asked to get my screen fixed.

- “It’ll cost you some 2000-3000 SEK” one of the two service guys says at once. Clearly not really wanting to fix it.

- “Eh, that’s a bit unspecific. Can you tell me with some better accuracy?” I reply. After all, I didn’t pay an awful lot more than 3000 SEK for it as new – in the US and I wanted to figure out if a repair would be worth the money.

- “Okay”, he says and leaves the room through a door and is gone for a while.

- “Do you have the serial number for it?” the guys says returning and yeah, I brought it there in its original box and the serial number is there. The guy leaves again.

- “Did you buy that device here?” He’s back. Without really specifying where “here” means but I figured he means in Sweden so I of course had to tell him

- “No, it’s purchased in the US”

- “Then we can’t fix it.” and then he explains how they can’t order the parts necessary for the device and that’s it. Nothing to do. They can’t.

Okay, so I knew there’s always a risk when “grey-importing” so I’m not really very upset. I’m mostly baffled by their response and also by the fact that there apparently is something different with the Nexus 10 display if it was bought in the US – or at least they say so.

We still have another Nexus 10 in the family and I’m ordering a 2nd gen Nexus 7 now to replace this dead thing with…

Background

Since 2006 we’ve had three major API families in libcurl for doing file transfers:

the multi interface – a non-blocking interface that allows multiple simultaneous transfers (in both directions) in the same thread

the socket_action interface – a brother of the multi interface but designed for use with an event-based library/engine for high performance and large scale transfers

The curl command line tool uses the easy interface and our test suite for curl + libcurl consists of perhaps 80% curl tests, while the rest are libcurl-using programs testing both the easy interface and the multi interface.

Early this year we modified libcurl’s internals so that the functions driving the easy interface transfer would use the multi interface internally. Then all of a sudden all the curl-using tests using the easy interface also then by definition tested that the operation worked fine with the multi interface. Needless to say, this pushed several bugs up to the surface that we could fix.

So the multi and the easy interfaces are tested by many hundred test cases on a large number of various systems every day around the clock. Nice! But what about the third interface? The socket_action interface isn’t tested at all! Time to change this sorry state.

Event-based test challenges

The event-based API has its own set of challenges; like it needs to react on socket state changes (only) and allow smooth interactions with the user’s own choice of event library etc. This is our newest API family and also the least commonly used. One reason for this may very well be that event-based coding is generally harder to do than more traditional poll-based code. Event-based code forces the application into using state-machines all over to a much higher degree and the frequent use of callbacks easily makes the code hard to read and its logic hard to follow.

So, curl_multi_socket_action() acts in ways that aren’t done or even necessary when the regular select-oriented multi interface is used. Code that then needs to be tested to remain working!

Introducing an alternative curl_easy_perform

As I mentioned before, we made the general multi interface widely tested by making sure the easy interface code uses the multi interface under the hood. We can’t easily do the same operation again, but instead this time we introduce a separate implementation (for debug-enabled builds) called curl_easy_perform_ev that instead uses the event-based API internally to drive the transfer.

The curl_multi_socket_action() is meant to use an event library to work really well multi-platform, or something like epoll directly if Linux-only functionality is fine for you. curl and libcurl is quite likely among the most portable code you can find so after having fought with this agony a while (how to introduce event-based testing without narrowing the tested platforms too much) I settled on a simple but yet brilliant solution (I can call it brilliant since I didn’t come up with the idea on my own):

We write an internal “simulated” event-based library with functionality provided by the libcurl internal function Curl_poll() (the link unfortunately goes to a line number, you may need to move around in the file to find the function). It is in itself a wrapper function that can work with either poll() or select() and should therefor work on just about any operating system written since the 90s, and most of the ones since before that as well! Doing such an emulation code may not be the most clever action if the aim would be to write a high performance and low latency application, but since my objective now is to exercise the API and make an curl_easy_perform clone it was perfect!

It should be carefully noted that curl_easy_perform_ev is only for testing and will only exist in debug-enabled builds and is therefor not considered stable nor a part of the public API!

Running event-based tests

The fake event library works with the curl_multi_socket_action() family of functions and when curl is invoked with –test-event, it will call curl_easy_perform_ev instead of curl_easy_perform and the transfer should then work exactly as without –test-event.

The main test suite running script called ‘runtests.pl’ now features the option -e that will run all ~800 curl tests with –test-event. It will skip tests it can’t run event-based – basically all the tests that don’t use the curl tool.

Many sockets is slow if not done with events

This picture on the right shows some very old performance measurements done on libcurl in the year 2005, but the time spent growing exponentially when the amount of sockets grow is exactly why you want to use something event-based instead of something poll or select based.