I decided it was about time to upgrade my main development machine to something modern and snappy. It is 5.5 years ago since I bought my current work horse, a dual-core AMD Athlon 64 X2 5600+ (2.8GHz) equipped thing. I’m using my machine primarily for development. I never game. I decided to go for the higher end of what’s available to get me something to live with for several years to come.

All in all, this has two 120mm chassi fans, one 135mm fan on the big CPU cooler and there’s one fan in the PSU. I hope they won’t be causing too much noise or problems for me. The rather low-end graphics should keep the total power consumption (and thus heat production) at a decent level.

I purchased all the individual parts separately as I dislike how I can’t get an as optimized machine prebuilt from anywhere - I basically have to pay around 50% more, and then I still wouldn’t get the exact set of pieces I’d like. This way I also avoid the highly disturbing Microsoft tax prebuilt systems come with.

Unfortunately I got some bad luck included too, as when I first put everything together and pressed the power button nothing happened. Well, a single led was turned on but nothing else happened. It took me a while and some sweat to figure out where the problem lied and once I replaced the broken motherboard it would start properly and then I could proceed and install it.

Once my new machine (which now goes under the name Moo) gets settled, my old box will become my daughter’s new machine as hers existing tired old PIII machine isn’t really fun to do a lot with.

Utveckling och trender av multicorekretsar inom halvledarindustrin

Reverse engineering – egen kod på andras hårdvara

Keeping up with our fine tradition, we will be present at that huge open source conference called FOSDEM in Brussels Belgium at the beginning of February 2013. It will then become our… 4th (?) visit there. I don’t have any talk planned yet, but possibly I’ll suggest something later.

Fosdem is several thousand open source geeks in a massive scale conference with something like twenty different parallel tracks, where each room basically is organized and planned independently. There’s no registration and no entrance fee. I usually enjoy network and security related rooms and of course the embedded room, which unfortunately seems to be stuck in a very large room of the campus with the worst sound system and audio conditions…

I look forward to meet friends there and have a great time with open source talks and good Belgian beers at night! If you’ll be there too, let us know and we can meet up.

Within the twelve page document they discuss flaws in various APIs and other certificate checking software, and for libcurl they say:

Internally, it uses OpenSSL to verify the chain of trust and verifies the hostname itself. This functionality is controlled by parameters CURLOPT_SSL_VERIFYPEER (default value: true) and CURLOPT_SSL_VERIFYHOST (default value: 2). This interface is almost perversely bad. The VERIFYPEER parameter is a boolean, while a similar-looking VERIFYHOST parameter is an integer.

(The fact that libcurl supports no less than nine(!) different SSL library backends seems to have been ignored but is irrelevant.)

The final part is their focus. It is an integer option but it looks like it could be similar to the VERIFYPEER option which could be considered a boolean option. They go on to explain:

Well-intentioned developers not only routinely misunderstand these parameters, but often set CURLOPT_SSL_VERIFY HOST to TRUE, thereby changing it to 1 and thus accidentally isabling hostname verification with disastrous consequences

They back up their claim with some snippets from PHP programs showing wrong use in chapter 7.

What did the authors do to try to fix the problem before posting rude comments in a report? Nothing. At. All. They could’ve emailed, tweeted or posted a bug report or patch but none of that happened.

They also only post examples of the bad use made by PHP code. The PHP code uses the PHP/CURL binding and a change could easily be done in the PHP binding. I don’t know PHP internals, but perhaps the option could be made to not accept a boolean value instead of a numerical there.

We’re now discussing this topic on the libcurl mailing list. If you have ideas or suggestions or just comments, feel free to join in!

During our embedded Linux hacking event in Stockholm on October 20th I ran a little contest for the ones who wanted to participate. I created it entirely by myself to allow as many people as possibly to participate with them knowing me or me knowing them etc limiting the fun.

For your amusement I include the full contest here. If you want to try it out, then make sure you don’t attempt to google for any answers or otherwise use a machine/computer as a help.

Here I just want to mention that, as is shown in the above example question, ‘ace‘ is the correct character sequence and the letters should then be kept in that order in the final question. Also note that a character sequence can legally contain a dash as well. You will get 16 similar sequences of 1 to 3 letters, and those 16 sequences should be moved around to form the 17th question.

… at this point I fired off all the questions one by one at about 15-20 seconds per question. In this blog post I’ll take a shortcut and instead show you the final page I made that showed all questions at once, which I then left displayed for the remainder of the competition time. Click the image to get a full resolution version that is perfectly readable:

My take away from this contest is that it was harder than I anticipated and took a longer time to crack than I thought. I gave away few additional clues and hints as the time went on, but in the end I believe there were several persons who were very close to breaking it at almost the same time. In the end, Klas and Jonas presented the correct answer first and won the bottle of Champagne. I’m sure you appreciate their efforts after having tried this yourself!

The answers? Are you really sure? The correct answers and the final question with its answer is available…

I had a great time creating the competition and I believe the competitors appreciated it.

On September 10th, I sent out the invite to the foss-sthlm community for an embedded hacking event just before lunch. In just four hours, the 40 available tickets had been claimed and the waiting list started to get filled up as well… I later increased the amount to 46, we had some cancellations and I handed out more tickets and we had 46 people signed up at the day of the event (I believe 3 of these didn’t show up). At the day the event started, we still had another 20 people in the waiting list with hopes of getting a spot!

(All photos in this post are scaled down versions, click the picture to see a slightly higher resolution version!)

In Enea we had found an excellent sponsor for this event. They provided the place, the food, the raspberry pis, the coffe, the tshirts, the infrastructure and everything else that had to be there to make it an awesome day.

We started off the event at 10:00 on October 20 in the Enea offices in Kista, Stockholm Sweden. People dropped in one by one and were handed their welcome present containing a raspberry pi board, a 2GB SD card and a USB-to-serial cable to interface/power the board with. People then found their seats in the room.

There were fruit, candy, water and coffee to start off and keep the mood high. We experienced some initial wifi and internet access problems but luckily we had no less than two dedicated Enea IT support people present and they could swiftly fix the little hiccups that occurred.

Once everyone seemed to have landed, I welcomed everyone and just gave a short overview of what to expect from the day, where the toilets are and so on.

In order to try to please everyone who couldn’t be with us at this event, due to plans or due to simply not having got one of the attractive 40 “tickets”, Enea helped us arrange a video camera which we used during the entire day to film all talks and the contest. I can’t promise any delivery time for them but I’ll work on getting them made public as soon as possible. I’ll make a separate blog post when there’s something to see. (All talks were in Swedish!)

At 11:30 I started off the day for real by holding the first presentation. We used one of the conference rooms for this, just next to the big room where everyone say hacking. This day we had removed all tables and only had chairs in the room movie theater style and it turned out we could fit just about all attendees in the room this way. I think that was good as I think almost everyone sat down to hear and see me:

Open Source in Embedded Systems

I did a rather non-technical talk about a couple of trends in the embedded operating systems market and how I see the upcoming future and then some additional numbers etc. The full presentation (with most of the text in Swedish) can be found on slideshare.

I got good questions and I think it turned out an interesting discussion on how things run and work these days.

After my talk (which I of course did longer than planned) we served lunch. Three different sallads, bread and stuff were brought out. Several people approached me to say how they appreciated the food so I must say that Enea managed really well on that account too!

Development and trends in multicore CPUs

Jonas Svennebring from Freescale was up next and talked about current multicore CPU development trends and what the challenges are for the manufacturers are today. It was a very good and very technical talk and he topped it off by showing off his board with T4240 running, Freecale’s latest flagship chip that is just now about to become available for companies outside of Freescale.

On this photo on the left you see the power supply in the foreground and the ATX board with a huge fan and cooler on top of the actual T4240 chip.

T4240 is claimed to have a new world record in coremark performance, features 12 multi-threaded ppc cores in up to 1.8GHz.

There were some good questions to Jonas and he delivered good and well thought out answers. Then people walked out in the big room again to continue getting some actual hacking done.

We then took the opportunity to hand out the very nice-looking tshirts to all attendees, again kindly done so by Enea.

The Contest

The next interruption was the contest. Designed entirely by me to allow everyone to participate, even my friends and Enea employees etc. On the photo on the right you can see I now wear the tshirt of the day.

The contest was hard. I knew it was hard as I wanted it really make it a race that was only for the ones who really get embedded linux and have their brain laid out properly!

I posted the entire contest in separate blog post, but the gist of it was that I presented 16 questions with 3 answer alternatives. Each alternative had a sequence of letters. So after 16 questions you had 16 letter sequences you had to put in the right order to get a 17th question. The first one to give a correct answer to that 17th question would win.

A whole bunch of people gave up immediately but there was a core group who really fought hard, long and bravely and in the end we got a winner. The winner had paired up so the bottle of champagne went jointly to Klas and Jonas. It was a very close call as others were within seconds of figuring it out too.

I think the competition was harder than I thought. Possibly a little too hard…

Your own code on others’ hardware

Linus from Haxx (who shouldn’t be much of a stranger to readers of this blog) then gave some insights on how he reversed engineered mp3 players for the Rockbox project. Reverse engineering is a subject that attracts many people and I believe it has some sort of magic aura around it. Again many good questions and interested people in the room.

On the photo on the right you can see Linus’ stripped down hardware which he explained he had ripped off all components from in order to properly hunt down how things were connected on the PCB.

Coffee

We did not keep the time schedule so we had to get the coffee break in after Linus, and there were buns and so on.

Yocto

Björn from Haxx then educated the room on the Yocto Project. What it is, why it is, who it is and a little about how it is designed and how it works etc.

I think perhaps people started to get a little soft in their brain as we had now blasted through all but one of the talks, and as a speaker finale we had Henrik…

u-boot on Allwinner A10

Henrik Nordström did a walk-through explaining some u-boot basics and then explained what he had done for the Allwinner targets and related info.

I believe the talks were kind of the glue that made people stick around. Once Henrik was done and there was no more talks planned for the day, it was obvious that it was sort of the signal for people to start calling it a day even though there was still over one hour left until the official end time (20:00).

Of course I don’t blame anyone for that. I had hardly had any time myself to sit down or do anything relaxing during the day so I was kind of exhausted myself…

Summary

I got a lot of very positive comments from people when they left the facilities with big smiles on their faces, asking for more of these sorts of events in the future.

I am very happy with the overly positive response, with the massive interest from our community to come to such an event and again, Enea was an awesome sponsor for this.

I didn’t get anything done on the raspberry pi during this day. As a matter of fact I never even got around to booting my board, but I figure that wasn’t a top priority for me this day.

The crowd size felt really perfect for these facilities and 40 something also still keeps the spirit of familiarity and it doesn’t feel like a “big” event or so.

Will I work on making another event similar to this again? Sure. It might not happen immediately, but I don’t see why it can’t be made again under similar circumstances.

Credits

All photos on this page were taken by me, Björn Stenberg, Kjell Ericson, Mats Lidell and Mia Åkerberg.

Thanks to Jonas, Björn, Linus and Henrik for awesome talks.

Thanks to Enea for sponsoring this event, and Mia then in particular for being a good organizer.

Today I disconnected my ADSL modem through which I still had the landline telephone service running. I’ve cancelled the service so at the end of the month it will die and I’m now instead signing up to one of them lesser known companies that offer cheap IP-telephony that is network independent: Affinity telecom. Staying with only phone service over the copper was not a good idea since the cheapest service is still expensive over that medium.

Why have a landline phone at all? I’ve decided to stick with a landline telephone for a few reasons: first, our phone number is distributed widely and it is convenient to keep it working and keeping it reach our house instead of a specific person. Secondly, we have two kids who like to phone at times and they can use this fine and it’ll end up cheaper than if they’d use their own mobile phones. And thirdly, the parents-in-law factor. My parents and my wife’s parents etc like calling landlines instead of mobile phones, I think partly due to old habits but also partly because it is cheaper for them…

When I researched which service to pick I discovered that not a single operator exists in Sweden that only charges for usage. They all have a minimum monthly fee, so I went with one of the cheapest I could find on prisjakt for my usage pattern.

I simply yanked the RJ11 from the ADSL modem and inserted into my new “Ping Communications Voice Catcher 201E” device (see picture) that I received. The instructions that come with the device say I should connect it directly to internet and then let my existing LAN traffic route through it. I don’t want to add yet another box between me and the internet and I fear that a cheapo box like this might cause problems in one way or another if I do, so I just plugged this new toy into my router and after I while I could happily confirm that it worked just as nicely. It got an IP from my DHCP server and I can call my old-style analog phone now (I love number portability) and I can use it to call out to my mobile.

Microsoft admits the WSApoll function is broken but won’t do anything about it. Unless perhaps if customers keep nagging them.

Doing portable socket programming has always meant using a bunch of #ifdefs and similar. A program needs to be built on many systems and slowly adjusted to work really well all over. For ages, for example, Windows only supported select() and not poll() while all sensible systems[*] out there supported poll(). There are several reasons to prefer poll to select when writing code.

Among the many improvements to the Winsock API shipping in Vista is the new WSAPoll function. Its primary purpose is to simplify the porting of a sockets application that currently uses poll() by providing an identical facility in Winsock for managing groups of sockets.

If you have experience developing applications using poll(), WSAPoll will be very familiar. It is designed to behave just like poll().

Emphasis added by me. It was (of course) made to work like poll, and that’s why the API is made like that. Why would you introduce something that is almost like poll() except in minor details?

Since the new function only was available in Vista and later, it took a while until libcurl users in a more wider scale got to use it but over time Windows XP users are slowly shifting away and more and more libcurl Windows users therefore use the WSApoll based builds. Life seemed to be good. Some users noticed funny things and reported bugs we couldn’t report (on other platforms) but nothing really stood out and no big alarm bells went off.

During July 2012, a user of libcurl on Windows, Jan Koen Annot experienced such problems and he didn’t just sigh and move on. He rolled up his sleeves and decided to get to the bottom. Perhaps he could fix a bug or two while at it? (It seems reasonable that he thought so, I haven’t actually asked him!) What he found was however not a bug in libcurl. He found out that WSApoll did indeed not work like poll (his initial post to curl-library on the problem)! On August 1st he submitted a support issue to Microsoft about it. On August 7 we pushed the commit to curl that removed our use of WSApoll.

It turns out Microsoft already knew about this bug, which they apparently have named “Windows 8 Bugs 309411 – WSAPoll does not report failed connections”. The ticket has been resolved as Won’t Fix… (I haven’t found any public access of this.)

Jan argued for the case that since WSApoll is designed and used as a plain poll() replacement it would make sense to actually make it also work the same way:

First, it will cost much time to find out that some ‘real-life’ issue can be traced back to this WSAPoll bug. In my case we were lucky to have a regression test which triggered when we started using a slightly different cURL-library configuration on Windows. Tracing back that the test was triggered because of this bug in WSAPoll took several hours. Imagine what it would cost, if some customer in the field reported annoying delays, to trace such a vague complaint back to a bug in the WSAPoll function!

Second, even if we know beforehand about this bug in WSAPoll, then it is difficult to determine in which situations in your code you can safely use WSAPoll and in which situations you might suffer from this bug. So a better recommendation would be to simply not use WSAPoll. (…)

Third, porting code which uses the poll() function to the Windows sockets API is made more complex. The introduction of WSAPoll was meant specifically for this, so it should have compatible behaviour, without a recommendation to not use it in certain circumstances.

Fourth, your recommendation will only have effect when actively promoted to developers using WSAPoll. A much better approach would be to repair the bug and publish an update. Microsoft has some nice mechanisms in place for that.

So, my conclusion is that, even if in our case the business impact may be low because we found the bug in an early stage, it is still important that Microsoft fixes the bug and publishes an update.

In my eyes all very good and sensible arguments. Perhaps not too surprisingly, these fine reasons didn’t have any particular impact on how Microsoft views this old and known bug that “has been like this forever and people are already used to it.”. It will remain closed, and Microsoft motivated this decision to Jan quite clearly and with arguments one can understand:

A discussion has been conducted around this topic and the taken decision was not to have the fix implemented due to the following reasons:

This issue since Vista

no other Microsoft customer has asked for a Hotfix since Vista timeframe

fixing this old issue might have some application compatibility risk (for those customers who might have somehow taken a dependency on WSAPoll failing with a timeout in the cases of connect failure as opposed to POLLERR).

This API will become more irrelevant as the Windows versions increase; the networking APIs will move away from classic select/poll to more advanced I/O completion mechanisms.

Argument one and two are really weak and silly. Microsoft users are very rarely complaining to Microsoft and most wouldn’t even know how to do it. Also, this problem may certainly still affect many users even if nobody has asked for a fix.

The compatibility risk is a valid point, but that’s a bit of a hard argument to have. All bugs that are about behavior will of course risk that users have adapted to the wrong behavior so a bug fix may break those. All of us who write and maintain stable APIs are used to this problem, but sticking to the buggy way of working because it has been doing this for so long is in my eyes only correct if you document this with very large letters and emphasis in all documentation: WSApoll is not fully emulating poll – beware!

The fact that they will focus more on other APIs is also understandable but a beside the point. We want reliable APIs that work as documented. Applications that are Windows-only probably already very rarely use WSApoll, it will probably remain being more important for porting socket style programs to Windows.

Jan also especially highlights a funny line from this Microsoft person:

Okay, so even if they have motives why they won’t fix this bug they seem to hint that if more customers nag them about it they might change their minds. Fair enough. But the users of libcurl who for five years perhaps experienced funny effects are extremely unlikely to ever report and complain to Microsoft about this. They are way more likely to complain to us, or possibly to just work around the issue somehow.

Of course, users of WSApoll can adapt to the differences and make conditional code that handles them and that could be what we end up with in the curl project in the future if we just get volunteers to adapt the code accordingly. In the mean time we’ve just reverted to the old select()-using code instead, since select() does in fact mimic the “real” select much better…

[*] = clearly Mac OS X is not a sensible system since its poll() implementation is even worse than Windows and is mostly broken or just unreliable. Subject for another blog post another time.

My first speed checks using the Swedish bredbandskollen service were a bit disappointing since it only showed something like 50mbit down and 80mbit up. I decided to ignore that fact for the moment as things were new and I had some other more annoying issues.

I detected that sometimes, on some specific sites I had problems to get HTTP! The TCP connection would get connected and data would get sent, but it would stall somewhere and then get disconnected. This showed up when for example my wife tried to download a Spotify client and my phone got trouble to download some of my favorite podcasts. Some, not all. Possibly one out of ten or twenty sites showed the problem. Most of the sites I use frequently worked flawlessly and I would only ever see the problem when I tried HTTP.

Puzzling!

I could avoid the problem by setting up a SSH connection to my server and running that as a SOCKS proxy, and so I could still get service even from the sites I apparently couldn’t quite use. I tried to collect the problematic sites and I tried traceroute etc on all the ones that failed in order to get data that would help me and my operator to pinpoint the problem. I reported this problem to my ISP really early on, but they too were puzzled and it never got far in their end.

I was almost convinced they had some kind of traffic shaping thing in the middle that was broken for HTTP somehow.

Time to fix it once and for all

Finally one day I stopped being nice with the support people and I demanded that they would send over a guy and fix it. It wasn’t good enough that they would find that everything is OK from remote. I clearly had problems and I could escape the problems by switching over to my old ADSL so I knew it wasn’t due to my computers’ configs or due to my own firewalls or routers.

I was also convinced my ISP would get me some cable guy coming over when the problem wasn’t really in the cable, but sure it would be a necessary first step towards finding the real error.

They sent a cable guy, and it took him like three minutes to detect a bad signal level on the fiber, meaning that the problem was certainly not in my house anyway. He then drove down to the “station” that terminates my fiber, and after having “polished” that end of the fiber connection too he called me and said that he couldn’t spot any problem anymore and asked me to verify…

Now my checks showed me 80-90 mbit in both directions and the sites that used to give me problems all worked just fine.

It was the cable all along. Bad signal level. “A mystery it worked at all for you” my cable person said.

I’m left with my over-analyzed problem suspecting all sorts of high level stuff. But why did this only affect some sites? How come I could circumvent this with SOCKS? Gah, my brain hurts trying to answer these questions…

Back in 2002 I realized that having libcurl not do SSL server verification by default basically meant that everyone writing libcurl apps would inherit that flaw, simply because most people always just let the defaults remain unless they really have to read up on what something does and then modify them. If things work, things will just remain. So when we shipped libcurl 7.10 on the first of October that year, libcurl started verifying server certs by default.

Fast forward about ten years.

Surely SSL clients everywhere now do the right thing?

One day a couple of months ago, I was referred to this bug report for the pyssl module in Python which identifies that it doesn’t verify server certs by default! The default SSL handler in Python doesn’t verify the certificate properly. It makes all python programs that use this without special attention vulnerable for man in the middle attacks.

So let’s look at the state of another popular language: PHP. A plain standard PHP program opens a ssl:// or tls:// stream. Unless the author of said program knows and understands these things, it too runs without verifying server certs. If a program instead decides to use the PHP/CURL binding for HTTPS or similar, it will use libcurl’s default which verifies it (as I explained above).

But not everything is gloomy. Some parts of our community have decided to do the right thing:

I was told (and proven) that Ruby now does the right thing, but I don’t know how recent that is and thus how many older Ruby programs that suffer.

The same problem existed with perl’s major HTTPS using module, the LWP, for a very long time. The perl camp however already modified LWP to do verification by default with the release of libwww-perl 6.00, released in March 2011.

Side-note: in the curl project we make it easy for everyone on the Internet to use Firefox’s excellent CA cert bundle to verify server certs by providing the Firefox CA cert collection converted to PEM – the preferred format for OpenSSL, GnuTLS and others.

Conclusion:

Even today, lots and lots of applications and scripts will remain insecure – even though they probably think they’re fairly safe when they switch to a HTTPS or SSL using protocol – and might be subject for man-in-the-middle attacks without even being able to spot it. I think it is pretty sad.