Monday, November 10, 2008

Another great source of information on practical, real-world concurrency is the "Effective Concurrency" column that Herb Sutter is writing for Dr. Dobb's Journal. I've been reading them in the print issues, but you can find them by searching for "Herb Sutter" on the web site. The latest issue which just arrived today at the Palatial Overclock Estate (a.k.a. the Heavily-Armed Overclock Compound) is on "Understanding Parallel Performance".

Sunday, November 09, 2008

Bryan Cantrill and Jeff Bonwick, both of Sun Microsystems, have written "Real-world Concurrency", published online in the September edition (6.5) of ACMQueue. It's a great set of tips, tricks, and lessons learned for those of us that have to deal with multi-threaded code. I've already passed it along to several of my colleagues working in C, C++ or Java on embedded or multi-core platforms. Queue is the Association for Computing Machinery's online journal targeted at near-term real-world computing issues.

Saturday, November 08, 2008

That's a remark made by my mentor Bob Dixon when he was my thesis advisor circa 1984. It's one of those pieces of wisdom that we accumulate and carry around in our brains for the rest of our lives. It had reason to percolate to the surface as I read Why Current Publication Practices May Distort Science by N. Young et al., published in the Medicine blog of the Public Library of Science. Young and his coauthors discuss the effects of "The Winner's Curse" on how scientific results are (or are not) published.

When an item is auctioned, potential buyers make offers, or bids, on the item. Although the dollar amounts of the bids may not be normally distributed, for sure they are spread out. Some bidders bid low, some bid high, depending on how each perceives the value of the item being auctioned. What is the actual market value of the item? One definition might be the average of all the bids. But who actually gets the item being auctioned? The highest bidder. That means, by this definition of market value, the winner of the auction always pays above market value for the item. That's "The Winner's Curse".

Of course, it's not that simple. The winner may be bidding higher because they believe they have special knowledge or capabilities that will let them leverage the item to greater value than the other bidders. Or they may be desperate. Or just foolish. (I personally have fallen into that category at one art auction, and ended up $250 poorer because of it.) But in general, auctions are good for the seller, maybe not so good for the buyer.

Young et al. apply this to scientific publications. Research with the most spectacular results tend to be what gets published. And, broadly speaking, research with positive results has a much better chance of getting published than research with negative results. Researchers whose projects yield positive results win the auction for space in refereed journals.

This gives a false perception of the state of a particular area of research, since projects that yield negative results are not published and hence are not part of the "average" perceived by those who read the scientific journals. In a "publish or perish" academic climate, there is a strong motivation to produce only positive results, to quickly terminate research that yields negative results, or even to phrase results to make them appear positive. Surely negative results are just as valuable, since they can prune the tree of possible avenues of inquiry for other researchers. But such research is seldom published.

Long time readers are just waiting for me to say this: this is yet another example of measurement dysfunction.

I've been accused by my friend and occasional colleague Rodney Black of having an academic bent. (He sincerely meant it as a compliment, and I don't deny it.) So I appreciate this issue that is perhaps of interest mostly to those who spend most of their time on the R end of the R&D spectrum. (Although since this affects much medical research, including the clinical trials done by pharmaceutical companies, it probably should be of great concern to all of us.)

But there is an equally strong bias at the D end of the spectrum too. Technology efforts that crash and burn seldom get discussed, seldom get written up, are usually quietly buried and the participants sent to the career equivalent of Guantanamo Bay, or maybe Gulag Archipelago. But like those research projects yielding negative results, these failed efforts would be valuable as object lessons, cautionary tales, and as port mortems on what to do differently next time.

This too leads to a false sense of the "average" of the state of the art. It causes artificially inflated expectations, frequently on the part of upper management. All they see in the in-flight magazine are the success stories. "History is written by the victors", as Winston Churchill once said.

I'm as guilty of this as any one else. I've had a few spectacular failures in my career. But you'd have to ply me with more than a few gin and tonics to get me to talk about them. (I encourage you to try.) That may be human nature, but it's wrong.

Two-time Nobel Prize winner Linus Pauling once said "The best way to have a good idea is to have lots of ideas". Meaning, most of your ideas aren't going to work out. If you have "a track record of success", it's only because you've done a very good job at hiding your failures.

Not only is failure an option, it's downright necessary.

(Update 2009-02-18)

I was having breakfast the other day with two of my oldest friends, Brian and Paul, who are the closest thing to homies I could have when I was in college in the 1970s. We were discussing our career ups and downs. I was a little surprised to hear someone say that the few real disasters I had had in my career were in hindsight the best things that had ever happened to me. I was even more surprised to realize it was me saying it. But it's true. Funny how life works out.

(Update 2012-12-21)

I eventually did write about the most spectacular failure of my career, and how it was simultaneously the my biggest stroke of luck, in Dead Man Walking.

Sunday, November 02, 2008

In From Diminuto to Arroyo I described Arroyo, another version of my Linux-based project to teach embedded software design and assembly language programming on a commercially available evaluation board, the ARM-based AT91RM9200-EK. My earlier project, Diminuto, was completely memory resident. It is a good example of smaller "diskless" embedded Linux systems. Arroyo uses an Secure Digital (SD) card as its root file system "disk", and can similarly use USB drives too. It is a good example of larger "diskful" embedded Linux systems.

Understand that Diminuto thinks it has a disk, when all it has is RAM. Arroyo thinks it has a disk, but is actually using flash-based devices that conspire with their device drivers to emulate IDE and SCSI disks. There's not a traditional spinning disk (or any other moving part) in either system. Diminuto still has all the drivers and utilities needed to mount and use SD cards and USB drives, it just doesn't need them to run.

Arroyo supports a variety of file systems like EXT2 and EXT3 (also popular in PC-based Linux systems) and VFAT (or FAT-32, popular in Windows systems). Most SD and USB storage devices come right out of the package pre-formatted for VFAT. Although VFAT doesn't support some traditional UNIX features like soft links, they still can be used directly without reformatting or reinitialization on Diminuto or Arroyo. I like the EXT3 file system in particular, because like EXT2, EXT3 supports all the standard UNIX features. But unlike EXT2, it is a journaled file system. This makes is extremely robust, especially in the face of the kind of treatment that consumer devices have to expect, like being powered off in the middle of operations.

The latest version of the Arroyo root file system image includes a full set of EXT2 and EXT3 file system maintenance utilities that allows you to fix and update your "disk-based" Arroyo file systems without resorting to using the same tools off-line on a server (which is how I originally built the first Arroyo root file system on an SD card). It also includes a tool that allows you to examine, initialize, modify partitions on your SD and USB storage devices. In this article I'll show you how to download a new Arroyo file system image from a server and install it using nothing but tools on Arroyo itself.

We will begin by using an SD card as our original root file system,

create an EXT3 file system on a USB drive,

download the new root file system image (which is just a file) from a server to that USB drive,

install the root file system image on a second USB drive,

reboot the system and temporarily use the USB drive as our new root file system,

install the same root file system image on our original SD card, and

reboot the system to use the new root file system image back on our original SD card.

Notice at no point do I actually boot the Linux operating system off an SD card or USB drive. Both Arroyo and Diminuto still depend on booting the Linux kernel from a TFTP server, and Diminuto's root file system image is embedded inside its kernel image. We use a TFTP server because the version of the U-Boot boot loader on the AT91RM9200-EK board doesn't have the capability of booting from a storage device. Later versions of U-Boot have this capability, and I have used it to good effect on other projects. That would eliminate the need for a TFTP server for Arroyo. But updating to a new version of U-Boot requires some expensive hardware, like a flash programmer, and is a story for another time.

(Blogger wraps or even truncates long lines of preformatted text. If you want to see a complete verbatim log of this process, including all of my typos and other mistakes, you can find one here.)

Insert a USB drive into one of the two host USB ports on the EK board. Note that it the kernel hotplug facility detects it automatically after a few seconds and gives it the name "/dev/sda1". That's short for "SCSI Disk #A Partition #1", because the USB storage device driver in Linux emulates a SCSI device.

This filesystem will be automatically checked every 28 mounts or180 days, whichever comes first. Use tune2fs -c or -i to override.Create an EXT3 journal file on the USB drive. This turns it into an EXT3 file system. Other than the journal file (which is a invisible file by virtue of having no directory entry), EXT2 and EXT3 file systems are bit for bit compatible. You can mount an EXT3 file system as an EXT2 file system (but then the journal won't be used).

# tune2fs -j /dev/sda1tune2fs 1.41.2 (02-Oct-2008)Creating journal inode: doneThis filesystem will be automatically checked every 28 mounts or180 days, whichever comes first. Use tune2fs -c or -i to override.Mount the newly minted EXT3 file system so that we can use it.

# dd if=arroyo-root.ext2 of=/dev/sdb1 bs=1M127+1 records in127+1 records outRun a file system check on the second USB drive, which now appears to have an EXT2 file system on it. It will be a small file system, about over a hundred megabytes, because that is the size of the root file system image that we downloaded.

# e2fsck -f /dev/sdb1e2fsck 1.41.2 (02-Oct-2008)Filesystem did not have a UUID; generating one.

/dev/sdb1: ***** FILE SYSTEM WAS MODIFIED *****/dev/sdb1: 4438/7168 files (0.2% non-contiguous), 111875/130700 blocksResize the file system on the second USB drive so that it consumes the entire USB drive which is about two gigabytes.

Reboot the system and temporarily use the USB drive as our new root file system.This is an important step, and it doesn't show up in this log: remove the first USB drive, the one to which we FTPed the root file system image. Leave the second USB drive containing the new root file system. It doesn't matter which USB host port it's in. I'll explain why below.

# sync# sync# sync# rebootThe system is going down NOW!Sending SIGTERM to all processesSending SIGKILL to all processesRequesting system rebootRestarting system.

Environment size: 459/8188 bytesChange the bootargs variable in the U-Boot bootloader so that it boots from "/dev/sda1". Why "/dev/sda1" and not "/dev/sdb1"? The naming of the USB storage devices by the hotplug process depends on the order in which the devices are detected, not which USB host port in which they are inserted. If you boot the system with two USB devices inserted, the order in which they will be detected is non-deterministic, depending on the timing of the USB hardware inside the drives enumerating to the USB controller on the EK board. We only leave the USB drive that we want to use as our root file system inserted, so we can be sure to know what name to provision here. We also set the root delay to 5 seconds. This causes the boot process to pause long enough for the hotplug facility in the kernel to detect the USB drive before we try to mount it as our root file system.

This program is distributed in the hope that it will be useful, but WITHOUT ANYWARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR APARTICULAR PURPOSE. See the GNU General Public License for more details.

Device Boot Start End Blocks Id System /dev/mmcblk0p1p1 1 247 1983996 83 Linux Warning: Partition 1 does not end on cylinder boundary.Command (m for help): qInstall the file system image on the SD card much like we did on the USB drive. Run a file system check on it, and resize it.

Reboot the system to use the new root file system image on our original SD card.We just use the standard "arroyo" and "start" scripts that we used when we first brought the system up. But this time, the root file system will be the one we just installed.

# sync# sync# sync# rebootThe system is going down NOW!Sending SIGTERM to all processesSending SIGKILL to all processesRequesting system rebootRestarting system.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Device Boot Start End Blocks Id System /dev/sdb1p1 1 243 1951866 83 Linux Warning: Partition 1 does not end on cylinder boundary.Command (m for help): q# mountrootfs on / type rootfs (rw)/dev/root on / type ext2 (rw)proc on /proc type proc (rw)devpts on /dev/pts type devpts (rw,gid=5,mode=620)tmpfs on /tmp type tmpfs (rw)# Part of the beauty of using Linux as an embedded operating system is that the tools and techniques you use to work on it are the same as those you use on your server. In fact, the commands I used to build my first root file system on an SD card for Arroyo on my Ubuntu-based Dell server were almost identical to those I used here on Arroyo itself, differing on only slightly in the device names because of differences in hardware between the two systems.

Sunday, October 19, 2008

In Can unit testing be a waste? pbielicki argues that it is far more efficient to test at the highest possible level of abstraction and rely on your code coverage tools to insure that all code paths have been tested. That might work in his world, but just a couple of days after having read that article I ran into a situation where developers had been doing just that and missed a bug completely.

The C++ code was implementing an algorithm in an open specification for the binary encoding of an identity string into a binary format for transmission to a remote piece of equipment. The identity string is a globally unique fifteen digit decimal number. The string is broken up in to several separate fields and each field is encoded from decimal into binary before being stuffed into a protocol packet destined for the remote equipment. There are several encoding algorithms depending on the length of the specific field, which range from one to three decimal digits in length.

Looking at the couple of dozen lines of C++ code in the base, I thought "It would be reasonable to assume that if any valid identity string were to be encoded into binary, then decoded back into a decimal string, the result would match exactly the original fifteen digit string." So knowing exactly nothing yet about the specific algorithms from the spec, I wrote a little unit test to do precisely that.

How many of the possible values did I test for each encoding algorithm? All of them, of course. On a 2.8 gigahertz Pentium 4, cycles are easy to come by.

What did I find? Twenty percent of the possible values failed this simple unit test.

I am pleased to report, however, that code coverage was one hundred percent. In fact, testing any single input value for any of the algorithms would have yielded code coverage of one hundred percent. Given that eighty percent of the possible values passed unit testing, it would be easy, in fact, likely, that you would not catch this bug just testing a few selected values. Conceivably you could test a lot of values, and still not have any failures, if you just happened to stay within the eighty percent that worked.

It wasn't a matter of code coverage. It was a matter of input range coverage.

I have written unit tests that ran for many minutes, testing a huge range of input values. For an algorithm that did time and date calculations, I had a unit test that ran for days. Fortunately, I didn't have to run it very often. But when I did run it, and it completed successfully, I was pretty darn sure that the underlying code worked.

Cycles are cheap. When dealing with what are fundamentally mathematical algorithms, there is no reason not to test a lot of values. When in doubt, test all of them.

Friday, September 26, 2008

I recently took a nine day motorcycle trip from Denver Colorado to Dayton Ohio to visit family and friends. I rode my BMW R1100RT about 2500 miles (maybe 4000 kilometers) round trip. When I'm riding solo eight to ten hours a day without benefit of distractions like music or conversation, I spend a lot of time paying attention to what goes by and listening to what Mrs. Overclock (a.k.a. Dr. Overclock, Medicine Woman) calls my ThinkMan™. Here is some of my play list for that trip.

Route US-36 splits off from Interstate 70 just East of Denver. US-36 has a completely different character depending on what state you're traveling through. In Eastern Colorado it's kind of spooky, passing through ghost towns of shuttered buildings. In Kansas it's a two lane road with lots of local traffic and a little town every fifty miles or so. In Missouri, it's a four-lane divided controlled-access highway. In Illinois, it is either nothing but two-lane asphalt going through endless corn fields, or it merges with I-72. US-36 ends in Indianapolis Indiana, where I hopped onto I-70 for the final run to Dayton. Where as I-70 is just a vast ribbon of asphalt from Denver to Dayton, US-36 has enough variety to keep your attention.

I like staying in the little Mom and Pop motels along the way, where you ring the bell at the counter and someone leaves the television in their living room to check you in. I was traveling with my Nokia N810, so tiny it takes up almost no room in my motorcycle luggage. The internet tablet and the availability of WiFi made it easy to keep up with my email or do a little web cruising. I was amazed at how many of these little motels had WiFi. For example, both motels in Smith Center Kansas (near the geographic center of the contiguous United States) had hand-written signs "WiFi Internet" in their office windows.

I realized it was cable television that made the WiFi easy. When cable television became a "must have" for every little motel, it forced them to put in the broadband infrastructure. Once that was done, getting internet access over the cable and installing a LinkSys WAP was a piece of cake. And nearly all of them were LinkSys: seldom did I see the WAP SSID changed from its out-of-the-box default.

At first I thought that seeing URLs on billboards advertising farm equipment in the middle of Kansas was a sign that the web really had gone mainstream. But I suspect that farmers in the middle of Kansas may have realized the usefulness of the web and the internet long before many city dwellers did. When you're relatively isolated, being able to mouse-up Amazon.com and order just about anything from a Long Tail selection larger than any urban store might seem especially important. And since the internet was originally designed so that military communication would survive a nuclear war, being able to stay in touch in the aftermath of tornadoes and blizzards is pretty darned useful too.

My experience is that farmers and ranchers are pragmatic adopters and exploiters of not just bio-tech and agri-tech, but just about any useful-tech. A friend of mine once worked for a company that developed a local area network technology designed to transmit across barbed wire. Another worked with a rancher who tagged his cattle with RFID-like devices to automatically track how long each animal stood at the feeding trough. This is pretty cool stuff.

The ghost towns along US-36 got me to thinking, is there some critical mass that is necessary to keep a town alive? Did the conversion of the more southern Route US-40 to I-70 kill the towns in Colorado, but not those in Kansas and Missouri?

When I was but a lad growing up in Ohio, shortly after the glaciers receded, I had reason to spend some part of most summers in rural Eastern Kentucky. No running water, one broadcast television channel, electricity most of the time. What time wasn't spent on chores was spent reading, shooting, crashing a dirt bike, or just goofing around. A twenty minute drive in either direction on the road would get you, depending on which direction you turned out of the gravel driveway, to Grayson or Sandy Hook. Both towns had main streets maybe a block or two long with no building taller than two stories. Sandy Hook had a bit of an edge, since it was the seat of the county where my family owned land, and had the high school and the funeral home.

Decades later Mrs. Overclock and I flew into Lexington Kentucky and rented a car to drive to Sandy Hook for a funeral, while staying at a motel in Grayson. I was startled by the contrast between the two towns. Grayson was huge and sprawling, and many choices of chain hotels, restaurants, and retail. Sandy Hook looked like a ghost town, with nearly all the buildings shuttered and boarded up. Only the high school, the funeral home, and the county buildings remained.

It was the interstate highway of course. In the intervening decades, I-65 had been built, and it had a Grayson exit. Also, the U. S. Army Corps of Engineers undertook a huge flood control project between the two towns (incidentally permanently flooding the property where my mom's family originally had their farm). The resulting dam, reservoir and lake, and no doubt the beautiful Appalachian scenery, motivated Kentucky to develop a state park and recreation area. Regardless of the fact that the lake was pretty much in between both towns, it was named "Grayson Lake State Park". Tourism joined tobacco and coal mining as a major industry in the area. Grayson prospered. Sandy Hook mostly disappeared.

During the intervening decades between wandering in the woods with a firearm and flying around the country with Mrs. Overclock, I was a student of Computer Science at Wright State University. The WSU main campus is in Fairborn Ohio on what was once farm land. (Indeed, my thesis advisor and mentor Bob Dixon was the first faculty member hired by the University, and his initial office was in a farm house standing on the property.) And when I was there, it still mostly looked like farm land: woods and rolling hills. Unfortunately, that was about all there was around campus. If you wanted to dine off campus, you had to get in a car and drive at least fifteen minutes even to get fast food.

Then Interstate 675, a bypass around Dayton Ohio connecting I-75 and I-70, was built, with several exits for campus and the nearby Wright-Patterson Air Force Base. You can guess how this story ends. I barely recognize the area now. Enormous development, condos, retail, dining. And no doubt at least in part due to the greater accessibility that I-675 provided the University, the campus must be about triple its size than when I was a student, with about 16,000 students (the majority of which, interestingly enough, are women).

It probably goes without saying the accessibility is a key factor in the ability to develop and grow. And of course, it's not just physical accessibility, but virtual accessibility; maybe even more so. Many years ago, my old comrade Doug Supp (who is still at Wright State, but is surely thinking about retirement by now) and I wrote a proposal to bring the internet to the University. It seems laughable now, but way back then, back in the 1980s, we actually had to sell it. Not everyone was convinced that this new fangled internet thing was worth the money, or that it would amount to anything. After all, this predated the World Wide Web, so we were really talking about technologies like electronic mail, telnet, USENET, and file transfer being the killer applications.

Doug and I used the growth we saw that was so evident outside of our office windows as a result of I-675 as a rationale for the project, which we called TURNPIKE, trying to draw the analogy between physical accessibility and virtual accessibility as important for the growth of the University. Fortunately for all involved, the University bought into it. It was by no means a sure thing. By the way, TURNPIKE stood for The University Resource Network for the Pursuit of Information, Knowledge and Education. Yeah, we had to rack our brains for that one.

(Wright State University is also known for having a handicap-accessible campus right from its very inception in the late 1960s. Another victory for accessibility.)

Tooling along US-36 cross country with the hum of the 1100cc opposed twin engine in my ears, I had a lot of time to think about how important accessibility is.

Thursday, September 25, 2008

If you've lasted this long reading my articles on Diminuto, you know it's my attempt to create an environment for teaching real-time software design, embedded development, and assembly language programming, using commercially available hardware and open source software. Diminuto uses the AtmelAT91RM9200-EK evaluation kit (EK), a single board computer (SBC) that has an ARM9 processor with memory management and a host of peripherals, including Ethernet, USB ports, and a Secure Digital (SD) card slot. You may also recall that Diminuto was built using uClibc, a reduced memory footprint C library, and has a root file system that is memory resident.

(Not that everyone doesn't know what these look like, but below is a photograph of an SD card, which is a little larger than the size of my thumbnail, and a USB storage drive, which is about the size of my thumb. Both of these particular examples hold two gigabytes, a fact that an old guy like me finds astounding.)

Now it's time for Diminuto Phase II, which I call Arroyo. Arroyo has a vastly larger root file system that runs "disk" resident on an SD card. I'm using an EXT3 journaled file system on a 2GB SD card; the root file system utilizes less than 10% of the card. Arroyo is still based on BusyBox (1.11.2), but it uses the full Standard C Library and includes a complete Bash (3.2) shell. Because Arroyo doesn't use a RAM-resident root file system, it has a memory footprint not much larger than Diminuto. Like Diminuto, Arroyo supports USB storage drives as well, making it easy for students to take their projects with them.

(Below is a photograph of the EK SBC with the SD card in its card slot in the left rear and two USB drives in the host USB ports in the center front. Both Diminuto and Arroyo can access these devices just like disk drives. Very slow disk drives. Arroyo uses the SD card for its root file system, although in practice, as with any other Linux system, just about everything commonly used ends up cached in memory.)

Arroyo was not built using Buildroot, although it does make use of the genext2fs host utility created by the same folks to build the EXT2 file system image. Arroyo was built using individual open source components including the Linux2.6.26.3 kernel. The only patch required to any of the open source software was to the kernel to address the wrong board type code reported by U-Boot that I've discussed before.

Arroyo was created using a pre-built tool chain, Sourcery G++ Lite, provided by ARM, Ltd. and the folks at CodeSourcery. The tool chain can be downloaded for free, and provides full support for the later ARM Embedded Application Binary Interface (EABI).

I've ported the Desperado embedded C++ library to Arroyo, including John Sadler's Ficl embeddable Forth interpreter, and ran most of the unit tests.

Here is an example of booting Arroyo on the EK SBC. Note that I overrode the U-Boot bootargs environmental variable to point the kernel at the SD card for its root file system. Since the SD card device driver is hot plugged into the system when you insert the SD card, I told the kernel to delay for a couple of seconds to allow the device to come online. (You can also see this log file here.)

Diminuto and Arroyo are both excellent teaching examples of embedded Linux systems at opposite ends of the resource spectrum, and both run on the identical EK hardware and firmware. I'll be writing more about both Diminuto and Arroyo in articles to come.

Metadata

Chip Overclock® is the pen name of John Sloan, a freelance product developer based in Denver Colorado USA who specializes in very large and very small systems: distributed, real time, high performance, embedded, concurrent, parallel, close to bare metal.