Tuesday, August 31, 2010

I guess I'm still just a boy at heart, because I love to read stories like this one, about the Second Avenue subway project that's underway in New York City.

Here's a quote to entice you, although you'll probably want to go just for the pictures!

The drill is placed on a track where an array of hydraulic pistons presses it forward. As the drill is applied to the rock, the steel discs on its face shred the stone in its path. The resulting debris, called slag, is collected through a set of channels on the drill’s face and ferried on conveyor belts to the rear of the machine. As more of the rock is drilled, the machine creeps forward along its track. Think of it like a giant earthworm.

Monday, August 30, 2010

When one twist these two wires, the resulting double strand gets shorter and shorter (Fig. 1B). At a certain point, no more rotations can be added without deforming the material or introducing intrusions – this is where the curve changes from blue to red. The maximally rotated geometry is universal and therefore independent of the skills of the craftsman.

Thursday, August 26, 2010

I was telling my friend Cal about this amazing, hilarious, delightful soccer game that I was watching the other night.

It was between Real Salt Lake and Cruz Azul, and they were playing in the late afternoon at Estadio Azul, in Mexico City.

An important thing to know about this stadium, as Wikipedia notes, is that:

A peculiarity of this stadium is that it is built as a pit; the playing field is below street level.

Anyway, I tuned into the game yesterday afternoon, somewhat on a whim, looking for something fun to watch. It was the middle of the first half, and the rain had started to come down. Well, that doesn't describe it very well, so let's allow Andrea Canales of Goal.com to take over the description:

It was already raining, but around this point, the storm escalated, and pools of standing water began to form on the field, turning much of the action comical, as the ball would alternately stop or skip on the liquid, leaving the players befuddled.

Normal play did not apply in these situations, as Horacio Cervantez in the 44th minute found out to his dismay when he attempted a backpass to his goalkeeper and the ball held up in the water. Saborio reacted quickly, rounding the defender and eluding the goalkeeper to put RSL in the lead for the first time.

As the rain continued to pour down, Real Salt Lake held their advantage to the half.

The rain came down, and came down, and came down.

Pretty much every fan in residence at Estadio Azul hit the road, although a few hundred hardy souls simply shed all their clothing, pulled out their drums, trumpets, vuvuzuelas, flags, and capes, and settled in for the long haul.

And a long haul it was! At halftime, they halted the game for an amazing 22-minute rain delay! I've watched a lot of soccer, but never seen a rain delay like this. The grounds crew came out, and did their best to try to squeegee the standing water to the sidelines, and finally the referee decided that they might as well play a second half.

Although the rain seemed to taper off somewhat, the conditions were beyond belief. As Canales tells it,

At times more of a mud-wrestling match than soccer, the game looked likely to injure players. Beckerman, in order to get power behind his kicks, would often hit the ball half-falling into a puddle.

“Steadily the rain just came and it wouldn’t let up," team captain Kyle Beckerman said. "Both teams had to play in the stuff. I think we had some mental lapses and they made us pay for it. That’s all it came down to. We made them pay for some lapses as well, but they made us pay for one more than we did.”

Oh, wait, did I tell you the score? It was 3-1 after 70 minutes, and it looked like the rain and the conditions were going to keep the game pretty stable. But then, as the game drew to a close, both teams decided to score like mad: Cruz Azul scored in the 75th, 88th, 89th, and 97th minutes, with a Real Salt Lake goal sprinkled in for good measure. I could barely sit down between goals!

It ended up 5-4, a sluice-gate was opened at the end of the pit, all the players floated down into the locker rooms, and they called it quits for the night.

For the most part, I've been extremely happy with the Java platform: it's solid & reliable, performance of the underlying compilers and virtual machines has steadily improved, the class library is rich and well-documented.

However, it's been a persistent aggravation that Java performance tuning is much harder that it seems like it needs to be. I spent several fruitless years in a previous job struggling to improve the performance of some quite complex Java software. Time and time again I'd construct a benchmark, execute it under various profiling regimens, study the results, attempt to improve the performance, and be aggravated to discover that my changes had no visible effect.

Other times, guided either by instinct or by the knowledge and suspicions of fellow developers, I'd make performance-related changes that, according to the profilers, should have had no benefit at all, only to see substantial improvements in benchmark results.

So I was quite interested to happen upon this recent paper: Evaluating the Accuracy of Java Profilers. The paper does a very thorough and approachable job of explaining why current Java profilers produce misleading results, as well as suggesting ways that future profilers could be improved.

From the paper:

The contributions of this paper are as follows:

1. We show that commonly-used profilers often disagree with each other and thus are often incorrect.

2. We use causality analysis to determine whether or not a profiler is producing an actionable profile. This approach enables us to evaluate a profiler without knowing the “correct” profile; prior work on evaluating the accuracy of profilers assumes the existence of a “correct” profile (Section 8.1). For example, Buytaert et al. use a profile obtained using HPMs and frequent sampling to evaluate the accuracy of other profilers.

3. We show, using causality analysis, that commonly-used profil- ers often do not produce actionable profiles.

4. We show that the observer effect biases our profiles. In partic- ular, dynamic optimizations interact with a profiler’s sampling mechanism to produce profiler disagreement.

5. We introduce a proof-of-concept profiler that addresses the above mentioned source of profiler bias. As a consequence our profiler produces actionable profiles.

I thought that the most interesting part of the paper was section 6: "Understanding the cause of profiler disagreement". The key observation that is made in this section has to do with the interaction between the profiler's sampling technique and the underlying JVM's implementation of "yield points":

To understand why our profilers were not randomly picking sam- ples from the program run, we took a closer look at their imple- mentation. We determined that all four profilers take samples only at yield points. More specifically, when a profiler wishes to take a sample, it waits for the program’s execution to reach a yield point.

Yield points are a mechanism for supporting quasi-preemptive thread scheduling; they are program locations where it is “safe” to run a garbage collector (e.g., all the GC tables are in a consistent state). Because yield points are not free, compilers often optimize their placement. For example, as long as application code does not allocate memory and does not run for an unbounded amount of time, the JVM can delay garbage collection until after the appli- cation code finishes; thus, a compiler may omit yield points from a loop if it can establish that the loop will not do any allocation and will not run indefinitely. This clearly conflicts with the goal of profilers; in the worst case, the profiler may wish to take a sample in a hot loop, but because that loop does not have a yield point, the profiler actually takes a sample sometime after the execution of the loop. Thus, the sample may be incorrectly attributed to some method other than the one containing the loop.

Using an alternate technique, the authors prototype a different sampling-based profiling technique, which they call tprof, and test out their profiler on several common benchmarks, and demonstrate dramatically improved profiling results. Unfortunately, they also run up against a number of limitations in their profiling technique, due to the behaviors of current JVMs and JITs, but it seems like these limitations should be surmountable, so I really hope that this paper gets some attention in the Java profiling community and leads to improved profiler behaviors.

Tuesday, August 24, 2010

So I had a certain 'deja vu' sensation when I read the recent reports about the massive traffic jams in China on National Highway 110 which leads west from Beijing. Isn't that the road the Hessler drove, I thought? Sure enough, it is.

Hessler's book is fascinating, and very well written. Give it a try, if you're looking for that next book to read.

But, really, the answer here is simple, and, as usual, it's Jason Kottke who lets us know about Tim Carmody's insight that, while the Web might be ailing, it's definitely not dead, because the Web is in deep symbiosis with Brett Favre.

Saturday, August 14, 2010

I'm confused about the ideas behind the Oracle-Google lawsuit announced last week. The facts are clear enough: Oracle sued Google in the US District Court for the Northern District of California, for infringing on copyrights and patents related to Java, saying that:

Android (including without limitation the Dalvik VM and the Android software development kit) and devices that operate Android infringe one or more claims of each of United States Patents Nos. 6,125,447; 6,192,476; 5,966,702; 7,426,720; RE38,104; 6,910,205; and 6,061,520.

What's considerably less clear is what Oracle is really mad about, and what they want Google to do. In the software industry, patent law is a clumsy tool, and when it is exhibited in this fashion, as this Forbes blog suggests, it usually indicates that something else is going on behind the scenes.

As Bruce Perens points out, the Java Language Specification includes a broad patent grant. Is Oracle claiming that Google violated that grant?

Or, as suggested by this ComputerWorld analysis, is Oracle claiming that Google, who have hired many prominent engineers from the original Sun Java team, could not possibly have followed true "clean-room" techniques when building their Dalvik/Android system?

Or, as Bruce Perens notes in a follow-up article, is the debate about whether Google's version of Java is really compliant to the Java standards and specifications, and, if not, whether that means they cannot call their implementation "Java", similar to the objections that Sun had over Microsoft's not-quite-the-same version of Java years ago?

Or, as suggested by this article, is Oracle trying to save whatever is left of Java Micro Edition, which had achieved a moderate amount of success in the cell phone market, but which is now being crushed by Android's runaway adoption? And, as this article speculates, will this just play into the hands of Microsoft and their lagging, but technically very advanced, .NET mobile platform?

It's all very confusing.

When Android first came out, I was excited by the possibility of an open Java-based mobile computing platform, and I downloaded the Android development tools and tried to write some applications. But I was quite shocked to see that Derby, which is one of the most portable, highest-quality Java applications in existence anywhere, would not run on the Android platform:

Derby's use of common Java APIs, such as javax.naming, ran afoul of Android's selective inclusion of the JDK API set

After those discoveries, I gave up in disgust, so there may be more such obstacles lurking. But those incompatibilities were large and severe, and I can see that Sun/Oracle would be upset by them. It brings back memories of the Sun-Microsoft fights over Java last century, when Microsoft similarly produced their own not-quite-the-same version of Java, and Sun fought bitterly to force Microsoft to abandon that technology.

What will happen here? Does it make sense to you? If you have any ideas to share, or pointers to clear and well-written analysis, please let me know!

We drove up from the Bay Area on Saturday afternoon, and spent the night in Weaverville at the Red Hill Cabins. Sunday morning we arose early, picked up sandwiches and coffee at the local Tops market, and headed up California 3 toward the Trinity Alps. At Trinity Center, where Swift Creek enters Trinity Lake (formerly Clair Engle Lake, which is a story all to itself), we left the paved road and drove 8 miles on Forest Service dirt roads up to the Poison Canyon trailhead. Trinity Center is at about 2,500 feet elevation, while the trailhead is at 4,100 feet elevation; the Forest Service road took almost 45 minutes to drive.

When we got to the trailhead, we were a trifle concerned when we noticed that the brand new Toyota 4Runner parked there had been apparently attacked by a bear: there were paw marks on the windows; two fiberglass side panels were torn off the car and lying on the road; and when we looked into the car window, a cooler was sitting open on the back seat. We had no food to leave in the truck, so we carefully moved our overnight bags of spare clothes into the center-front of the truck bed, covered them with a grey blanket, closed the truck up tight, and hoped for the best.

The Poison Canyon trail leads upward along the North Fork of Swift Creek, staying close enough to the creek to hear the water, but not close enough to see the creek. The trail is steep: it climbs 1,800 feet in just 2 miles, which is about a 17% average grade. Happily the trail leads mostly through shaded woods and pleasant meadows, so even though it is steep and unrelenting, we took our time and enjoyed the walk. Poison Canyon is apparently named due to a type of grass which grows there, which is indigestible by the livestock. Of course, we backpackers refrained from chewing the plants...

As the trail rises through the canyon, we start to see views of towering Ycatapom Peak, the 7,600 primary landmark of the area.

After a long break for lunch, we arrived at Lilypad Lake in mid-afternoon, and spent a few hours selecting the best campsite and setting up camp. As the guide books had predicted, we had the canyon to ourselves, and so we picked a great campsite high on the boulders above the lake, near the feeder stream with its pleasant running water. Firewood is abundant in this area of the Alps, and we had an enjoyable campfire before an early night, as we were all quite tired.

Monday morning we headed up the Thumb Rock trail above Lilypad Lake. Again, this trail is very steep, but since we were only carrying light day packs it was much more tolerable, and we made good progress. The trail takes you quickly up 600 feet above the lake, where the views of Mount Shasta are tremendous! We climbed a bit further, past the junction with the Deer Flat trail, to a knoll above Poison Canyon with a spectacular view, at about 7,000 feet of altitude. After lunch, we hiked a bit further, to the crest of the Parker Divide, where we got a 360-degree view of the eastern Trinity Alps, including the Parker Creek canyon, Cub Wallow, and Deer Flat. We split up, with Roger and Rich heading off to climb Thumb Rock's neighbor peak (elev. 7735), while Mike, Deb, and I returned back down the switchbacks to Lilypad Lake for some rest, book reading, and wildlife study.

The shooting stars were great on Monday night! I think we were 3 days early for the Perseids, but we still saw lots of meteors.

Tuesday morning dawned cold and grey, with heavy clouds overhead. Tired and a bit discouraged, we cancelled our plans to take the 8 mile (round trip) hike over to Boulder Lake, and instead decided to play campground games: horseshoe golf, poker, and rock skipping were all favorites. Unfortunately Rich didn't get to use his fishing pole; maybe next year!

Wednesday we broke camp and headed back down to the trailhead. Happily, Mr. Bear had decided that there was nothing worth fighting for in the truck, and we sailed back down the dirt road, through Weaverville, back to I-5, and (eventually) made it home to the Bay Area for the conclusion of another great trip.

Friday, August 6, 2010

People often ask me why I get so passionate about fixing bugs, and why, in particular, I routinely take the time to fix "minor", "unimportant", or "harmless" bugs. Sometimes the discussions get somewhat heated, since many programmers, quite justifiably, are very concerned about the risk of bug fixing, and about how to manage that risk effectively. Each bug fix introduces volatility into the code base, and there is no denying that what appear to be simple bug fixes can often be more complicated than they appear, and may break other behaviors unexpectedly, or have unintended side effects. And the volatility itself can be frustrating, because when you are working in a team environment volatility of the code has an expense in and of itself.

But I'm a very strong believer in fixing all bugs, even minor bugs, if at all possible, and, for me at least, this belief is deeply rooted in a theory of social behavior known as the Broken Windows Theory. This theory dates from a famous 30-year-old article published in the Atlantic magazine, by George Kelling and James Wilson, in which they report on several long and detailed studies of neighborhood policing. In the article, Kelling and Wilson observe:

Social psychologists and police officers tend to agree that if a window in a building is broken and is left unrepaired, all the rest of the windows will soon be broken. This is as true in nice neighborhoods as in rundown ones. Window-breaking does not necessarily occur on a large scale because some areas are inhabited by determined window-breakers whereas others are populated by window-lovers; rather, one unrepaired broken window is a signal that no one cares, and so breaking more windows costs nothing.

This is an incredibly important observation, and I believe that it's deeply and fundamentally true.

Moreover, I believe that it's deeply and fundamentally true about software, just as it is about neighborhoods:

one unrepaired broken window is a signal that no one cares

Replace "broken window" with "software defect", in that conclusion, and you'll see what I mean.

So, that's why I get passionate about fixing bugs. But, how do we manage the risk?

It's very hard to follow this discipline, but it gets easier the more you do it. And, in the end, you'll find that it makes your programming more fun, as well as more successful and productive, if you just follow the basic Martin Fowler-based rules:

Don't be afraid to change the software

Integrate early, integrate often

Always write a test

Always run your tests

Always pay attention when your tests fail

Repair your broken windows

Climbing down off my soapbox now, and heading for the mountains to think about things other than software for a week...

Today is the 3 month anniversary of the "Flash Crash", the peculiar trading behavior that occurred during the afternoon of May 6th, wildly disrupting the markets and resulting in thousands of cancelled trades.

Today, the Wall Street Journal has a fascinating, long, and detailed article about what is known, and what still isn't known, about the Flash Crash. If you aren't a subscriber, you may be able to get to the article by searching for it via Google, since something about the redirect-via-Google-search triggers the WSJ to allow you to read the full article rather than stopping at the paywall.

Some of the details in the article were quite new to me, for example:

At one point, Apple traded for nearly $100,000 a share on Arca, according to NYSE officials, after a buy order for 5,000 shares entered the market and only 4,105 shares were available. When Arca's computers saw that no more shares were available to sell, the system automatically assigned a default price of $99,999 to the remaining 895 shares. Those trades later were cancelled.

A default price of one hundred thousand dollars per share? That's ridiculous, and is just as ridiculous as the one penny per share "stub quotes" that also occurred during the crash.

According to the WSJ, the investigation continues, and "the SEC and Commodities Futures Trading Commission expect to issue a final report on the flash crash within a few months." So hopefully we will soon know more about the details and can study them and discuss what they mean.

In the meantime, the WSJ article has some fascinating tidbits about the interactions between humans and their computers during the crash:

Rumors swirled about of an erroneous "fat-finger" order by a trader at Citigroup Inc.—that the trader mistakenly entered extra zeros, turning millions into billions. Citigroup and regulators later said such an errant trade did not appear to have taken place.

But the rumor helped stabilize the market. If the massive decline was the result of a mistake and not some terrible news, that meant there were bargains to be had.

I'm intrigued by the notion of the rumor stabilizing the market. It is very interesting to reflect on what that tells us about how we reason about our computers, and our algorithms, and our observations of systematic behavior.

The WSJ also confirms what others said about the disturbing behavior of so-called "stop-loss" orders during the Flash Crash:

Some of the biggest ETF traders are firms that try to profit from discrepancies between prices of ETFs and the stocks that they track. But as questions mounted about pricing of individual stocks, these firms pulled back from trading. This hurt small investors who had placed "stop-loss orders," aimed at protecting against big losses by automatically selling once prices fell below a certain level. Those orders hit the market when there were virtually no buyers to be found.

Unfortunately, as the article notes,

Exchanges are unlikely to be able to prevent high-frequency trading firms or statistical-arbitrage firms from bailing out of the market en masse.

So it's still quite confusing about exactly what went wrong, and what should be done to improve the market behaviors, other than continuing to educate the public about how modern markets operate, and requiring consumer-facing financial firms to provide appropriate tools, and appropriate training, for the individual consumers who are trying to participate in the market themselves (e.g., me, with my Rollover IRA).

Also, if you're still listening, the gang over at Nanex are continuing to explore visualizations of some of the algorithmic trading behaviors that they observed during the Flash Crash. I'm not quite sure what to make of this, but the displays are quite intriguing to scroll through. They also maintain a short links page with some further reading.

Tuesday, August 3, 2010

If you're even slightly interested in Java performance tuning, don't miss Jeremy Manson's great post on why it is incredibly hard to make a Java performance profiling tool which gives the right answers.

For years I was frustrated by the behavior that Manson describes:

there are many profilers that report a great deal of your time spent in blocking mechanisms, simply because threads frequently become scheduled as they exit blocking functions. However, those sorts of threads are all "ready to be scheduled" rather than "actually running". As a result, CPU time is charged to those threads inaccurately.

Manson describes the fundamental mechanism that his team at Google are investigating to build a better Java profiler, and also points to a new paper from a team at CU-Boulder describing their studies of a collection of current state-of-the-art Java profilers:

If two profilers disagree, at least one must be incorrect.

Definitely words to live by.

Hopefully this will lead to some improvement in the Java profiling tools world. Building a better Java profiler is a very leveraged activity, given the amount of Java code in the world and the number of people who are sweating away trying to figure out what their code is doing and how to improve it.

Monday, August 2, 2010

Does your software trust the other software it uses? Does your software trust the hardware it runs on? Does your software trust the users who use it?

At my day job, we've been engaged in a fascinating discussion recently regarding the limits of software trust. Our product is used to store objects of incredible value, and our users depend completely on us to preserve and retain that value. And when you try to build software that approaches the level of reliability required to preserve tens of millions of files holding dozens of petabytes of content, you start encountering situations where simple measures of trust break down.

If your program writes a block of data to the disk, how confident are you that the block was actually written, with exactly the contents you provided, to exactly the location you specified?

More intriguingly, if your program reads a block of data from the disk, how confident are you that it is the same block of data that your program wrote there, some time ago?

Unfortunately, hardware fails, software fails, and users make mistakes. Bits rot, gamma rays and voltage spikes occur, cables fray and wiggle, subtle bugs afflict drivers, and well-meaning administrators mis-type commands while signed on with high privileges. So once you have fixed all the easy bugs, and closed all the simple design holes, and built all the obvious tests, and created all the routine infrastructure, a program which really, really cares about these issues finds itself facing these core issues of trust.

There's only one technique for addressing these problems, and it's two-part:

Provide recoverability of your data via backups and/or replication,

and have a way to detect damaged data.

The first part ensures that you are able to repair damaged data; the second part ensures that you know which data need to be repaired.

Most people know a lot about backups and replication, but by themselves that's not enough. If you can't detect damaged data, then your backups have no value.

Most organizations that I've dealt with depend, without really realizing it, on detecting damaged data at the time that it's next used. Unfortunately, that is far, far too late:

By then, the damage is impacting your business; you've got downtime due to the damage.

Worse, by then, the damaged data may have already made it into your backups; your backups themselves may be damaged. And then you're done for.

The sooner you detect damaged data, and the closer to the source of the damage that you detect it, the faster you can repair it, and the more likely you are to be able to track down the source of the damage (buggy software, user error, intermittent power fluctuation, overheating DIMM board, etc.).

Computer science has one basic technique for detecting damaged data, though it goes by several names: checksums, CRCs, hashes, digests, fingerprints, and signatures among others. Here's a nice summary from Wikipedia:

A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that an accidental or intentional change to the data will change the hash value. The data to be encoded is often called the "message", and the hash value is sometimes called the message digest or simply digest.

The basic idea is that:

At some point, when you are confident that your data is valid, you compute the checksum and save it away.

Later, when you want to know if your data has been damaged, you re-compute the checksum and compare it against the checksum that you saved before. If they don't match, your data has been damaged (or, perhaps, your checksum has been damaged)

Note that there is a natural symmetry to the use of checksums:

Checksum the data when you write it out, then verify the checksum when you read it back in.

Checksum the data when you send it over the net, then verify the checksum when you receive the data over the net.

Given that there are reliable, free, widely-available libraries of checksum code available out there, this should be the end of the discussion. But of course life is never so simple. Programs fail to checksum data at all, or they checksum it but don't later verify the checksum, or they checksum the data after the damage has already happened. Or they use a weak checksum, which doesn't offer very extensive protection. Or they do all the checksumming properly, but don't tie the error checking into the backup-recovery system, so they don't have a way to prevent damaged data from "leaking" into the backups, leading to damaged and useless backups and permanent data loss. There are a world of details, and it takes years to work through them and get them all right.

If you've got this far, and you're still interested in learning more about this subject, here's two next resources to explore:

In the Internet, that means that we are sending large volumes of incorrect data without anyone noticing. ... The data suggests that, on average, between one packet in 10 billion and one packet in a few millions will have an error that goes undetected. ... Our conclusion is that vital applications should strongly consider augmenting the TCP checksum with an application sum.

Valerie Aurora's position paper, An Analysis of Compare-by-Hash, on the use of cryptographic hashes as surrogates for data, particularly in places such as disk de-dup, file replication, and backup. Here's an excerpt from her paper:

On a more philosophical note, should software improve on hardware reliability or should programmers accept hardware reliability as an upper bound on total system reliability? What would we think of a file system that had a race condition that was triggered less often than disk I/O errors? What if it lost files only slightly less often than users accidentally deleted them? Once we start playing the game of error relativity, where does we stop? Current software practices suggest that most programmers believe software should improve reliability -- hence we have TCP checksums, asserts for impossible hardware conditions, and handling of I/O errors. For example, the empirically observed rate of undetected errors in TCP packets is about 0.0000005%. We could dramatically improve that rate by sending both the block and its SHA-1 hash, or we could slightly worsen that rate by sending only the hash.

Sunday, August 1, 2010

You know it's almost time for Tackle Football here in America when you start reading stories like this one:

Tackles Mario Henderson and Langston Walker: These are your starters on Day One in Napa for an offensive line that Yahoo! Sports recently called the worst in the league.

...

Walker is back for another stint, and the Cal product has dropped some weight and is down to 360 pounds. He looked fresh at minicamp and quickly moved into the lineup, dispatching Khalif Barnes to get some work at guard. Walker's emergence could also signal a lot more power blocking under new offensive coordinator Hue Jackson, in addition to the zone blocking that Cable favored.