September 4

(0.0; 1185.4 total, 988.6 to go; -15.0 from pace, -134.6 overall)

I intend to get up and hike today, but laziness and a ready Internet connection distract me. I spend most of the day reading email and feeds as well as working on a web tech blog post about a feature I implemented shortly before starting this hike: the DOMText.wholeText and Text.replaceWholeTextAPIs. It was an interesting little bit of hacking I did in an attempt to pick up as many easy Acid3 points as possible for Firefox 3 with as little effort as possible. I have more to say on this topic, but at the request of a few people I have split it into an extended, separate post so that my thru-hike ramblings don’t distract from it. Beyond writing the web-tech post and catching up on things, one other minor anecdote sticks out from today: at one point Lydia, the daughter, has a screaming fit. Red Wing tells me how he responded: he told her that she should be quiet because her stuffed rabbit was trying to sleep — and it worked. Heh. 🙂

The Honeymooners walk in later in the day to again catch up to me (not unexpectedly, as I knew the Four State Challenge would not be an efficient way to make good miles in the long run), and I round out the day again by taking advantage of the same $25 hostel deal available last night. Also, since I ended up staying this extra day at the hostel, I’m now slightly rushed to meet up with family at the south end of Shenandoah National Park, at around 1325 miles down the trail. I’m currently at 1185, with the plan being to meet them at the end on September 10, so I’ve eaten up my margin for error today: The Hike Must Go On again in earnest tomorrow.

September 5

(18.3; 1203.7 total, 970.3 to go; +3.3 from pace, -131.3 overall)

Yesterday was a recovery day, so today it’s back to business, as I work to remain mostly even with the Honeymooners and to catch back up to Smoothie. Of course, that still doesn’t stop me from dallying in the morning, and after I post the web-tech article alluded to yesterday, I finally roll out of the hostel at around noon, well after the Honeymooners leave.

Not much sticks out in today’s hiking. I don’t see the Honeymooners again, which is a little odd since I’d assumed they would plan to hike further than I intended to hike, having left several hours before I did. The first bit of the day is just getting out of the Roller Coaster (a 13.5 mile stretch of trail with ten viewless ascents and descents necessitated by a narrow trail corridor; see also my previous entry), and there’s not a lot to see. By the time I get through to Rod Hollow Shelter at its end it’s just about 17:00. I consider stopping for the day, but I haven’t even gone ten miles at this point, and I still have daylight and energy left in me; onward another 8.4 miles I go to the next shelter.

I have to keep up the pace to get there before it gets too ridiculously dark, but it’s a nice bit of hiking. Later on I pass through Sky Meadows State Park as dusk hits; I feel a sprinkle every so often, providing further incentive to keep moving to avoid real rain if it happens. I get to the shelter as darkness hits, and it’s an unusual one — probably the most unusual since Hexacuba Shelter in Vermont. Dick’s Dome Shelter is on private land, was constructed by a PATC member out of (as best as I recall) fairly artificial materials, and — strangest of all — is shaped like a d20 (icosahedron, for the culturally challenged) with three adjacent faces omitted to serve as an opening. It’s small (claimed to sleep four), and luckily I’m the only person in it for the night. I fill up on water from a small stream passed en route to the shelter, and I hang my food bag from a bear cable placed between trees a little distance from the shelter — it’s great not to have to search around for a plausible tree branch in the darkness. Rain falls at a moderate rate — no longer sprinkles, but not in particular earnest — as I head to sleep. 18 miles for the day is a reasonable distance given my late start, but I have 122.2 miles to hike in the next five days to meet family at the south end of Shenandoah National Park, and an 18-mile day just doesn’t cut it if I want to hike those days without feeling rushed.

September 6

(18.0; 1221.7 total, 952.3 to go; +3.0 from pace, -128.3 overall)

I wake up in the morning to the same rain from last night, and it shows no signs of stopping. (I eventually learn that this rain is the continuation of Hurricane Hanna, explaining the rain’s persistence over the next several days.) At least it’s not turning into a downpour, but this won’t be much fun to hike through. Off I head into the rain; it’s not stopping, and I can’t stop either.

Rain continues up to the first shelter stop of the day at Manassas Gap Shelter, where I take an opportunity to duck inside and out of the rain for a bit. The shelter has a note prominently posted in it talking about a semi-residential rattlesnake, noting that anyone who sees it (I do not) should mention it in the register. I fill up my water bottle with rainwater pouring off a corner of the shelter roof (I still purify, of course) before returning to the rain. I continue hiking through the rain to the Jim and Molly Denton Shelter, where I again stop out of the rain for a bit. I save its register from being soaked beyond its current state; someone’s left it on one of the porch benches, fully exposed to rain. Past that the trail passes by a fenced-in National Zoological Park Research Center, which the Companion says occasionally provides views of exotic animals; I see none in this weather.

A few more miles of walking and drizzle take me to Tom Floyd Wayside, the first shelter in the Shenandoah National Park section of the trail as delimited by the Companion. (Technically, I have a little more hiking before I’m inside the park proper.) The site has a nice cable for hanging food, and as usual these days I have the shelter to myself. It’s somewhat odd, this being a Saturday night when usually others are out camping, but I suppose the near proximity of Shenandoah makes the difference: if you’re going out for a weekend trip, you’re probably not going to go just next to a national park but rather into it. Among the shelter’s decorations: a charming poster warning of the possible dangers of accidentally inhaling fecal dust from mice or rats infected with disease.

Today’s hiking would have been shorter and more pleasant if I didn’t have the looming deadline to get out of Shenandoah to meet family. Given the remaining distance, however, I couldn’t make any further curtailments; 104.8 miles in the remaining four days is already pushing pretty hard, and the rainy weather makes that even worse. Why didn’t I learn the lesson the last time I had a hard deadline to make that deadlines are bad?

September 7

(23.6; 1245.3 total, 928.7 to go; +8.6 from pace, -119.7 overall)

Today’s hiking is much more pleasant than yesterday’s soggy mess as I enter Shenandoah National Park.

Mid-day view of Skyline Drive in Shenandoah NP

Shenandoah National Park consists, roughly speaking, of a hundred-mile road called Skyline Drive that follows the ridges of the Appalachians in the area, surrounded by a fair amount of forest, cut through with hiking trails, horse riding paths, and other nature-y things of that nature. Sounds vaguely nice, right? Well, yes and no. First, it’s a national park, which means, relative to its attractions, it receives an outsize number of visitors. I don’t mind people when I’m hiking, but I’d prefer there not be too many people, and national parks can push it, even on hiking trails. Second, it’s a national park whose chief attraction is a road. Many, possibly most, visitors drive down Skyline Drive looking at scenery, maybe stopping at overlooks, and consider that their park experience. Fine, you go do that if you want and miss out on all the interesting bits, but leave me out of it. Unfortunately, for this park, hiking on the A.T., there’s no choice: the trail roughly parallels Skyline Drive for roughly 100 miles of trail, crossing back and forth over it 28 times according to the Companion. Thus, you’re never very far from something approximating civilization. Great national park experience, eh?

The only good thing about being near a road so much is that you’re also near Shenandoah’s “waysides”, convenience stores along the road at which it’s possible to resupply. (The stores also sell bottles of wine, which seems like an incredibly stupid idea given clueless tourists’ penchant for careless littering.) This is almost convenient, except that the waysides are all run by a single entity, Aramark, so you get markedly higher pricing than you’d get at any other resupply point in the area. Some of the difference is due to the waysides’ remoteness, to be sure, but some is certainly the result of Aramark’s government-licensed monopoly on services in the park.

More mid-day views from Shenandoah

Shenandoah’s a bit different from most of the Trail, for backpackers, in that you have to get a permit to backpack through it. The permit’s basically a formalism: if you enter Shenandoah via the A.T. you pass by a small sign-in station. There you pick up a carbon-copy form, fill it out with your rough itinerary of camping locations and dates, deposit one copy at the station, and visibly fasten the remainder on the outside of your backpack with a small wire. There’s no fee for doing this (people who drive into the park have to pay for that privilege), which is pleasantly surprising. I pick mine up at the station, which is just a mile into today’s hiking.

Much of today’s hiking consists of me marveling at so many people as I cross and recross Skyline Drive. Parking lots filled with cars and the occasional people disembarking from them present a marked contrast to anything I’ve seen this close to the trail since probably Bear Mountain in New York. The highlight of the day happens around noon when I get a definite sighting of a black bear. I’d seen what might have been one in New Jersey, but I get a good, long view of this one. He’s eastward of the trail maybe fifteen or twenty yards over — and ten or fifteen yards up. See a bear like this, and you’ll realize why the prospect of climbing a tree to get away from a bear is such an utterly ridiculous idea. The bear’s up there, just nibbling away on leaves or acorns or whatever it is they like to eat, certainly aware of the people around him but not sufficiently rushed to stop eating immediately. After a minute or so some people I’d just passed catch up, and they stop to watch the bear as well. Shortly after the bear leisurely and gracefully climbs down the tree. He turns, looks at us briefly, then ambles off into the trees and brush in the opposite direction. Good stuff — now if only I’d remembered to pull out the camera before he was walking away…. Shenandoah is the big spot on the Trail for seeing bears — if you see one, you’re probably going to see it here. I see a few more bears through my hiking hear, and I hear what are probably about an equal number crashing through the bushes and trees running away from me. This one was really one of the more fearless bears I saw both in Shenandoah and on the entire hike.

My goal for the first bit of the day is to get to Elkwallow Wayside, 16 miles south, so I can resupply. The Companion notes this is the last opportunity for northbounders to get blackberry shakes, so I’m planning on getting one to see what it’s like. Once there I resupply and discover the aforementioned prices. Where usually Knorr noodles (the mainstay of my dinners) go for $1.25-1.50 or so, here, as best as I recall, they’re around $2.30. Other items are similarly marked up; had I known beforehand I would have made an effort to avoid resupplying here, since resupply here was more a matter of convenience than anything else. The attached burger/milkshake fast-food counter is similarly overpriced, with the milkshake going for around $4.00 as I recall (and it’s unremarkable to boot), and the entire meal coming to around $11 including tax. Oh well, at least it’s only the once.

After lazing around for a bit eating and relaxing, watching people pass by, it’s time to start hiking again, newly burdened with food to last through the rest of SNP. It’s only 7.4 miles to go, and I hit a good pace and the miles go by effortlessly. I walk up to Pass Mountain Hut (as shelters are named in SNP) with some day to spare.

I share the shelter with three other people. One is a middle-aged man who says he’s doing a long-distance northbound hike, starting from some location I can’t quite remember, maybe the south end of Virginia. The other two men are in their thirties or forties and, as best as I can recall, are hiking together. One of them, upon hearing at some point that I’m going into the software industry after I finish my thru-hike, vigorously attempts to dissuade me from such, based upon his experiences at IBM (in sales or something like that, I hasten to note), wherein he discovered just how far he was willing to betray his principles. (The furthest such case involved eating food from the foot of a quadriplegic client, or something approximating that scenario. I am not making this up!) I’m not sure if I ever stated that I was going to work for Mozilla. I’m not sure it would have made a difference even if I had, given the extent of his bitter-and-jadedness.

The shelter register here provides good entertainment in the form of stories from other insane thru-hikers. Several northbounders, when they passed through, arbitrarily decided to complete a Twenty-Four Hour Challenge. Their full day of hiking ended at this shelter, after a sixty-plus mile day including a few-hour detour into nearby Luray for food and entertainment resulting in an entertaining picture of, of all things, clogging, which they left in the register. Another potential challenge, maybe? This one doesn’t require any particular time or location to attempt, so it’s easier to work into the hike any time it’s convenient. We’ll see…

The other fun thing about Shenandoah is that they have somewhat unusual food-storage requirements at campsites: large, dozen-foot poles stuck in the ground with hooks at the top, from which you’re to hang bags of smellables using an attached pole. (If you’re stealth-camping and not staying at an established campsite you hang your smellables in the usual way, with rope over tree branches.) It takes some dexterity to get my pack up, mostly because it’s still mostly filled with food. It’s not necessary to send the full pack up, but — particularly in a national park — the shelter mice population will be considerable. Even emptying a pack out isn’t proof against mice chewing their way in after smells of foods since removed. Opening up all zippers and pockets as far as possible helps, but it’s no guarantee like removing it from their reach completely.

September 8

(26.8; 1272.1 total, 901.9 to go; +11.8 from pace, -107.9 overall)

Today it’s up and out pretty early since I have so much distance to cover: nearly 27 miles to where I plan to stay for the night, Bearfence Mountain Hut. As a result I scare a couple bears down from trees as I walk back along the trail to the shelter to get back to the A.T. itself. These bears are both a good ten or fifteen yards up, and they’re down on the ground in two or three seconds bolting the other way, yet again demonstrating that climbing trees to escape bears is pointless and that the bears are more afraid of you than you are of them.

Continuing on, I meet my first truly unwelcome wildlife in Shenandoah: a deer on the trail who won’t move. (The trail’s too narrow to safely walk around it given its hind legs.) As I approach the deer is clearly aware of my presence, but it makes no effort to move. I keep walking until I’m perhaps a dozen feet from it, and it still hasn’t moved! This is ridiculous. I yell at it a little, and it remains unfazed (and unmoved). I take a step or two toward it while attempting to appear as aggressive as I can; it backs up (WIN), but then it takes a step or two toward me (FAIL), and I just as quickly take a couple steps back. This deer is clearly used to Not Taking Nothing From Nobody Nohow. Eventually it takes a step or two halfway into the bushes next to the trail, and I decide that if I move as far to the left on the trail as possible I’ll be comfortably far from the deer’s hind legs, so I pass the “wild” deer on the momentarily-congested trail.

Mm, doesn't that venison look tasty?

This isn’t the only deer I see in SNP that’s overcomfortable with people. The situation eventually gets so bad that I start counting the number of deer I see each day I’m hiking through, and I don’t think I had a day where the number was in single digits. Sometimes the deer run; mostly they stand and watch lazily. The deer know I can’t do anything to them, and indeed they hope for the opposite: that I’ll give them food. There are signs everywhere telling visitors not to feed the deer (I took a moment to admire one while at Elkwallow Wayside yesterday), but effectively prohibiting deer-feeding is about as likely as being able to effectively administer Prohibition. Through this food, the deer have lost much of their natural respect (with some amount of fear) for humans. Now to be sure, education is a worthy goal, and the fewer people who actually do feed the deer, the better — but it’s not realistic to ever think education alone will cure the deer of their fearlessness.

What SNP really needs, and what would address this problem, is the introduction of limited hunting. Shenandoah is a national park, so hunting in it is prohibited. This doesn’t have to be the case! A very little bit of hunting, carefully overseen by the park, would very quickly make the deer realize that humans are not risk-free potential sources of food, that humans can mean danger, and that it’s best to maintain a healthy distance from them. I don’t suggest open season — but if portions of the park were periodically opened to hunting for very short periods of time (so as to minimize disruption to the activities of other visitors), the deer would learn quickly enough how to behave in a way that minimizes dangerously close interactions between deer and humans. Limited hunting for such purposes is not an unusual concept; when I lived in Michigan we had a similar problem at some state parks (except even worse, because those parks had far more trail and road coverage than SNP does), and the problem got so bad that the state actually hired sharpshooters to come in to thin the herds and reduce excessive interaction. (Why they hired sharpshooters rather than opening it up for local hunters, who would have done the same work for free, is beyond me.) Unfortunately, getting traction on the problem will be a horrendous matter of politics, so I don’t expect such sensible measures to be taken any time soon.

Morning hiking proceeds slowly as usual. At one point the trail passes by the Pinnacles Picnic Ground, with picnic tables, water, restrooms, and so on, and I stop for a bite to eat. While there I meet a man who, upon hearing what I’m doing, says he thru-hiked the Appalachian Trail back in the 1970s when the A.T. was thru-hiked much less often; his year saw maybe a dozen or so thru-hikes (or perhaps completions, memory hazy), total. (For comparison, there were 419 reported thru-hikes going north to south or south to north in 2008 when I completed my hike, a further 39 hikes which took some other route covering the entire trail [say, starting halfway walking north, then returning to finish hiking south], and 96 full-trail hikes completed in multiple segments.) I continue hiking toward Skyland, a lodge roughly 11 miles into my hiking for the day, arriving sometime after 13:00.

Skyland is a small lodge/resort/restaurant/tap room in SNP that dates to the 1800s; it precedes the park itself, and its owner, George Freeman Pollack, was a strong advocate for creating a national park in the area by taking the necessary land from its owners using eminent domain. From what I understand he took this position not because he thought the country needed a national park, or SNP in particular, but because he figured it would be a good way to drum up extra business at his resort. It’s ironic, then, that through his success in seeing SNP created he himself was among those who had his resort and land taken from him. It’s a dirty little story I doubt you’d see mentioned in many of the displays there, a strong warning to those considering harnessing leviathan for private gain.

In other circumstances I would be taking this opportunity to visit the tap room at Skyland, the better to fully enjoy the unique experience of a good, backcountry, restaurant. But that’s out of the question now if I have to get to the south end of the park in two days. 🙁 (Have I mentioned how incredibly bad deadlines are?) Instead I buy a (vending-machine) bottle of root beer, glance at a nearby newspaper (Hurricane Hanna and the nationalization of Fannie Mae and Freddie Mac dominate headlines), and search for a phone to call family and update them on current progress. I have almost 70 miles to go, another 16 still for today, to meet family where I’d intended to meet them. We discuss for a little, but there really isn’t a good place to meet up other than the end. There are a number of random overlooks at which we might be able to meet, but that’s kind of a dicey plan. There’s one major road crossing that’s feasible, except that it’s so much further from the end that getting there in two days would be absurdly easy. In the end I decide there’s really nothing to it but to hike through to where we were going to meet originally — serves me right for setting a deadline.

Call completed, it’s back out to hiking again. It’s getting close to 15:00, so I need to make miles quickly to avoid night hiking. Thankfully, as usual my afternoon-hiking legs kick in, and I churn out the next 16 miles of trail almost without stopping, to arrive at Bearfence Mountain Hut as darkness hits around 20:30. I have the shelter to myself, it being a Monday night, and there’s a very convenient water source. After that it’s off to sleep to the sound of more-present-than-typical shelter mice (zippers and pockets opened) — two more big-mile days until family…

September 9

(25.8; 1297.9 total, 876.1 to go; +12.8 from pace, -97.1 overall)

It’s up and out for a long day today, either thirty-plus miles to Blackrock Hut or something less than that with stealth camping. It’s drizzly and rainy for much of the day, making hiking more drudgery than otherwise. Still, it could be worse — I pass Smoothie, hiking in the opposite direction, hoping to find a camera he thinks he dropped somewhere back on the trail. (I find out in shelter register reading tomorrow that he successfully retrieved it, getting a ride back to where he was on the trail and thus leapfrogging me.) Hiking continues into the afternoon; I see several turtles on the trail:

A turtle (possibly an eastern box turtle) on the trail

The trail drags today; eventually I find myself at a road crossing with a ranger station (and, more importantly, a water spigot) 0.2 miles away, and I head there to consider my options. It’s getting pretty late in the afternoon, and I’m not much above halfway for the day if I wanted to go to Blackrock Hut. For the moment I decide to punt on a decision and make and eat dinner. I hope maybe that’ll help me decide what do to. While eating I have perhaps my most disgusting deer interaction of the trip, as a deer walks up to within a dozen paces of me and proceeds to pace back and forth, looking at my Knorr rice dinner with obvious interest, clearly begging for food. This deer knows what he can get from stupid tourists, but I’m not one.

The extra energy from dinner (I really ought to use this tactic more often), however, helps me decide what to do: hike another seven miles or so to an area of trail nestled carefully between a campground and a wayside (safely further than the required 0.25 miles distance from either) and stealth-camp there, or hike further if there’s no obvious camping space. By now it’s late enough that I know I’ll be night-hiking. With the extra energy I have now, that just makes it all the more fun — after all, if you don’t night-hike, how will you ever get to wonder if that crashing noise from up in a tree just off the trail is a bear fleeing your presence or not? (This actually happens to me tonight; highly recommended. Remember: the bears fear you more than you fear them.) My pace isn’t fast, particularly due to encroaching darkness, but that no longer matters mentally, so it’s all okay.

My aim brings me to a field with somewhat high grasses and a fair number of trees. The trees provide good branches to hang smellables, which makes it basically adequate for me — hanging food in a bad location is a bigger chore than suffering a little while sleeping. This is definitely an area where I’d have been out of luck with the tent; I’m glad to have the bivy sack.

September 10

(28.0; 1325.9 total, 848.1 to go; +13.0 from pace, -84.1 overall)

Sleep last night wasn’t very comfortable, nor was it very dry. This bivy sack may keep out water when properly set up, but if the hood at top is mis-deployed it’s hopeless. I find if I roll over slightly I’m in a half-puddle of water, fun times. I get up, put on not-dried socks and wet boots, and start hiking with a minimum of delay. Hiking is slow this morning (what’s new?) as I head toward the first shelter of the day at Blackrock Hut, eating a Pop-Tart breakfast (“breakfast”? so it goes) as I walk. This trail is interesting because it was the site of a controlled burn in the spring, according both to signs and scorched, er, “blackery”. There’s not a whole lot of green through here, certainly none of it as trees, and I’m sure some northbounders had to skip trail in this area to avoid the fires when they were originally started. I reach Blackrock Hut having covered seven or so miles in way too long, and by the time I leave it’s past noon, and I have twenty miles to go to meet family at an unspecified time at a location that hopefully isn’t too hard to figure out (it’s a large road crossing, but beyond that I have no idea).

I continue hiking, being careful about what I eat because I have little of it — a handful or so of large Snickers bars is about it. This doesn’t help hiking speed much, but the problem is likely more mental than physical. I’m helped out a bit when, in talking to some passing day hikers (Trail fans, one a section hiker as I recall; they talk a bit about the Mayor with me), they give me an extra Clif bar to eat. After eating that and getting a little mental boost from talking and explaining where I have to be at end of the day, hiking pace picks up again to its normal top speed, and the miles start flying again. At this pace I should finish hiking just before dark.

The miles pass as I hike out of SNP. As I approach the first shelter just outside the park I see wild turkeys off the side of the trail; they’re making me hungry. The shelter’s far enough off-trail, and I’m in enough of a groove, and my time is just limited enough, that I keep moving past the shelter and don’t bother stopping. (An idle idea: could you carry a road bike in along the Appalachian Trail, past the gates, to avoid paying an entry fee?) I reach the southern self-registration backpacker kiosk, where northbounders would have registered. There’s a note waiting from Dad, maybe 15-30 minutes ago, saying they were waiting with the car at the road perhaps 0.8 miles away. A bit more walking and I’m there! The very first task is to quickly run to the promised nearby convenience store that closes at 20:00 in the hopes of getting some proper ice cream, but the store’s empty (and clearly has been for some time, sigh; another thing fixed in more recent Companions).

That done, and greetings complete (I’m told quite accurately that I reek), it’s on the road to drive to the resort where we’re staying (with a grocery store stop along the way, to get that half gallon of ice cream I’d been hoping to eat — and it’s not a proper half gallon either 🙁). It’s about an hour away (perils of trying to meet a hiker who doesn’t have a planned schedule), but once we arrive it’s dinnertime — barbecued ribs tonight, as I recall. For kicks I pull out the wrappers from all the various food items I ate today and count calories; the total is upwards of 3000 calories, maybe just under 4000 — and all of it except maybe the Clif bar was junk food. 🙂 Fun times…

September 11

(0.0; 1325.9 total, 848.1 to go; -15.0 from pace, -99.1 overall)

Today’s pretty lackadaisical, and mostly it’s just a chance to relax. We don’t make an effort to do very much. Shopping for supplies for the next section of hike is the biggest task I remember (although it seems like we still spent a fair amount of time running errands even if we didn’t do much). As usual the candy haul (thirty-odd bars) gets me some looks. This next section of trail’s fairly remote, arguably the most so since the Hundred Mile Wilderness. While there are towns off-trail, they’re all a fair distance along the roads, so the most convenient destination for someone not interested in dealing with the unpredictable delay hitchhiking entails is 134 miles south at Daleville. My food supply, therefore, is probably the third-largest I end up carrying during the entire trip. (The two long stretches in Maine are the only larger hauls.) Once back at the resort we take the opportunity to swim in the resort pool; for me it’s the first swimming since Massachusetts. But mostly, today’s just a day to relax, without deadline or plan to fulfill.

I don’t know it at the time, of course, but I have 44 days of hiking to go to reach Springer…

In September 2008 I wrote a web tech blog post about Text.wholeText and Text.replaceWholeText. These are two DOMAPIs which I implemented in Gecko before I graduated from MIT and took five months to thru-hike the Appalachian Trail. Implementing whole-text functionality was an interesting little bit of hacking, done in an attempt to pick up as many easy Acid3 points as possible for Firefox 3, with as little effort as possible. The functionality didn’t quite make 3.0, but aside from the missed point I think that mattered little.

The careful reader might think the post contains a slight derision for Text.wholeText and Text.replaceWholeText — and he would be right to think so. As I note in the last paragraph of the post, Node.textContent (or in the real world of the web, innerHTML) is generally better-suited for what you might use Text.wholeText to implement. In those situations where it isn’t, direct DOM manipulation is usually much clearer.

The whole-text approach of Text.wholeText and Text.replaceWholeText is arcane. Its relative usefulness is an artifact of the weird way content is broken up into a DOM that can contain multiple adjacent text nodes, in which node references persist across mutations. It is an approach motivated by fundamental design flaws in the DOM: Text.wholeText and Text.replaceWholeText are a patch, not new functionality. Further, Text.replaceWholeText‘s semantics are complicated, so it’s not particularly easy to use it to good effect. (Note the rather contorted example I gave in the post.)

Fundamentally, the only reason I implemented whole-text functionality is because it was in Acid3. I believe this is the only reason WebKit implemented it, and I believe it is quite probably the only reason other browser engines have implemented it. This is the wrong way to determine what features to implement. Features should be implemented on the basis of their usefulness, of their “aesthetics” (an example lacking such: shared-state threads with manual locks, rather than shared-nothing worker threads with message passing), of their ability to make web development easier, and of what they make possible that had previously been impossible (or practically so). I know of no browser engine that implemented whole-text functionality because web developers demanded it. Nevertheless, its being in a well-known test mandated its implementation; in an arms race, cost-benefit analysis must be discarded. (The one bright spot for Mozilla: in contrast to at least some of their competitors, they didn’t have to spend money, or divert an employee, contractor, or intern already more productively occupied, to implement this — beyond review time and marginal overhead, at least.)

The requirement of whole-text functionality, despite its non-importance, is one example of what I think makes Acid3 a flawed test. Acid3 went out of its way to test edge cases. Worse, it tested edge cases where differences posed little cost for web developers. Acid3 often didn’t test things web authors wanted, but instead it tested things that were broken or not implemented regardless whether anyone truly cared.

The other Acid3 bugs I fixed were generally just as unimportant as whole-text functionality. (Due to time constraints of classes and graduation, this correlation shouldn’t be very surprising, of course, but each trivial test was a missed opportunity to include something developers would care about.) Those bugs were:

The UTF-16 bug was exactly the sort of thing to test, especially for its potential security implications; disagreement here is frankly dangerous. (Still, I remain concerned that third-party specification inexactness caused Acid3 to permit several different semantics, listed beneath “it would be permitted to do any of the following” in Acid3‘s source. This concern will be addressed in WebIDL, among other places, in the future.) cursor:none was an arguably reasonable test, but it probably wasn’t important to web developers because it had a trivial workaround: use a transparent image. (The same goes for other unrecognized keywords, if with less fidelity to the user’s browser conventions, therefore lending the testing of these keywords greater reasonableness.) But the other tests are careful spec-lawyering rather than reflections of web author needs. (This is not to say that spec-lawyering is not worthwhile — I enjoy spec-lawyering immensely — but the real-world impact of some non-compliance, such as the toString example noted below, is vanishingly small.) Nitpicking the exact exceptions thrown trying to create elements with patently malformed names doesn’t really matter, because in a world of HTML almost no one creates elements with novel names. (Even in the world of XML languages, element names are confined to the vocabulary of namespaces.) Effectively no one uses Element.attributes, and the removeNamedItemNS method of it even less, preferring instead {has,get,set}Attribute{,NS}. The bug in question — that null was returned rather than an exception being thrown for non-existent attributes — was basic spec compliance but ultimately not useful function for web developers. Similarly, the impact of an incorrect difference between (3.14).toString() and (3.14).toString(undefined) is nearly negligible. The escape-parsing bug was an interesting quirk, but since other browsers produced a syntax error it had little relevance for developers. All these issues were worth fixing, but should they have been in Acid3? How many developers salivated in anticipation of the time when eval("var v\\u0020 = 1;") would properly throw a syntax error?

Other Acid3-tested features fixed by others often demonstrated similar unconcern for real-world web authoring needs. (NB: I do not mean to criticize the authors or suggesters of mentioned tests [I’m actually in the latter set, having failed to make these opinions clear at the time]; their tests are generally valid and worth fixing. I only suggest that their tests lacked sufficient real-world importance to merit inclusion in Acid3.) One test examined support for getSVGDocument(), a rather ill-advised method on frames and objects added by the SVG specification, whose return value, it was eventually determined (after Acid3-spawned discussion), would be identical to the sibling contentDocument property. Another examined the values of various properties of DocumentType nodes in the DOM, notwithstanding that web developers use document types — at source level only, not programmatically — almost exclusively for the purpose of placing browser engines in standards mode. Not all tested features were unimportant; one clear counterexample in Acid3, TTF downloadable font support, was well worth including. But if Acid3 gave web authors that, why test SVG font support? (Dynamically-modifiable fonts don’t count: they’re far beyond the bounds of what web authors might use regularly.) SVG font use through CSS was an after-the-fact rationalization: SVG fonts were only intended for use in SVG. (If one wanted to write an acid test specifically for SVG renderers, testing SVG font support at the same time might be sensible. Acid3, despite its inclusion of a few SVG tests, was certainly not such a test.)

But Acid tests don’t have to test trivialities! Indeed, past Acid tests usefully prodded browsers to implement functionality web developers craved. I can’t speak to the original as it was way before my time, but Acid2 did not have these shortcomings. The features Acid2 tested were in demand among web authors before the existence of Acid2, a fortiori desirable independent of their presence in Acid2.

I have hope Acid4 will not have these shortcomings. This is partly because the test’s author recognizes past errors as such. With the advent of HTML5 and a barrel of new standards efforts (workers, WebGL, XMLHttpRequest, CSS animations and transitions, &c. to name a few that randomly come to mind), there should be plenty of useful functionality to test in future Acid tests without needing to draw from the dregs. Still, we’ll have to wait and see what the future brings.

(A note on the timing of this post: it was originally to be a part of my ongoing Appalachian Trail thru-hike posts, because I wrote the web tech blog post on whole-text functionality during the hike. However, at the request of a few people I’ve separated it out into this post to make it more readable and accessible. [This post would have been in the next trail update, to be posted within a week.] This post would indisputably have been far more timely awhile ago, but I write only as I have time. [I wouldn’t even have bothered to post given the delay, but I have a certain amount of stubbornness about finishing up the A.T. post series. Since in my mind this belongs in that narrative, and as I’ve never omitted a memorable topic even if (if? —ed.) it interested no one but me, I feel obliged to address this even this far after the fact.] Now, if you skipped this post’s contents for this explanation, return to the start and read on.)

15.06.09

Astute observers of the recent Firefox 3.0.11 release will note that this release gains a point on the Acid3 test, specifically fixing test 68, involving a difference in how Mozilla processes UTF-16 from what the Unicode specification requires. (Indeed, it seems Wikipedia’s Acid3 article was updated to reflect this on the very day of the 3.0.11 release, and that edit has even been reverted and un-reverted once. One does wonder at times whether these people don’t have better things to do with their time. 🙂 ) What did test 68 check, why was it fixed, and why was it fixed in a dot release of a stable version? If you’re curious about this and are willing to dive into technical details at some depth, keep reading. If you aren’t, I write here for the pleasure of a presumptively-interested audience whether or not any such thing exists for the topic at hand, so you’ll forgive me for not caring too much what you do. 🙂

Representing text through Unicode and UTF-16

Without descending into too much detail, modern computer programs represent text as a sequence of numbers, one number for each “character” in the text. Unicode defines each such number as a code point and assigns each number a particular meaning (e.g. 99 is “LATIN SMALL LETTER C”, that is, the letter ‘c’). Such an arbitrary sequence can’t simply be represented as itself partly because computers can only efficiently handle numbers whose values are within small, fixed ranges, and code points cover too large a range (0 to 1114111) to handle directly. Unicode therefore defines a small number of well-known, widely adopted processes to convert an idealized sequence of code points into computer-friendly sequences.

UTF-16 is one way to represent a sequence of code points, using a sequence of 16-bit numbers. Broadly speaking, code points in the range 0 to 65535 are represented using the identical number; code points 65536 and greater are represented by a pair of 16-bit numbers. There’s an obvious problem: how do you determine if a 16-bit number encodes a single code point or half of one? Basically, not every 16-bit number is a code point; there are intentionally no code points from hexadecimal 0xD800 through 0xDFFF (55296 through 57343). With a little care, we can use two (exactly two, no more, no fewer) numbers from this range to represent a code point. For example, the hexadecimal number 0xD863, 55395 in decimal, does not correspond to a valid Unicode code point, nor does 0xDDDD, 56797 in decimal. The sequence 0xDDDD 0xD863 also corresponds to no Unicode code point or sequence of code points. However, 0xD863 0xDDDD represents the Unicode code point with the hexadecimal value 0x28DDD, 167389 in decimal.

Acid3 test 68 and its fix

The failing test 68 examined how to interpret a purportedly UTF-16 value which actually was in error. Specifically, consider a sequence of 16-bit values like so, to be interpreted as UTF-16:

Index

Value description

Example

0

Value appearing to be the first in a valid pair (a high surrogate)

0xD863 (decimal 55395)

1

Value not appearing to be the second in a valid pair (instead appearing to be a value in range [0, 65536) validly represented as itself)

0x61 (decimal 97)

2

Value in range [0, 65536) validly represented as itself

0x61 (decimal 97)

How should such “UTF-16” data be interpreted? First, note that Unicode includes the concept of a “replacement character”, used to fill in for the malformed parts of ostensibly-UTF-16 data when interpreting it. With that concept in mind, we have three plausible ways to interpret this sequence of values:

In fact if you look at test 68 itself, you’ll see a variety of responses are (to put it as conservatively as possible, since the specs are currently insufficiently precise to admit only one correct behavior) “not prohibited” by the relevant specifications. Mozilla at the time chose to interpret such data as the last of the three possibilities, thinking that the pair of 16-bit values was invalid rather than merely the first of them (in which case the second of the “pair” would be interpreted as its own value or start of a pair). However, the Unicode standard didn’t permit this choice, and neither did Acid3; further considering that other browsers had correct (and just as important, different) behavior, it made sense to change our interpretation in this case. I pushed a fix to the mozilla-central repository very nearly a year ago, and I made no attempt to get it in 3.0, for two reasons. First, the bug didn’t matter in the real world (web developers are very unlikely to have relied on the previous behavior, and the fix enables no new, desirable functionality); second, I knew we didn’t have enough time before the release to sniff out all potential regressions from such a low-level change and be confident nothing had been broken.

Acid3 refldux

Fast-forward a year later, however, and suddenly the fix for test 68 is in 3.0.11. The test, while important to fix eventually for a number of reasons, is not especially important for real-world behavior (a flaw of many aspects of Acid3, notwithstanding its many useful tests of desirable functionality, but I digress), so why fix it now rather than in the 3.5 release? Surely such a non-essential bugfix is better left to 3.5 as it already had been, right? That’s certainly true enough if our assumptions are correct — but in this case, curiously, they’re not.

One of the things that makes changing how character encoding works so exciting is that it affects a lot of other code. A small bugfix to such code usually has no effect on properly formed input, because such input is the normal case that receives regular testing. Improperly formed input, however, may cause immediate problems (which are usually simple to smoke out through careful creation and use of automated tests) or problems in other code at miles of distance (which are far more difficult to discover). Suppose data is interpreted by two different decoding implementations, producing two different idealized representations — what are the consequences? Maybe the lengths will differ. If those implementations are used in code written by memory nazis, perhaps a string will be copied incorrectly and result in a buffer overflow. Algorithmic differences might also throw off hashing schemes that use string length in computing a hash key. Of course, if lengths differ the characters must differ as well. That might cause a CSS selector to not apply correctly, or it might introduce a vector for slipping forbidden characters through an anti-XSS filter. Character encoding and decoding is fundamental to any number of other systems which rely on precise, correct behavior in order to work properly. If you don’t have that behavior, all bets are off.

In this case, as it turns out, two separate bugs were uncovered which my proposed patch fixed: bug 490513 and bug 489041. If you read the details in each bug, you’ll note that there’s very little in either bug to suggest why my change might have any effect. To be sure, both testcases deal with strings containing problematic sequences as above, but nothing in the testcase explicitly suggests UTF-16 decoding is happening.

A useful first step in examining any bug in a bug database is to look at its ancestry. Curiously, bug 490513 is a Bugzilla clone of bug 439206, with the same steps to reproduce and the same testcase. From there we proceed to the fix for the bug. The patch is small, and if one understands the relevant code it’s similarly easy to understand, but —

The comment in the patch looks strangely familiar.

In fact I seem to remember exactly that comment in my patch for bug 421576 that fixed test 68. (Lest I be misconstrued, this comment-cribbing is perfectly acceptable in open source code, indeed is even stylistically better for demonstrating consistent intent in the code. However, your mileage may vary in other code or if perchance you happen to work for Microsoft.)

Also suspicious: this bug was filed roughly a week after I pushed 421576 to mozilla-central. A little more investigation confirms the obvious conclusion: bug 439206 was a regression from bug 421576, vindicating my initial thoughts in bug 421576 that “we couldn’t reasonably take this now and expect to be able to sniff out all possible regressions”. It seems that I updated most of our decoding code to handle lone high surrogates correctly, but I missed the spot being fixed in this patch. The code being patched here handles string hashing within Mozilla’s hash tables, and if you read the other bug comments you can see the testcase is causing a string to be hashed using two different algorithms (a failure mode I mentioned earlier), and as the computed hashes differed things went awry.

Here’s where Denmark turned rotten: this change was deemed important to fix for 3.0 point releases, but because no one noticed it was a trunk-only regression and thus didn’t need to be fixed in a security updates, it was backported to 3.0. The problem on trunk was that I only updated half the decoding algorithms in bug 421576, so fixing bug 439206 fixed the other half and brought them into sync. What did fixing bug 439206 in 3.0 do? It updated the other half of the decoding algorithms and took that half out of sync with what the other half was before 421576! Our trunk problem — that we changed one decoding algorithm but not a second that needed to be synchronized with it — had been ported in mirror image to 3.0 point releases. This problem, then, triggered the filing of bug 490513 (incidentally regressing bug 489041 as well), and for precisely the same reasons bug 439206 was marked as security-sensitive until being fixed, bug 490513 was marked as security-sensitive.

Let’s recap:

I fix bug 421576 (and Acid3 test 68) in mozilla-central.

This causes the security-sensitive regression bug 439206.

Bug 439206 is investigated and fixed in mozilla-central.

This fixes the potential vulnerability I introduced.

People recognize 439206 as potentially dangerous but not as a regression, so the same fix is added to 3.0 point release code.

This causes the security-sensitive regression bug 490513 (and bug 489041 as well), because we’re missing the first half of the code (in 421576) that caused 439206.

That bug is investigated by the reviewer of my fix for 421576, who correctly hypothesizes that my fix will fix that bug without determining exactly why.

Both halves (421576 and 439206) are now fixed in mozilla-central and in 3.0 point release code.

…and people wonder why we’re so hesitant to fix anything except security bugs in updates to stable releases, instead deferring to the subsequent major release. The potential for error, even in code that’s been written and reviewed by four different people across the two bugs, introduces a far higher cost than can be offset by the value in fixing nearly any non-security, non-stability bug.

Lessons for the future

All this mess is now behind us with the 3.0.11 release. However, a number of the causes for failure here are not peculiar to the precise bugs fixed here. What can we learn from this that can be applied in the future?

Make string encoding/decoding code simpler

First, and most pertinent to the case at hand, string encoding and decoding are complex, and we need to do everything we can to make this code simpler. A large part of the reason 439206 was missed when fixing 421576 was that the relevant code was not part of Mozilla’s string code — instead of residing in xpcom/string, it was in xpcom/ds. It’s difficult to argue that a data structure used to hold atomized strings (that is, strings which uniquely identify a sequence of characters, making a comparison of two atomized strings as fast as comparing two numbers) shouldn’t reside in a data structures (ds) directory. However, the code to compute the hash of the string (a single number that summarizes a string’s contents, hopefully uniquely but possibly non-uniquely) must simultaneously decode the string, and that code should be in xpcom/string with all other string decoding code.

In response to this state of relative disarray, I’ve filed bug 497204 to reorganize and consolidate Mozilla’s string code, with the primary goal of getting it all in one location (so that even if you’re unfamiliar with it, you at least know all the code you might need to read) and with the secondary goal of organizing it in a clearer and simpler fashion (to make the consolidated code easier to read). We may still end up making mistakes in how we handle string encoding and decoding even with that work complete, but those mistakes will be easier to find, diagnose, and completely fix.

Write more automated tests, and make it possible to include them in security fixes

Second, we need to move as much testing like this out of human hands and into “computer hands” as possible. One reason we introduced a regression was that we relied on fallible human testing to varying degrees throughout this whole process. I relied on some informal testing of my original patch in deeming it complete; the regression fix did likewise as manual QA testing verified the problem had been fixed on trunk and then later discovered the bug not to be fixed in 3.0.x releases (after the unnecessary backport). Yet I didn’t uncover my omission, and QA didn’t notice the backport caused the branch regression rather than fixed it. Some amount of manual testing, formalized through QA or otherwise through masses of nightly testers, is both desirable and unavoidable. Most of the time, however, we are much better off with automated tests that can be run with much less effort, on a much shorter time scale, with much greater rigor, and with much less chance of mistakes in their execution.

Now, to be fair, automated tests have less value here than in many other case. I included automated tests in my original patch, but they only addressed the bug at hand and not the regression (else that regression would never have occurred). It’s true that the followup would have been helped by an automated test, because when it was ported to branch it would have failed, and the backport would have been reverted pending further investigation. (Whether this investigation would have led to discovering bug 439206 to be a trunk-only regression is unclear, but it certainly seems plausible.) Here, however, we encounter the large problem that we currently can’t include automated tests in security bug fixes because we don’t want to tip our hands before a release with the fix is available. (In cases such as this one this worry is perhaps more paranoid than well-founded, but in many other cases where a testcase is half a step from a full exploit it’s vital; mrbkap can elaborate on this at length.) The testcase is often committed after the release, particularly when the problem is memory corruption in ways that appear difficult to control (the apparent case here), but by that time the damage has already been done.

Automated tests are not a panacea, as my original fix shows. Nevertheless, consistent use of them here would have at least eliminated the 3.0.11 regression if not the trunk regression. For this to happen we must have security bugs include automated tests that run, essentially, as soon as the fix lands. The solution that I believe best meets this need as I envision it consists of one separate, private Mecurial repository per actively maintained release to which security bug tests may be committed. Access to these repositories would be strictly limited to developers with access to security bugs plus accounts for use by tinderboxen. We would then add additional steps to the build process on those tinderboxen to pull and run the security tests, reporting either PASS or FAIL for the lot of them. Detailed information would not be publicly displayed but would be available in some unspecified manner to developers who cause security bug tests to fail. Then, as security bugs are fixed, we would require two-phase commits for security bugs: first the fix to the public repository, then the test to the restricted repository. As failures would turn the main tinderbox orange, this ensures regressions get attention between the initial commit to the security repository and their eventual migration to the main repository — early enough to make the difference in cases like this. Another plus: I believe this would dovetail nicely with work in bug 383136 to make it possible to run tests against prepackaged builds.

There may be a different solution which meets the needs I specify here; I’m less concerned about the process than about results. However, I haven’t heard another proposal that I believe would work well enough.

Manual testing: actually, I don’t know what the lesson is here

Third, we have manual testing, imperfect but still necessary to some degree. I don’t know what the lesson for QA and manual testing is. Sure, they could be more diligent about checking that a bug exists pre-commit and is fixed post-commit, but in isolation every such action is reasonable. Is the decrease in available time worth the benefit of potentially catching a problem like here every so often? I don’t know the exact processes they follow these days or how they otherwise spend their time, so I really can’t evaluate this. I’ll let QA consider this situation and decide how to adjust, because I’m certainly not qualified to do so.

Conclusion

The particular bugs at issue here are now fixed, so for the moment we’re back to steady state. As explained above, however, it’s possible that further errors might happen for the same basic reasons unless we make an effort to eliminate those reasons, and we still have work to do to fix the root problems. I have hope that this article may spur improvements in processes that will make mistakes like this much harder to make, but it will take more than just me to make these changes happen. In the meantime, to Firefox 3.0.11 users, enjoy the gift of an unintended Acid3 point; it came at a much higher cost than we’d have been willing to pay if we had known the full story and never made either 3.0.x backport.