Patching the Newton

This one ends with me going into a Good Guys electronics store in a suburb of Boston, flashing my employee badge and saying “I’m from Apple. Take me to your Newtons.” I wanted to use a deep G-man voice, but I didn’t have the build for it. Also, I was wearing the slightly redonkulous khaki-pants and polo shirt that we’d been instructed to wear for the Newton debut the next day, and I looked like a rah-rah Apple sales guy. Appearances are deceiving.

It’s August, 1993, the day before Apple’s Newton MessagePad goes on sale, and overnight we’ve heard that some people who managed to get units ahead of time are having trouble with them, and someone heard that someone said they thought they had a unit whose software wasn’t the right version. There’s a little uncertainty. Since we shipped the final Newton ROM in June we’ve spent all summer fixing bugs and making patches. The patches are applied in the factory, toward the end of the production line, and they fix critical bugs. If the units don’t have these fixes, the Newt won’t work very well.

The patches live in the battery protected low-power RAM of the Newton, and they’re theoretically immortal as long as power holds out. This is why the battery compartment has a wacky mechanical locking system meant to discourage people from simultaneously removing both the main and the backup batteries. It’s a byzantine contraption of sliders and buttons molded in Holy Shit Yellow, and it’s meant to scare people into being cautious. But early adopters aren’t necessarily normal people, and maybe they’ve been mucking with stuff they shouldn’t be. Who knows?

As the morning progresses it becomes clear that some number of the Newtons that are already at stores and about to be sold either didn’t get patched at the factory, or (worse) are losing their patches in transit. So here I am, cruising around Boston in a crappy rental car, fumbling with maps and getting lost on streets that are not laid out in a grid. It’s also about 90 degrees and humid, reminding me of why I don’t live on the east coast any more.

I find the Good Guys store manager. “I’m from Apple, and you have a bunch of Newtons that you can’t sell unless we take a look at them first.”

“Um, okay.”

They lead me into into the back room, and there’s a cage with about twenty Newtons in it. It’s the first time I’ve seen one in an actual retail box. It’s kind of neat. We made this? Holy crap, people are about to buy them.

“I’m going to have to open each one and inspect it, okay?”

“Sure.” No pushback at all. They get a stockroom kid to help me. We open each box, I check the version number that each newt displays on its startup screen, and (lo!) there are a couple whose versions are “1.0”. I record the serial numbers of these units, stick the upgrader PCMCIA card into them (hacked up Very Quickly by Newtonites who didn’t get to go to Boston), and verify that the units take the patch. We re-box everything (I resist the urge to include a scribbled note, something like “Inspected by #18706″, my Apple badge number). Half an hour later, we’re done.

“That’s it. You can sell these now.”

The Guy from Apple exits Good Guys, checks it off the list, and proceeds to the next store on his assignment. G-Man is surprised what we was able to do by flashing his Apple badge and adopting some bluster; just walk in, make some demands and start cracking open merchandise. That’s kind of disturbing, really.

There are perhaps a dozen of us spread out over the Boston metro area, patching Newtons that have somehow lost their minds. We don’t know the cause of the failure yet. We report serial numbers back to Cupertino. It’s the usual mode of operation of Newton; shit happens, we fix it just in time, and there are unsung heros. A number of engineers back in Infinite Loop did an all-nighter last night and made that magic re-patching PCMCIA card. And here I am stuck in Boston traffic. It’s hawt.

None of this is going to matter.

—-

Back in February or March, the Newton OS group knew we had to solve a tough problem. The Newton’s ROM actually is read-only. The days of shipping flashable devices as a common practice were a few years off. The Newton’s OS is decided forever by its silicon mask when the chip is fabbed, and it can’t be changed, ever. Also, the lead time for ROMs is on the order of ten weeks, so the software will have to be done earlier, which means it will have more bugs. Huh.

So, how do you fix bugs in a ROM, if you can’t change the image? The basic idea is that you litter the code with indirect jumps that go through a jump table that’s been copied to RAM. When you need to patch something, chances are that you can do tricky and unnatural things to get control at the right spot, fix things up and continue.

The brain-dead way to do this is to give each and every function in the ROM (except for the critical startup stuff) a jump table entry. You copy the whole table to RAM when the system starts up. For the first Newton the size of the jump table was about 40K, which is about twice the budget we had for patches (the first Newton had 512K of memory, a fair amount of which was reserved for storage of user data, with the rest used for NewtonScript heaps, screen buffers, thread stacks, networking, working storage for handwriting recognition and so on).

Okay, no problem; the Newton has an MMU, and we used it extensively already. So we put the jump table into ROM, and then only RAM-back the pages of it that contain function stubs that we need to modify. But this doesn’t work because pretty often a function you need to patch will be in a page you haven’t used yet. Even a moderate number of patches will quickly drag in the whole jump table again, and you haven’t even included the actual patch code yet.

And that’s where we were roughly in March, knowing that we had a problem, but with nobody really working on it because all we had were the old, dumb ideas that didn’t work. Time hadn’t run out yet, but people outside the team were getting a little antsy.

—-

The OS group was in one “pod” of Infinite loop’s building One (we were one of the first groups to move in, having done our bit of purgatory in an off-campus building on Bubb road for a number of years). The floor layout was a big improvement over a sea of cubicals; the building architects had designed very workable areas in each corner of the building. Each corner pod had about a dozen offices (with doors!) surrounding a living-room sized area that had couches and *lots* of whiteboard space. If you left your office door open you could halfway participate in the conversations happening in the central area, and if you wanted privacy you just closed your door (which was a nice, solid hunk of wood that stopped sound pretty well). The setup was designed to let people both collaborate and to get quiet think-time. After years of working in cubicals, it was great.

The discussion one afternoon was yet another bull session on how to do patches. I was typing away (bringing our flash object store to life) and not listening too closely, but . . . to this day I don’t know exactly where the idea came from. We were big on randomized policy in Newton because Bob, one of our OS heavies, maintained that it was the ultimate in fairness. Maybe that was it.

I popped my head out of my office and let my mouth talk by itself. “Why don’t we randomize the jump table?”

“What?”

It wasn’t clear in my head yet. This half-baked crazy idea was zooming around upstairs, trying to get out. I went to the whiteboard and started drawing boxes. Whenever an idea is trying to escape, start by drawing boxes and arrows — it’s great bait. The little critters can’t resist a whiteboard with unfulfilled boxes and arrows.

“We randomize the function placement in virtual memory.” Boxes. “We spread them out over a few megabytes but tile them sort of skewed so they use only 40K or so of actual physical memory.” Arrows. “So when you patch a function, you copy it to RAM and map it where it needs to be, but the rest of the page is available in virtual space for the code.”

One of my cow-orkers caught the idea and ran with it. He wasn’t a whiteboard kind of guy, though, so someone else picked up a pen and drew what he was saying.

Greg: “Yeah, we can re-use the same physical page over and over again until we run out of space.” Boxes. “We write a tool to figure out the optimal packing.”

Bob: “We need to make sure that patches are really well protected. REALLY well. Maybe we shouldn’t ever map them anywhere as writable.”

Daniel: “We need some kind of patch-building tool. And we have to read it from somewhere to get patches installed, and mapped-in when we boot. I’ll do the patch manager work.”

[conversations not accurately remembered, but that was the gist]

In short order we figured out the work, and a couple of weeks later all the tooling and build stuff in place. When we told some of the old Macintosh patching veterans about the technique the reaction was usually a thoughtful look, then a little giggling, and then they wanted to try to write a patch.

Usually a patch would first be written by the engineer who fixed the problem in the mainline sources, and it would go to one of the patching gurus (hi, Andy!) who could distill someone else’s hard-won 16 instruction patch into three instructions of such utter genius and evil that shivers would run down your spine.

I was busy on other stuff and only had time to watch the work happen, but it was pretty cool.

—-

[Here’s what was trying to come out. In virtual space, patches would be in a big honking address space, with functions distributed around it (not really randomly, but that was how we got the idea). Each line of dashes below is a page boundary:

jump A
-
-
-
----
-
jump B
-
-
----
-
-
jump C
-
----
-
-
-
jump D

Note that the virtual address of each jump table entry crawls forward a little on each successive page. So physically we can pack the page like this:

jump A
jump B
jump C
jump D

and map it multiple times.

Now let’s say that we need to patch functions B and C. We’d make a new page, distributed as the patch itself, that contains the new code for B and C at the right locations:

The actual fixes might involve jumping to the original code, or could be *significantly* more tricky and convoluted. What mattered was that the patch pages could be packed with code for fixes for different routines. Also, the skewed jump table format was a little more complex, so that we could often include the fixes inline in just a few instructions.

Anyway, the technique saved us a significant amount of RAM, and let us get ROMs out in time for the Big Day.]

—-

Ultimately the Newton was a market failure. If you look at the competition at the time we did maybe five things wrong that the other guys got right or chose not to try to address:

1. Price point. The Newton was about a thousand dollars. Ouch. The Palm Pilot was about $300.

2. Form factor. The Newton was large and weighed about a pound. The other guy fit in a shirt pocket.

3. Handwriting recognition wasn’t quite there (and the production Newton had a digitizer noise issue that we found out about only later that Fall, which dramatically reduced the recognition rate). The Pilot used a much more primitive letter-by-letter system, but people got it and it worked well after a little training. The 2.0 version of the Newton did much better here (but I had left Apple before this shipped).

4. No real apps . . . and you had to pay Apple a commission. For a totally new platform, charging your developers to develop software is an uphill battle. (Of course, all this changed with the iPhone, but the demand was very high). The commission idea was just another stupid and reality-disconnected idea from upper management; instead of attracting developers (who already had to pony up over a thousand bucks for a dev kit) I think it discouraged folks from even considering developing for the Newton.

5. The “MessagePad” wasn’t all that good at actually messaging anything. It was awkward to connect a Newton to anything (via a cable or PCMCIA card), and the modes of communication were either inconvenient or expensive (wireless pay-by-the-kilobyte). The Newton was about 15 years ahead of the technology curve here.

6. Releasing on a magic date. The Newton probably needed another three months or so of bake time before we shipped it. By November or so we’d patched nearly all of the software issues that critics (rightly) complained about, and I like to think that the reception would have been much kinder if we’d worked just a little longer. To this day I have a deep mistrust of magic ship dates and the motivation behind them.

7. Language? I actually think that NewtonScript was a great idea, but the tools were rocky and never really approached the quality that Microsoft soon showed off in Visual BASIC. Apple never released a native dev kit for the Newton, but given the state of its native runtime (gnarly) it’s probably a good idea. In any event, developing on the Pilot was a lot like programming a 68000-based Mac, and pretty familiar to many programmers, while NewtonScript was something from another world (even if it did have many cool aspects).

I don’t regret working on the Newton at all. The people I worked with were top-notch, I learned a hell of a lot about technology and business, and I’m still pretty proud of all the neat stuff we put in under the hood. Ultimately I don’t think that tweaking what we got wrong would have saved the product, but it might have left more of a mark than it did.

Some day I’ll talk about Wayfarer…

—-

I should add: I don’t think we ever found out why those initial units lost their patches. I think we figured they never had them to begin with (it *was* a new production line of a 1.0 product, after all).

8 Responses to Patching the Newton

I remember most of the Global Village sales team coming back from MacWorld Boston that year with Newtons in tow. I must admit at the time I was asking, “Wait, you paid how much for something that does what?”

I think one of the things that Palm got right was their reasonably smooth linkage with your desktop/laptop. Keeping those in sync, along with the small form factor, got me to stop carrying around a 3×5 card printed out from whatever the contact manager software I was using back then. Killed that software niche as well.

Yeah. I bent heaven and earth and did some of the longest hours in my career making the Newton’s storage system bulletproof from crashes and other failures, while Palm’s solution was to make it really easy for the Pilot to talk to a PC (whereupon you got backups of that data we were trying to protect AND you got a communications story that didn’t utterly suck).

That was a valuable lesson: Even coders at the coal face of your product need to have a product-wide view.

I had to read that through a few times, it does sound like a neat hack. So if I’ve understood it right, in the unpatched case, the jump table is spread out across say 10 pages of virtual address space, jmps A-F are accessed by addresses in page 1, G-P in page 2, etc, but all 10 pages are actually a single page of ROM mapped multiple times because A-F’s page offsets don’t overlap with those of G-P.

Then when you want to fixup function G, you copy the single page into RAM, map it over page 2 of the jump table, make a fixup that doesn’t overlap any of the jumps H-P, and since you don’t need to remap page 1 you could use the space used by jumps A-F in the fix for G. If you later needed to fix function A you could make a new fixup page and map it in page 1, or reshuffle the fixup page you already had.

Great stuff. I worked briefly on Newton Toolkit (itself, not with) at 1 Infinite Loop, and later, did a fair amount of Palm Pilot development myself… you’re right on about developing for Newton vs. the early Palm devices… not only was the first Palm device very nearly the same machine constraints as the early Macs, it even used ResEdit for resource editting.

I also had to manually recreate a feature of the Apple Mac toolchain for the Palm Pilot once my app. became more than the 64K that the linker could generate offsets for: branch islands.