Spacecraft Driven Development: The Story of How We Took Computers to The Moon

At the dawn of the space program, computers were enormous, bulky, delicate things, with extremely limited processing capability. Today, we have unmanned spacecraft exploring the outer reaches of the solar system.

This is the story of how we got here from there.

Spacecraft driven development: a series of incremental improvements and learnings

It’s easy to forget that even the most sophisticated technology is rooted in a series of incremental developments – learning from failure – and the story of NASA’s earliest onboard guidance computers is no different. Along the way, they more or less invented the discipline of software engineering, along with many of the best practices we know and love today.

If you’ve come here for stories of sci-fi-like last-minute repairs and incredible feats of genius, you may be disappointed. The people in these stories were incredibly smart, yes, but they were also figuring things out as they went along – it’s a hazard of doing things no one’s ever done before.

Project Mercury & the Human Computer

The year was 1958. The space race had just kicked off with the launch of Sputnik, and the newly formed National Aeronautics and Space Administration was under a lot of pressure.

Cue Project Mercury, the first U.S. human spaceflight program. The goal was simply to put an astronaut into orbit, then return him (all American astronauts were male until Sally Ride’s first flight in 1983) safely to the ground.

Project Mercury spacecraft were essentially well-sealed tin cans filled with air, plus some instrumentation and small positional thrusters. Ascent was controlled by the angle of the rocket at launch, and descent was handled by a parachute and water landing. There was no guidance computer, and the only way to communicate with the ground was by radio.

The astronaut still needed to perform small adjustments in mid-flight to control his orbit and ensure he re-entered the atmosphere at the correct angle. And so they needed some way to relay data to the ground, use NASA’s computers to make those incredibly complex flight calculations, and relay the results back to space.

The solution they came up with was a logistical feat that boggles the mind. It was a global relay race:

The astronaut would take readings from his instruments and radio the results to the ground.

That radio call would be received by the Worldwide Tracking Network, a chain of 18 stations around the equator in various nations and continents. Each station was in range of the spacecraft for approximately 7 minutes as it passed overhead.

The Worldwide Tracking Network relayed the data from the spacecraft to NASA staffers at the control centers in Maryland and Florida, who ran the calculations on their (firmly non-portable) computers.

NASA then phoned the results of their calculations to the WTN (after calculating which station was currently in range of the spacecraft).

The WTN station radioed the results back up to the spacecraft.

The astronaut would make the appropriate adjustments, take new readings, and start the cycle all over again.

Each flight needed a total of about 18,000 support personnel on the ground.

Project Gemini & the First In-Flight Computer

Fast forward to 1961 and Project Gemini. NASA, having successfully orbited people around the earth, had a new, more ambitious goal – a manned spacecraft that could make a mid-air rendezvous with an upper-stage rocket, which would be sent up separately. The other rocket would deliver additional fuel after the manned spacecraft reached orbit, allowing it to perform much more complex flight maneuvers than those possible with Project Mercury spacecraft.

To accomplish that goal, they needed greater precision than the Worldwide Tracking Network/NASA relay could provide. So they imagined something new: a computer installed onboard the spacecraft itself, capable of taking input from the ship’s instruments and performing the calculations needed to get a crew safely home.

The resulting Gemini Guidance Computer, built by IBM, weighed a whopping 60 pounds and took 420 milliseconds to multiply two numbers together. This was, of course, state of the art at the time.

To IBM’s surprise, however, building the hardware turned out to be the easy part. Software engineering was a very young discipline at this point – and indeed wasn’t even called software engineering! IBM’s engineers assumed that they would be able to quickly crank out the code needed to guide the spacecraft through its trip.

Once they started coding, though, their software promptly ran up against the limits of the hardware.

Even though they were using a variant of assembly language that was extremely low-level and memory-efficient, the tiny memory was nowhere near large enough to hold all the programs needed for take-off, in-flight maneuvers, re-entry, and landing. The solution was an external tape storage system that could hold programs not currently in use and load them back into the working memory as needed.

Unfortunately, the tape system was prone to errors and misreads during the vibrations of takeoff and landing. They stabilized the tape with weights to minimize the effect of the vibrations, and implemented a three-way voting system where each bit of a program would be read simultaneously from three copies. If a particular bit on one tape was corrupted or misread, the computer had two other good copies to overrule the bad one.

The downside of this safety measure was time: it took nearly 6 minutes to load a new program from tape. But it reduced errors to acceptable levels.

Another new challenge was how to organize all this code.

The engineers originally envisioned each flight as one big procedural program, like a long essay. This plan lasted until they realized that the giant program would would need to be completely rewritten and retested for each subsequent flight.

They eventually settled on a module system organized by the phase of the flight, since some parts of the code changed more than others between flights.

The last major obstacle IBM’s engineers faced was to how test their flight code. Clearly, the programs needed to be error-free; just as clearly, they couldn’t actually be tested in flight. The result was a 3-level test system:

First, each hardcoded equation was verified to ensure that it produced the correct output.

Next up were “man-in-the-loop” simulations, something like what we might call unit tests – given a certain input, does a piece of a program produce the correct output?

Lastly, they would run a series of manual integration tests using the real hardware and crew interface.

The Apollo Project: Taking Computers to the Moon

The next opportunity to build space computers came a few years later with the launch of the Apollo program. MIT rose to the task and produced the Apollo Guidance Computer.

The MIT team unfortunately fell victim to some of the exact same problems that plagued the IBM team for the Gemini missions – namely, the hardware team vastly underestimated the amount of memory required hold the ever-increasing complexity of the guidance software. The hardware memory had to be increased 18 times during the course of the project, and topped out at what, in modern units, would be about 72 kb.

In an effort to make their software as memory-efficient as possible, the software team even wrote their own language: they created a variant of assembly language (now known as ACG assembly), which used less memory at the price of sacrificing some speed. To get a feel for what the code looked like, check out the code for the Apollo 11 mission on Github.

The hardware team did learn from some of the mistakes of the Gemini team, though. Given the issues the Gemini team had with vibrations and power fluctuations during takeoff and landing, they decided to use an extremely reliable type of memory called core rope memory.

Since core rope memory is read-only, it wasn’t vulnerable to data corruption from the rigors of spaceflight. However, the fact that it couldn’t be changed or updated without manufacturing an entirely new piece of memory created a whole new set of challenges for the programmers.

The software workflow went something like this:

An engineer wrote or edited a section of the guidance software in AGC assembly language.

The AGC assembly language was compiled to binary and printed to paper.

This printed binary went off to a manufacturing facility, where workers hand-wove it into core rope memory

The rope memory came back from manufacturing and underwent testing for software bugs or manufacturing flaws

When problems were found, the cycle started over again from the beginning

The length of the feedback cycle, plus the fact that the software team couldn’t easily develop or test changes with the actual flight hardware, meant that the process needed to be carefully managed.

According to some sources, Margaret Hamilton coined the term software engineer while working on the AGC software – it’s easy to see why this complex, error-prone process might have inspired the term.

To Infinity…

So what’s the point of all this, other than to share cool facts about space and old computers? Well, if you squint you can see the parallels with common problems that plague software projects today:

How do we build something that’s never been built before? What do we do when our software maxes out the resources allotted to it? How do we test software that guides a space flight without launching a rocket every time we change something?

The main reason, though, is that it’s easy to forget that big advances happen in small steps. It’s easy for us application developers to joke that our job isn’t rocket science, but one of the most interesting things I learned while researching this piece is just how many mistakes happened on the way to the moon.