Why is it that when we solve a proportion, we multiply two elements, and then divide by a third?

For example, with a proportion like:

the method I was taught was to cross-multiply the two numbers that are diagonally across from each other (2 and 180), and divide by the number opposite x (3). We solve for x with , but why?

These things weren’t explained. We were just told, “When you have this situation, do X.” It works. It produces what we’re after (which school “math” classes see as the point), but once I got into college, I talked with a fellow student who had a math minor, and he told me while he was taking Numerical Analysis that they explained this sort of stuff with proofs. I thought, “Gosh, you can prove this stuff?” Yeah, you can.

I’ve picked up a book that I’d started reading several years ago, “Mathematics in 10 Lessons,” by Jerry King, and he answers Question 1 directly, and Question 2 indirectly. I figured I would give proofs for both here, since I haven’t found mathematical explanations for this stuff in web searches.

I normally don’t like explaining stuff like this, because I feel like I’m spoiling the experience of discovery, but I found as I tried to answer these questions myself that I needed a lot of help (from King). My math-fu is pretty weak, and I imagine it is for many others who had a similar educational experience.

I’m going to answer Question 2 first.

The first thing he lays out in the section on rational numbers is the following definition:

I guess I should explain the double-arrow symbol I’m using (and the right-arrow symbol I’ll use below). It means “implies,” but in this case, with the double-arrow, both expressions imply each other. It’s saying “if X is true, then Y is also true. And if Y is true, then X is also true.” (The right-arrow I use below just means “If X is true, then Y is also true.”)

In this case, if you have two equal fractions, then the product equality in the second expression holds. And if the product equality holds, then the equality for the terms in fraction form holds as well.

When I first saw this, I thought, “Wait a minute. Doesn’t this need a proof?”

Well, it turns out, it’s easy enough to prove it.

The first thing we need to understand is that you can do anything to one side of an equation so long as you do the same thing to the other side.

We can take the equal fractions and multiply them by the product of their denominators:

by cancelling like terms, we get:

This explains Question 2, because if we take the proportion I started out with, and translate it into this equality between products, we get:

To solve for x, we get:

which is what we’re taught, but now you know the why of it. It turns out that you don’t actually work with the quantities in the proportion as fractions. The fractional form is just used to relate the quantities to each other, metaphorically. The way you solve for x uses the form of the product equality relationship.

To answer Question 1, we have to establish a couple other things.

The first is the concept of the multiplicative inverse.

For every x (with x ≠ 0), there’s a unique v such that xv = 1, which means that .

From that, we can say:

From this, we can say that the inverse of x is unique to x.

King goes forward with another proof, which will lead us to answering Question 1:

Theorem 1:

Proof:

by cancelling like terms, we get:

(It’s also true that , but I won’t get into that here.)

Now onto Theorem 2:

By Theorem 1, we can say:

Then,

By cancelling like terms, we get:

Therefor,

And there you have it. This is why we invert and multiply when dividing fractions.

Edit 1/11/2018: King says a bit later in the book that by what I’ve outlined with the above definition, talking about how if there’s an equality between fractions, there’s also an equality between a product of their terms, and by Theorem 1, it is mathematically correct to say that division is just a restatement of multiplication. Interesting! This does not mean that you get equal results between division and multiplication: , except when b equals 1 or -1. It means that there’s a relationship between products and rational numbers.

Some may ask, since the mathematical logic for these truths is fairly simple, from an algebraic perspective, why don’t math classes teach this? Well, it’s because they’re not really teaching math…

Note for commenters:

WordPress supports LaTeX. That’s how I’ve been able to publish these mathematical expressions. I’ve tested it out, and LaTeX formatting works in the comments as well. You can read up on how to format LaTeX expressions at LaTeX — Support — WordPress. You can read up on what LaTeX formatting commands to use at Mathematical expressions — ShareLaTeX under “Further Reading”.

HTML codes also work in the comments. If you want to use HTML for math expressions, just a note, you will need to use specific codes for ‘<‘ and ‘>’. I’ve seen cases in the past where people have tried using them “naked” in comments, and WordPress interprets them as HTML tags, not how they were intended. You can read up on math HTML character codes here and here. You can read up on formatting fractions in HTML here.

One of my favorite shows when I was growing up, which I tried not to miss every week, was “The Computer Chronicles.” It was shown on PBS from 1983 to 2002. The last couple years of the show fell flat with me. It seemed to lack direction, which was very unlike what it was in the prior 17 years. Still, I enjoyed it while it lasted.

Randy Kindig did an interview with Cheifet. They talked about the history of how the show got started, and how it ended. He told a bunch of notable stories about what certain guests were like, and some funny stories. Some screw-ups happened on the show, and some happened before they taped a product demonstration. It was great hearing from Cheifet again.

This is the clearest explanation I’ve seen Alan Kay give for what the ARPA community’s conception of computer science was. His answer on Quora was particularly directed at someone who was considering entering a CS major. Rather than summarize what he said, I’m just going to let him speak for himself. I’d rather preserve this if anything happens to Quora down the road. There is a lot of gold here.

If you are just looking to get a job in computing, don’t bother to read further.

First, there are several things to get clear with regard to any field.

What is the best conception of what the field is about?

What is the best above threshold knowledge to date?

How incomplete is the field; how much will it need to change?

When I’ve personally asked most people for a definition of “Computer Science” I’ve gotten back an engineering definition, not one of a science. Part of what is wrong with “CS” these days is both a confusion about what it is, and that the current notion is a weak idea.

The good news is that there is some above threshold knowledge. The sobering news is that it is hard to find in any undergrad curriculum. So it must be ferreted out these days.

Finally, most of the field is yet to be invented — or even discovered. So the strategies for becoming a Computer Scientist have to include learning how to invent, learning how to change, learning how to criticize, learning how to convince.

Most people in the liberal arts would not confuse learning a language like English and learning to read and write well in it, with the main contents of the liberal arts — which, in a nutshell, are ideas. The liberal arts spans many fields, including mathematics, science, philosophy, history, literature, etc. and we want to be fluent about reading and writing and understanding these ideas.

So programming is a good thing to learn, but it is most certainly not at the center of the field of computer science! When the first ever Turing Award winner says something, we should pay attention, and Al Perlis — who was one of if not the definer of the term said: “Computer Science is the Science of Processes”, and he meant all processes, not just those that run on digital computers. Most of the processes in the world to study are more complex than most of the ones we’ve been able to build on computers, so just looking at computers is looking at the least stuff you can look at.

Another way to figure out what you should be doing, is to realize that CS is also a “blank canvas” to “something” kind of field — it produces artifacts that can be studied scientifically, just as the building of bridges has led to “bridge science”. Gravitational and weather forces keep bridge designers honest, but analogous forces are extremely weak in computing, and this allows people who don’t know much to get away with murder (rather like the fashion field, where dumb designs can even become fads, and failures are not fatal). Getting into a “learned science of designs that happen to be dumb” is not the best path!

We (my research community) found that having an undergraduate degree in something really difficult and developed helped a lot (a) as a bullshit detector for BS in computing (of which there is a lot), (b) as a guide to what a real “Computer Science” field should be and could be like, and (c) to provide a lot of skills on the one hand and heuristic lore on the other for how to make real progress. Having a parallel interest in the arts, especially theater, provides considerable additional perspective on what UI design is really about, and also in the large, what computing should be about.

So I always advise young people -not- to major in computing as an undergraduate (there’s not enough “there there”) but instead to actually learn as much about the world and how humans work as possible. In grad school you are supposed to advance the state of the art (or at least this used to be the case), so you are in a better position with regard to an unformed field.

Meanwhile, since CS is about systems, you need to start learning about systems, and not to restrict yourself just those on computers. Take a look at biology, cities, economics, etc just to get started.

Finally, at some age you need to take responsibility for your own education. This should happen in high school or earlier (but is rare). However, you should not try to get through college via some degree factory’s notion of “certification” without having formed a personal basis for criticizing and choosing. Find out what real education is, and start working your way through it.

When I’ve looked over other material written by Kay, and what has been written about his work, I’ve seen him talk about a “literate” computing science. This ties into his long-held view that computing is a new medium, and that the purpose of a medium is to allow discussion of ideas. I’ve had a bit of a notion of what he’s meant by this for a while, but the above made it very clear in my mind that a powerful purpose of the medium is to discuss systems, and the nature of them, through the fashioning of computing models that incorporate developed notions of one or more specific kinds of systems. The same applies to the fashioning of simulations. In fact, simulations are used for the design of said computer systems, and their operating systems and programming languages, before they’re manufactured. That’s been the case for decades. This regime can include the study of computer systems, but it must not be restricted to them. This is where CS currently falls flat. What this requires, though, is some content about systems, and Kay points in some directions for where to look for it.

Computing is a means for us to produce system models, and to represent them, and thereby to develop notions of what systems are, and how they work, or don’t work. This includes regarding human beings as systems. When considering how people will interact with computers, the same perspective can apply, looking at it as how two systems will interact. I’m sure this sounds dehumanizing, but from what I’ve read of how the researchers at Xerox PARC approached developing Smalltalk and the graphical user interface as a learning environment, this is how they looked at it, though they talked about getting two “participants” (the human and the computer) to interact/communicate well with each other (See discussion of “Figure 1” in “Design Principles Behind Smalltalk”). It involved scientific knowledge of people and of computer system design. It seems this is how they achieved what they did.

Prologue

SICP reaches a point, in Chapter 3, where for significant parts of it you’re not doing any coding. It has exercises, but they’re all about thinking about the concepts, not doing anything with a computer. It has you do substitutions to see what expressions result. It has you make diagrams that focus in on particular systemic aspects of processes. It also gets into operational models, talking about simulating logic gates, how concurrent processing can work (expressed in hypothetical Scheme code). It’s all conceptual. Some of it was good, I thought. The practice of doing substitutions manually helps you really get what your Scheme functions are doing, rather than guessing. The rest didn’t feel that engaging. One could be forgiven for thinking that the book is getting dry at this point.

It gets into some coding again in Section 3.3, where it covers building data structures. It gets more interesting with an architecture called “streams” in Section 3.5.

One thing I will note is that the only way I was able to get the code in Section 3.5 to work in Racket was to go into “Lazy Scheme.” I don’t remember what language setting I used for the prior chapters, maybe R5RS, or “Pretty Big.” Lazy Scheme does lazy evaluation on Scheme code. One can be tempted to think that this makes using the supporting structures for streams covered in this section pointless, because the underlying language is doing the delayed evaluation that this section implements in code. Anyway, Lazy Scheme doesn’t interfere with anything in this section. It all works. It just makes the underlying structure for streams redundant. For the sake of familiarity with the code this section discusses (which I think helps in preserving one’s sanity), I think it’s best to play along and use its code.

Another thing I’ll note is this section makes extensive use of knowledge derived from calculus. Some other parts of this book do that, too, but it’s emphasized here. It helps to have that background.

I reached a stopping point in SICP, here, 5 years ago, because of a few things. One was I became inspired to pursue a history project on the research and development that led to the computer technology we use today. Another is I’d had a conversation with Alan Kay that inspired me to look more deeply at the STEPS project at Viewpoints Research, and try to make something of my own out of that. The third was Exercise 3.61 in SICP. It was a problem that really stumped me. So I gave up, and looked up an answer for it on the internet, in a vain attempt to help me understand it. The answer didn’t help me. It worked when I tried the code, but I found it too confusing to understand why it produced correct results. The experience was really disappointing, and disheartening. Looking it up was a mistake. I wished that I could forget I’d seen the answer, so I could go back to working on trying to figure it out, but I couldn’t. I’d seen it. I worked on a few more exercises after that, but then I dropped SICP. I continued working on my history research, and I got into exploring some fundamental computing concepts on processors, language parsing, and the value of understanding a processor as a computing/programming model.

I tried an idea that Kay told me about years earlier, of taking something big, studying it, and trying to improve on it. That took me in some interesting directions, but I hit a wall in my skill, which I’m now trying to get around. I figured I’d continue where I left off in SICP. One way this diversion helped me is I basically forgot the answers for the stuff I did. So, I was able to come back to the problem, using a clean slate, almost fresh. I had some faded memories of it, which didn’t help. I just had to say to myself “forget it,” and focus on the math, and the streams architecture. That’s what finally helped me solve 3.60, which then made 3.61 straightforward. That was amazing.

A note about Exercise 3.59

It was a bit difficult to know at first why the streams (cosine-series and sine-series) were coming out the way they were for this exercise. The cosine-series is what’s called an even series, because its powers are even (0, 2, 4, etc.). The sine-series is what’s called an odd series, because its powers are odd (1, 3, 5, etc.). However, when you’re processing the streams, they just compute a general model of series, with all of the terms, regardless of whether the series you’re processing has values in each of the terms or not. So, cosine-series comes out (starting at position 0) as: [1, 0, -1/2, 0, 1/24, …], since the stream is computing a0x0 + a1x1 + a2x2 …, where ai is each term’s coefficient, and some of the terms are negative, in this case. The coefficients of the terms that don’t apply to the series come out as 0. With sine-series, it comes out (starting at position 0) as: [0, 1, 0, -1/6, 0, 1/120, …].

What’s really interesting is that exercises 3.59, 3.61, and 3.62 are pretty straightforward. Like with some of the prior exercises, all you have to do is translate the math into stream terms (and it’s a pretty direct translation from one to the other), and it works! You’re programming in mathland! I discovered this in the earlier exercises in this section, and it amazed me how expressive this is. I could pretty much write code as if I was writing out the math. I could think in terms of the math, not so much the logistics of implementing computational logic to make the math work. At the same time, this felt disconcerting to me, because when things went wrong, I wanted to know why, computationally, and I found that working with streams, it was difficult to conceptualize what was going on. I felt as though I was just supposed to trust that the math I expressed worked, just as it should. I realize now that’s what I should have done. It was starting to feel like I was playing with magic. I didn’t like that. It was difficult for me to trust that the math was actually working, and if it wasn’t, that it was because I was either not understanding the math, and/or not understanding the streams architecture, not because there was something malfunctioning underneath it all. I really wrestled with that, and I think it was for the good, because now that I can see how elegant it all is, it looks so beautiful!

Exercise 3.60

This exercise gave me a lot of headaches. When I first worked on it 5 years ago, I kind of got correct results with what I wrote, but not quite. I ended up looking up someone else’s solution to it, which worked. It was kind of close to what I had. I finally figured out what I did wrong when I worked on it again recently: I needed to research how to multiply power series, but in a computationally efficient manner. It turns out you need to use a method called the Cauchy product, but there’s a twist, because of the way the streams architecture works. A lot of math sources I looked up use the Cauchy product method, anyway (including a source that covers power series multiplication that I cite below), but the reason it needs to be used is it automatically collects all like terms as each term of the product is produced. The problem you need to work around is that you can’t go backwards through a stream, except by using stream-ref and indexes, and I’ve gotten the sense by going through these exercises that when it comes to doing math problems, you’re generally not supposed to be doing that, though there’s an example later where they talk about a method Euler devised for accelerating computation of power series where they use stream-ref to go backwards and forwards inside a stream. I think that’s an exception.

One hint I’ll leave here is that there is a way to do the Cauchy product without using mutable state in variables, nor is it necessary to do anything particularly elaborate. You can do it just using the stream architecture. Thinking about the operations involved, you don’t have to do the computations strictly left to right, because all of the operations are commutative.

Here are a couple sources that I found helpful when trying to work this out:

Formal Power Series, from Wikipedia. Pay particular attention to the section on “Operations on Formal Power Series.” It gives succinct descriptions on multiplying power series, inverting them (though I’d only pay attention to the mathematical expression of this in Exercise 3.61), and dividing them (which you’ll need for Exercise 3.62).

I found this video on the Cauchy product very helpful in coming up with my solution for this exercise. Notice the pattern by which the products are computed. This pattern should influence your thinking in figuring out how to carry out the Cauchy product using streams.

It is crucial that you get the solution for this exercise right, because the next two exercises (3.61 and 3.62) will give you no end of headaches if you don’t. They build on each other, and this exercise.

The exercise says you can try out mul-series using sin(x) and cos(x), squaring both, and adding them together, and you should get 1. How is “1” represented in streams? Well, it makes sense that it will be in the constant-term position (the zeroth position) in the stream. That’s where a scalar value would be in a power series.

There is a related concept to working with power series called a unit series. A unit series is just a list of the coefficients in a power series. If you’ve done the previous exercises where SICP says you’re working with power series, this is what you’re really working with (though SICP doesn’t mention this), and it’s why in 3.61 it has you write a function called “invert-unit-series”.

The unit series equivalent for the scalar value 1 is [1, 0, 0, 0, 0, …].

A note about Exercise 3.62

The exercise talks about using your div-series function to compute tan-series. Here was a good source for finding out how to do that:

It’s been a while since I’ve worked on this. I used up a lot of my passion for working on this series in the first three parts. Working on this part was like eating my spinach, not something I was enthused about, but it was a good piece of history to learn.

Nevertheless, I am dedicating this to Bob Taylor, who passed away on April 13. He got the ball rolling on building the Arpanet, when a lot of his colleagues in the Information Processing Techniques Office at ARPA didn’t want to do it. This was the predecessor to the internet. When he started up the Xerox Palo Alto Research Center (PARC), he brought in people who ended up being important contributors to the development of the internet. His major project at PARC, though, was the development of the networked personal computer in the 1970s, which I covered in Part 3 of this series. He left Xerox and founded the Systems Research Center at Digital Equipment Corp. in the mid-80s. In the mid-90s, while at DEC, he developed the internet search engine AltaVista.

He and J.C.R. Licklider were both ARPA/IPTO directors, and both were psychologists by training. Among all the people I’ve covered in this series, both of them had the greatest impact on the digital world we know today, in terms of providing vision and support to the engineers who did the hard work of turning that vision into testable ideas that could then inspire entrepreneurs who brought us what we are now using.

Part 3 Part 2 covers the history of how packet switching was devised, and how the Arpanet got going. You may wish to look that over before reading this article, since I assume that background knowledge.

My primary source for this story, as it’s been for this series, is “The Dream Machine,” by M. Mitchell Waldrop, which I’ll refer to occasionally as “TDM.”

The Internet

Ethernet/PUP: The first internetworking protocol

Bob Metcalfe,from University of Texas at Austin

In 1972, Bob Metcalfe was a full-time staffer on Project MAC, building out the Arpanet, using his work as the basis for his Ph.D. in applied mathematics from Harvard. While he was setting up the Arpanet node at Xerox PARC, he stumbled upon the ALOHAnet System, an ARPA-funded project that was led by Norman Abramson at the University of Hawaii. It was a radio-based packet-switching network. The Hawaiian islands created a natural environment for this solution to arise, since telephone connections across the islands were unreliable and expensive. The Arpanet’s protocol waited for a gap in traffic before sending packets. On ALOHAnet, this would have been impractical. Since radio waves were prone to interference, a sending terminal might not even hear a gap. So, terminals sent packets immediately to a transceiver switch. A network switch is a computer that receives packets and forwards them to their destination. In this case, it retransmitted the sender’s signal to other nodes on the network. If the sending terminal received acknowledgement that its packets were received, from the switch, it knew they got to their destination successfully. If not, it assumed a collision occurred with another sending terminal, garbling the packet signal. So, it waited a random period of time before resending the lost packets, with the idea of trying to send them when other terminals were not sending their packets. It had a weakness, though. The scheme Abramson used only operated at 17% of capacity. The random wait scheme could become overloaded with collisions if it got above that. If use kept increasing to that limit, the system would eventually grind to a halt. Metcalfe thought this could be improved upon, and ended up collaborating with Abramson. Metcalfe incorporated his improvements into his Ph.D. thesis. He came to work at PARC in 1972 to use what he’d learned working on ALOHAnet to develop a networking scheme for the Xerox Alto. Instead of using radio signals, he used coaxial cable, and found he could transmit data at a much higher speed that way. He called what he developed Ethernet.

The idea behind it was to create local area networks, or LANs, using coaxial cable, which would allow the same kind of networking one could get on the Arpanet, but in an office environment, rather than between machines spread across the country. Rather than having every machine on the main network, there would be subnetworks that would connect to a main network at junction points. Along with that, Metcalfe and David Boggs developed the PUP (PARC Universal Packet) protocol. This allowed Ethernet to connect to other packet switching networks that were otherwise incompatible with each other. The developers of TCP/IP, the protocol used on the internet, would realize these same goals a little later, and begin their own research on how to accomplish it.

In 1973, Bob Taylor set a goal in the Computer Science Lab at Xerox PARC to make the Alto computer networked. He said he wanted not only personal computing, but distributed personal computing, at the outset.

The creation of the Internet

Bob Kahn,from the ACM

Another man who enters the picture at this point is Bob Kahn. He joined Larry Roberts at DARPA in 1972, originally to work on a large project exploring computerized manufacturing, but that project was cancelled by Congress the moment he got there. Kahn had been working on the Arpanet at BBN, and he wanted to get away from it. Well…Roberts informed him that the Arpanet was all they were going to be working on in the IPTO for the next several years, but he persuaded Kahn to stay on, and work on new advancements to the Arpanet, such as mobile satellite networking, and packet radio networking, promising him that he wouldn’t have to work on maintaining and expanding the existing system. DARPA had signed a contract to expand the Arpanet into Europe, and networking by satellite was the route that they hoped to take. Expanding the packet radio idea of ALOHAnet was another avenue in which they were interested. The idea was to look into the possibility of allowing mobile military forces to communicate in the field using these wireless technologies.

Kahn got satellite networking going using the Intelsat IV satellite. He planned out new ideas for network security and digital voice transmission over Arpanet, and he planned out mobile terminals that the military could use, using packet radio. He had the thought, though, how are all of these different modes of communication going to communicate with each other?

As you’ll see in this story, Bob Kahn with Vint Cerf, and Bob Metcalfe, in a sense, ended up duplicating each other’s efforts; Kahn working at DARPA, Cerf at Stanford, and Metcalfe at Xerox in Palo Alto. They came upon the same basic concepts for how to create internetworking, independently. Metcalfe and Boggs just came upon them a bit earlier than everyone else.

Kahn thought it would seem easy enough to make the different networks he had in mind communicate seamlessly. Just make minor modifications to the Arpanet protocol for each system. He also wondered about future expansion of the network with new modes of communication that weren’t on the front burner yet, but probably would be one day. Making one-off modifications to the Arpanet protocol for each new communications method would eventually make network maintenance a mess. It would make the network more and more unweildy as it grew, which would limit its size. He understood from the outset that this method of network augmentation was a bad management and engineering approach. Instead of integrating the networks together, he thought a better method was to design the network’s expansion modularly, where each network would be designed and managed seperately, with its own hardware, software, and network protocol. Gateways would be created via. hybrid routers that had Arpanet hardware and software on one side, and the foreign network’s hardware and software on the other side, with a computer in the middle translating between the two. Kahn recognized how this structure could be criticized. The network would be less efficient, since each packet from a foreign network to the Arpanet would have to go through three stages of processing, instead of one. It would be less reliable, since there would be three computers that could break, instead of one. It would be more complex, and expensive. The advantage would be that something like satellite communication could be managed completely independently from the Arpanet. So long as both the Arpanet and the foreign network adhered to the gateway interface standard, it would work. You could just connect them up, and the two networks could start sending and receiving packets. The key design idea is neither network would have to know anything about the internal details of the other. Instead of a closed system, it would be open, a network of networks that could accommodate any specialized network. This same scheme is used on the internet today.

Kahn needed help in defining this open concept. So, in 1973, he started working with Vint Cerf, who had also worked on the Arpanet. Cerf had just started work at the computer science department of Stanford University.

Vint Cerf,from Wikipedia

Cerf came up with an idea for transporting packets between networks. Firstly, a universal transmission protocol would be recognized across the network. Each sending network would encode its packets with this universal protocol, and “wrap” them in an “envelope” that used “markings” that the sending network understood in its protocol. The gateway computer would know about the “envelopes” that the sending and receiving networks understood, along with the universal protocol encoding they contained. When the gateway received a packet, it would strip off the sending “envelope,” and interpret the universal protocol enclosed within it. Using that, it would wrap the packet in a new “envelope” for the destination network to use for sending the packet through itself. Waldrop had a nice analogy for how it worked. He said it would be as if you were sending a card from the U.S. to Japan. In the U.S., the card would be placed in an envelope that had your address and the destination address written in English. At the point when it left the U.S. border, the U.S. envelope would be stripped off, and the card would be placed in a new envelope, with the source and destination addresses translated to Kanji, so it could be understood when it reached Japan. This way, each network would “think” it was transporting packets using its own protocol.

Kahn and Cerf also understood the lesson from ALOHAnet. In their new protocol, senders wouldn’t look for gaps in transmission, but send packets immediately, and hope for the best. If any packets were missed at the other end, due to collisions with other packets, they would be re-sent. This portion of the protocol was called TCP (Transmission Control Protocol). A separate protocol was created to manage how to address packets to foreign networks, since the Arpanet didn’t understand how to do this. This was called IP (Internet Protocol). Together, the new internetworking protocol was called TCP/IP. Kahn and Cerf published their initial design for the internet protocol in 1974, in a paper titled, “A Protocol for Packet Network Interconnection.”

Cerf started up his internetworking seminars at Stanford in 1973, in an effort to create the first implementation of TCP/IP. Cerf described his seminars as being as much about consensus as technology. He was trying to get everyone to agree to a universal protocol. People came from all over the world, in staggered fashion. It was a drawn out process, because each time new attendees showed up, the discussion had to start over again. By the end of 1974, the first detailed design was drawn up in a document, and published on the Arpanet, called Request For Comments (RFC) 675. Kahn chose three contractors to create the first implementations: Cerf and his students at Stanford, Peter Kirstein and his students at University College in London, and BBN, under Ray Tomlinson. All three implementations were completed by the end of 1975.

Bob Metcalfe, and a colleague named John Schoch from Xerox PARC, eagerly joined in with these seminars, but Metcalfe felt frustrated, because his own work on Ethernet, and a universal protocol they’d developed to interface Ethernet with Arpanet and Data General’s network, called PUP (Parc Universal Packet) was proprietary. He was able to make contributions to TCP/IP, but couldn’t overtly contribute much, or else it would jeopardize Xerox’s patent application on Ethernet. He and Schoch were able to make some covert contributions by asking some leading questions, such as, “Have you thought about this?” and “Have you considered that?” Cerf picked up on what was going on, and finally asked them, in what I can only assume was a humorous moment, “You’ve done this before, haven’t you?” Also, Cerf’s emphasis on consensus was taking a long time, and Metcalfe eventually decided to part with the seminars, because he wanted to get back to work on PUP at Xerox. In retrospect, he had second thoughts about that decision. He and David Boggs at Xerox got PUP up and running well before TCP/IP was finished, but it was a proprietary protocol. TCP/IP was not, and by making his decision, he had cut himself off from influencing the direction of the internet into what it eventually became.

PARC’s own view of Ethernet was that it would be used by millions of other networks. The initial vision of the TCP/IP group was that it would be an extension of Arpanet, and as such, would only need to accommodate the needs that DARPA had for it. Remember, Kahn set out for his networking scheme to accommodate satellite, and military communications, and some other uses that he hadn’t thought of yet. Metcalfe saw that TCP/IP would only accommodate 256 other networks. He said, “It could only work with one or two nets per country!” He couldn’t talk to them about the possibility of millions of networks using it, because that would have tipped others off to proprietary technology that Xerox had for local area networks (LANs). Waldrop doesn’t address this technical point of how TCP/IP was expanded to accommodate beyond 256 networks. Somehow it was, because obviously it’s expanded way beyond that.

The TCP/IP working group worked on porting the network stack (called the Kahn-Cerf internetworking protocol in the early days) to many operating systems, and migrating subprotocols from the Arpanet to the Kahn-Cerf network, from the mid-1970s into 1980.

People on the project started using the term “Internet,” which was just a shortened version of the term “internetworking.” It took many years before TCP/IP was considered stable enough to port the Arpanet over to it completely. The Department of Defense adopted TCP/IP as its official standard in 1980. The work of converting the Arpanet protocols to use TCP/IP was completed by January 1, 1983, thus creating the Internet. This made it much easier to expand the network than was the case with the Arpanet’s old protocol, because TCP/IP incorporates protocol translation into the network infrastructure. So it doesn’t matter if a new network comes along with its own protocol. It can be brought into the internet, with a junction point that translates it.

We should keep in mind as well that at the point in time that the internet officially came online, nothing had changed with respect to the communications hardware (which I covered in Part 3Part 2). 56 Kbps was still the maximum speed of the network, as all network communications were still conducted over modems and dedicated long-distance phone lines. This became an issue as the network expanded rapidly.

The next generation of the internet

While the founding of the internet would seem to herald an age of networking harmony, this was not the case. Competing network standards proliferated before, during, and after TCP/IP was developed, and the networking world was just as fragmented when TCP/IP got going as before. People who got on one network could not transfer information, or use their network to interact with people and computers on other networks. The main reason for this was that TCP/IP was still a defense program.

The National Science Foundation (NSF) would unexpectedly play the critical role of expanding the internet beyond the DoD. The NSF was not known for funding risky, large-scale, enterprising projects. The people involved with computing research never had much respect for it, because it was so risk-averse, and politicized. Waldrop said,

Unlike their APRA counterparts, NSF funding officers had to submit every proposal for “peer-review” by a panel of working scientists, in a tedious decision-by-committee process that (allegedly) quashed anything but the most conventional ideas. Furthermore, the NSF had a reputation for spreading its funding around to small research groups all over the country, a practice that (again allegedly) kept Congress happy but most definitely made it harder to build up a Project MAC-style critical mass of talent in any one place.

…

Still, the NSF was the only funding agency chartered to support the research community as a whole. Moreover, it had a long tradition of undertaking … big infrastructure efforts that served large segments of the community in common. And on those rare occasions when the auspices were good, the right people were in place, and the planets were lined up just so, it was an agency where great things could happen. — TDM, p. 458

In the late 1970s, researchers at “have not” universities began to complain that while researchers at the premier universities had access to the Arpanet, they didn’t. They wanted equal access. So, the NSF was petitioned by a couple schools in 1979 to solve this, and in 1981, the NSF set up CSnet (Computer Science Network), which linked smaller universities into the Arpanet using TCP/IP. This was the first time that TCP/IP was used outside of the Defense Department.

Steve Wolff,from Internet2

The next impetus for expanding the internet came from physicists, who thought of creating a network of supercomputing centers for their research, since using pencil and paper was becoming untenable, and they were having to beg for time on supercomputers at Los Alamos and Lawrence Livermore that were purchased by the Department of Energy for nuclear weapons development. The NSF set up such centers at the University of Illinois, Cornell University, Princeton, Carnegie Mellon, and UC San Diego in 1985. With that, the internet became a national network, but as I said, this was only the impetus. Steve Wolff, who would become a key figure in the development of the internet at the NSF, said that as soon as the idea for these centers was pitched to the NSF, it became instantly apparent to them that this network could be used as a means for communication among scientists about their work. That last part was something the NSF just “penciled in,” though. It didn’t make plans for how big this goal could get. From the beginning, the NSF network was designed to allow communication among the scholarly community at large. The NSF tried to expand the network outward from these centers to K-12 schools, museums, “anything having to do with science education.” We should keep in mind how unusual this was for the NSF. Waldrop paraphrased Wolff, saying,

[The] creation of such a network exceeded the foundation’s mandate by several light-years. But since nobody actually said no—well, they just did it. — TDM, p. 459

The NSF declared TCP/IP as its official standard in 1985 for all digital network projects that it would sponsor, thenceforth. The basic network that the NSF set up was enough to force other government agencies to adopt TCP/IP as their network standard. It forced other computer manufacturers, like IBM and DEC, to support TCP/IP as well as their own networking protocols that they tried to push, because the government was too big of a customer to ignore.

The expanded network, known as “NSFnet,” came online in 1986.

And the immediate result was an explosion of on-campus networking, with growth rates that were like nothing anyone had imagined. Across the country, those colleges, universities, and research laboratories that hadn’t already taken the plunge began to install local-area networks on a massive scale, almost all of them compatible with (and having connections to) the long-distance NSFnet. And for the first time, large numbers of researchers outside the computer-science departments began to experience the addictive joys of electronic mail, remote file transfer and data access in general. — TDM, p. 460

That same year, a summit of network representatives came together to address the problem of naming computers on the internet, and assigning network addresses. From the beginnings of the Arpanet, a paper directory of network names and addresses had been updated and distributed to the different computer sites on the network, so that everyone would know how to reach each other’s computers on the network. There were so few computers on it, they didn’t worry about naming conflicts. By 1986 this system was breaking down. There were so many computers on the network that assigning names to them started to be a problem. As Waldrop said, “everyone wanted to claim ‘Frodo’.” The representatives came to the conclusion that they needed to automate the process of creating the directory for the internet. They called it the Domain Name Service (DNS). (I remember getting into a bit of an argument with Peter Denning in 2009 on the issue of the lack computer innovation after the 1970s. He brought up DNS as a counter-example. I assumed DNS had come in with e-mail in the 1970s. I can see now I was mistaken on that point.)

Then there was the problem of speed,

“When the original NSFnet went up, in nineteen eighty-six, it could carry fifty-six kilobits per second, like the Arpanet,” says Wolff. “By nineteen eighty-seven it had collapsed from congestion.” A scant two years earlier there had been maybe a thousand host computers on the whole Internet, Arpanet included. Now the number was more like ten thousand and climbing rapidly. — TDM, p. 460

This was a dire situation. Wolff and his colleagues quickly came up with a plan to increase the internet’s capacity by a factor of 30, increasing its backbone bandwidth to 1.5 Mbps, using T1 lines. They didn’t really have the authority to do this. They just did it. The upgrade took effect in July 1988, and usage exploded again. In 1991, Wolff and his colleagues boosted the network’s capacity by another factor of 30, going to T3 lines, at 45 Mbps.

About Al Gore

Before I continue, I thought this would be a good spot to cover this subject, because it relates to what happened next in the internet’s evolution. Bob Taylor said something about this in the Q&A section in his 2010 talk at UT Austin. (You can watch the video at the end of Part 3.) I’ll add some more to what he said here.

Gore became rather famous for supposedly saying he “invented” the internet. Since politics is necessarily a part of discussing government R&D, I feel it’s necessary to address this issue, because Gore was a key figure in the development of the internet, but the way he described his involvement was confusing.

First of all, he did not say the word “invent,” but I think it’s understandable that people would get the impression that he said he invented the internet, in so many words. This story originated during Gore’s run for the presidency in 1999, when he was Clinton’s vice president. What I quote below is from an interview with Gore on CNN’s Late Edition with Wolf Blitzer:

During my service in the United States Congress, I took the initiative in creating the Internet. I took the initiative in moving forward a whole range of initiatives that have proven to be important to our country’s economic growth, environmental protection, improvements in our educational system, during a quarter century of public service, including most of it coming before my current job. I have worked to try to improve the quality of life in our country, and our world. And what I’ve seen during that experience is an emerging future that’s very exciting, about which I’m very optimistic …

The video segment I cite cut off after this, but you get the idea of where he was going with it. He used the phrase, “I took the initiative in creating the Internet,” several times in interviews with different people. So it was not just a slip of the tongue. It was a talking point he used in his campaign. There are some who say that he obviously didn’t say that he invented the internet. Well, you take a look at that sentence and try to see how one could not infer that he said he had a hand in making the internet come into existence, and that it would not have come into existence but for his involvement. In my mind, if anybody “took the initiative in creating the Internet,” it was the people involved with ARPA/IPTO, as I’ve documented above, not Al Gore! You see, to me, the internet was really an evolutionary step made out of the Arpanet. So the internet really began with the Arpanet in 1969, at the time that Gore had enlisted in the military, just after graduating from Harvard.

The term “invented” was used derisively against him by his political opponents to say, “Look how delusional he is. He thinks he created it.” Everyone knew his statement could not be taken at face value. His statement was, in my own judgment, a grandiose and self-serving claim. At the time I heard about it, I was incredulous, and it got under my skin, because I knew something about the history of the early days of the Arpanet, and I knew it didn’t include the level of involvement his claim stated, when taken at face value. It felt like he was taking credit where it wasn’t due. However, as you’ll see in statements below, some of the people who were instrumental in starting the Arpanet, and then the internet, gave Gore a lot of slack, and I can kind of see why. Though Gore’s involvement with the internet didn’t come into view until the mid-1980s, apparently he was doing what he could to build political support for it inside the government all the way back in the 1970s, something that has not been visible to most people.

Gore’s approach to explaining his role, though, blew up in his face, and that was the tragedy in it, because it obscured his real accomplishments. What would’ve been a more precise way of talking about his involvement would’ve been for him to say, “I took initiatives that helped create the Internet as we know it today.” The truth of the matter is he deserves credit for providing and fostering political, intellectual, and financial support for a next generation internet that would support the transmission of high-bandwidth content, what he called the “information superhighway.” Gore’s father, Al Gore, Sr., sponsored the bill in the U.S. Senate that created the Interstate highway system, and I suspect that Al Gore, Jr. wanted, or those in his campaign wanted him to be placed right up there with his father in the pantheon of leaders of great government infrastructure projects that have produced huge dividends for this country. That would’ve been nice, but seeing him overstate his role took him down a peg, rather than raising him up in public stature.

Here are some quotes from supporters of Gore’s efforts, taken from a good Wikipedia article on this subject:

As far back as the 1970s Congressman Gore promoted the idea of high-speed telecommunications as an engine for both economic growth and the improvement of our educational system. He was the first elected official to grasp the potential of computer communications to have a broader impact than just improving the conduct of science and scholarship […] the Internet, as we know it today, was not deployed until 1983. When the Internet was still in the early stages of its deployment, Congressman Gore provided intellectual leadership by helping create the vision of the potential benefits of high speed computing and communication. As an example, he sponsored hearings on how advanced technologies might be put to use in areas like coordinating the response of government agencies to natural disasters and other crises.

— a joint statement by Vint Cerf and Bob Kahn

The sense I get from the above statement is that Gore was a political and intellectual cheerleader for the internet in the 1970s, spreading a vision about how it could be used in the future, to help build political support for its future funding and development. The initiative Gore talked about I think is expressed well in this statement:

A second development occurred around this time, namely, then-Senator Al Gore, a strong and knowledgeable proponent of the Internet, promoted legislation that resulted in President George H.W Bush signing the High Performance Computing and Communication Act of 1991. This Act allocated $600 million for high performance computing and for the creation of the National Research and Education Network. The NREN brought together industry, academia and government in a joint effort to accelerate the development and deployment of gigabit/sec networking.

These quotes are more of a summary. Here is some more detail from TDM.

In 1986, Democratic Senator Al Gore, who had a keen interest in the internet, and the development of the networked supercomputing centers created in 1985, asked for a study on the feasability of getting gigabit network speeds using fiber optic lines to connect up the existing computing centers. This created a flurry of interest from a few quarters. DARPA, NASA, and the Energy Department saw opportunities in it for their own projects. DARPA wanted to include it in their Strategic Computing Initiative, begun by Bob Kahn in the early 1980s. DEC saw a technological opportunity to expand upon ideas it had been developing. To Gore’s surprise, he received a multiagency report in 1987 advocating a government-wide “assault” on computer technology. It recommended that Congress fund a billion-dollar research initiative in high-performance computing, with the goal of creating computers in several years whose performance would be an order of magnitude greater than the fastest computers available at the time. It also recommended starting the National Research and Education Network (NREN), to create a network that would send data at gigabits per second, an order of magnitude greater than what the internet had just been upgraded to. In 1988, after getting more acquainted with these ideas, he introduced a bill in Congress to fund both initiatives, dubbed “the Gore Bill.” It put DARPA in charge of developing the gigabit network technology, and officially authorized and funded the NSF’s expansion of NSFnet, and it was also given the explicit mission in the bill to connect up the whole federal government to the internet, and the university system of the United States. It ran into a roadblock, though, from the Reagan Administration, which argued that these initiatives for faster computers and faster networking should be left to the private sector.

This stance causes me to wonder if perhaps there was a mood to privatize the internet back then, but it just wasn’t accomplished until several years later. I talked with a friend a few years ago about the internet’s history. He’s been a tech innovator for many years, and he said he was lobbying for the network to be privatized in the ’80s. He said he and other entrepreneurs he knew were chomping at the bit to develop it further. Perhaps there wasn’t a mood for that in Congress.

Edit 5/10/17: I deleted a paragraph here, because I realized Waldrop may have had some incorrect information. He said on Page 461 of TDM that the Gore Bill was split into two, one passed in 1991, and another bill that was passed in 1993, when Gore became Vice President. I can’t find information on the 1993 bill he talks about. So, I’m going to assume for the time being that instead, the Gore Bill was just passed later, in 1991, as the High-Performance Computing Act (HPCA). It’s possible that the funding for it was split up, with part of it appropriated in 1991, and the rest being appropriated in 1993. If I get more accurate information, I will update this history.

Waldrop said, though, that the defeat of the Gore Bill in 1988 was just a bump in the road, and it hardly mattered, because the agencies were quietly setting up a national network on their own. Gorden Bell at the NSF said that their network was getting bigger and bigger. Program directors from the NSF, NASA, DARPA, and the Energy Department created their own ad hoc shadow agency, called the Federal Research Internet Coordinating Committee (FRICC). If they agreed on a good idea, they found money in one of their agencies’ budgets to do it. This committee would later be reconstituted officially as the Federal Networking Council. These agencies also started standardizing their networks around TCP/IP. By the beginning of the 1990s, the de facto national research network was officially in place.

Marc Andreeson said that the development of Mosaic, the first publicly available web browser, and the prototype for Netscape’s browser, couldn’t have happened when it did without funding from the HPCA. “If it had been left to private industry, it wouldn’t have happened. At least, not until years later,” he said. Mosaic was developed at the National Center for Supercomputer Applications (NCSA) at the University of Illinois in 1993. Netscape was founded in December 1993.

The 2nd generation network takes shape

Recognizing in the late ’80s that the megabit speeds of the internet were making the Arpanet a dinosaur, DARPA started decommissioning their Information Message Processors (IMPs), and by 1990, the Arpanet was officially offline.

By 1988, there were 60,000 computers on the internet. By 1991, there were 600,000.

CERN in Switzerland had joined the internet in the late 1980s. In 1990, an English physicist working at CERN, named Tim Berners-Lee, created his first system for hyperlinking documents on the internet, what we would now know as a web server and a web browser. Berners-Lee had actually been experimenting with data linking long before this, long before he’d heard of Vannevar Bush, Doug Engelbart, or Ted Nelson. In 1980, he linked files together on a single computer, to form a kind of database, and he had repeated this metaphor for years in other systems he created. He had written his World Wide Web tools and protocol to run on the NeXT computer. It would be more than a year before others would implement his system on other platforms.

How the internet was privatized

Wolff, at NSF, was beginning to try to get the internet privatized in 1990. He said,

“I pushed the Internet as hard as I did because I thought it was capable of becoming a vital part of the social fabric of the country—and the world,” he says. “But it was also clear to me that having the government provide the network indefinitely wasn’t going to fly. A network isn’t something you can just buy; it’s a long-term, continuing expense. And government doesn’t do that well. Government runs by fad and fashion. Sooner or later funding for NSFnet was going to dry up, just as funding for the Arpanet was drying up. So from the time I got here, I grappled with how to get the network out of the government and instead make it part of the telecommunications business.” — TDM, p. 462

The telecommunications companies, though, didn’t have an interest in taking on building out the network. From their point of view, every electronic transaction was a telephone call that didn’t get made. Wolff said,

“As late as nineteen eighty-nine or ‘ninety, I had people from AT&T come in to me and say—apologetically—’Steve, we’ve done the business plan, and we just can’t see us making any money.'” — TDM, p. 463

Wolff had planned from the beginning of NSFnet to privatize it. After the network got going, he decentralized its management into a three-tiered structure. At the lowest level were campus-scale networks operated by research laboratories, colleges, and universities. In the middle level were regional networks connecting the local networks. At the highest level was the “backbone” that connected all of the regional networks together, operated directly by the NSF. This scheme didn’t get fully implemented until they did the T1 upgrade in 1988.

Wolff said,

“Starting with the inauguration of the NSFnet program in nineteen eighty-five,” he explains, “we had the hope that it would grow to include every college and university in the country. But the notion of trying to administer a three-thousand-node network from Washington—well, there wasn’t that much hubris inside the Beltway.” — TDM, p. 463

It’s interesting to hear this now, given that HealthCare.gov is estimated to be between 5 and 15 million lines of code (no official figures are available to my knowledge. This is just what’s been disclosed on the internet by an apparent insider), and it is managed by the federal government, seemingly with no plans to privatize it. For comparison, that’s between the size of Windows NT 3.1 and the flight control software of the Boeing 787.

Wolff also structured each of the regional service providers as non-profits. Wolff told them up front that they would eventually have to find other customers besides serving the research community. “We don’t have enough money to support the regionals forever,” he said. Eventually, the non-profits found commercial customers—before the internet was privatized. Wolff said,

“We tried to implement an NSF Acceptable Use Policy to ensure that the regionals kept their books straight and to make sure that the taxpayers weren’t directly subsidizing commercial activities. But out of necessity, we forced the regionals to become general-purpose network providers.” — TDM, p. 463

This structure became key to how the internet we have unfolded. Waldrop notes that around 1990, there were a number of independent service providers that came into existence to provide internet access to anyone who wanted it, without restrictions.

After a dispute erupted between one of the regional networks and NSF over who should run the top level network (NSF had awarded the contract to a company called ANS, a consortium of private companies) in 1991, and a series of investigations, which found no wrongdoing on NSF’s part, Congress passed a bill in 1992 that allowed for-profit Internet Service Providers to access the top level network. Over the next few years, the subsidy the NSF provided to the regional networks tapered off, until on April 30, 1995, NSFnet ceased to exist, and the internet was self-sustaining. The NSF continued operating a much smaller, high-speed network, connecting its supercomputing centers at 155 Mbps.

Where are they now?

Bob Metcalfe left PARC in 1975. He returned to Xerox in 1978, working as a consultant to DEC and Intel, to try to hammer out an open standard agreement with Xerox for Ethernet. He started 3Com in 1979 to sell Ethernet technology. Ethernet became an open standard in 1982. He left 3Com in 1990. He then became a publisher, and wrote a column for InfoWorld Magazine for ten years. He became a venture capitalist in 2001, and is now a partner with Polaris Venture Partners.

Bob Kahn founded the Corporation for National Research Initiatives, a non-profit organization, in 1986, after leaving DARPA. Its mission is “to provide leadership and funding for research and development of the National Information Infrastructure.” He served on the State Department’s Advisory Committee on International Communications and Information Policy, the President’s Information Technology Advisory Committee, the Board of Regents of the National Library of Medicine, and the President’s Advisory Council on the National Information Infrastructure. Kahn is currently working on a digital object architecture for the National Information Infrastructure, as a way of connecting different information systems. He is a co-inventor of Knowbot programs, mobile software agents in the network environment. He is a member of the National Academy of Engineering, and is currently serving on the State Department’s Advisory Committee on International Communications and Information Policy. He is also a Fellow of the IEEE, a Fellow of AAAI (Association for the Advancement of Artificial Intelligence), a Fellow of the ACM, and a Fellow of the Computer History Museum. (Source: The Corporation for National Research Initiatives)

Vint Cerf joined DARPA in 1976. He left to work at MCI in 1982, where he stayed until 1986. He created MCI Mail, the first commercial e-mail service that was connected to the internet. He joined Bob Kahn at the Corporation for National Research Initiatives, as Vice President. In 1992, he and Kahn, among others, founded the Internet Society (ISOC). He re-joined MCI in 1994 as Senior Vice President of Technology Strategy. The Wikipedia page on him is unclear on this detail, saying that at some point, “he served as MCI’s senior vice president of Architecture and Technology, leading a team of architects and engineers to design advanced networking frameworks, including Internet-based solutions for delivering a combination of data, information, voice and video services for business and consumer use.” There are a lot of accomplishments listed on his Wikipedia page, more than I want to list here, but I’ll highlight a few:

Cerf joined the board of the Internet Corporation for Assigned Names and Numbers (ICANN) in 1999, and served until the end of 2007. He has been a Vice President at Google since 2005. He is working on the Interplanetary Internet, together with NASA’s Jet Propulsion Laboratory. It will be a new standard to communicate from planet to planet, using radio/laser communications that are tolerant of signal degradation. He currently serves on the board of advisors of Scientists and Engineers for America, an organization focused on promoting sound science in American government. He became president of the Association for Computing Machinery (ACM) in 2012 (for 2 years). In 2013, he joined the Council on Cybersecurity’s Board of Advisors.

Steve Wolff left the NSF in 1994, and joined Cisco Systems as a business development manager for their Academic Research and Technology Initiative. He is a life member of the Institute of Electrical and Electronic Engineers (IEEE), and is a member of the American Association for the Advancement of Science (AAAS), the Association for Computing Machinery (ACM), and the Internet Society (ISOC) (sources: Wikipedia, and Internet2)

Conclusion

This is the final installment in my series.

The point of this series has been to tell a story about the research and development that led to the technology that we are conscious of using today, and some of the people who made it happen. I know for a fact I’ve left out some names of contributors to these technologies. I have tried to minimize that, but trying to trace all of that down has been more tedious than I have wanted to delve into to write this. My point about naming names has not been so much to give credit where it’s due (though I feel an obligation to be accurate about that). It’s been to make this story more real to people, to let people know that it wasn’t just some abstract “government effort” that magically made it all happen; that real flesh and blood people were involved. I’ve tried a bit to talk about the backgrounds of these individuals to illustrate that most of them did not train specifically to make these ideas happen. The main point of this review of the history has been to get across where the technology came from; a few of the ideas that led to it, and what research structure was necessary to incubate it.

One of my hopes with this series was to help people understand and appreciate the research culture that generated these ideas, so as to demystify it. The point of that being to help people understand that it can be done again. It wasn’t just a one-shot deal of some mythical past that is no longer possible. However, to have that research culture requires having a different conception of what the point of research is. I talked about this in Are we future-oriented? Neil deGrasse Tyson nicely summarized it at the end of this 2009 interview:

Scott Adams has a great rule, that it’s a waste of time to talk about doing something, as opposed to just doing it (though a lot of organizations get into talking about what they should do, instead of doing it). What motivates “just doing it,” though, is different from talking about it. I started out this series talking about how the Cold War created the motivation to do the research, because our government realized that while it had working nuclear weapons, it was not really prepared to handle the scenario where someone else had them as well, and that part of being prepared for it required automating the process of gathering information, and partly automating the process of acting on it. The problem the government had was the technology to do that didn’t exist. So, it needed to find out how to do it. That’s what motivated the government to “just do it” with respect to computer research, rather than talk about doing it.

I started out feeling optimistic that reviewing this history would not only correct ignorance about it, but would also create more support for renewing the research culture that once existed. I am less optimistic about that second goal now. I think that human nature, at least in our society, dictates that circumstances primarily drive the motivation to fund and foster basic research, and it doesn’t have to be funded by government. As I’ve shown in this series, some of the most important research in computing was funded totally in the private sector. What mattered was the research environment that was created, wherever it was done.

I must admit, though, that the primary motivation for me to write this series was my own education. I very much enjoyed coming to understand the continuum between the technology I enjoyed, and was so inspired by when I was growing up, and the ideas that helped create them. For most of my life, the computer technology I used when I was younger just seemed to pop into existence. I thought for many years that the computer companies created their own conception of computers, and I gave the companies, and sometimes certain individuals in them all the credit. In a few cases they seemed brilliant. A little later I learned about Xerox PARC and the graphical user interface, and a bit about the lineage of the internet’s history, but that was it.

This view of things tends to create a mythology, and what I have seen that lead to is a cargo cult, of sorts. Cargo cults don’t create the benefits of the things that brought them. As Richard Feynman said, “The planes don’t land.” It’s a version of idol worship that looks modern and novel to the culture that engages in it. It has the cause and effect relationship exactly backwards. What creates the benefits is ideas generated from imagination and knowledge intersecting with the criticism produced through powerful outlooks. What I’ve hoped to do with this series is dispel the mythology that generates this particular cargo cult. I certainly don’t want to be part of it any longer. I was for many years. What’s scary about that is for most of that time, I didn’t even recognize it. It was just reality.

Edit 7/17/2017: One thing I’ve been meaning to note, my primary source of information for this series was “The Dream Machine,” as I noted earlier. Even though there is a lot of material in this series, I did not just summarize the whole book. I summarized portions of it. The time period it covers is pretty vast. It starts in the 1920s, and quite a bit happened with computational theory in the 1930s, which Waldrop documents, setting the stage for what happened later. One thing I’ve hoped is that this series inspires readers to explore historical sources, including this book.

I found out about Antic–The Atari 8-bit podcast through a vintage computer group on Google+. They have some great interviews on there. At first blush, the format of these podcasts sounds amateurish. The guys running the show sound like your stereotypical nerds, and sometimes the audio sounds pretty bad, because the guest responses are being transmitted through Skype, or perhaps using cell phones, but if you’re interested in computer history like I am, you will get information on the history of Atari, and the cottage industry that was early personal computing and video gaming that you’re not likely to find anywhere else. The people doing these podcasts are also active contributors to archive.org, keeping these interviews for posterity.

What comes through is the business was informal. It was also possible for one or two people to write a game, an application, or a complete operating system. Those days are gone.

Most of the interviews I’ve selected cover the research and development Atari did from 1977 until 1983, when Warner Communications (now known as Time-Warner) owned the company, because it’s the part of the history that I think is the most fascinating. So much of what’s described here is technology that was only hinted at in old archived articles I used to read. It sounded so much like opportunity that was lost, and indeed it was, because even though it was ready to be produced, the higher-ups at Atari didn’t want to put it into production. The details make that much more apparent. Several of these interviews cover the innovative work of the researchers who worked under Alan Kay at Atari. Kay left Xerox PARC, and worked at Atari as its chief scientist from 1981 to 1984. In 1983, Atari, and the rest of the video games industry went through a market crash. Atari’s consumer electronics division was sold to Jack Tramiel in 1984. Atari’s arcade game business was spun off into its own company. Both used the name “Atari” for many years afterward. For those who remember this time, or know their history, Jack Tramiel founded Commodore Business Machines in the 1950s as a typewriter parts company. Commodore got into producing computers in the late 1970s. Jack Tramiel had a falling out with Commodore’s executives over the future direction of the company in 1984, and so he left, taking many of Commodore’s engineers with him. He bought Atari the same year. As I’ve read the history of this period, it sounds literally as if the two companies did a “switch-a-roo,” where Amiga’s engineers (who were former Atari employees) came to work at Commodore, and many of Commodore’s engineers came to work at Atari. 1984 onward is known as the “Tramiel era” at Atari. I only have one interview from that era, the one with Bob Brodie. That time period hasn’t been that interesting to talk about.

I’ll keep adding more podcasts to this post as I find more interesting stuff. I cover in text a lot of what was interesting to me in these interviews. I’ve provided links to each of the podcasts as sources for my comments.

Show #4 – An interview with Chris Crawford. Crawford is a legend in the gaming community, along with some others, such as Sid Meier (who worked at Microprose). Crawford is known for his strategy/simulation games. In this interview, he talked about what motivated him to get into computer games, how he came to work at Atari, his work at Atari’s research lab, working with Alan Kay, and the games he developed through his career. One of the games he mentioned that seemed a bit fascinating to me is one that Atari didn’t release. It was called “Gossip.” He said that Alan encouraged him to think about what games would be like 30 years in the future. This was one product of Crawford thinking about that. I found a video of “Gossip.”

It’s not just a game where you fool around. There is actual scoring in it. The point of the game is to become the most popular, or “liked” character, and this is scored by what other characters in the game think of you, and of other characters. This is determined by some mathematical formula that Crawford came up with. AI is employed in the game, because the other characters are played by the computer, and each is competing against you to become the most popular. You influence the other players by gossiping about them, to them. The other characters try to influence you by sharing what they think other characters are thinking about you or someone else. Since Atari computers had limited resources, there is no actual talking, though they added the effect of you hearing something that resembles a human voice as each computer character gossips. You gossip via. a “facial expression vocabulary.” It’s not easy to follow what’s going on in the game. You call someone. You tell them, “This is how I feel about so-and-so.” They may tell you, “This is what X feels about Y,” using brackets and a pulsing arrow to denote, “who feels about whom.” The character with brackets around them shows a facial expression to convey what the character on the other end of the phone is telling you.

Interview #7 with Bill Wilkinson, founder of Optimized Systems Software (OSS), and author of the “Insight: Atari” column at Compute! Magazine. He talked about how he became involved with the projects to create Atari Basic, and Atari DOS, working as a contractor with Atari through a company called Shepardson Microsystems. He also talked about the business of running OSS. There were some surprises. One was Wilkinson said that Atari Basic compiled programs down to bytecode, line by line, and ran them through a virtual machine, all within an 8K cartridge. I had gotten a hint of this in 2015 by learning how to explore the memory of the Atari (running as an emulator on my Mac) while running Basic, and seeing the bytecodes it generated. After I heard what he said, it made sense. This runs counter to what I’d heard for years. In the 1980s, Atari Basic had always been described as an interpreter, and the descriptions of what it did were somewhat off-base, saying that it “translated Basic into machine code line by line, and executed it.” It could be that “bytecode” and “virtual machine” were not part of most programmers’ vocabularies back then, even though there had been such systems around for years. So the only way they could describe it was as an “interpreter.”

Another surprise was that Wilkinson did not paint a rosy picture of running OSS. He said of himself that in 20/20 hindsight, he shouldn’t have been in business in the first place. He was not a businessman at heart. All I take from that is that he should have had someone else running the business, not that OSS shouldn’t have existed. It was an important part of the developer community around Atari computers at the time. Its Mac/65 assembler, in particular, was revered as a “must have” for any serious Atari developer. Its Action! language compiler (an Atari 8-bit equivalent of the C language) was also highly praised, though it came out late in the life of the Atari 8-bit computer line. Incidentally, Antic has an interview with Action!‘s creator, Clinton Parker, and with Stephen Lawrow, who created Mac/65.

Wilkinson died on November 10, 2015.

Interview #11 with David Small, columnist for Creative Computing, and Current Notes. He was the creator of the Magic Sac, Spectre 128, and Spectre GCR Macintosh emulators for the Atari ST series computers. It was great hearing this interview, just to reminisce. I had read Small’s columns in Current Notes years ago, and usually enjoyed them. I had a chance to see him speak in person at a Front Range Atari User Group meeting while I was attending Colorado State University, around 1993. I had no idea he had lived in Colorado for years, and had operated a business here! He had a long and storied history with Atari, and with the microcomputer industry.

Interview #30 with Jerry Jessop. He worked at Atari from 1977 to 1984 (the description says he worked there until 1985, but in the interview he says he left soon after it was announced the Tramiels bought the company, which was in ’84). He worked on Atari’s 8-bit computers. There were some inside baseball stories in this interview about the development of the Atari 1400XL and 1450XLD, which were never sold. The most fascinating part is where he described working on a prototype called “Mickey” (also known as the 1850XLD) which might have become Atari’s version of the Amiga computer had things worked out differently. It was supposed to be a Motorola 68000 computer that used a chipset that Amiga Inc. was supposed to develop for Atari, in exchange for money Atari had loaned to Amiga for product development. There is also history which says that Atari structured the loan in a predatory manner, thinking that Amiga would never be able to pay it off, in which case Atari would’ve been able to take over Amiga for free. Commodore intervened, paid off the loan, got the technology, and the rest is history. This clarified some history for me, because I had this vague memory for years that Atari planned to make use of Amiga’s technology as part of a deal they had made with them. The reason the Commodore purchase occurred is Amiga ran through its money, and needed more. It appears the deal Atari had with Amiga fell through when the video game crash happened in 1983. Atari was pretty well out of money then, too.

Interview #37 with David Fox, creator of Rescue on Fractalus, and Ballblazer at Lucasfilm Games. An interesting factoid he talked about is that they wrote these games on a DEC Vax, using a cross-assembler that was written in Lisp. I read elsewhere that the assembler had the programmer write code using a Lisp-like syntax. They’d assemble the game, and then download the object code through a serial port on the Atari for testing. Fox also said that when they were developing Ballblazer, they sometimes used an Evans & Sutherland machine with vector graphics to simulate it.

Interview #40 with Doug Carlston, co-founder and CEO of Brøderbund Software. Brøderbund produced a series of hit titles, both applications and games for microcomputers. To people like me who were focused on the industry at the time, their software titles were very well known. Carlston talked about the history of the company, what led to its rise, and fall. He also had something very interesting to say about software piracy. He was the first president of the Software Publishers Association, and he felt that they took too harsh a stand on piracy, because from his analysis of sales in the software industry, it didn’t have the negative effect on it that most people thought it did. He saw piracy as a form of marketing. He thought it increased the exposure of software to people who would have otherwise not known about it, or thought to buy it. He said when he compared the growth of software sales from this era to when software was first released on CDs, during a time when recording to CDs was too expensive for most consumers, and the volume of data held on CDs would have taken up a very large chunk of users’ hard drive space, if not exceeded it (so piracy was nearly impossible), he said he didn’t notice that big of a difference in the sales growth curves. Nevertheless, some developers at Brøderbund were determined to foil pirates, and a story he told about one developer’s efforts was pretty funny. 🙂 Though, he thought it went “too far,” so they promptly took out the developer’s piracy-thwarting code.

Interview #49 with Curt Vendel and Marty Goldberg. They spoke in detail about the history of how Atari’s video game and computer products developed, while Atari was owned by Warner Communications. A surprising factoid in this interview is they revealed that modern USB technology had its beginning with the development of the Atari 8-bit’s Serial Input/Output (SIO) port. This port was used to daisy-chain peripherals, such as disk drives, serial/parallel ports, and printers to the Atari computer. The point was it was an intelligent port architecture that was able to identify what was hooked up to the computer, without the user having to configure hardware or software. A driver could be downloaded from each device at bootup (though not all SIO devices had built-in drivers). Each device was capable of being ready to go the moment the user plugged it into either the computer, or one of the other SIO-compatible peripherals, and turned on the computer.

Interview #58 with Jess Jessop, who worked in Atari’s research lab. In one project, called E.R.I.C., he worked on getting the Atari 800 to interact with a laserdisc player, so the player could run video segments that Atari had produced to market the 800, and allow people to interact with the video through the Atari computer. The laserdisc player would run vignettes on different ways people could use the computer, and then have the viewer press a key on the keyboard, which directed the player to go to a specific segment. He had a really funny story about developing that for the Consumer Electronics Show. I won’t spoil it. You’ll have to listen to the interview. I thought it was hilarious! 😀 If you listen to some of the other podcasts I have here, Kevin Savetz, who conducted most of these interviews, tried to confirm this story from other sources. All of them seem to have denied it, or questioned it (questioned whether it was possible). So, this story, I think, is apocryphal.

Here’s a bit of the laserdisc footage.

Jessop also mentioned Grass Valley Lab (Cyan Engineering), a separate research facility that worked on a variety of technology projects for Atari, including “pointer” technology, using things like tablet/stylus interfaces, and advanced telephony. Kevin Savetz interviewed a few other people from Grass Valley Lab in other podcasts I have here.

Jessop said the best part of his job was Alan Kay’s brown bag lunch sessions, where all Alan would do was muse about the future, what products would be relevant 30 years down the road, and he’d encourage the researchers who worked with him to do the same.

The really interesting part was listening to Jessop talk about how the research lab had worked on a “laptop” project in the early 1980s with Alan Kay. He said they got a prototype going that had four processors in it. Two of them were a Motorola 68000 and an Intel processor. He said the 68000 was the “back-end processor” that ran Unix, and the Intel was a “front-end processor” for running MS-DOS, which he said was for “interacting with the outside world” (ie. running the software everyone else was using). He said the Unix “back end” ran a graphical user interface, “because Alan Kay,” and the “front-end processor” ran DOS in the GUI. He said the prototype wasn’t going to end up being a “laptop,” since it got pretty heavy, ran hot with all the components in it, and had loud cooling fans. It also had a rather large footprint. The prototype was never put into production, according to Jessop, because of the market crash in 1983.

Jessop expressed how fun it was to work at Atari back when Warner Communications was running it. That’s something I’ve heard in a few of these interviews with engineers who worked there. They said it spoiled people, because most other places have not allowed the free-wheeling work style that Atari had. Former Atari engineers have since tried to recreate that, or find someplace that has the same work environment, with little success.

Interview #65 with Steve Mayer, who helped create Grass Valley Labs, and helped design the hardware for the Atari 2600 video game system, and the Atari 400 and 800 personal computers. The most interesting part of the interview for me was where he talked about how Atari talked to Microsoft about writing the OS for the 400 and 800. They were released in 1979, and Mayer was talking about a time before that when both computers were still in alpha development. He said Microsoft had a “light version of DOS,” which I can only take as an analogy to some other operating system they might have been in possession of, since PC-DOS didn’t even exist yet. I remember Bill Gates saying in the documentary “Triumph of the Nerds” that Microsoft had a version of CP/M on some piece of hardware (the term “softcard” comes to mind) that it had licensed from Digital Research by the time IBM came to them for an OS for their new PC, in about 1980. Maybe this is what Mayer was talking about.

Another amazing factoid he talked about is Atari considered having the computers boot into a graphical user interface, though he stressed that they didn’t envision using a mouse. What they looked at instead was having users access features on the screen using a joystick. What they decided on instead was to have the computer either boot into a “dumb editor” called “Memo Pad,” where the user could type in some text, but couldn’t do anything with it, or, if a Basic cartridge was inserted, to boot into a command line interface (the Basic language).

Mayer said that toward the end of his time at Atari, they were working on a 32-bit prototype. My guess is this is the machine that Jess Jessop talked about above. From documentation I’ve looked up, it seems there were two or three different 32-bit prototypes being developed at the time.

Interview #73 with Ron Milner, engineer at Cyan Engineering from 1973-1984. Milner described Cyan as a “think tank” that was funded by Atari. You can see Milner in the “Splendor in the Grass” video above. He talked about co-developing the Atari 2600 VCS for the first 15 minutes of the interview (in the podcast). For the rest, he talked about experimental technologies. He mentioned working on printer technology for the personal computer line, since he said printers didn’t exist for microcomputers when the Atari 400/800 were in development. He worked on various pointer control devices under Alan Kay. He mentioned working on a touch-screen interface. He described a “micro tablet” design, where they were thinking about having it be part of the keyboard housing (sounding rather like the trackpads we have on laptops). He also described a walnut-sized optical mouse he worked on. These latter two technologies were intended for use on the 8-bit computers, before Apple introduced their GUI/mouse interface in 1983 (with the Lisa), though Atari never produced them.

The most striking thing he talked about was a project he worked on shortly after leaving Atari, I guess under a company he founded, called Applied Design Laboratories (ADL). He called it a “space pen.” His setup was a frame that would fit around a TV set, with what he called “ultrasonic transducers” on it that could detect a pen’s position in 3D space. A customer soon followed that had made investments in virtual reality technology, and which used his invention for 3D sculpting. This reminded me so much of VR/3D sculpting technology we’ve seen demonstrated in recent years.

Interview #75 with Steve Davis, Director of Advanced Research for Atari, who worked at Cyan Engineering under Alan Kay. He worked on controlling a laserdisc player from an Atari 800, which has been talked about above (E.R.I.C.), a LAN controller for the Atari 800 (called ALAN-K, which officially stood for “Atari Local Area Network, Model-K,” but it was obviously a play on Alan Kay’s name), and “wireless video game cartridges.” Davis said these were all developed around 1980, but at least some of this had to have been started in 1981 or later, because Kay didn’t come to Atari until then. Davis mentioned working on some artificial intelligence projects, but as I recall, didn’t describe any of them.

He said the LAN technology was completed as a prototype. They were able to share files, and exchange messages with it. It was test-deployed at a resort in Mexico, but it was never put into production.

The “wireless video game cartridge” concept was also a completed prototype. This was only developed for the Atari 2600. From Davis’s description, you’d put a special cartridge in the 2600. You’d choose a game from a menu, pay for it, the game would be downloaded to the cartridge wirelessly, and you could then play it on your 2600. It sounded almost as if you rented the game remotely, downloading the game into RAM in the cartridge, because he said, “There was no such thing as Flash memory,” suggesting that it was not stored permanently on the cartridge. The internet didn’t exist at the time, and even if it did, it wouldn’t have allowed this sort of activity on it, since all commercial activity was banned on the early internet. He said Atari did not use cable technology to transmit games, but cooperated with a few local radio stations to act as servers for these games, which would be transmitted over FM subcarrier. He didn’t describe how this system handled paying for the games, either. In any case, it was not something that Atari put on the broad market.

Kevin Savetz, who’s done most of these interviews, has described APX, the Atari Program eXchange, as perhaps the first “app store,” in the sense that it was a catalog of programs sponsored by Atari, exclusively for Atari computer users, with software written by Atari employees, and non-Atari (independent) authors, who were paid for their contributions based on sales. Looking back on it now, it seems that Atari pioneered some of the consumer technology and consumer market technologies that are now used on modern hardware platforms, using distribution technology that was cutting edge at the time (though it’s ancient by today’s standards).

Davis said that Warner Communications treated Atari like it treated a major movie production. This was interesting, because it helps explain a lot about why Atari made its decisions. With movies, they’d spend millions, sometimes tens of millions of dollars, make whatever money they made on it, and move on to the next one. He said Warner made a billion dollars on Atari, and once the profit was made, they sold it, and moved on. He said Warner was not a technology company that was trying to advance the state of the art. It was an entertainment company, and that’s how they approached everything. The reason Atari was able to spend money on R&D was due to the fact that it was making money hand over fist. A few of its projects from R&D were turned into products by Atari, but most were not. A couple I’ve heard about were eventually turned into products at other companies, after Atari was sold to Jack Tramiel. From how others have described the company, nobody was really minding the store. During its time with Warner, it spent money on all sorts of things that never earned a return on investment. One reason for that was it had well-formed ideas that could be turned into products, but they were in markets that Atari and Warner were not interested in pursuing, because they didn’t apply to media or entertainment.

Edit 7/30/2017: After some consultation with a historical researcher, I’ve decided to remove a podcast that I had placed here, of an interview with Tim McGuinness. What I learned was that he’s not a trustworthy source of information on work done at Atari, nor about Atari. Some circumstantial evidence I looked into a few days ago seemed to confirm this. I apologize if what I posted here has misled anyone.

Interview #77 with Tandy Trower, who worked with Microsoft to develop a licensed version of Microsoft Basic for the Atari computers. Trower said that Atari’s executives were so impressed with Microsoft that they took a trip up to Seattle to talk to Bill Gates about purchasing the company! Keep in mind that at the time, this was entirely plausible. Atari was one of the fastest growing companies in the country. It had turned into a billion-dollar company due to the explosion of the video game industry, which it led. It had more money than it knew what to do with. Microsoft was a small company, whose only major product was programming languages, particularly its Basic language. Gates flatly refused the offer, but it’s amazing to contemplate how history would have been different if he had accepted!

Trower told another story of how he left to go work at Microsoft, where he worked on the Atari version of Microsoft Basic from the other end. He said that before he left, an executive at Atari called him into his office and tried to talk him out of it, saying that Trower was risking his career working for “that little company in Seattle.” The executive said that Atari was backed by Warner Communications, a mega-corporation, and that he would have a more promising future if he stayed with the company. As we now know, Trower was onto something, and the Atari executive was full of it. I’m sure at the time it would’ve looked to most people like the executive was being sensible. Listening to stories like this, with the hindsight we now have, is really interesting, because it just goes to show that when you’re trying to negotiate your future, things in the present are not what they seem.

Interview #132 with Jerry Jewel, a co-founder of Sirius Software. Their software was not usually my cup of tea, though I liked “Repton,” which was a bit like Defender. This interview was really interesting to me, though, as it provided a window into what the computer industry was like at the time. Sirius had a distribution relationship with Apple Computer, and what he said about them was kind of shocking, given the history of Apple that I know. He also tells a shocking story about Jack Tramiel, who at the time was still the CEO of Commodore.

The ending to Sirius was tragic. I don’t know what it was, but it seemed like the company attracted not only some talented people, but also some shady characters. The company entered into business relationships with a series of the latter, which killed the company.

Interview #201 with Atari User Group Coordinator, Bob Brodie. He was hired by the company in about 1989. This was during the Tramiel era at Atari. I didn’t find the first half of the interview that interesting, but he got my attention in the second half when he talked about some details on how the Atari ST was created, and why Atari got out of the computer business in late 1993. I was really struck by this information, because I’ve never heard anyone talk about this before.

Soon after Jack Tramiel bought Atari in 1984, he and his associates met with Bill Gates. They actually talked to Gates about porting Windows to the Motorola 68000 architecture for the Atari ST! Brodie said that Windows 1.0 on Intel was completely done by that point. Gates said Microsoft was willing to do the port, but it would take 1-1/2 years to complete. That was too long of a time horizon for their release of the ST, so the Tramiels passed. They went to Digital Research, and used their GEM interface for the GUI, and a port of CP/M for the 68000; what Atari called “TOS” (“The Operating System”). From what I know of the history, Atari was trying to beat Commodore to market, since they were coming out with the Amiga. This decision to go with GEM, instead of Windows, would be kind of a fateful one.

Brodie said that the Tramiels ran Atari like a family business. Jack Tramiel’s three sons ran it. He doesn’t say when the following happened, but I’m thinking this would’ve been about ’92 or ’93. When Sam Tramiel’s daughter was preparing to go off to college, the university she was accepted at said she needed a computer, and it needed to be a Windows machine. Sam got her a Windows laptop, and he was so impressed with what it could do, he got one for himself as well. It was more capable, and its screen presentation looked better than what Atari had developed with their STacy and ST Book portable models. Antonio Salerno, Atari’s VP of Application Development, had already quit. Before heading out the door, he said to them that, “they’d lost the home computer war, and that Windows had already won.” After working with his Windows computer for a while, Sam probably realized that Salerno was right. Brodie said, “I think that’s what killed the computer business,” for Atari. “Sam went out shopping for his daughter, and for the first time, really got a good look at what the competition was.” I was flabbergasted to hear this! I had just assumed all those years ago that Atari was keeping its eye on its competition, but just didn’t have the engineering talent to keep up. From what Brodie said, they were asleep at the switch!

Atari had a couple next-generation “pizza box” computer models (referring to the form factor of their cases, similar to NeXT’s second-generation model) that were in the process of being developed, using the Motorola 68040, but the company cancelled those projects. There was no point in trying to develop them, it seems, because it was clear Atari was already behind. What I heard at the time was that Atari’s computer sales were flat as well, and that they were losing customers. After ’93, Atari focused on its portable and console video game business.

Interview #185 with Ted Kahn, who created the Atari Institute for Education Action Research, while Warner owned Atari. This was Antic’s best interview. Basically Kahn’s job with Atari was to make connections with non-profits, such as foundations, museums, and schools across the country, and internationally, to see if they could make good use of Atari computers. His job was to find non-profits with a mission that was compatible with Atari’s goals, and donate computers to them, for the purpose of getting Atari computers in front of people, as a way of marketing them to the masses. It wasn’t the same as Apple’s program to get Apple II computers into all the schools, but it was in a similar vein.

What really grabbed my attention is I couldn’t help but get the sneaking suspicion that I was touched by his work as a pre-teen. In 1981, two Atari computers were donated to my local library by some foundation whose name I can’t remember. Whether it had any connection to the Atari Institute, I have no idea, but the timing is a bit conspicuous, since it was at the same time that Kahn was doing his work for Atari.

He said that one of the institutions he worked with in 1981 was the Children’s Museum in Washington, D.C. My mom played a small part in helping to set up this museum in the late 1970s, while we lived in VA, when I was about 7 or 8 years old. She moved us out to CO in 1979, but I remember we went back to Washington, D.C. for some reason in the early 1980s, and we made a brief return visit to the museum. I saw a classroom there that had children sitting at Atari computers, working away on some graphics (that’s all I remember). I wondered how I could get in there to do what they were doing, since it looked pretty interesting, but you had to sign up for the class ahead of time, and we were only there for maybe a half-hour. This was the same classroom that Kahn worked with the museum to set up. It was really gratifying to hear the other end of that story!

Interview #207 with Tom R. Halfhill, former editor with Compute! Magazine. I had the pleasure of meeting Tom in 2009, when I just happened to be in the neighborhood, at the Computer History Museum in Mountain View, CA, for a summit on computer science education. I was intensely curious to get the inside story of what went on behind the scenes at the magazine, since it was my favorite when I was growing up in the 1980s. We sat down for lunch and hashed out the old times. This interview with Kevin Savetz covers some of the same ground that Tom and I discussed, but he also goes into some other background that we didn’t. You can read more about Compute! at Reminiscing, Part 3. You may be interested to read the comments after the article as well, since Tom and some other Compute! alumni left messages there.

A great story in this interview is how Charles Brannon, one of the editors at the magazine, wrote a word processor in machine language, called SpeedScript, that was ported and published in the magazine for all of their supported platforms. An advertiser in the magazine, called Quick Brown Fox, which sold a word processor by the same name for Commodore computers, complained to the magazine that what they were publishing was in effect a competing product, and selling it for a much lower price (the price of a single issue). They complained it was undercutting their sales. Tom said he listened to their harangue, and told them off, saying that SpeedScript was written in a couple months by an untrained programmer, and if people were liking it better than a commercial product that was developed by professionals, then they should be out of business! Quick Brown Fox stopped advertising in Compute! after that, but I thought Tom handled that correctly!

This is going to strike people as a rant, but it’s not. It’s a recommendation to the fields of software engineering, and by extension, computer science, at large, on how to elevate the discussion in software engineering. This post was inspired by a question I was asked to answer on Quora about whether object-oriented programming or functional programming offers better opportunities for modularizing software design into simpler units.

So, I’ll start off by saying that what software engineering typically calls “OOP” is not what I consider OOP. The discussion that this question was about is likely comparing abstract data types, in concept, with functional programming (FP). What they’re also probably talking about is how well each programming model handles data structures, how the different models implement modular functionality, and which is preferable for dealing with said data structures.

Debating stuff like this is so off from what the discussion could be about. It is a fact that if an OOP model is what is desired, one can create a better one in FP than what these so-called “OOP” proponents are likely talking about. It would take some work, but it could be done. It wouldn’t surprise me if there already are libraries for FP languages that do this.

The point is what’s actually hampering the discussion about what programming model is better is the ideas that are being discussed, and more broadly, the goals. A modern, developed engineering discipline would understand this. Yes, both programming models are capable of decomposing tasks, but the whole discussion about which does it “better” is off the mark. It has a weak goal in mind.

I remember recently answering a question on Quora regarding a dispute between two people over which of two singers, who had passed away in recent years, was “better.” I said that you can’t begin to discuss which one was better on a reasonable basis until you look at what genres they were in. In this case, I said each singer used a different style. They weren’t trying to communicate the same things. They weren’t using the same techniques. They were in different genres, so there’s little point in comparing them, unless you’re talking about what style of music you like better. By analogy, each programming model has strengths and weaknesses relative to what you’re trying to accomplish, but in order to use them to best effect, you have to consider whether each is even a good fit architecturally for the system you’re trying to build. There may be different FP languages that are a better fit than others, or maybe none of them fit well. Likewise, for the procedural languages that these proponents are probably calling “OOP.” Calling one “better” than another is missing the point. It depends on what you’re trying to do.

Comparisons about programming models in software engineering tend to wind down to familiarity, at some point; which languages can function as a platform that a large pool of developers know how to use, because no one wants to be left holding the bag if developers decide to up and leave. I think this argument misses a larger problem: How do you replace the domain knowledge those developers had? The most valuable part of a developer’s knowledge base is their understanding of the intent of the software design; for example, how the target business for the system operates, and how that intersects with technical decisions that were made in the software they worked on. Usually, that knowledge is all in their heads. It’s hardly documented. It doesn’t matter that you can replace those developers with people with the same programming skills, because sure, they can read the code, but they don’t understand why it was designed the way it was, and without that, it’s going to be difficult for them to do much that’s meaningful with it. That knowledge is either with other people who are still working at the same business, and/or the ones who left, and either way, the new people are going to have to be brought up to speed to be productive, which likely means the people who are still there are going to have to take time away from their critical duties to train the new people. Productivity suffers even if the new people are experts in the required programming skills.

The problem with this approach, from a modern engineering perspective, goes back to the saying that if all someone knows how to use is a hammer, every problem looks like a nail to them. The problem is with the discipline itself; that this mentality dominates the thinking in it. And I must say, computer science has not been helping with this, either. Rather than exploring how to make the process of building a better-fit programming model easier, they’ve been teaching a strict set of programming models for the purpose of employing students in industry. I could go on about other nagging problems that are quite evident in the industry that they are ignoring.

I could be oversimplifying this, but my understanding is modern engineering has much more of a form-follows-function orientation, and it focuses on technological properties. It does not take a pre-engineered product as a given. It looks at its parts. It takes all of the requirements of a project into consideration, and then applies analysis technique and this knowledge of technological properties to finding a solution. It focuses a lot of attention on architecture, trying to make efficient use of materials and labor, to control costs. This focus tends to make the scheduling and budgeting for the project more predictable, since they are basing estimates on known constraints. They also operate on a principle that simpler models (but not too simple) fail less often, and are easier to maintain than overcomplicated ones. They use analysis tools that help them model the constraints of the design, again, using technological properties as the basis, taking cognitive load off of the engineers.

Another thing is they don’t think about “what’s easier” for the engineers. They think about “what’s easier” for the people who will be using what’s ultimately created, including engineers who will be maintaining it, at least within cost constraints, and reliability requirements. The engineers tasked with creating the end product are supposed to be doing the hard part of trying to find the architecture and materials that fit the requirements for the people who will be using it, and paying for it.

Clarifying this analogy, what I’m talking about when I say “architecture” is not, “What data structure/container should we use?” It’s more akin to the question, “Should we use OOP or FP,” but it’s more than that. It involves thinking about what relationships between information and semantics best suite the domain for which the system is being developed, so that software engineers can better express the design (hopefully as close to a “spec” as possible), and what computing engine design to use in processing that programming model, so it runs most efficiently. When I talk about “constraints of materials,” I’m analogizing that to hardware, and software runtimes, and what their speed and load capacity is, in terms of things like frequency of requests, and memory. In short, what I’m saying is that some language and VM/runtime design might be necessary for this analogy to hold.

What this could accomplish is ultimately documenting the process that’s needed—still in a formal language—and using that documentation to run the process. So, rather than requiring software engineers to understand the business process, they can instead focus on the description and semantics of the engine that allows the description of the process to be run.

What’s needed is thinking like, “What kind of system model do we need,” and an industry that supports that kind of thinking. It needs to be much more technically competent to do this. I know this is sounding like wishful thinking, since people in the field are always trying to address problems that are right in front of them, and there’s no time to think about this. Secondly, I’m sure it sounds like I’m saying that software engineers should be engaged in something that’s impossibly hard, since I’m sure many are thinking that bugs are bad enough in existing software systems, and now I’m talking about getting software engineers involved in developing the very languages that will be used to describe the processes. That sounds like I’m asking for more trouble.

I’m talking about taking software engineers out of the task of describing customer processes, and putting them to work on the syntactic and semantic engines that enable people familiar with the processes to describe them. Perhaps I’m wrong, but I think this reorientation would make hiring programming staff based on technical skills easier, in principle, since not as much business knowledge would be necessary to make them productive.

Thirdly, “What about development communities?” Useful technology cannot just stand on its own. It needs an industry, a market around it. I agree, but I think, as I already said, it needs a more technically competent industry around it, one that can think in terms of the engineering processes I’ve described.

It seems to me one reason the industry doesn’t focus on this is we’ve gotten so used to the idea that our languages and computing systems need to be complex, that they need to be like Swiss Army knives that can handle every conceivable need, because we seem to need those features in the systems that have already been implemented. They reality is the reason they’re complex is a) they have been built using semantic systems that are not well suited to the problem they’re trying to solve, and b) they’re really designed to be “catch all” systems that anticipate a wide variety of customer needs. So, the problem you’re trying to solve is but a subset of that. We’ve been coping with the “Swiss Army knife” designs of others for decades. What’s actually needed is a different knowledge set that eschews from the features we don’t need for the projects we want to complete, that focuses on just the elements that are needed, with a practice that focuses on design, and its improvement.

Very few software engineers and computer scientists have had the experience of using a language that was tailored to the task they were working on. We’ve come to think that we need feature-rich languages and/or feature-rich libraries to finish projects. I say no. That is a habit, thinking that programming languages are communication protocols not just with computers, but with software engineers. What would be better is a semantic design scheme for semantic engines, having languages on top of them, in which the project can be spec’d out, and executed.

As things stand, what I’m talking about is impractical. It’s likely there are not enough software engineers around with the skills necessary to do what I’m describing to handle the demand for computational services. However, what’s been going on for ages in software engineering has mostly been a record of failure, with relatively few successes (the vast majority of software projects fail). Discussions, like the one described in the question that inspired this post, are not helping the problem. What’s needed is a different kind of discussion, I suggest using the topic scheme I’ve outlined here.

I’m saying that software engineering (SE) needs to take a look at what modern engineering disciplines do, and do their best to model that. CS needs to wonder what scientific discipline it can most emulate, which is what’s going to be needed if SE is going to improve. Both disciplines are stagnating, and are being surpassed by information technology management as a viable solution scheme for solving computational problems. However, that leads into a different problem, which I talked about 9 years ago in IT: You’re doing it completely wrong.

Vladislav Zorov, a Quora user, brought this to my attention. It is a worthy follow-up to Jerry King’s “The Art of Mathematics,” called Lockhart’s Lament. Lockhart really fleshes out what King was talking about. Both are worth your time. Lockhart’s lament reminds me a lot of a post I wrote, called The challenge of trying to get a real science of computing in our schools. I talked about an episode of South Park to illustrate this challenge, where a wrestling teacher is confronted with a pop culture in wrestling that’s “not real wrestling” (ie. WWE), as an illustration of the challenge that computer scientists have in trying to get “the real thing” into school curricula. The wrestling teacher is continually frustrated that no one understands what real wrestling is, from the kids who are taking his class, to the people in the community, to the administrators in his school. There is a “the inmates have taken over the asylum” feeling to all of this, where “the real thing” has been pushed to the margins, and the pop culture has taken over. The people who see “the real thing,” and value it are on the outside, looking in. Hardly anybody on the inside can understand what they’re complaining about, but some of them are most worried that nothing they’re doing seems to be enough to make a big dent in improving the lot of their students. Quite the conundrum. It looks idiotic, but it’s best not to dismiss it as such, because the health and welfare of our society is at stake.

Two issues that come to mind from Lockhart’s lament (and the “sequel”) is it seems like since we don’t have a better term for what’s called “math” in school, it’s difficult for a lot of people to disambiguate mathematics from the benefits that a relative few students ultimately derive from “math.” I think that’s what the critics hang their hat on: Even though they’ll acknowledge that what’s taught in school is not what Lockhart wishes it was, it does have some benefits for “students” (though I’d argue it’s relatively few of them), and this can be demonstrated, because we see every day that some number of students who take “math” go on to productive careers that use that skill. So, they will say, it can’t be said that what’s taught in school is worthless. Something of value is being transmitted. Though, I would encourage people to take a look at the backlash against requiring “math” in high school and college as a counterpoint to that notion.

Secondly, I think Lockhart’s critics have a good point in saying that it is physically impossible for the current school system, with the scale it has to accommodate, to do what he’s talking about. Maybe a handful of schools would be capable of doing it, by finding knowledgeable staff, and offering attractive salaries. I think Lockhart understands that. His point, that doesn’t seem to get through to his critics, is, “Look. What’s being taught in schools is not math, anyway! So, it’s not as if anyone would be missing out more than they are already.” I think that’s the sticking point between him and his critics. They think that if “math” is eliminated, and only real math is taught in a handful of schools (the capacity of the available talent), that a lot of otherwise talented students would be missing out on promising careers, which they could benefit from using “math.”

An implicit point that Lockhart is probably making is that real math has a hard time getting a foothold in the education system, because “math” has such a comprehensive lock on it. If someone offered to teach real math in our school system, they would be rejected, because their method of teaching would be so outside the established curriculum. That’s something his critics should think about. Something is seriously wrong with a system when “the real thing” doesn’t fit into it well, and is barred because of that.

I’ve talked along similar lines with others, and a persistent critic on this topic, who is a parent of children who are going through school, has told me something similar to a criticism Lockhart anticipated. It goes something like, “Not everyone is capable of doing what you’re talking about. They need to obtain certain basic skills, or else they will not be served well by the education they receive. Schools should just focus on that.” I understand this concern. What people who think about this worry about is that in our very competitive knowledge economy, people want assurances that their children will be able to make it. They don’t feel comfortable with an “airy-fairy, let’s be creative!” account of what they see as an essential skill. That’s leaving too much to chance. However, a persistent complaint I used to hear from employers (I assume this is still the case) is that they want people who can think creatively out of the box, and they don’t see enough of that in candidates. This is precisely what Lockhart is talking about (he doesn’t mention employers, though it’s the same concern, coming from a different angle). The only way we know of to cultivate creative thinkers is to get people in the practice of doing what’s necessary to be creative, and no one can give them a step-by-step guide on how to do that. Students have to go through the experience themselves, though of course adult educators will have a role in that.

A couple parts of Lockhart’s account that really resonated with me was where he showed how one can arrive at a proof for the area of a certain type of triangle, and where he talked about students figuring out imaginary problems for themselves, getting frustrated, trying and failing, collaborating, and finally working it out. What he described sounds so similar to what my experience was when I was first learning to program computers, when I was 12 years old, and beyond. I didn’t have a programming course to help me when I was first learning to do it. I did it on my own, and with the help of others who happened to be around me. That’s how I got comfortable with it. And none of it was about, “To solve this problem, you do it this way.” I can agree that would’ve been helpful in alleviating my frustration in some instances, but I think it would’ve risked denying me the opportunity to understand something about what was really going on while I was using a programming language. You see, what we end up learning through exploration is that we often learn more than we bargained for, and that’s all to the good. That’s something we need to understand as students in order to get some value out of an educational experience.

By learning this way, we own what we learn, and as such, we also learn to respond to criticism of what we think we know. We come to understand that everything that’s been developed has been created by fallible human beings. We learn that we make mistakes in what we own as our ideas. That creates a sense of humility in us, that we don’t know everything, and that there are people who are smarter than us, and that there are people who know what we don’t know, and that there is knowledge that we will never know, because there is just so much of it out there, and ultimately, that there are things that nobody knows yet, not even the smartest among us. That usually doesn’t feel good, but it is only by developing that sense of humility, and responding to criticism well that we improve on what we own as our ideas. As we get comfortable with this way of learning, we learn to become good at exploring, and by doing that, we become really knowledgeable about what we learn. That’s what educators are really after, is it not, to create lifelong learners? Most valuable of all, I think, is learning this way creates independent thinkers, and indeed, people who can think out of the box, because you have a sense of what you know, and what you don’t know, and what you know is what you question, and try to correct, and what you don’t know is what you explore, and try to know. Furthermore, you have a sense of what other people know, and don’t know, because you develop a sense of what it actually means to know something! This is what I see Lockhart’s critics are missing the boat on: What you know is not what you’ve been told. What you know is what you have tried, experienced, analyzed, and criticized (wash, rinse, repeat).

On a related topic, Richard Feynman addressed something that should concern us re. thinking we know something because it’s what we’ve been told, vs. understanding what we know, and don’t know.

What seems to scare a lot of people is even after you’ve tried, experienced, analyzed, and criticized, the only answer you might come up with is, “I don’t know, and neither does anybody else.” That seems unacceptable. What we don’t know could hurt us. Well, yes, but that’s the reality we exist in. Isn’t it better to know that than to presume we know something we don’t?

The fundamental disconnect between what people think is valuable about education and what’s actually valuable in it is they think that to ensure that students understand something, it must be explicitly transmitted by teachers and instructional materials. It can’t just be left to chance in exploratory methods, because they might learn the wrong things, and/or they might not learn enough to come close to really mastering the subject. That notion is not to be dismissed out of hand, because that’s very possible. Some scaffolding is necessary to make it more likely that students will arrive at powerful ideas, since most ideas people come up with are bad. The real question is finding a balance of “just enough” scaffolding to allow as many students as possible to “get it,” and not “too much,” such that it ruins the learning experience. At this point, I think that’s more an art than a science, but I could be wrong.

I’m not suggesting just using a “blank page” approach, where students get no guidance from adults on what they’re doing, as many school systems have done (which they mislabeled “constructivism,” and which doesn’t work). I don’t think Lockhart is talking about that, either. I’m not suggesting autodidactic learning, nor is Lockhart. There is structure to what he is talking about, but it has an open-ended broadness to it. That’s part of what I think scares his critics. There is no sense of having a set of requirements. Lockhart would do well to talk about thresholds that he is aiming for with students. I think that would get across that he’s not willing to entertain lazy thinking. He tries to do that by talking about how students get frustrated in his scheme of imaginative work, and that they work through it, but he needs to be more explicit about it.

He admits in the “sequel” that his first article was a lament, not a proposal of what he wants to see happen. He wanted to point out the problem, not to provide an answer just yet.

The key point I want to make is the perception that drives not taking a risk is in fact taking a big risk. It’s risking creating people, and therefore a society that only knows so much, and doesn’t know how to exceed thresholds that are necessary to come up with big leaps that advance our society, if not have the basic understanding necessary to retain the advances that have already been made. A conservative, incremental approach to existing failure will not do.

It’s been 5 years since I’ve written something on SICP–almost to the day! I recently provided some assistance to a reader of my blog on this exercise. In the process, I of course had to understand something about it. Like with Exercise 1.19, I didn’t think this one was written that clearly. It seemed like the authors assumed that the students were familiar with the Miller-Rabin test, or at least modular mathematics.

The exercise text made it sound like one could implement Miller-Rabin as a modification to Fermat’s Little Theorem (FLT). From providing my assistance, I found out that one could do it as a modification to FLT, though the implementation looked a bit convoluted. When I tried solving it myself, I felt more comfortable just ignoring the prior solutions for FLT in SICP, and completely rewriting the expmod routine. Along with that, I wrote some intermediate functions, which produced results that were ultimately passed to expmod. My own solution was iterative.

I found these two Wikipedia articles, one on modular arithmetic, the other on Miller-Rabin, to be more helpful than the SICP exercise text in understanding how to implement it. Every article I found on Miller-Rabin, doing a Google search, discussed the algorithm for doing the test. It was still a bit of a challenge to translate the algorithm to Scheme code, since they assumed an imperative style, and there is a special case in the logic.

It seemed like when it came right down to it, the main challenge was understanding how to calculate factors of n – 1 (n being the candidate tested for primality), and how to handle the special case, while still remaining within the confines of what the exercise required (to do the exponentiation, use the modulus function (remainder), and test for non-trivial roots inside of expmod). Once that was worked out, the testing for non-trivial roots turned out to be pretty simple.

If you’re looking for a list of primes to test against your solution, here are the first 1,000 known primes.

I thought I’d share the video below, since it has some valuable insights on what computer science should be, and what education should be, generally. It’s all integrated together in this presentation, and indeed, one of the projects of education should be integrating computer science into it, but not with the explicit purpose to create more programmers for future jobs, though it could always be used for that by the students. Alan Kay presents a different definition of CS than is pursued in universities today. He refers to how Alan Perlis defined it (Perlis was the one to come up with the term “computer science”), which I’ll get to below.

This thinking about CS and education provides, among other things, a pathway toward reimagining how knowledge, literature, and art can be presented, organized, dissected, annotated, and shared in a way that’s more meaningful than can be achieved with old media. (For more on that, see my post “Getting beyond paper and linear media.”) As with what the late Carl Sagan often advocated, Kay’s presentation here advocates for a general model for the education of citizens, not just professional careerists.

Another reason I thought of sharing this is several years ago I remember hearing that Kay was working on rethinking computer science curriculum. What he presents is more about suggestion, “Start thinking about it this way.” (He takes a dim view of the concept of curriculum, as it suggests setting students on a rigid path of study with the intent to create minds with a cookie cutter, though he is in favor of classical liberal arts education, with the prescriptions that entails.)

As he says in the presentation, there’s a lot to develop in the practice of education in order to bring this into fruition.

This is from 2015:

I wrote notes below for some of the segments he talks about, just because I think his presentation bears some interpretation for many readers. He uses metaphors a lot.

The bicycle

This is an analogy for how an apparatus, or a subject, is typically modified for education. We take the optimized, or the adult version of something, and add compensators, which make it so that beginners can use it without falling all over themselves. It’s seen as easier to present it this way, and as a skill-building experience, where in order to learn how to do something, you need to use “the real thing.” Beginners can put on a good show of using this sort of apparatus, or modified subject, but the problem is that it doesn’t teach a beginner how to become good at really using the real thing at its full potential. The compensators become a crutch. He said a better idea is to use an apparatus, or a component of the subject, that allows a beginner to get a feel for how to use the thing in a way that gets across significant aspects of its potential, but without the optimizations, or the way adults use it, which make it too complicated for a beginner to use under their own power. In this case, a lowbike is better. This beginner apparatus, or component, is more like the first version of the thing you’re trying to teach. Bicycles were originally more like scooters, without pedals, or a chain, where you’d sit in the seat, push it along with your legs, kind of “running while sitting,” glide, and turn by shifting your weight, and turning into the turn. Once a beginner gets a feel for that, they can deal with the optimized version, or a scaled down adult version, with pedals and a chain to drive the bike, because all that adds is the ability to get more power out of it. It doesn’t change any of the fundamentals of how you use it.

This gets to an issue of pedagogy, that learners need to learn something in components, rather than dealing with the whole thing at once. Once they learn one capacity, they can move on to the next challenge in learning the “whole thing.”

Radiation vs. nouns

He put forward a proposition for his talk, which is that he’s mixing a bunch of ideas together, because they overlap. This is a good metaphor, because most of his talk is about what we are as human beings, and how society should situate and view education. Only a little of it is actually on computer science, but all of it is germane to talking about computer science education.

He also gives a little advice to education reformers. He pointed out what’s wrong with education as it is right now, but rather than cursing it, he said one should make a deliberate effort to “build a tribe” or coalition with those who are causing the problem, or are in the midst of the problem, and suggest ways to bring them into a dignified position, perhaps by sharing in their honors, or, as his example illustrated, their shame. I draw some of this from Plato’s Cave metaphor.

Cooperation and competition in society

I once heard Kay talk about this many years ago. He said that, culturally, modern corporations are like the ancient hunter-gatherers. They exploit the resources of an area for a while, and once it’s exhausted, they move on, and that as a culture, they have yet to learn about democracy, which requires more of a “settlement” mentality toward what they’re doing. Here, he used an agricultural metaphor to talk about a cooperative society that creates the wealth that is then used by competitive forces within it. What he means by this is that the true wealth is the knowledge that’s ultimately used to develop new products and services. It’s not all developed inside the marketplace. He doesn’t talk about this, but I will. Even though a significant part of the wealth (as he said, you can think of it as “potential energy”) is generated inside research labs, what research labs tend to lack is knowledge of what the members of society can actually understand of this developed knowledge. That’s where the competitive forces in society come in, because they understand this a lot better. They can negotiate between how much of the new knowledge to put into a product, and how much it will cost, to reach as many people as possible. This is what happened in the computer industry of the past.

I think I understand what he’s getting at with the agricultural metaphor, though perhaps I need to be filled in more. My understanding of what he means is that farmers don’t just want to reap a crop for one season. Their livelihood depends on maintaining fertility on their land. That requires not just exploiting what’s there season after season, or else you get the dust bowl. If instead, practices are modified to allow the existing land to become fertile again, or, in the case of hunter-gathering, aggressively managing the environment to create favorable grazing to attract game, then you can create a cycle of exploitation and care such that a group can stay in one area for a long time, without denying themselves any of the benefits of how they live. I think what he suggests is that if corporations would modify their behavior to a more settled, agricultural model, to use some of their profits to contribute to educating the society in which they exist, and to funding scientific research on a private basis, that would “regenerate the soil” for economic growth, which can then fuel more research, creating a cycle of renewal. No doubt the idea he’s presenting includes the businesses who would participate in doing this. They should be allowed to profit (“reap”) from what they “sow,” but the point is they’re not the only ones who can profit. Other players in the marketplace can also exploit the knowledge that’s generated, and profit as well. That’s what’s been done in the past with private research labs.

He attributes the lack of this to culture, of not realizing that the economic model that’s being used is not sustainable. Eventually, you use up the “soil,” and it becomes “infertile,” and “blows away,” and, in the case of hunter-gathering, the “good hunting grounds” are used up.

He makes a crucial point, though, that education is not just about jobs and competitiveness. It’s also about inculcating what citizenship really means. I’m sure if he was asked to drill down on this more, he would suggest a classical education for this, along with a modified math and science curriculum that would give students a sense of what those subjects really are like.

The sense I get is he’s advocating almost more of an Andrew Carnegie model of corporate stewardship, who, after he made his money, directed his philanthropy to building schools and libraries. Kay would just add science labs to that mix. (He mentions this later in his talk.)

I feel it necessary to note that not all for-profit entities would be able to participate in funding these cooperative activities, because their profit margins are so slim. I don’t think that’s what he’s expecting out of this.

What we are, to the best of our knowledge

He gives three views into human mental capacity: the way we perceive (theatrical), how much we can perceive at any moment in time (1 ± 2), and how educators should perceive ourselves psychologically and mentally (more primate and mammalian). This relates to neuroscience, and to some extent, evolutionary psychology.

The raison d’être of computer science

The primary purpose of computer science should be developing a science of systems in process, and that means all processes: mechanical processes, technological processes, social processes, biological processes, mental processes, etc. This relates to my earlier post, “Beginning the journey of becoming a computer scientist.” It’s partly about developing a new kind of mathematics for modeling processes. Alan Turing did it, with his Turing Machine concept, though he was trying to model the process of generating mathematical statements, and doing mathematical tests on them.

Shipping the design

Kay talks about how programmers today don’t have access to anything like what designers in other fields have, where they’re able to model their design, simulate it, and then have a machine fabricate a prototype that you can actually show and use.

I want to clarify this one point, because I don’t think he articulated it well (I found out about this because he expressed a similar thought on Quora, and I finally understood what he meant), but at one point he said that students at UCLA, one of the Top 10 CS schools, use “vi terminal emulators” (he sounds like he said “bi-terminal emulators”), emulating punched cards. What he meant by this was that students are logging in to a Unix/Linux system, bringing up an X-Windows terminal window, which is 80 columns wide (hence the punched card metaphor he used, because punch cards were also 80 columns wide), and using the “vi” text editor (or more likely “vim”, which is vi emulated in Emacs) to write their C++ code, the primary language they use.

I had an epiphany about this gulf between the tools that programmers use and the tools that engineers use, about 8 or 9 years ago. I was at a friend’s party, and there were a few mechanical engineers among the guests. I overheard a couple of them talking about the computer-aided design (CAD) software they were using. One talked about a “terrible” piece of CAD software he used at work. He said he had a much better CAD system at home, but it was incompatible with the data files that were used at work. As much as he would’ve loved to use it, he couldn’t. He said the system he used at work required him to group pieces of a design together as he was building the model, and once he did that, those pieces became inflexible. He couldn’t just redesign one piece of it, or separate one out individually from the model. He couldn’t move the pieces around on the model, and have them fit. Once they were grouped, that was it. It became this static thing. He said in order to redesign one piece of it, he had to take the entire model apart, piece by piece, redesign the part, and then redesign all the other pieces in the group to make the new part fit. He said he hated it, and as he talked about it, he acted like he was so disgusted with it, he wanted to throw it in the trash, like it was a piece of garbage. He said on his CAD system at home, it was wonderful, because he could separate a part from a model any time he wanted, and the system would adjust the model automatically to “make sense” out of the part being missing. He could redesign the part, and move it to a different part of the model, “attach it” somewhere, and the system would automatically adjust the model so that the new part would fit. The way he described it gave it a sense of fluidity. Whereas the system he used at work sounded rigid. It reminded me of the programming languages I had been using, where once relationships between entities were set up, it was really difficult to take pieces of it “out” and redesign them, because everything that depended on that piece would break once I redesigned it. I had to go around and redesign all the other entities that related to it to adjust to the redesign of the one piece.

I can’t remember how this worked, but another thing the engineer talked about was the system at work had some sort of “binding” mechanism that seemed to associate parts by “type,” and that this was also rigid, which reminded me a lot of the strong typing system in the languages I had been using. He said the system he had at home didn’t have this, and to him, it made more sense. Again, his description lent a sense of fluidity to the experience of using it. I thought, “My goodness! Why don’t programmers think like this? Why don’t they insist that the experience be like this guy’s CAD system at home?” For the first time in my career, I had a profound sense of just what Alan Kay talked about, here, that the computing field is retrograde. It has not advanced anywhere close to the kind of engineering that exists in other fields, where they would insist on this sort of experience. We accept so much less, whereas modern engineers have a hard time standing for it, because they know they have better options.

Don’t be fooled by large efforts below threshold

Before I begin this part, I want to share a crucial point that Kay makes, because it’s one of the big ones:

Think about what thinking is. Thinking is not being logical. Thinking is choosing the environment that you’re going to think in before you start rationalizing.

Kay had something very interesting, and startling, to say about the Apollo space program, using that as a metaphor for large reform initiatives in education generally. I recently happened upon a video of testimony he gave to a House committee on educational computing back in 1982, chaired by then-congressman Al Gore, and Kay talked about this same example back then. He said that the way the Apollo rockets were designed was a “hack.” They were not the best design for space travel, but it was the most expedient for the mission of getting to the Moon by the end of the 1960s. Here, in this presentation, he talks about how each complete rocket was the height of a 45-story building (1-1/2 football fields in length), most of it high explosives, with only a tiny capsule at the top that could fit 3 astronauts. This is not a model that could scale to what was needed for space travel.

It became this huge worldwide cultural event when NASA launched it, and landed men on the Moon, but Kay said it was really not a great accomplishment. I remember Rep. Gore said in jest, “The walls in this room are shaking!” The camera panned around a bit, showing pictures on the wall from NASA. How could he say such a thing?! This was the biggest cultural event of the century, perhaps in all of human history. He explained the same thing here: that the Apollo program didn’t advance space travel beyond the mission to the Moon. It was not technology that would get us beyond that, though, in hindsight we can say technology like it enabled launching probes throughout the Solar System.

Now, what he means by “space travel,” I’m not sure. Is it manned missions to the outer planets, or to other star systems? Kay is someone who has always thought big. So, it’s possible he was thinking of interstellar travel. What he was talking about was the problem of propulsion, getting something powerful enough to make significant discoveries in space exploration possible. He said chemical propellant just doesn’t do it. It’s good enough for launching orbital vehicles around our planet, and launching probes, but that’s really it. The rest is just wasting time below threshold.

Another thing he explained is that large initiatives which don’t cross a meaningful threshold can be harmful to efforts to advancing any endeavor, because large initiatives come with extended expectations that the investment will continue to be used, and they must be satisfied, or else there will be no cooperation in doing the initial effort. The participants will want their return on investment. He said that’s what happened with NASA. The ROI had to play out, but that ruined the program, because as that happened, people could see we weren’t advancing the state of the art that much in space travel, and the science that was being produced out of it was usually nothing to write home about. Eventually, we got what we now see: People are tired of it, and have no enthusiasm for it, because it set expectations so low.

What he was trying to do in his House committee testimony, and what he’s trying to do here, is provide some perspective that science offers, vs. our common sense notion of how “great” something is. You cannot get qualitative improvement in an endeavor without this perspective, because otherwise you have no idea what you’re dealing with, or what progress you’re building on, if any. Looking at it from a cultural perspective is not sufficient. Yes, the Moon landing was a cultural milestone, but not a scientific or engineering milestone, and that matters.

Modern science and engineering have a sense of thresholds, that there can come a point where some qualitative leap is made, a new perspective on what you’re doing is realized that is significantly better than what existed prior. He explains that once a threshold has been crossed, you can make small improvements which continue to build on that significant leap, and those improvements will stick. The effort won’t just crash back down into mediocrity, because you know something about what you have, and you value it. It’s a paradigm shift. It is so significant, you have little reason to go back to what you were doing before. From there, you can start to learn the limits of that new perspective, and at some point, make more qualitative leaps, crossing more thresholds.

“Problem-finding”/finding the goal vs. problem-solving

Problem solving begins with a current context, “We’re having a problem with X. How do we solve it?” Problem finding asks do we even have a good idea of what the problem is? Maybe the reason for the problems we’ve been seeing has to do with the fact that we haven’t solved a different problem we don’t know about yet. “Let’s spend time trying to find that.”

Another way of expressing this is a concept I’ve heard about from economists, called “opportunity cost,” which, in one context, gets across the idea that by implementing a regulation, it’s possible that better outcomes will be produced in certain categories of economic interactions, but it will also prevent certain opportunities from arising which may also be positive. The rub is these opportunities will not be known ahead of time, and will not be realized, because the regulation creates a barrier to entry that entrepreneurs and investors will find too high of a barrier to overcome. This concept is difficult to communicate to many laymen, because it sounds speculative. What this concept encourages people cognizant of it to do is to “consider the unseen,” to consider the possibilities that lie outside of what’s currently known. One can view “problem finding” in a similar way, not as a way of considering the unseen, but exploring it, and finding new knowledge that was previously unknown, and therefore unseen, and then reconsidering one’s notion of what the problem really is. It’s a way of expanding your knowledge base in a domain, with the key practice being that you’re not just exploring what’s already known. You’re exploring the unknown.

The story he tells about MacCready illustrates working with a good modeling system. He needed to be able to fail with his ideas a lot, before he found something that worked. So he needed a simple enough modeling system that he could fail in, where when he crashed with a particular model, it didn’t take a lot to analyze why it didn’t work, and it didn’t take a lot to put it back together differently, so he could try again.

He made another point about Xerox PARC, that it took years to find the goal, and it involved finding many other goals, and solving them in the interim. I’ve written about this history at “A history lesson on government R&D” Part 2 and Part 3. There, you can see the continuum he talks about, where ARPA/IPTO work led into Xerox PARC.

This video with Vishal Sikka and Alan Kay gives a brief illustration of this process, and what was produced out of it.

Erosion gullies

There are a couple metaphors he uses to talk about the lack of flexibility that develops in our minds the more we focus our efforts on coping, problem solving, and optimizing how we live and work in our current circumstances. One is erosion gullies. The other is the “monkey trap.”

Erosion gullies channel water along a particular path. They develop naturally as water erodes the land it flows across. These “gullies” seem to fit with what works for us, and/or what we’re used to. They develop into habits about how we see the world–beliefs, ideas which we just accept, and don’t question. They allow some variation in the path that’s followed, but they provide boundaries that don’t allow the water to go outside the gully (leaving aside the exception of floods, for the sake of argument). He uses this to talk about how “channels” develop in our minds that direct our thinking. The more we focus our thoughts in that direction, the “deeper” the gully gets. Keep at it too long, and the “gully” won’t allow us to see anything different than what we’re used to. He says that it may become inconceivable to think that you could climb out of it. Most everything inside the “gully” will be considered “right” thinking (no reason why), and anything outside of it will be considered “wrong” (no reason why), and even threatening. This is why he mentions that wars are fought over this. “We’re all in different erosion gullies.” They don’t meet anywhere, and my “right” is your “wrong,” and vice-versa. The differences are irreconcilable, because the idea of seeing outside of them is inconceivable.

He makes two points with this. One is that we have erosion gullies re. stories that we tell ourselves, and beliefs that we hold onto. Another is that we have erosion gullies even in our sensory perceptions that dictate what we see and don’t see. We can see things that don’t even exist, and typically do. He uses eyewitness testimony to illustrate this.

I think what he’s saying with it is we need to watch out for these “gullies.” They develop naturally, but it would be good if we had the flexibility to be able to eventually get out of our “gully,” and form a new “channel,” which I take is a metaphor for seeing the world differently than what we’re used to. We need a means for doing that, and what he proposes is science, since it questions what we believe, and tests our ideas. We can get around our beliefs, and thereby get out of our “gullies” to change our perspective. It doesn’t mean we abandon “gullies,” but just become aware that other “channels” (perspectives) are possible, and we can switch between them, to see better, and achieve better outcomes.

Regarding the “monkey trap,” he uses it as a metaphor for us getting on a single track, grasping for what we want, not realizing that the very act of being that focused, to the exclusion of all other possibilities, is not getting us anywhere. It’s a trap, and we’d benefit by not being so dogged in pursuing goals if they’re not getting us anywhere.

“Fast” vs. “slow”

He gets into some neuroscience that relates to how we perceive, what he called “fast” and “slow” response. You can train your mind through practice in how to use “fast” and “slow” for different activities, and they’re integral to our perception of what we’re doing, and our reactions to it, so that we don’t careen into a catastrophe, or miss important ideas in trying to deal with problems. He said that cognitive approaches to education deal with the “slow” systems, but not the “fast” ones, and it’s not enough to help students in really understanding a subject. As other forms of training inherently deal with the “fast” systems, educators need to think about how the “fast” systems responds to their subjects, and incorporate that into how they are taught. He anticipates this will require radically redesigning the pedagogy that’s typically used.

He says that the “fast” systems deal with the “atoms” of ideas that the “slow” system also deals with. By “atoms,” I take it he means fundamental, basic ideas or concepts for a subject. (I think of this as the “building blocks of molecules.”)

The way I take this is that the “slow” systems he’s talking about are what we use to work out hard problems. They’re what we use to sit down and ponder a problem for a while. The “fast” systems are what we use to recognize or spot possible patterns/answers quickly, a kind of quick, first-blush analysis that can make solving the problem easier. To use an example, you might be using “fast” systems now to read this text. You can do it without thinking about it. The “slow” systems are involved in interpreting what I’m saying, generating ideas that occur to you as you read it.

This is just me, but “fast” sounds like what we’d call “intuition,” because some of the thinking has already been done before we use the “slow” systems to solve the rest. It’s a thought process that takes place, and has already worked some things out, before we consciously engage in a thought process.

Science

This is the clearest expression I’ve heard Kay make about what science actually is, not what most people think it is. He’s talked about it before in other ways, but he just comes right out and says it in this presentation, and I hope people watching it really take it in, because I see too often that people take what they’ve been taught about what science is in school and keep reiterating it for the rest of their lives. This goes on not only with people who love following what scientists say, but also in our societal institutions that we happen to associate with science.

…[Francis] Bacon wrote a book called “The Novum Organum” in 1620, where he said, “Hey, look. Our brains are messed up. We have bad brains.” He called the ways of messing up “idols.” He said we get serious errors because of our genetics. We get serious errors because of the culture we’re in. We get serious errors because of the languages we use. They don’t represent what’s actually out there. We get serious errors from the way that academia hangs on to bad ideas, and teaches them over again. These are his four “idols.” Anyone ever read Bacon? He said we need something to get around our bad brains! A set of heuristics, is the term we’d use today.

What he called for was … science, because that’s what “Novum Organum,” the rest of the title, was: “A new way of dealing with knowledge.”

Science is not the knowledge, because knowledge is in this context. What science is is a negotiation between what’s out there and what we can represent.

This is the big idea. This is the idea they don’t teach in school. This is the idea we should be teaching. It’s one of the biggest ideas of all time.

It isn’t the knowledge. It’s the relationship, because what’s out there is only knowable by a phenomena that is being filtered in every possible way. We don’t even know if our brain is capable of representing the stuff.

So, to think about science as the truth is completely wrong! It’s not the right way to look at it. But if you think about it as a negotiation between the best you can do right now and stuff that’s out there, where you’re not completely sure, you’re in a very strong position.

Science has been the most important, powerful thought system humans have ever invented, because it gave up the idea of truth, and it substituted for it a thousand variations of false, some of which are incredibly powerful. This is the big idea.

So, if we’re going to think about computing, this is one way … of thinking about, “Wow! Computers!” They are representers. We can learn about representations. We can simulate ideas. We can get a very good–much better sense of dealing with thinking about these complexities.

“Getting there”

The last part demonstrates what I’ve seen with exploration. You start out thinking you’re going to go from Point A to Point B, but you take diversions, pathways that are interesting, but related to your initial search, because you find that it’s not a straight path from Point A to Point B. It’s not as straightforward as you thought. So, you try other ways of getting there. It is a kind of problem solving, but it’s really what Kay called “problem finding,” or finding the goal. In the process, the goal is to find a better way to get to the goal, and along the way, you find problems that are worth solving, that you didn’t anticipate at all when you first got going. In that process, you’ll find things you didn’t expect to learn, but which are really valuable to your knowledge base. In your pursuit of trying to find a better way to get to your destination, you might even get through a threshold, and find that your initial goal is no longer worth pursuing, but there are better goals to pursue in this new perception you’ve obtained.