Telling a computer what to do is actually quite simple when broken down.

Programming for all

Computers are ubiquitous in modern life. They offer us portals to information and entertainment, and they handle the complex tasks needed to keep many facets of modern society running smoothly. Chances are, there is not a single person in Ars' readership whose day-to-day existence doesn't rely on computers in one manner or another. Despite this, very few people know how computers actually do the things that they do. How does one go from what is really nothing more than a collection—a very large collection, mind you—of switches to the things we see powering the modern world?

We've arranged a civilization in which most crucial elements profoundly depend on science and technology. We have also arranged things so that almost no one understands science and technology. This is a prescription for disaster. We might get away with it for a while, but sooner or later this combustible mixture of ignorance and power is going to blow up in our faces.

—Carl Sagan

At their base, even though they run much of the world, computers are one thing: stupid. A computer knows nothing. Its brain is little more than a large collection of on/off switches. The fact that you can play video games, browse the Internet, and pump gas at a gas station is thanks to the programs the computers have been given by a human. In this article, we'll take a look at some of the basic concepts of computer programming: how a person teaches a computer something and how the ideas encapsulated in the program go from something we can understand to something a computer understands.

First, it needs to be said that programming is not some black art, something arcane that only the learned few may ever attempt. It is a method of communication whereby a person tells a computer what, exactly, they want it to do. Computers are picky and stupid, but they will indeed do exactly as they are told. Therefore, each program you write should be like an elegant recipe that anyone—including a computer—can follow. Ideally, each step in a program should be clearly described and, if it is complicated, broken down into smaller steps to remove all doubt about what is to happen.

Programming is about problem solving and thinking in a methodical manner. Like many other disciplines, it requires someone to be able to look at a complex problem and start whittling away at it, solving easier pieces first until the whole thing is tackled. It is this whittling away, this identification of smaller challenges and developing solutions to them, that requires the real talent—and creativity. If you go out to solve a problem, or create a program no one has ever done before, all the book knowledge in the world won't give you the answer. A creative mind might.

Think of programming like cooking: you learn the basic rules and then you can let your creativity run wild. Few will go on to become the rock stars of the kitchen, but that's OK. The barrier to entry is not high, and once you are in, you are then limited only by your desire and creativity.

Thinking in base 10

Like many things in the modern world, to understand computer programming and how computers function in general, you have to start with numbers—integers, to be precise. The nineteenth century mathematician Leopold Kronceker is credited with the phrase "God made the integers; all else is the work of man." If one is intending to understand computer programming, they must start their journey by looking at simple integers.

In the modern world, we count using what is known as base 10; that is, there are ten distinct digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), if you want to go higher than those, we prepend a digit representing the number of tens we have. For instance, the number 13 really tells us we have one of the tens, and three of the ones. This structure can go on and on. If we write 586, we are really saying that we have five of the one hundreds, eight of the tens, and six of the ones.

Before looking at how we can translate that to the on and off switches of computers, let's take a step back and look at our description of a number in a bit more detail. Let's use 53,897 as our example. Following what we did above, we are really saying that this is five of the ten thousands, three of the one thousands, eight of the hundreds, nine of the tens, and seven of the ones. If we write this out in a more mathematical notation we would arrive at 5*10,000 + 3*1,000 + 8*100 + 9*10 + 7*1. This would evaluate to the number 53,897.

Looking at this summation a little more closely, one might realize the number on the right-hand side of the multiplication (the multiplicand) are all powers of 10; 10,000 is 104, 1,000 is 103, 100 is 102, 10 is 101, and 1 is 100. Re-writing our expression would give 5*104+3*103+8*102+9*101+7*100. In the more general sense, any base-10 number AB,CDE (where, A, B, C, D, and E are some arbitrary digit from 0 through 9) is really saying that we have the following summation: A*104+B*103+C*102+D*101+E*100.

All your base... oh, nevermind

Taking the generalization one step further, we can note that the multiplicand is always 10 to some power when we are working in base 10. That is not a coincidence. If we want to work in any other base number system, we would write any number AB,CDE (where again, A, B, C, D, and E are valid digits in the base we are working with) as the follow summation: A*base4+B*base3+C*base2+D*base1+E*base0.

If we one day encounter an alien civilization that uses base five for everything and they threaten us by stating that they have 342 battle cruisers en route to Earth, we'd know that we would count that as 3*52+4*51+2*50, or 97 ships headed to destroy us. (We'll just send out the winner of the USS Defiant vs. the Millennium Falcon to take them on, no problem).

Now, there is no reason that we must count in base 10, it is simply convenient for us. Given our ten fingers and ten toes it is only natural that we use a base-10 number system. Past societies have used others, Babylonians, for instance, used a base-60 number system and they had the same symbol represent 1, 60, and 3,600 (a similar problem existed for 61, 3,601, and 3,660).

What does this all have to do with computers? Well, as I mentioned above, computers are nothing more than a big collection of switches. A monstrous set of on/off items—some with only two possible states. If we find that using ten possible states to be logical due to our biology, then it would reason that using base-two numbers would make the most sense for a computer that can only know on or off.

That means the binary number system. The binary number system is base two, and the only numbers available for counting are zero (0) and one (1). Even though you only have 0s and 1s, any number can be represented, as we just showed above. This gives us a way to tell a computer—a big collection of switches—all about numbers. Counting to 10 (base 10) in binary would go as follows: 0 (0), 1 (1), 10 (2), 11 (3), 100 (4), 101 (5), 110 (6), 111 (7), 1000 (8), 1001 (9), 1010 (10).

More generally any binary number can be computed similarly to how we wrote the general sum for base-10 numbers. The number ABCDE in binary would represent the number A*24+B*23+C*22+D*21+E*20. As an example, the number 10011010 in binary would be 1*27+0*26+0*25+1*24+1*23+0*22+1*21+0*20, or 154 in base-10 terminology. (As a nerdy aside, such counting is the origin of the joke "There are only 10 kinds of people in this world. Those who understand binary and those who do not.")

Before moving on, a quick aside about terminology. In computer parlance, a single 0 or 1 (a single on/off value) is termed a "bit." When you group eight bits together, you get a byte (four bits is called a nybble, but you won't see that around much anymore). Therefore, if you have a 1GB drive, that means you have 1024*1024*1024 bytes of information, or 8*1024*1024*1024 individual on/off storage spots to which information can be written. When someone in the computer world says something is 16-bit or 32-bit or 64-bit, they are talking about how many bits are available, by default, for representing an integer (or memory address location) on that computer.

Early days

With a method to represent numbers, it became possible to use these collections of switches to do things with those numbers. Even with all the innovations in CPU design and technology that the past 50+ years have brought, at the heart of things, there are only a handful of things that a CPU can do with those numbers: simple arithmetic and logical operations. Just as importantly, it can also move bits/bytes around in memory.

Each of those simple operations could be given a numeric code, and a sequence of these operations could then be issued to a computer for it to execute—this sequence is termed a program. In the beginning, developing a program quite literally involved a programmer sitting in front of a large board of switches, setting some to up (on/1) and some to down (off/0) one at a time, thereby entering the computation sequence into the computer directly. Once all the operations in the sequence have been entered, the program could then be executed.

After entering programs by literally flipping switches became tedious (this didn't take too long), hardware manufacturers and software developers came up with the concept of assembly languages. While the hardware really only knows combinations of ones and zeros, the machine language, these are difficult to work with. Assembly languages gave a simple name to each possible operation that a given piece of hardware could carry out.

The Fortran Automatic Coding System for the IBM 704, the first Programmer's Reference Manual for Fortran.

Assembly languages are considered low-level programming languages, in that you are working directly with operations that the hardware can do rather than anything more complex or abstract built on top of these commands. For instance, a simple CPU could understand assembly for adding (ADD) two numbers, subtracting (SUB) two numbers, and moving memory between locations (MOV). The development of assembly allowed programmers to create software that was somewhat removed from the absolute binary nature of the machine and quite a bit easier to understand by reading.

Even though this got programmers one step removed from the numeric codes that actually run the computer, the fact that each different piece of hardware from each manufacturer could have its own, non-overlapping assembly language meant is was still difficult to create complex software that carried out large-scale tasks in a general manner.

It wasn't until 1956 that John Backus came down from the mountain with a host of punch cards inscribed with "The FORTRAN Automatic Coding System for the IBM 704" (PDF). FORTRAN (FORmula TRANslator) was the first ever high-level language, a language in which scientists and engineers could interact with a computer without needing to know the down and dirty specifics of the commands the machine actually supported.

129 Reader Comments

Great article on the computer and technology. I think Carl Sagan's statement is just as applicable to nuclear energy and all the associated environmental fear-mongering. Ars could have a series of such articles on technology and a realistic explanation of the mystique behind it.

A very nice introduction, though I personally think the part about the binary number system could have been skipped in favour of simply using the normal boolean "true"/"false".

This is because I have observed many people - including my peers at the time - having trouble with all of this when we were in school. I basically brute forced through the math necessarily till it all became clear AFTER for I took advance math modules in logic and number theory and my first thought was -

"Why did my O+A level math have high-order derivatives but nothing on number theory - i.e. functions, sets and rings - which would have helped me so much?"

Like my old math professor told us, "People struggle in math not because they are stupid but because there are gaping holes in their understanding". But I guess I am digressing.

Concisely explained. I'll have to show this to my niece. I've been showing her how to navigate BASH and been a bit unsure how to explain everything that goes on beneath the command interpreter and shell.

She's been asking tough, but insightful questions. Maybe I shouldn't have got her that abacus as a tot.

Great article, but would it not have been better to use a real language so that beginning students could follow along? I understand that you want to avoid the silly language flame wars of course. Merry Christmas !

This is because I have observed many people - including my peers at the time - having trouble with all of this when we were in school. I basically brute forced through the math necessarily till it all became clear AFTER for I took advance math modules in logic and number theory and my first thought was -

I remember as an electronics tech student spending what was at least 2 weeks on base conversion exercises. The theory seeped in via osmosis and writers cramp. I remember being able to successfully explain it to a girlfriend which was always a sign that I had grasped the fundamentals. I could never do that with Laplace transforms though.

I've got something of a handle on four high-level languages, and it still messes with me that when it gets down to it, processors are just massive collections of switches.

Just makes me happy that I don't have to be super hardcore and learn assembly, although at some point I might attempt taking on C++ or some other sort of lower-level language.

You definitely should give assembly a shot! Despite what you generally see people say, it's a very simple language (well, family of languages). The challenge is when you look at what it provides, you generally fool yourself into thinking 'there must be more!' when in reality that's it. Getting to the point where you can write simple toy programs in assembly is a decent enough stopping point, unless you feel the urge to see how far you can go with it.

And for a 'low level' language, I'd recommend C over C++. C is pretty much the lowest high level language (back in the day, it was a high level language!) still in common use today. It's not a one-to-one mapping between assembly and C, but it's pretty close for the most part. It's also a good playground to learn how pointers work and all the tricks you can do with them (and likely burn yourself countless times in the process, so you'll learn to appreciate all the nice things modern high level languages give you). While I'd recommend C++ over C for actual development (others would argue differently), C++ is quite a bit more complicated in the backend (vtables and all that!), making it much more difficult to go C++ <-> assembly than C <-> assembly.

And while we're on the you-should-learn front... learning a functional programming language is always good for opening your eyes to a different perspective than the OOP and imperative/procedural approaches. Learning the pros and cons of your tools (paradigms) is always good.

Of course, all this just emphasizes the ultimate point that learning more, new, and different things is always a good idea when you're a programmer.

"Why did my O+A level math have high-order derivatives but nothing on number theory - i.e. functions, sets and rings - which would have helped me so much?"

This is a problem that goes all the way down to kindergarten/primary school - teaching people how to count and understand numbers properly. (The most basic being knowing to count from 0 to 9, not 1 to 10 - and starting with 0, not 1 is fundamental for understanding every basic numerical system, including binary/hexadecimal etc...).

I remember when i was a first-year grad student (1960) Martin Schwartzschild handed me a Fortran manual one afternoon and asked me to write a program for numerical intergration of some diff equations. "You will have four hours on the IBM 360 next week to run the program."

Though yes, computers execute binary programs - the digression into base 10 and base 2 I think was not really necessary.

Yeah - sometimes I do bit level arithmetic - but only after I decide there's a performance reasons to do so - which is to say, incredibly rarely.

And no, while programming is not a black art, the set of people who are able to do it even at a basic level of competence is surprisingly small. It's not really even an intelligence thing. I've known a number of very intelligent people who couldn't program to save their souls.

Programming just appears to require an odd balance of creativity and structured thought that's rare.

Disk drive sizes use salesman's math. KB=10^10 MB=10^20 GB=10^30. Usually there is a footnote in very fine print somewhere on the package explaining that the stated size differs from the actual size with a very strange looking decimal number for the actual size. The number looks strange because the sector size is usually a power of 2 (256, 512, 1024) and the actual size of the disk is the sector size * the number of sectors. This is rounded off to a neat power of 10 number that is then printed in huge letters as the size of the disk with a disclaimer that after formatting the operating system will report a smaller size. This difference is usually blamed on OS overhead, but in reality it is due to the difference between the salesman's base 10 notation and the computer indutstry's base 2 notation.

There have been exceptions over the years. The famous MS-DOS formatted 1.44MB floppy was a mixture of these notations 1440 (base 10) kilobytes (2^10) bytes and the Commodore 8x50 drives with stated capacities of 512kB and 1MB per floppy (they used the common 360kB MS-DOS floppy ) with an actual capacity a little bit greater than those measures.

Computers are not limited to binary switches. Binary is the most common simply because it is the easiest to use with current electronics. The first programmable computer design used mechanical logic and was a base 10 computer that used punch cards to load programs and enter data . (Babbage's Analytical Engine) The linked content includes Analytical Engine emulators if you would like to try your hand at programming the earliest known digital computer design

I've got something of a handle on four high-level languages, and it still messes with me that when it gets down to it, processors are just massive collections of switches.

Just makes me happy that I don't have to be super hardcore and learn assembly, although at some point I might attempt taking on C++ or some other sort of lower-level language.

You don't really understand a computer until you learn pointers. At the heart of it, programming is just moving memory around. C is much cleaner and better for understanding pointers, so I would suggest you dive right in.

Oh and there's nothing particularly hardcore about assembly. It's just a different level of abstraction from object-oriented structures. In fact, programming in assembly makes you aware of the amount of bloat embedded in object-oriented design.

And no, while programming is not a black art, the set of people who are able to do it even at a basic level of competence is surprisingly small. It's not really even an intelligence thing. I've known a number of very intelligent people who couldn't program to save their souls.

Programming just appears to require an odd balance of creativity and structured thought that's rare.

It definitely requires an aptitude, and you have to enjoy spending a lot of time in your own head. But then, I don't have the aptitude for fixing cars, or designing circuits, or stuff like that.

Eric Raymond claimed that most hackers fall into the INTP/INTJ personality types on the Meyers-Briggs spectrum, which are relatively uncommon in the general population. I can believe that, based on what I know about my fellow developers and some classmates. I went through the program with people who were bright, motivated, and loved working with computers, but could never really grok writing code. There were only really a few people who truly seemed to get it, and these were the people who'd be in the computer lab all night hacking out stupid little ASCII-art games (VT220 terminals attached to a VAX/VMS cluster) in C or Fortran.

It definitely requires an aptitude, and you have to enjoy spending a lot of time in your own head. But then, I don't have the aptitude for fixing cars, or designing circuits, or stuff like that.

Eric Raymond claimed that most hackers fall into the INTP/INTJ personality types on the Meyers-Briggs spectrum, which are relatively uncommon in the general population. I can believe that, based on what I know about my fellow developers and some classmates. I went through the program with people who were bright, motivated, and loved working with computers, but could never really grok writing code. There were only really a few people who truly seemed to get it, and these were the people who'd be in the computer lab all night hacking out stupid little ASCII-art games (VT220 terminals attached to a VAX/VMS cluster) in C or Fortran.

I think I would even go so far as to say that programming isn't all that amenable to teaching. Yes, you can teach syntax, algorithms and theory, but either somebody gets it, or they don't, and you can't teach that.

I just had to create an account to comment after reading this really nice article. Just now I've been trying to learn how to program by myself. A year ago I read a book about python to see how it works. At first it felt like an enormous amount of information to store - like all this stuff about lists, strings, floats, functions, etc. In some way, just understanding many of these areas in programming took quite a long time for me, and I lost a bit of confidence that I could really do programming, and decided to walk away from it.

Still, I was intrigued about trying to code, but wondered about a process of how to learn it. months later, I decided a different language that was more flexible, like perl, and instead tried to read along, type my own examples, and run programs. I think I got a bit better, but I kept getting headaches over understanding it, so I walked away from it.

Then just a week ago, I learned of a site that teaches newbies how to code by setting simple tasks up for people to perform. So far, I'm starting to understand programming better, and found Ruby to be the language I like of the ones they had self paced courses for.

From this, I guess everyone has their own way of learning. Perhaps I'm atuned to interactive media - so the interactive website helped me more than merely reading books. If you're passionate about learning to code, try a variety of ways to do it and don't give up!

So far, I pretty much understand the programs of "Magic Happy Fun" - the syntax looks pretty similar to the languages I tried above. Although, for many newcomers, the whole last page has a huge set of information already... But I do like how this goofy language separates having to learn all the precise _syntax and punctuation_ in programming from the actual _concept_ of it. Who knows, maybe it was from seeing multiple languages that I understood its concepts...

I've got something of a handle on four high-level languages, and it still messes with me that when it gets down to it, processors are just massive collections of switches.

Just makes me happy that I don't have to be super hardcore and learn assembly, although at some point I might attempt taking on C++ or some other sort of lower-level language.

I would say that I really didn't understand programming until I learned assembly. Understanding how code and data are loaded, how program execution really works, register manipulation, interrupts, etc. An example on what knowing assembly might shed light on is buffer overflows. Everyone knows that a buffer overflow is when data is written to a section of memory that exceeds the size of the destination. But the interesting part is how that can cause code to be executed. Maybe it overwrote a function pointer, maybe the return address of a stack frame. These are things that understanding say PHP, Java, or C# will not give you low level insight into.

Man, I never can get myself to sit down and code anything start to finish.

I've decided to just take the plunge and jump right into the LWJGL from an introductory Java course (which actually requires two prerequisite programming courses at my college, so it's not so much introductory programming as much as it is "You don't know Java, but you know something else") and hope I can get something on screen before getting distracted. I think my goal's going to be to display a basic 3D polygon with a single light source with camera controls bound to WASD.

Then I'll implement a Newtonian physics system, let the user change the cube's materials, put the cube inside a bigger cube, allow the user to strike the cube with the mouse pointer with an arbitrary amount of force the user keys in, and by the time I finish trying to code that the project will be completely broken or I'll have something reusable elsewhere!

Interesting article. Definitely aiming low for a lot of the Ars's typical readers I'd guess. I'm a little puzzled about the time spent explaining the binary number system. There really wasn't anything later in this article that got anywhere close to needing to use that information. Off hand, I don't see the series heading to where it would really need to make use of that information either.

Carl Sagan's quote definitely rings true and probably applies to most everything with in reach of anyone reading this article and I'm not just talking about computer showing up in everything. Just think about plastics and how much goes into making it. Gathering the raw material. Purifying and processing it. Designing and building the molds for the plastic part. Manufacturing it. Assembling it. Every one of these has a dozen steps or more and is often isolated from the next step so that no one knows all the steps and one person probably couldn't know all the steps because there is too much to learn. Our society truly is built upon the knowledge base built up by our predecessors and only survives based on the communal knowledge of the whole.

Your article had a few typos, which is fine, but also a few kludgy statements with too many parantheses, pop-culture references and digressions. If this is meant to be a serious introduction -- and I'm not sure what else it could be -- it would be better served without fluff, and with absolutely clear writing. That's what you're essentially trying to teach, after all.

Want to echo first commenter on saying I wouldn't mind a series of articles in such a vein.

I do like what you have said here. But from my too many years of experience, I would say that the single thing that most differentiates a programmer from a non-programmer is this:

If and only if A, then B.

That's it. Iff A, then B. The idea that B happens only if A, and stringing that together into C, D, E, is the single most important thing a good programmer can know. It seems trivial, but most people do not have this skill, and will try to leap right from A to E. And they will fail. This is not a natural way of thinking. We're goddamn freaks!

I do see that this is what you're aiming at, but never quite state it explicitly.

I do like what you have said here. But from my too many years of experience, I would say that the single thing that most differentiates a programmer from a non-programmer is this:

If and only if A, then B.

That's it. Iff A, then B. The idea that B happens only if A, and stringing that together into C, D, E, is the single most important thing a good programmer can know. It seems trivial, but most people do not have this skill, and will try to leap right from A to E. And they will fail. This is not a natural way of thinking. We're goddamn freaks!

I do see that this is what you're aiming at, but never quite state it explicitly.

Ummmmm, It's a bit of a jump from MHF to recommending SICP and LtU as further reading. Just sayin'!

Other than that, I whole-heartedly support the idea of sites lik Ars explaining programming to beginners. Back in the day, we used to have computer magazines that did this, which is how I learned to program, but these days, there doesn't really seem to be an equivalent where somebody interested in technology can serendiptiously bump into an article explaining how to program.

It would be nice to see some articles on higher level stuff too though, for those "professional" programmers that never went through a formal CS program, so they haven't learned about how a compiler actually works, for example, yet this type of knowledge makes the difference between being an average programmer and a highly-productive guru!

I've been trying myself to learn programming for a while. I work with IT since forever, and beyond very basic scripting, I know nothing. Sometimes I face some situations where I wish I could program, as it would save me lots of time and make my boss happier.

I've tried some online resources but I always get stuck. Results take too long to come, sometimes I even fail the basic exercises - compiler or programming environment return an unexpected error, etc., and I can't carry on.

I have good feelings about the series, let's see.

In the meantime, does anyone have any suggestions on how to begin? I'd like to learn C++, but if you have any advices on how to begin, maybe some other more simple language where I could grab the basics?

I would say that I really didn't understand programming until I learned assembly. Understanding how code and data are loaded, how program execution really works, register manipulation, interrupts, etc.

I agree 100% Heap. I commented on an article here called 'Zen of programming' (or something like that), in which I stated that programming zen is knowing assembly. One programmer after another commented that zen is knowing your current language to its fullest. Yet, it does not matter what language you use, it all is going to compile into machine readable object code, and if one does not know how the compiler took the high level code and compiled it into machine code, and if one does not understand that machine code, then one will never reach zen programming cause they really have no clue how computers process data. They only know how their particular language does it, not the computer. And that language is usually very very far away and isolated from the processor and/or direct memory access.

Once you learn assembly, you not only learn how the processor works, but you almost instantly understand how ALL languages work at once. If not, then write code some in a high level language and then compile it and disassemble it, and then compare it with your source. You will see how a language creates a list of assembly instructions out of your high level code. And there will be one shining detail about the generated disassembly instructions.... all languages convert to them.

As the author stated, there are no 'do all' high level programming languages. The only language that does everything you would ever want any language to do for you... is assembly. Learn it, and consider yourself enlightened.

Just makes me happy that I don't have to be super hardcore and learn assembly,

My first computer was a Sinclair Spectrum - awesomely cheap and pretty powerful for its time. The most wonderful aspect of it was that it was, by current standards, insanely simple. I worked my way through a 236 page book that broke down the entire operating system's machine code. It was an incredible feeling to know that I had a grasp of how every aspect of the machine worked at the basic code level - well worth the experience.

Unfortunately the vast majority of current OSes are orders of magnitude more complex and learning the source for a Linux kernel would be a significantly harder undertaking (let alone the code for the rest of a distro stacked on top), which is a pity in some ways.

Still, don't let that put you off assembler, which isn't really that hard at all (especially not compared to the complex operations available in high-level languages). Write short modules with plenty of clear comments and it's really pretty easy to write and understand.

Love the image! I keep trying to tell people it's like Legos - you have some bricks and you just have to put them together the right way. Hopefully, future generations have a basic competence in programming, even if it's not their primary occupation.

I would have loved to have been able to consult an online programming resource back in the day. Of course, those were the days the BBC, Sinclair ZX81, Acorn Electron, Spectrum and Commodore 64, so I learnt by typing in programs from early computer magazines; complete with typos, theirs or mine!

I'm curious to see how much I remembered and how much I can still learn.

Matt Ford / Matt is a contributing writer at Ars Technica, focusing on physics, astronomy, chemistry, mathematics, and engineering. When he's not writing, he works on realtime models of large-scale engineering systems.