When it comes to processor architecture we still don’t have a clear agreement on what sort of design philosophies should be followed. How do you make a faster general purpose processor? This is a question about architecture.

Processor perfection

If you want to start a really good argument among a group of hardware enthusiasts just mention the fact that such-and-such a processor is better then some other!

When it comes to processor architecture we don’t even have a clear agreement on what sort of design philosophies should be followed to produce a good one. What we do have is a collection of techniques and approaches that all promise to deliver processor perfection.

In the early days of computer design the big problem was simply that any sort of processor took so much electronics to build that the main concern was keeping things simple. Even when integrated circuits became possible it was still difficult to achieve the sort of component density needed to implement a complete processor on one chip.

As a result the early microprocessors were very primitive even by the standards of computer design of the day. This is the reason that they were called "microcontrollers" or "microprocessors" - even the idea they could be used to create a small computer system was thought to be slightly silly.

The very first processor design philosophy was just the simple idea that more is better. Designers attempted to make a processor do more at each step and tried to make each step take less and less time.

Most processors are “synchronous” – that is they use a clock to time when instructions occur. There are asynchronous designs were different things occur at different times and rates across the processor but so far these are mostly experimental systems.

At its simplest, a synchronous processor carries out one instruction (or at least the same number of instructions) per clock cycle. Hence you can get more processor power simply by increasing the clock frequency.

Initially processors operated at 1MHz (i.e. 1 million pulses per second) but over the years clock rates have shot up to 1GHz (i.e. 1 thousand million pulses per second) or more. What this means is if nothing else had changed the processor we use today would be 1000 times faster than the ones we used back in the 1980s.

It is worth thinking about these figures for a moment. The electronics becomes more difficult to design as the frequency it handles increases. The problems are that much above 1MHz the signals start to radiate as radio waves. For example 1MHz is classed as medium frequency, 1GHz is UHF and used for TV signals, 2GHz is the frequency used by WiFi and microwave ovens. So building a computer at 1GHz and above gets increasingly difficult as the frequency goes up.

It is still true that increasing the clock rate is one of the most effective way of speeding things up – but it’s not the most interesting and recently the effort to increase clock rate has run out of steam so much so that we can no longer rely on simple clock speed increases to keep our bloated software running faster each year.

CISC and RISC

So the really important question is -

How can you make a processor faster without increasing the clock speed?

The most obvious way is to increase the amount done per clock pulse. This is so obvious that it’s what has happened to processors without anyone really working out that this is the best thing to do!

You could say that

performance = clock speed x instructions per clock pulse

Over time processors supported more and more instructions that did more and more.

For example, early processors only supported addition and subtraction instructions and to multiply and divide you had to write a small program that implemented general arithmetic using nothing but addition and subtraction. Of course today’s processors have special numerical hardware built into them that can do high precision arithmetic in a single operation. This is an example of each instruction doing more and implementing what used to be multiple instructions within one instruction.

More, bigger, complex instructions make the processor do more per clock pulse.

How could such an approach be wrong?

Surely it must be better to have a powerful multiply instruction than have to create one from feeble addition instructions?

Well to a great extent the obviousness of this argument is based on a misunderstanding of what computers do.

Back in the early 1970s John Cocke at the IBM research labs analysed exactly what instructions a processor used most often. He discovered that of the, say, 200 instructions a processor might support, only 10 or so were used at all often. In fact these 10 instructions accounted for over two thirds of the processor’s time.

Later this became enshrined in the 80/20 rule – 80% of the work is done by 20% of the instructions.

What this implied was that by making these core instructions work as fast as possible the processor would be greatly speeded up. In fact given that only 10 or so simple instructions were really used why bother worrying about the rest!

This gave rise to the idea of a RISC, or Reduced Instruction Set Computer, which implemented a very small set of instructions as efficiently as possible.

The simple instructions did very little but they could be executed at a very fast rate! This is almost the basic principle of a computer, which does very little very very quickly!

The idea is a little more subtle than it is usually painted. John Cocke's idea was really to not increase the complexity of a machine unless the instruction was used often enough to provide a payback. That is, it isn't that short simple instructions are better by some principle it is more that you should tailor the instruction set of a computer so that the instructions most use are the best implemented.