An embedded system is a special-purpose computer system built into a larger device. An embedded system is typically required to meet very different requirements than a general-purpose personal computer.

Two major areas of differences are cost and power consumption. Since many embedded systems are produced in the tens of thousands to millions of units range, reducing cost is a major concern. Embedded systems often use a (relatively) slow processor and small memory size to cut costs.

The slowness is not just clock speed. The whole architecture of the computer is often intentionally simplified and degraded to lower costs. For example, embedded systems often use peripherals controlled by synchronous serial interfaces, which are ten to hundreds of times slower than comparable peripherals used in PCs.

Programs on an embedded system often must run with real-time constraints.
Usually there is no disk drive, operating system, keyboard or screen.

There are many different CPU architectures used in embedded designs. This in contrast to the desktop computer market, which as of this writing (2003) is limited to just a few competing architectures, chiefly Intel's x86, and the Apple/Motorola/IBM PowerPC, used in the Apple Macintosh.

One common configuration for embedded systems is the system on a chip, an application-specific integrated circuit, for which the CPU was purchased as intellectual property to add to the IC's design.

Debugging is usually performed with an in-circuit emulator, or some type of debugger that can interrupt the microcontroller's internal microcode.

The microcode interrupt lets the debugger operate in hardware in which only the CPU works. The CPU-based debugger can be used to test and debug the electronics of the computer from the viewpoint of the CPU. This feature was pioneered on the PDP-11.

Developers should insist on debugging which shows the high-level language, with breakpoints and single-stepping, because these features are widely available. Also, developers should write and use simple logging facilities to debug sequences of real-time events.

PC or mainframe programmers first encountering this sort of programming often become confused about design priorotoes and acceptable methods. Mentoring, code-reviews and egoless programming are recommended.

At the project's inception, the apollo guidance computer was considered the riskiest item in the apollo project.

The first mass-produced embedded system was the guidance computer for the minuteman missile. It also used integrated circuits, and was the first volume user of them. Without this program, integrated circuits might never have reached a usable price-point.

The crucial design features of the minuteman computer were that its guidance algorithm could be reprogrammed later in the program, to make the missile more accurate, and the computer could also test the missile, saving cable and connector weight.

All embedded systems have start-up code. Usually it disables interrupts, sets up the electronics, tests the computer (RAM, CPU and program), and then starts the application code. Many embedded systems recover from short-term power failures by skipping the self-tests if the software can prove they were done recently. Restart times under a 1/10 of a second are commonplace.

Many designers have found a software-controlled light useful to indicate errors. One common way to handle it is to have the electronics turn it off (which looks broken) at reset. The software turns it on at the first opportunity to prove the light works. After that, the code blinks it during normal operation, and maybe in patterns for errors. This reassures many users and technicians.

In this design, the software simply has a loop. The loop calls subroutines. Each subroutine manages a part of the hardware or software. Interrupts generally set flags, or update counters that are read by the rest of the software.

A simple API disables and enables interrupts. Done right, it handles nested calls in nested subroutines, and restores the preceding interrupt state in the outermost enable. This is one of the simplest methods of creating an exokernel.

Typically, there's some sort of subroutine in the loop to manage a list of software timers, using a periodic real time interrupt. When a timer expires, an associated subroutine is run, or flag is set.

Any expected hardware event should be backed-up with a software timer. Hardware events fail about once in a trillion times. That's about once a year with modern hardware. With a million mass-produced devices, leaving out a software timer is a business disaster.

State machines are implemented with a function-pointer per state-machine (in C++, C or assembly, anyway). A change of state stores a different function into the pointer. The function pointer is executed every time the loop runs.

Many designers recommend reading each IO device once per loop, and storing the result so the logic acts on consistent values.

Many designers prefer to design their state machines to check only one or two things per state. Usually this is a hardware event, and a software timer.

Designers recommend that hierarchical state machines should run the lower-level state machines before the higher, so the higher run with accurate information.

Complex functions like internal combustion controls are often handled with multi-dimensional tables. Instead of complex calculations, the code looks up the values. The software can interpolate between entries, to keep the tables small and cheap.

Some designers keep a utility program to turn data files into code, so that they can include any kind of data in a program.

Most designers also have utility programs to add a checksum or CRC to a program, so it can check its program data before executing it.

One major weakness of this system is that it does not guarantee a time to respond to any particular hardware event. Another is that it can become complex to add new features.

The strength is that it's simple, and on small pieces of software the loop is usually so fast that nobody cares that it's unpredictable.

Another advantage is that this system guarantees that the software will run. There's no mysterious operating system to blame for bad behavior.

Careful coding can easily assure that nothing disables interrupts for long. Thus interrupt code can run at very precise timings.

This system is very similar to the above, except that the loop is hidden in an API. One defines a series of tasks, and each task gets its own subroutine stack. Then, when a task is idle, it calls an idle routine (usually called "pause", "wait" or etc.).

An architecture with similar properties is to have an event queue, and have a loop that removes events and calls subroutines based on a field in the queue-entry.

The advantages and disadvantages are very similar to the control loop, except that adding new software is easier. One simply writes a new task, or adds to the queue-interpreter.

Take any of the above systems, but add a timer system that runs subroutines from a timer interrupt. This adds completely new capabilities to the system. FOr the first time, the timer routines can occur at a guaranteed time.

Also, for the first time, the code can step on its own data structures at unexpected times. The timer routines must be treated with the same care as interrupt routines.

Take the above nonpreemptive task system, and run it from a preemptive timer or other interrupts.

Suddenly the system is quite different. Any piece of task code can damage the data of another task- they must be precisely separated. Access to shared data must be rigidly controlled, with message queues or semaphores.

Often, at this stage, the developing organization buys a real-time operating system. This can be a wise decision if the organization lacks people with the skills to write one, or if the port of the operating system to the hardware will be used in several products. Otherwise, be aware that it usually adds six to eight weeks to the schedule, and forever after programmers can blame delays on it.

These are popular for embedded projects that have no systems budget. In the opinion of at least one author of this article, they are usually a mistake. Here's the logic.

Operating systems are specially-packaged libraries of reusable code. If the code does something useful, the designer saves time and money. If not, it's worthless.

Operating systems for business systems lack interfaces to embedded hardware. For example, if one uses Linux to write a motor controller or telephone switch, most of the real control operations end up as numbered functions in an IOCTL call. Meanwhile, the normal read, write, fseek, interface is purposeless. So the operating system actually interferes with development.

Operating systems protect the hardware from user programs. That is, they interfere with embedded systems development profoundly.

Since most embedded systems do not perform office work, most of the code of an office operating systems is waste. For example, most embedded systems never use a file system or screen, so file system amd GUI logic is waste.

Operating systems must invariably be ported to an embedded system. That is, the hardware driver code must always be written anyway. Since this is the most difficult part of the operating system, little is saved by using one.

Last, the genuinely useful, portable features of operating systems are small pieces of code. For example, a basic TCP/IP interface is about 3,000 lines of C code. Likewise, a simple file system. So, if Aa design needs these, they can be had for less than 10% of the typical embedded system's development budget, without a royalty, just by writing them. Also, if the needed code is sufficiently generic, the back of embeded systems magazines typically have vendors selling royalty-free C implmentation.

A notable, beloved exception to all of these objections is DOS for an IBM-PC. If you use a single-card computer, the BIOS is done, thus no drivers. DOS permits code to write to hardware. Finally, DOS doesn't do much, so it's compact.

Some systems require safe, timely, reliable or efficient behavior unobtainable with the above architectures. There are well-known tricks to construct these systems:

Hire a real system programmer. They cost a little more, but can save years of debugging, and the associated loss of revenue.

RMA, rate monotonic analysis, can be used to find whether a set of tasks can run under a defined hardware system. In its simplest form, the designer assures that the quickest-finishing tasks have the highest priorities, and that on average, the CPU has at least 30% of its time free.

Harmonic tasks optimize CPU efficiency. Basically, designers assure that everything runs from a heartbeat timer. It's hard to do this with a real-time operating system, because these usually switch tasks when they wait for an I/O device.

Systems with exactly two levels of priority (usually running, and interrupts-disabled) cannot have inverted priorities, in which a lower priority task waits for a higher-priority task to release a semaphore or other resource.

Systems with monitors can't have deadlocks. A monitor locks a region of code from interrupts or other preemption. If the monitor is only applied to small, fast pieces of code, this can work acceptably well.

This means that systems that use dual priority and monitors are safe and reliable because hey lack both deadlocks and priority inversion. If they use harmonic tasks, they can even be fairly efficient. However, RMA can't characterize these systems, and levels of priority had better not exist anywhere, including in the operating system.

User interfaces for embedded systems vary wildly, and thus deserve some special comment.

Designers recommend testing the user interface for usability at the earliest possible instant. A quick, dirty test is to ask an executive secretary to use cardboard models drawn with magic markers, and manipulated by an engineer. The videotaped result is likely to be both humorous and very educational. In the tapes, every time the engineer talks, the interface has failed: It would cause a service call.

Exactly one person should approve the user interface. Ideally, this should be a customer, the major distributor or someone directly responsible for selling the system. The decisionmaker should be able to decide. The problem is that a committee will never make up its mind, and neither will some people. Not doing this causes avoidable, expensive delays. A usability test is more important than any number of opinions.

Interface designers at PARC, Apple Computer, Boeing and HP minimize the number of types of user actions. For example, use two buttons (the absolute minimum) to control a menu system (just to be clear, one button should be "next menu entry" the other button should be "select this menu entry"). A touch-screen or screen-edge buttons also minimize the types of user actions.

Another basic trick is to minimize and simplfy the type of output. Designs should consider using a status light for each interface plug, or failure condition, to tell what failed. A cheap variation is to have two light bars with a printed matrix of errors that they select- the user can glue on the labels for the language that she speaks.

For example, Boeing's standard test interface is a button and some lights. When you press the button, all the lights turn on. When you release the button, the lights with failures stay on. The labels are in basic english.

For another example, look at a small computer printer. You might have one next to your computer. Notice that the lights are labelled with stick-on labels that can be printed in any language. Really look at it.

Another essential trick is to make any modes absolutely clear on the user's display.

If an interface has modes, they must be reversible in an obvious way.

Most designers prefer the display to respond to the user. The display should change immediately after a user action. If the machine is going to do anything, it should start within 7 seconds, or give progress reports.

If a design needs a screen, many designers use plain text. It can be sold as a temporary expedient. Why is it better than pictures? Users have been reading signs for years. A GUI is pretty and can do anything, but typically adds a year from artist, approval and translator delays and one or two programmers to a project's cost, without adding any value. Often, a clever GUI actually confuses users.

If a design needs to point to parts of the machine (as in copiers), these label these with numbers on the actual machine, that are visible with the doors closed.

A network interface is just a remote screen. It needs the same caution as any other user interface.

One of the most successful general-purpose screen-based interfaces is the two menu buttons and a line of text in the user's native language. It's used in pagers, medium-priced printers, network switches, and other medium-priced situations that require complex behavior from users.

When there's text, there are languages. The default language should be the one most widely understood. Right now this is English. French and Spanish follow.

Most designers recommend that one use the native character sets, no matter how painful. People with peculiar character sets feel coddled and loved when their language shows up on machinery they use.

Text should be translated by professional translators, even if native speakers are on staff. Marketing staff have to be able to tell foreign distributors that the translations are professional.

A foreign organization should give the highest-volume distributor the duty to review and correct any translations in his native language. This stops critiques by other native speakers, who tend to believe that no foreign organization will never know their language as well as they.