Vanishing Visibility

For novel ideas about building embedded systems (both hardware and firmware), join the 27,000+ engineers who subscribe to The Embedded Muse, a free biweekly newsletter. The Muse has no hype and no vendor PR. It takes just a few seconds (just enter your email, which is shared with absolutely no one) to subscribe.

By Jack Ganssle

I take great satisfaction in my tools. My fingers are not
strong enough to remove a bolt, but give me a wrench and my hand can perform
amazing new feats. We computer folks like to consider the PC a mind-tool that
increases the power and reach of one's brain. Conventional hand tools give us a
similar ability to manipulate the mechanical world in ways impossible via the
unaided human body.

I'm a fanatic about woodworking tools, keeping them
clean and sharp, buying only the best, collecting the cream of the technology of
yesteryear that, while now out of style, may still be the best solution to a
problem. Though power tools with big motors that hurl sawdust like a swirling
gale satisfy my testosterone craving for brute mechanical power, high quality
chisels and planes are among my favorite possessions. A hand plane works well
only if you take time to understand the wood, molding its use to the grain,
hardness, and even moisture content of the work. In contrast, a 2 horsepower
electric plane blindly tears through any obstacle leaving its marks of
destruction behind in telltale chatter-gouges. Yet you can't beat an electric
plane for removing lots of wood fast.

The same goes for the embedded world. This magazine
bulges with ads for all sorts of virtual assistants, each of which is aimed at
one part of the development process. Just as the hand and electric planes have
valid though different applications, no single embedded instrument is the silver
bullet for all circumstances. One of the skills of the engineer is the judicious
selection and use of the right mix of tools for each project.

This fascination with tools of all kinds led me to
start an emulator company back in the 80s. It was a wild and fascinating ride,
made much more interesting by the opportunity to look into the work of thousands
of developers, and to see how we grapple with the bugs that plague even the most
well designed systems. Recently, though, I decided to move on and sell the
company.

Yet the problem of getting products to market still
fascinates me. My love of tools of all sorts is undiminished. With no longer any
equity in the tool business and thus no conflict of interest with this column, I
feel freer to examine some of the issues that are surfacing in the 90s.

And I'm concerned. Scared, really, for the future
of developers in the embedded industry. It's driven by relentless forces none
of us can control and can sometimes barely understand. The twin forces of
technology advancements and frenetic business are backing engineers into a
metaphorical corner of impossible demands with terribly limited resources.

Now systems are more complex than ever, with new
breeds of bugs. Timing problems, once restricted to hardware, are an ever more
problematic firmware fact. RTOS complexities and excruciatingly complex
algorithms fan the fire of bugs.

Bugs will never go away. Better development
methodologies can (will? Not until we individual developers create a personal
passion for improvement) reduce the error rate, but never to zero. Debuggers -
of many types - will always be important tools.

Debuggers do one fundamental thing: provide
visibility into your system. Features vary, but all we ask of a debugger is
"tell me what is going on!" Sometimes we're interested in procedural flow
(single stepping, breakpointing); other times function timing or dependencies or
memory allocation. Regardless, we simply expect our tools to reveal hidden
system behavior. Only after we see
what's going on can we use our brains to understand "why that happened",
and then apply a fix.

My fear is we're removing our ability to look into the
systems. The visibility we take for granted is being eroded.

Technology Tribulations

In embedded systems emulators have always been one of the
choice weapons in the war on bugs. Yet, for as long as I can remember pundits
have been predicting their death. Though it seems as quaint as IBM's 1950s
prediction that the worldwide market for computers was merely a couple of dozen,
in fact 20 years ago many people believed that the 4 MHz Z80 would spell the
doom for ICEs. "4 MHz is just too fast," they proclaimed, "no one can
run those speedy signals down a cable."

Time proved them wrong, of course. Today's units run at
60+ MHz on processors with single-clock memory cycles, an astonishing
achievement.

The imagined speed limit is not limited to ICEs, as ROM
emulators and other debuggers that use a physical target system interface all
suffer from similar problems.

Is an end yet in sight? I believe so, though the
limiting frequency is a bit hazy. Today's approach of putting all or much of
the ICE's electronics on the pod removes the cabling and bus driver problems,
but electrons do move at a finite speed and even the fastest of circuits have
non-zero propagation delays.

CPU vendors squeeze the last bit of clock rates from
their creations partly by tuning their chips ever more exquisitely to the rest
of the system's memory and I/O. A danger signal is the current problem with PC
motherboards: it is so difficult to design a high speed Pentium-based
motherboard that Intel has had to assume that role. They are reportedly now the
largest producer of PC motherboards. In effect the computer is so tightly
coupled to the processor that only the CPU vendor can produce a reliable system
based on the chip! Clearly, an intrusion by any sort of development tool will at
best be problematic. Yes, today's Pentium emulators do work. Will tomorrow's
units be able to handle the continued push into stratospheric clock rates? I
have doubts.

Packages are creating another sort of problem. Heat,
speed, and size constraints have yielded a proliferation of packaging styles
that challenge any sort of probing for debugging. If you've ever tried to use
a scope on a 208 pin PQFP device, or, worse, a 100 pin TQFP,
you know what I mean. Yes, some tremendously innovative probing systems
exist - notably those from Emulation Technology and HP. Despite these, it's
still difficult at best to establish a reliable connection between a target CPU
and any sort of hardware debugger, from a voltmeter to an ICE.

Traditional (How can a few-year old technology have
traditions!) surface mount devices have exposed pins that you at least have a
prayer of getting to. Newer devices don't. The BGA (Ball Grid Array) package,
which is suddenly gaining favor, connects to a PC board via hundreds of little
bumps on the underside of the package - where they are completely inaccessible.
Other technologies bond the silicon itself under a dab of epoxy directly to the
board. All of these trends offer various system benefits; all make it difficult
to impossible to troubleshoot software and hardware.

OK, you smirk, these issues only apply to the high
end of the embedded market, where clock rates - and production costs - soar
with the eagles. Other, subtle influences, though, are wreaking havoc on the low
end.

Take microcontrollers, for example. These CPUs have
ROM and RAM on-board, giving a very simple, very inexpensive one-chip solution
for simple 8 and 16 bit applications. The 8051 is the classic example of this,
and indeed has been an amazing success that has survived twenty years of assault
by other, perhaps more capable, processors.

Single chip solutions are tough to debug, though,
since the on-board memory means there's generally no address/data bus coming
to the outside world. An extreme example is Microchip's 8 pin PIC part. 8
pins! The only ins and outs are I/O.

Various debugging solutions exist, but the
traditional solution is the bond-out chip, a special version of the processor,
with extra pins, that bring all important signals to the outside world,
especially those oh-so-critical address and data lines needed to track program
execution. With a proper bond-out-based ICE you can track everything the code
does, in real time, with no compromises. Perfect, no?

Well, a few wrinkles are starting to surface. For
one, the chip vendors hate making
bond-outs. The market is essentially zero, yet every time the processor's mask
gets revised a new bond-out is needed. In the old days chip vendors swallowed
hard, but did make them reasonably available.

Now this is less common. With the 386EX (which is not
a microcontroller, but which benefits from a bond-out) Intel announced that only
a handful of vendors would get access to the special version of the part,
probably to some extent increasing the cost of tools. Is this an indication of
the beginning of the end of generally available bond-out parts?

Sometimes the bond-out is not kept to current mask
revisions. I know of at least one case where a vendor provides bond-outs that
will not run at full speed, essentially removing the critical visibility of
real-time execution from developers. This situation puts you in the awful
conundrum of deciding "should I buy an expensive tool! that forces me to run
at half speed, no doubt destroying all timing relationships?"

Sometimes - often - the bond-outs will not run at
reduced voltages. Your 3 volt system might require a pod which is a convoluted
mix of 3 and 5 volt technologies, creating additional propagation delays as
voltages get translated. In effect a non-intrusive tool becomes subtly more
intrusive, in ways that are hard to predict. Voltages are declining fast - some
CPUs now run at sub-one volt levels - so the problem can only get worse.

A very scary development is the incredible
proliferation of CPUs. Vendors are proud of their ability to crank out a new
chip by pressing a few buttons on a CAD system, changing the mix of peripherals
and memory, producing variant number 214 in a particular processor family.
Variants are a sign of a good, healthy line of parts (look at that mind boggling
array of 8051 parts), but are a nightmare for tool vendors. Each requires new
hardware, software, support, evaluation boards, and the like. In the "good old
days", when we saw only a few new parts per year per family, support was easy
to find. Now my friends who make microcontroller tools complain of the frantic
pace needed to support even a subset of the parts.

As tool consumers you probably don't care about the
woes of the vendors. But part proliferation creates a problem that hits a bit
closer to home: for any specific variant there may only be a handful of
customers. Tool support may never exist for that part if vendors feel there's
not a big enough market. An odd fact of the tool market (from compilers to ICEs)
is that the health of the market is a function of the number of customers using
a chip, not the number of chips used. CPU vendors are happy to get one or two
huge design wins, say an automotive company that sucks up millions of parts per
year. Tool folks might only sell a couple of units to such a customer, far too
few to pay their huge development costs.

I know of dozens of big companies left stranded by
CPUs with no support, who have had to in some cases build their own tools. Some
even write custom compilers! There can't be a more expensive way to do a
project.

Non-Computers

As one interested in the philosophical implications of our
business I'm fascinated with the drift to "virtual" implementations of,
well, everything. Hey, your cell phone is a fascinating connection - without
wires - to a billion other phones on the planet; it's part of the biggest
machine on the planet, yet looks like nothing more than a few bucks of
electronics. Similar virtual connections underlie just about everything on the
Internet as well. Now we're seeing a move to "virtual" microprocessors.
Today, you can buy a micro, that, well, just
has no physical being. It's a file of VHDL equations.

Buy a virtual Z80 or 186 and then incorporate that
into your own design. Burn it into a big gate array or FPGA or ASIC. The idea is
to reduce chip count by integrating the processor into the ASIC along with all
of your proprietary circuits. It keeps costs and board size down.

We're used to software being a rather ethereal
"thing", with no real physical implementation. Now we can buy "hardware"
equally as ethereal. It's software hardware. Hardware software. Or something.

Some of the vendors promoting these ghostly CPUs
promise the ability to customize the processor. Add instructions with the click
the mouse! It would seem a magic solution to precisely matching computational
power to your application's needs.

But, how will you use the new instructions? Code in
assembly language only? Write your own compiler?

Worse, with the CPU buried inside of a big chip, how
do you plan to troubleshoot your code?

Cache, prefetchers, superscaler designs, and lots of
other ever-more-common processor features all create debugging headaches. My
point here is not a complaint against the technology; it's to voice a concern
that we dare not blindly design in the latest cool thing without understanding
how we'll find our bugs. We've got to realize that these new features have
both benefits and perils. I have seen too many designers in the flush of initial
project optimism forget that soon they'll be up to their eyeballs in bugs, and
that they will need some sort of tool to give them
yes"> visibility into their code.

Technological problems are a funny thing. The
barriers rarely stand for long. Customer needs quickly translate into solutions.
One only has to look at IBM's PowerPC parts, some of which include a built-in
debug port that even supports real time trace, to see what the future might
bring.

Modern-day Luddites fear technology, thinking it's
advancing much faster than we poor humans can adjust to the changes. In last
month's column I metaphorically walked a bit in their shoes, expounding my
concerns that our embedded technology is moving at a faster rate than our
ability to efficiently develop new systems. I see our infatuation with faster,
smaller, and higher integration to be leading us down a path where the costs of
development are skyrocketing.

CPU cores hidden away inside ASICs give fabulously
small systems, yet that buried processor is all but impossible to probe. Couple
bus cycles within fractions of a nanosecond to a peripheral and you leave no
margin for your tools. One-off CPUs, whether from burying a VHDL virtual
processor inside a high integration part, or from the huge explosion of
derivatives of popular parts, are often tool-orphans. Tool vendors, after all,
won't invest huge sums in developing products for a particular CPU unless they
see a large, healthy market for their offerings.

Even seemingly boring issues like device packaging
further isolate us from the processor. If we can't probe it, we can't see
what's going on. We lose the visibility needed to find bugs.

We see some glimmers of "solutions" to these
technology problems. An example is
Motorola's Background Debug Mode (BDM). Their recent processors include a
special serial port used only for debugging. Transistors are cheap - it makes
sense to integrate extra onto the processor as a special debug port.

Similarly, other vendors are putting variants of JTAG
(IEEE 1149.1) ports onboard their fastest CPUs, sometimes to aid product
testing, others times for debugging. Like the BDM these are all essentially
serial interfaces that give one access to the core while using only a minimum
number of pins.

A debugger on-board the chip eliminates all speed
issues. It functions despite cache's complications. Even when the CPU is
hidden in a huge ASIC, if just a few pins come out for the serial debugger, then
designers will have some ability to troubleshoot their code.

JTAG/BDM lets you set simple breakpoints, single
step, and examine and change memory and I/O... in short, everything you can do
with a normal PC design environment, like Microsoft's Visual C++.

BDM-like solutions are a reasonable subset of a
debugging methodology. They're so inexpensive that every developer can have
the toolset. Some tool vendors properly promote these as nothing more than
debugging adjuncts, devices designed for working on certain non-real time
sections of code. Their message is to "use the right tool for the right job -
a BDM where it makes sense, and a full-function emulator for real time
troubleshooting."

Unhappily, too many of us are so taken with the
prospect of cheap tools that we hear the good news about BDM/JTAG but somehow
don't listen to the second part of the message. I believe chip vendors,
frustrated by the difficulty of providing real emulation for their latest
creations, promote these limited serial debuggers as the "perfect" embedded
debugging tool.

Cheap serial debuggers give us about the same
debugging resources used by the millions of our programming brethren working on
PCs, UNIX machines, and mainframes. So, why am I complaining?

Though the database programmers of the world have
never had these tools, we need them;
our problems are quite different. Our systems are riddled with interrupts and
DMA. We run preemptive multitasking with closely coupled tasks. Most embedded
systems have a critical real time component to their nature, so we need
tools that let us work in the time domain.

Though the new breed of serial debugging devices does
make it possible to get our systems operating properly, they don't deal with
the complex nature of our products. We're forced to work heroically due to
tool limitations.

But there's much more to the tool woes. I see developers
squeezed between technology problems - which may ultimately prove to be solvable
- and much more insidious business issues, some of which I honestly can't
understand.

The Cost of Money

As an ex-tool vendor I can't count the times I've heard
"well, we really need decent equipment, but my boss won't let me spend the
money."

It matters little what equipment we're talking
about. Once I wrote an off-hand comment about companies who won't upgrade
computers. An avalanche of email filled my electronic in-box, from developers
saddled with 386-class machines in the Pentium age. We live in front of our
computers, spending hours per day with it. It's incomprehensible to me that a
business won't provide very expensive engineers new machines every two years.
I've seen compile times shrink from tens of minutes to tens of seconds when
transitioning just one generation of computers; surely this translates
immediately into real payroll savings and faster development times!

Yes, we have an insatiable appetite for new goodies.
Glimmering new scopes, emulators, logic analyzers, and software tools fill our
thoughts much as kids dream of Tonkas and Barbies. Very often, though, the gap
between what we want and what we get is as wide as the Grand Canyon.

Now, I know the cost and scarcity of capital. Just
try going to the bank, hand humbly in hand, looking for working capital when you
really need it. Venture capital is the seed of high tech, but is much less
available than people realize.

There's never enough money, especially in smaller
businesses, so every decision is a financial trade-off between competing needs.

I also know the cost of payroll. It's by far the
biggest expense in most technology businesses. Yet many managers view payroll as
a sunk cost. Years ago my boss told me "I have to pay you anyway, but to buy
that scope costs me real money."

Well, no, actually, he didn't have to pay me or any
of the engineers. He had options: do less engineering with fewer people and save
on salary. Use us inefficiently and ignore the costs. Work to improve our
efficiency and either get products out faster or get the same work done with
fewer people.

This concept of payroll as a fixed cost is a myth, one that
destroys too many technology companies. Managers do have the ability to manage
this cost, the biggest one of all, effectively. It's not easy and it's never
"done"; effective management requires an intimate understanding of the
processes involved, a willingness to experiment and tune, and a dedication to a
never-ending quest to find lots of 1 and 2% improvements, as the magic 20%
efficiency improvement silver bullet does not exist.

Our culture of absorbing payroll as a fixed expense
means we battle for weeks over $10,000 tool costs while ignoring, or accepting,
a million dollars in salary costs.

Perhaps this is symptomatic of uninformed managers,
and exhibits itself in every area of development. One friend who makes a living
designing products as a contractor tells me story after story of companies who
happily spend a quarter million dollars on tooling for the product's plastic
box yet balk at a quote for $30k in custom firmware.

I see an increasing number of companies embracing the
noble ideal of "doing more with less" without understanding that sometimes
spending a bit on tools is the fastest route to that ideal.

Time To Market

You can't pick up a trade magazine today without seeing
the industry's mantra - Time To Market - gracing every article and ad. All
sorts of studies indicate getting a product out first is the best way to gain
market share and profitability. Whether this is true or not makes little
difference; the important point is that management has universally bought into
the concept, leaving it up to engineering to somehow "make it so".

The time to market furor explains surveys that show
development time to be the number one priority of many engineering departments,
with cost usually running third after quality. Whether we agree with the goals
or not, it is at least a reasonable ranking of priorities.

Get it done fast. Do a good job. And then, worry
about costs. These are the constraints we're working under, in order.

But we can't develop a realistic plan without
considering all of the facts. One is that salaries continue to rise, especially
now, and especially for highly trained and scarce engineers. None of us can
control this.

Fast, gotta be fast. Cheap, too, somehow we have to save
bucks wherever we can. OK - now what?

Astonishingly, more and more companies are making
decisions like: no tools. Poor tools. Or, let's pick a chip that has no tools,
or for which decent tools are a but a dream.

How on earth are we supposed to be fast with
inadequate tools? Won't costs skyrocket as we spend more time struggling
finding bugs - bugs that are more evasive than ever as products get more complex
-using what amounts to toys?

In the face of increasing salaries, more complex
products, and terrifying schedules, all too often the question "need
normal">how are we going to get the work done" never gets answered
honestly.

Yet, as you read this today,need
yes"> hundreds of companies pursue development strategies that are
doomed to cost too much and take too long. Some use custom microprocessors - for
good reasons and bad - and build their own compilers and debuggers. I'm not
saying this is necessarily wrong, it's just costly. Some of these businesses
understand and manage the issues; others just yell louder at the developers to
meet the schedule.

I've seen months spend gluing CPUs inaccessibly
into the core of a monster ASIC, without the least thought given to debugging!
and then the hardware guys present the firmware folks with this fiat accompli
and only two months left in the schedule.

We must
look at the technology challenges posed by the parts we chose, and then at our
options for building the system and then finding bugs. We need
normal">must find or invent ways of achieving our fast-quality-cheap goals
before committing to a difficult or impossible technology.

And, management must understand that time costs
money, real money, not just sunk costs. Further, crummy development environments
never yield faster product introductions.

Are We the Problem?

This is not a Dilbert-like rant against managers. We're
all infatuated with the latest technology, and we all are convinced that, this
time, bugs won't be as big of a problem as last time.

Embedded processors through what's left of the 90s,
and on into the next century will continue to get faster, more highly
integrated, and will generally become much tougher to work on that those of
yesteryear. That's a fact as sure as salary inflation and time to market
pressures.

It's largely up to the developers doing the work to
educate management, and to make intelligent decisions yielding debuggable
products.

Often we are perceived as wanting everything without
decent justifications. Faster computers, private offices, better software tools.
Without educating our bosses about how these things save them money we'll lose
most battles.

A common joke is the "capital equipment
justification," all too often more an exercise in creative writing than in
fact gathering and analysis. Sometimes tool vendors will present you with
spreadsheets of savings from using their latest widget, but none of us really
trust these figures. It's far better to use hard-hitting, quantitative data
accumulated from your own hard-won experience. Don't have any? Shame on you!

One well-known bug reducer is recording each bug,
stopping and thinking for a few seconds about how you could have avoided making
the mistake in the first place. Take this a step further and think through (and
record!) how you found it, using what tools. Log it all in an engineering
notebook as you work; it's a matter of a few seconds time, yet will
help you improve the way you work. This notebook will also serve as the raw data
for your cost justifications. If that cruddy freeware compiler generated a bad
opcode that took a day to find, a little math quickly will show how much money a
multi-thousand dollar commercial package would save.

As you educate management, educate yourself, and
remember those lessons when you're the boss!

I'll end as I started. No longer in the tool biz I
have no vested interest in seeing anyone spend a nickel on the latest widget. I
am a great believer, though, that the human condition has always been improved
by our use of tools, from mastering fire to learning to use the simplest of
plows, to today's amazingly complex and sophisticated embedded tools.

Bugs will never go away. Better development
methodologies can (will? Not until we individual developers create a personal
passion for improvement) reduce the error rate, but never to zero. Debuggers -
of many types - will always be important tools. Make sure your hardware design
and processor selection will allow you to use tools effectively, and then make
intelligent decisions about which to get.

Do you need to eliminate bugs in your firmware? Shorten schedules? My one-day Better Firmware Faster seminar will teach your team how to operate at a world-class level, producing code with far fewer bugs in less time. It's fast-paced, fun, and covers the unique issues faced by embedded developers. Here's information about how this class, taught at your facility, will measurably improve your team's effectiveness.