You may redistribute this newsletter for noncommercial purposes. For commercial use contact info@ganssle.com.
To subscribe or unsubscribe go to http://www.ganssle.com/tem-subunsub.html or drop Jack an email.

How do you get projects done faster? Improve quality! Reduce bugs. This is the central observation of the quality movement that totally revolutionized manufacturing. The result is a win-win-win: faster schedules, lower costs and higher quality.

Yet the firmware industry has largely missed this notion. Deming et al showed that you simply can’t bolt quality onto an extant system. But in firmware there’s too much focus on fixing bugs rather than getting it right from the outset.

In fact it is possible to accurately schedule a project, meet the deadline, and drastically reduce bugs. Learn how at my one-day, fast-paced Better Firmware Faster class, presented at your facility. There's more info here.

Thanks for the support for the Muse's new format! Without exception the email was very positive.

Bob Paddock sent
a link to Analog Devices application handbook.
He also mentioned that they have most all of their seminar handbooks on-line here.

The folks at The Microprocessor Report did a look-back at the state of the industry 25 years ago, and made this interesting observation: The tech of the time was the 386, which was built in 1.5 micron geometry with 275,000 transistors on a 103 mm2 die. Today's Ivy Bridge has 1.4 billion transistors at the 22 nm node on a die less than twice the size of the 386's. If
the latter were built using the 22 nm node it would occupy just 0.02 mm2.

Chuck Petras wrote about electronics education: The folks over at Digilent have been advancing the state of electronics learning
, check out their "Real Analog" course, "Electronics 101," and "Real Digital" offerings.
Products that I think are real fun are the "Analog Discovery" lab in a dongle and their "Electronics Explorer Board".
Its really getting to the point that a person with a smart phone anywhere in the world can learn this stuff. Only the hands on portion would be prohibitive to 2nd & 3rd world learners.

Quotes and Thoughts

Documentation is a love letter that you write to your future self. - Damian Conway

Tools and Tips

Feel free to submit your ideas for neat ideas or tools you love or hate.Peter McConaghy noted that the URL in the last issue for Bray's Terminal no longer works. The correct one is https://sites.google.com/site/terminalbpp/ .

Is there a new theory of General Relativity that explains dark matter and dark energy?

Methodology Tools

In the embedded space, UML has a zero percent market share.

In the embedded space, the Capability Maturity Model (CMM) has a zero percent market share (other than CMM1, which is chaos).

The Shlaer-Mellor process tags right along at zero percent, as does pretty much every other methodology you can name.

Rational Unified Process? Zilch. Design patterns? Nada.

(To be fair, the zero percent figure is my observation from visiting hundreds of companies building embedded systems and corresponding with thousands of engineers. And when I say zero, I mean tiny, maybe a few percent, in the noise. No doubt an army of angry vendors will write in protesting my crude approximation, but I just don’t see much use of any sort of formal process in real embedded development).

There’s a gigantic disconnect between the typical firmware engineer and methodologies. Why? What happens to all of the advances in software engineering?

Mostly they’re lost, never rising above the average developer’s horizon. Most of us are simply too busy to reinvent our approach to work. When you’re sweating 60 hours a week to get a product out the door it’s tough to find weeks or months to institute new development strategies.

Worse, since management often views firmware as a necessary evil rather than a core competency of the business they will invest nothing into process improvement.

But with firmware costs pushing megabucks per project even the most clueless managers understand that the old fashioned techniques (read: heroics) don’t scale. Many are desperate for alternative approaches. And some of these approaches have a lot to offer; properly implemented they can great increase product quality while reducing time to market.

Unfortunately, the methodology vendors do a lousy job of providing a compelling value proposition. Surf their sites; you’ll find plenty of heartwarming though vague tales of success. But notably absent are quantitative studies. How long will it take for my team to master this tool/process/technique? How much money will we save using it? How many weeks will it shave off my schedule?

Without numbers the vendors essentially ask their customers to take a leap of faith. Hard-nosed engineers work with data, facts and figures. Faith is a tough sell to the boss.

Will UML save you time and money? Maybe. Maybe even probably, but I’ve yet to see a profit and loss argument that makes a CEO’s head swivel with glee. The issues are complex: tool costs are non-trivial. A little one-week training course doesn’t substitute for a couple of actual practice projects. And the initial implementation phase is a sure productivity buster for some block of time.

Developers buy tools that are unquestionably essential: debuggers, compilers, and the like. Few buy methodology and code quality products. I believe that’s largely because the vendors do a poor job of selling – and proving – their value proposition.

Give us an avalanche of successful case studies coupled with believable spreadsheets of costs and time. Then, Mr. Vendor, developers will flock to your doors, products will fly off the shelves, and presumably firmware quality will skyrocket as time-to-market shrinks.

What do you think? Turned off – or on – by methodology tools? Why?

Battle of the CPUs: Cortex M4 vs. M0

In the last few years the industry has increasingly embraced the notion of using multiple processors, often in the form of multicore. Though symmetric multiprocessing - the use of two or more identical cores that share memory - has received a lot of media attention, many embedded systems are making use of heterogeneous cores. A recent example is ARM's big.LITTLE approach, which is specifically targeted to smart phones. A big Cortex-A15 processor does the heavy lifting, but when computational demands are slight it goes to sleep and a more power-frugal A7 runs identical code.

NXP's LPC43xx also has two ARM cores: a capable Cortex-M4 and a smaller M0. Since power constraints are hardly novel to phones, my question was: "if we mirror the big.LITTLE philosophy, what is the difference in performance between the M4 and the M0?"

It's challenging to measure the difference in power used by the cores as there's no way to isolate power lines going to the LPC4350 on the Hitex board I was using. The board consumes about 0.25 amp at five volts, but most of that goes to the memories and peripherals. To isolate the LPC4350's changing power needs I put a 5 ohm resistor in the ground lead to the board, and built the circuit in figure 1. The pot nulls out the nominal 0.25 amp draw, and multiplies any difference from nominal by 50. The output is monitored on an oscilloscope.

Figure 1: Current monitor circuit

The cores run a series of tests, each designed to examine one aspect of performance. The cores run the tests alternately, going to sleep when done. Thus, after initialization only one core is ever active at a time. When running a test the core sets a unique GPIO bit which is monitored on the scope to see which core is alive, and how long the test takes to run. One of those GPIO bits is assigned, by the board's design, to an LED. I removed that so its consumption would not affect the results. All of the tests use a compiler optimization level of -O3 (the highest). The tests are identical on each processor, with one minor exception noted later.

Figure 2 is an example of the data. The top, yellow, trace is the M4's GPIO bit, which is high when that processor is running. The middle, green, trace is the bit associated with the M0. Note how much faster the M4 runs. The lower, blue, trace is the amplified difference in consumed power. I attribute the odd waveform to distributed capacitance on the board, and it's clear that the results are less quantitative than one might wish. But it's also clear the M4, with all of its high-performance features, sucks more milliamps than the M0. So the current numbers I'll quote are indicative rather than precise, sort of like an impressionistic painting.

Figure 2: The FIR test results

The first test put both CPUs to sleep, which reduced the board's power consumption by about 10 ma; that is, both CPUs running together consume somewhere around 10 ma. First impression: this part is very frugal with power.

In test 0 the processors take 300 integer square roots, using an algorithm from Math Toolkit for Real-Time Programming by Jack Crenshaw. Being integer, this algorithm is designed to examine the cores' behavior running generic C code. The M4 completes the roots in 1.842 msec, 21 times faster to the M0's 38.626 msec, but the M0 uses only a quarter of the current.

The next test ran the same algorithm using floating point. The M4 shined again, showing off its FPU, coming in 12 times faster than the M0 but with twice the power-supply load. There's considerable non-FPU activity in that code; software that uses floating point more aggressively will see even better numbers.

Test 3 also took 300 floating point square roots, and is the only one where the code varied slightly between cores. On the M4 it uses the __sqrtf() intrinsic instead of the M0's conventional C function sqrt(). The former invokes the FPU's VSQRT instruction, and that CPU just screamed with 174 times the performance of the M0. It was so fast the power measurements were completely swamped by the board's capacitance.

One of the Cortex-M4's important feature is its SIMD instructions. To give them a whirl I implemented an FIR algorithm that made use of the SMLAD SIMD instruction. Since the M0 doesn't have this I used the SMLAD macro from ARM's CMSIS library that requires several lines of C. Not surprisingly, the M4 blew the M0 out of the water, completing 20 executions of the filter in 5.15 msec, 10 times faster than the M0 and for 9 times as many milliamps.

But I was surprised the results weren't even better, considering how much the M0 has to do to emulate the M4's single-cycle SMLAD. So I modified the program with a SIMD_ON #define. If TRUE, the code ran as described. If FALSE, the SMLADs were removed and replaced by simple assignment statements. The result: the M4 still ran in 5.15 msec. There was no difference, indicating that essentially all of the time was consumed in other parts of the FIR code. In other words, code making heavier use of the SIMD instructions will run vastly faster.

One note: in many cases the M4 consumed less power than the M0, despite the higher current consumption, since the M4 ran so much faster than the M0. The M4 was asleep most of the time. However, in many systems a CPU has to be awake to take care of routine housekeeping functions. It makes little sense to use the M4 for these operations when the M0 can do them with a smaller power budget, and even handle some of the more complex tasks at the same time.

Though the LPC43xx is positioned as a fast processor with extensions for DSP-like applications, coupled with a smaller CPU for taking care of routine control needs, it's also a natural for deeply embedded big.LITTLE-like situations where a dynamic tradeoff between speed and power makes sense.

The IDE was from Keil, which has pretty good support for debugging two CPUs over single a shared JTAG connection. I found it was quite functional, though it took a lot of clicking around to go back and forth between cores. The flashing of windows during each transition was a bit annoying. A better solution would be two separate IDEs sharing that JTAG instead of a single shared window, especially for those of us running multiple monitors.

I remember sending you an email circa 7-18-2007 regarding the use of Doxygen and its usefulness for documenting code.

Now 5 years later, here are my observations:

It IS a great tool; but you DO have to use it (I know that is a
surprise), documentation doesn't create itself.

You need to get into a good habit of when you make a new
function class or whatever the case may be to add the documentation
information in comments along with it, don't wait, later will never
come. Although good practice to begin, it takes a while to form the
habit, so to ensure one continues doing this good practice, just keep at
it.

It is necessary to have a controlling document, (I use
documentation.h) just like a company needs leadership so does
documentation. The controlling document contains anything and everything
that tells Doxygen how the documentation gets pieced together and over
all commentary such as the author(s), milestones, and links to any other
documentation.

One may need to document libraries separately to keep things
structured; Doxygen has methods to do this.

Use the documentation, Doxygen generates LOTS of useful
information such as call graphs, use graphs, and structure information
etc. Use that information to see if what you are doing makes sense
(which surprised me as I thought I knew my code but apparently not as
well as I had been inclined to think).

The documentation is what it is, it will demonstrate how
organized you are and or how cryptic you are. Both are important. If
you don't see what you want or need, that shows that something was
neglected and should be added or corrected.

You can support change logs etc using Doxygen, this turned out
to be very useful. It is especially useful if one references functions
in the change log into the main documentation. To support this type of
functionality I have used something like "change_log_X.h", for each
change log (I did one per minor version change).
I have used this to establish a version hierarchy as well as to link
from the change log to the program documentation. This was not an easy
thing to implement in Doxygen, but is possible through the 'group'
commands, it is important to read the documentation carefully regarding
this.

Basic version information cannot be easily (or at all) extracted
from your source code, so you have to tell Doxygen specifically about it
each time it changes.

It is ok to support 2 separate methods of documentation, just be
sure to use both and to organize both so they work together. That is the
most difficult lesson I've learned and the most difficult part to
maintain (one will tend to favor one or the other).

One can generate PDF's from Doxygen generated html documents
however this can create (very) large PDF's, so it is a good idea to be
careful what you included in your documentation if you intend to PDF it,
my 'small' program created close to 2300 pages. BE certain to include
the source as part of the documentation, although it adds a large volume
it also keeps a record of what is in the documentation (and what was
being documented).

Jobs!

Let me know if you’re hiring embedded engineers. No recruiters please, and I reserve the right to
edit ads to fit the format and intents of this newsletter. Please keep it to 100 words.

A young engineer was leaving the office at 3.45 p.m. when he found the Acting CEO standing in front of a shredder with a piece of paper in his hand.

"Listen," said the Acting CEO, "this is a very sensitive and important document, and my secretary is not here. Can you make this thing work?"

"Certainly," said the young engineer. He turned the machine on, inserted the paper, and pressed the start button.

"Excellent, excellent!" said the Acting CEO as his paper disappeared inside the machine, "I just need one copy."

About The Embedded Muse

The Embedded Muse is Jack Ganssle's newsletter. Send complaints, comments, and
contributions to me at jack@ganssle.com.

The Embedded Muse is supported by The Ganssle Group, whose mission is to help embedded folks get
better products to market faster. We offer seminars at your site offering hard-hitting ideas - and action - you
can take now to improve firmware quality and decrease development time. Contact us at info@ganssle.com for more information.

Do you need to eliminate bugs in your firmware? Shorten schedules? My one-day Better Firmware Faster seminar will teach your team how to operate at a world-class level, producing code with far fewer bugs in less time. It's fast-paced, fun, and covers the unique issues faced by embedded developers. Here's information about how this class, taught at your facility, will measurably improve your team's effectiveness.