Error Code Boggled the Boiler

We have a Munchkin condensing boiler configured for natural gas that, with routine maintenance, ran reliably for three years. One morning, it stopped heating, and the controller display showed a code of F13.

The code on the display meant that the blower in this boiler wasn't spinning fast enough to safely open the gas valve and begin ignition. The problem was that, according to the diagnostics display on the controller, it actually was spinning at least 200RPM faster than the lowest speed for ignition. If I reset the controller, it would begin ignition, heat the water, satisfy the demand, and then shut down with that mysterious error code again. Sometimes, it would inexplicably shut down before demand was met.

I decided that I needed to make sure all the sensors were working as they should. One by one, I tested them to make sure the controller responded appropriately. The controller reacted pretty much as I would have expected. While doing this, I noticed that many key parts weren't made by Munchkin. The blower was made by EBM Pabst of Germany; the gas solenoid was made by Dungs of Denmark; the controller was made by S.I.T. Controls of Italy; the rest was apparently assembled in the US. This thing reminded me of the Tower of Babel story in the Bible.

EBM Pabst has a website with lots of data about its blowers. I had presumed that the blower was under some sort of control because, when I unplugged it while powered up, it went to full speed. From EBM Pabst, I found out that if I grounded the PWM speed control line, it should stop and go to a standby mode. I tried that, and the blower didn't shut down. Now I had reason to spend hundreds on a new blower. After replacing the blower and checking for leaks, I put it to the test. Everything worked properly again.

Looking back, I have to wonder whether the folks at S.I.T. Controls considered this failure mode. The F13 code was being triggered because the blower speed was free-running, not because it was under-speed, as the error code documentation suggested. The controller also wasn't consistent about why or how it tripped, leaving everyone wondering what the controller was balking at.

Despite detailed performance data, the boiler manuals did not have a Control Narrative section. The Control Narrative explains in gory detail exactly what the automation does in response to all stimuli in every automation state that it could have. It is a key deliverable to clients. As a registered controls engineer myself, I wouldn't dream of designing a potentially dangerous device such as a gas boiler without explicit documentation of this sort.

Without that narrative, even experienced, reputable service firms have little idea what the unit is supposed to be doing, why an error code might be there, or what tests one might use to diagnose it. It is not enough to failsafe on devices like this. Customers or their designated repair firms deserve a reasonable explanation of what kept a device from working.

This entry was submitted by Jacob Brodsky and edited by Rob Spiegel.

Tell us your experiences with Monkey-designed products. Send stories to Rob Spiegel for Made by Monkeys.

I really wanted to do a post mortem on the blower, but it was assembled in a way that made removal of the control board very difficult.

To make a long story short, the motor control circuit board actually penetrates inside the motor. The screws holding this circuit board to the motor were obscured by the impeller casing and casing was not possible to remove until the impeller was removed. The impeller was press-fit using components that I did not have the tools to remove.

My best guess is that one or more of the power devices in the H-Bridge shorted out. This might cause the motor to be less controllable, without actually failing.

As an aside, I would like to note that the world's market for power devices has always been fraught with cheap imitations that are not as robust. In earlier days I used to find fake 2N3055-type bipolar power transistors in failed power supplies. From what I've read in various electronics magazines, these bogus devices continue to find their way in to too many distribution channels.

This goes to show just how difficult it can be to build a reliable product. And where profit margins are very narrow, most people simply do not want to be bothered with this level of diligence.

Good job figuring this out. To echo another's post, did you do a post mortem on the blower?

When replacing expensive components at home, I always try to fix the old one to keep as a spare to avoid future expense (takes a little extra time, not always successful, but I usually learn something about the equipment I'm maintaining).

As to the tower of babel, remember that this is a global economy. People usually buy strictly on price, globally. This include's "primes" like the boiler nameplate. It's hard for a company to justify the cost of fully understanding all the features of each component, or of writing a well designed troubleshooting guide. If the product outlasts the warantee period, they have an adequate design. Sad, but it's the state of the times (and of the technology).

While the code explanation may not have included this exact situation, it sounds like it was directing any plug and play service response to replacing the blower. And that was the correct solution, wasn't it? Or did I miss something in the story?

Stephen & GTOLover: While I agree in principle with your comments, I think one factor that has been overlooked in these blog entries as well as other similar ones is that it is extremely probable that the EBM PABST blower was a proprietary unit, and therefore not well documented in the public domain. This is not a new phenomenon. Manufacturers for eons have been coding common parts w/ their own internal numbers forcing the issue of returning to them for replacement items. IF one thinks back the most blatant example of this is the semiconductor application sector. How many common diodes, transistors, integrated circuits have had unique numbers applied? No doubt, countless! For example, I have a bin box full of MOTOROLA TO92 transistors marked "9706". These ARE EXACTLY equivalent to their catalogued part, MPSA64 (P-N-P Darlington). In my lengthy professional career, I've been responsible for the specification of many parts, from capacitors to transformers, to "you name it" that were functionally EXACTLY the same as catalog items, but my employers sought to bolster their Repair Depts. with "proprietary" components. This works especially well when the product that you're marketing is sold throughout the world. By the same token, I'm sure that company executives "overseas" use this "tool" also.

One has no further to look than to NTE Corporation. They've built an extremely successful business by cross referencing & second sourcing a wealth of electronic components, from simple diodes to transistors to integrated circuits to mechanical relays, to ..........

Mr. Brodsky: I wouldn't hold my breath IF I were you for this climate of quick exchange to be surmounted by a more robust policy of providing manuals, troubleshooting guides, etc. I suspect that there has been more than one high-level corporate meeting involving the attorneys urging their companies to go in the opposite direction for fear of some hypothetical liability claim. That, to me, IS the saddest comment of the current state of affairs.

Your remark about Italian controls struck a nerve. Bavelloni makes CNC controlled cutters. The documentation is there but often puzzling and parts supplied from Italty very expensive, even though some are off the shelf items.

The thing that had me most scratching my head was their choice of wire colors.

To most of the world, power wiring that is green implies frame ground. In their machines, it often means the 24 volt supply. There is an obvious world of difference in troubleshooting between the two.

I can't disagree with you that there have been and always will be third party assemblies of one sort or another in every complex system. And in and of itself, this is not a problem as long as the assembly inputs and outputs are documented and the narrative for the control system is well defined.

The problem is that neither are commonplace. So now technicians have become board swappers and sensor shotgun experts. And I'll bet at least 1/2 to 3/4 of what they replace was working fine and did not need to be replaced.

I am no luddite. I design control systems for industrial processes. The thing that got me started down this garden path was because I was busy, and my wife called upon three different, reputable service firms to fix this boiler. The first one simply reset the unit. The second gave me the equivilant of an intelligent shrug, saying that "we don't service these things any more" and the third told me that he really thought I had a bad blower (though he couldn't explain why), but that he really wanted to replace the whole unit to the tune of nearly $11k.

These firms were not populated by fools. They were all knowlegable, reputable, and capable people who wanted to do the right thing. What they lacked is any way to reason through this problem. My primary concern is that with this tower of Babel going on inside this boiler, there may not have been a way that Munchkin themselves could have known that the F13 code documentation was wrong. And it turns out that Munchkin is not alone in this. Many firms build their products out of the pieces and parts from other companies.

There were other hints that the controls board from S.I.T. Controls had been adapted from other projects that didn't really resemble this one. There were mysterious registers with instructions not to change them. There were hints of an external modbus control system, but no indications of what the interface might look like or what each register might do.

This is the hazard of modern controls: unless you know and define exactly what a system is supposed to do at every step of the way and what stimuli they're expecting, there is no way to know that anything you have is in specification or that it is even broken.

In contrast, one of the projects I worked on where I'm employed is a filter backwash system. We have backwash control narratives with 16 steps. We carefully document each of these steps for the operators. If a step gets stuck there will be an alarm and the operators can know exactly how to respond to get things moving again.

The lack of any such documentation in this application shows that this company was too sloppy to care, and they're hardly alone in this. In the last several years I had problems with a dryer, an oven, and a cooktop. The common thread in all of these products is a missing set of documents to explain how the thing is supposed to do its job. It's no wonder modern "technicians" are reduced to board swapping --it's all they can do.

The only reason I was able to fix this boiler was because I reasoned my way through the entire process narrative. In effect I had to reverse engineer this thing. Expecting a technician to spend time doing that is, well, optimistic. Were it not for the motive of saving over $10k, I don't think I would have bothered either.

I'm old enough to recall the days when every radio or TV set had a service manual and a schematic diagram pasted on the inside of the unit. Owners were expected to be able to replace tubes and even conduct minor repairs. We need something similar for control systems, or there will be many more disgusted customers like me. Had I been less aware of this issue, I'd have spent $11k on another boiler with yet another tower of Babel in it. As efficiency demands increase, complexity increases. We need better diagnostics. We can not afford to replace such appliances just because nobody can figure out how they're supposed to work.

I agree. However, finding good techs that diagnose issues is getting harder as dealerships are about making money. Service is not defined by fixing your vehicle cheap, but by fixing it with a profit!

Myself, I read the codes and diagnose the issue similar to the author of this article. Get the meter out, look up specifications from the OEM of the sensors and actuators, try to figure out why the computer/controller displays an error code. There has been several Made by Monkey posts that seem to share this theme. The science of troubleshooting systems is becoming a lost skill. We read a code, look it up in the book (or online) then proceed to fix the code. Even when we find the bad component, we replace it and all is good. But why did the component fail? I know age, vibration, wear contribute, but what was the failure? I know in this article, the blower was replaced, but why did the blower fail? Electrical surge, worn, vibration (fatigue), design flaw?

By the way, like the link. Another good resource to learn how to troubleshoot more effectively.

Focus on Fundamentals consists of 45-minute on-line classes that cover a host of technologies. You learn without leaving the comfort of your desk. All classes are taught by subject-matter experts and all are archived. So if you can't attend live, attend at your convenience.