Do you refactor embedded software?

Software refactoring is an activity where software is transformed in such a way that preserves the external behavior while improving the internal software structure. I am aware of software development tools that assist with refactoring application software, but it is not clear whether design teams engage in software refactoring for embedded code – especially for control systems.

Refactoring was not practiced in the projects I worked on; in fact, the team philosophy was to make only the smallest change necessary when working with a legacy system to affect the change needed. First, we never had the schedule or budget needed just to make the software “easier to understand or cheaper to modify.” Second, changing the software for “cosmetic” purposes could cause an increase in downstream engineering efforts, especially in the area of verifying that the changes did not break the behavior of the system under all relevant operating conditions. Note that many of the control projects I worked on were complex enough that it was difficult just to ascertain whether the system worked properly or just coincidently looked like it did.

Most of the material I read about software refactoring assumes the software targets the application layer of software which is not tightly coupled to a specific hardware target and is implemented in an object oriented language, such as Java or C++. Are embedded developers performing software refactoring? If so, do you perform it on all types of software or are there types of software that you definitely include or exclude from a refactoring effort?

This entry was posted
on Wednesday, February 29th, 2012 at 4:58 pm and is filed under Question of the Week, Software Techniques.
You can follow any responses to this entry through the RSS 2.0 feed.
You can skip to the end and leave a response. Pinging is currently not allowed.

98 Responses to “Do you refactor embedded software?”

Many times embedded systems, especially realtime embedded systems, have some type of life safety and/or stringent certification requirements which are often quite expensive to carry out. Although it would appear that refactoring would lower costs, those savings are often offset by other certifications (medical, nuclear, aviation, voting systems, etc) costs. I was recently involved with some embedded software for electronic voting machines. We touched as little code as possible due to the outrageous costs of recertifying changes that could reach $1 million or more.

I’ve learned many times over that what you may see on the surface is almost never the entire picture, unfortunately.

Seems like “refactoring” as a term came along with web oriented languages, where managers would ask for one feature set in a finished program, then totally revise the function of the product, necessitating huge changes in design structure. Sometimes this happens in embedded design, but I have found that those managers (usually engineers, not just coders) know better what their product should do from the start, and don’t so often restructure everything from the ground up. Perhaps the need for huge “refactoring” efforts is an indicator of sloppy, rushed specification at the start.

The biggest “refactoring” tasks I’ve had to undertake involved changing the structure of the code from superloop to tasking, or vice-versa, when changing platforms or operating systems.

I used the term “Migration” as my research revolves round the transition of architectures in embedded systems from even-triggered to time-triggered designs. This is about changing the underlying architecture from multiple interrupts enabled in the pre existing system to change to a single interrupt based system.

I looked after the legacy gas detection instruments at Neotronics for a while. Most of them used 8048 & 8049 processors, and when a customer required their own special alarm levels or feature sets it meant editing the source & burning a new master ROM to be sent down to production. Even changing the short term exposure limit required editing the code – the time span was hardwired into the algorithm. Needless to say, modifying, making sure everything fit into the finicky 256byte code pages and testing took a significant amount of time.

As a side project, I reworked the codebase to automatically rearrange functions to get the best fit in each code page, changed the STEL algorithm to be data driven and grouped all configurable items into one small configuration file. The end result brought the average release cycle down from two days to two hours.

My boss at Neos supported this task, but in every other place I have worked refactoring has had to be done as a skunkworks project, because at the executive level it is rarely appreciated that what looks like an exercise in Engineering navel-gazing can give real cost reductions a few months down the road.

While I never used the term “refactoring”, in the dim past I have rewritten legacy software and my own (in-development) code more than a few times. When I did, I was motivated by 1) need to make the code robust (work for all newly known situations), 2) make the code fit into available codespace, 3) make the code more maintainable (reduce probability of future changes causing problems). I’m sure there were other reasons, but these three pop up in mind quickly.

My management was wise enough and trusted my judgement enough to grit their teeth and let me get the job done. My need for quality derived from their need for quality and I worked to factor time and cost effects into these decisions.

Change for the sake of change is never a good idea. Far too often I have seen bugs introduced by refactoring. This is usually the case in old, badly commented code where things have been done in a seemingly odd way, but for a reason, though the reason was sadly not documented at the time.

I work in safety systems now, where every change must be reviewed and recertified, so we definately only change code that needs to be changed and those changes are kept to a minimum. Certification is a long and very expensive process.

Although Jack has listed legitimate motivations for refactoring I fear the usage of the term is not quite what Robert meant in his original question.
Refactoring is not just redesigning, or rewriting, your software. It is a process that necessarily requires an associated automated test harness, preferrably designed into the code from the start. It is usually used with OO languages but could be applied to procedural languages as well. It involves iterative cycles, each of small changes followed by automatic test. How you select the steps to take comes down to experience but Martin Fowler’s book, “Refactoring: improving the design of existing code” gives a good introductory tutorial for those using OO languages.
My team is using refactoring techniques on safety critical real time embedded OO applications. As Gregory pointed out, the regulatory process is very expensive. However we do the refactoring with every code review, with the motivation of making the code maintainable. In any safety critical application all code must be reviewed, and this happens long before any submission is made. If the reviewed code has associated test harnesses that give the reviewer a high confidence that the code is running correctly he can make changes to the code during the review and be confident that these changes have not broken the code. This process gives the reviewer a much deeper understanding of the code they are reviewing and helps add real value in a process that we have to follow anyway.

However, I’d also question leaving poorly commented or misleading documentation as is. I consider bad documentation to be a risk which is only temporarily asleep.

I even included a “change comment only” capability into a code editor I made back in the early 1970′s. Following any “just comments” edit, new/old executables would be verified as identical (no other testing required).

I guess to a large extent it depends how active the code is; if it is written once, dusted off every now and then for a minor feature modification then put away then there is no justification for eliminating the nasty little kludges that find their way into the code when programming under the pressure of an imminent deadline.

If however the code is in constant use generating product after product, with multiple teams all working across the whole codebase, then there certainly comes a time when the accumulation of blue wires makes it extremely difficult to be sure that a change has no unexpected side effects. In this environment periodic reviews and clean ups can extend the life of the code, accelerate development, remove performance bottlenecks and eliminate traps for the unwary.

Ideally, to refactor/recode or not is a decision to be made on a situation by situation process.

Your organization’s Defect Tracking and Management process might be one way to mitigate, tracking code (chunk) problems which exist but which for risk, schedule and/or economic reasons will be deferred until some future massive product redesign. The defect resolution could require explicitly documented operational limitations and user training where applicable.

I strongly favor documentating code limitations within source code comments, but can see potential for alternative code documentation wrappers. Unfortunately, my experience with inherited seperated code documentation has been too often pretty negative.

Hmmm, that’s what we always do, to be perfectly blunt. All our software functions in control devices have “exactly” the same function call interface, thus any function can be interconnected with any other function. The objects all have various icons, where we actually “draw” the interconnection of the functions to create product functionality, which we call “User Applicaiton Programs” (UAPS). We “reuse” these UAPs over and over and over again across products, sometimes modifying them slightly, sometimes creating new ones. Since all functions in a product use the same interface, and since all product functionality uses these functions, we also reuse test plans, test code and so on to “retest” UAPs when changed. I could to on and on, but effecitvely we treat these “assets” (funcitons, objects, UAPs, test plans, etc) as reusable entities. How do you reduce project schedules and costs? By reusing tested assets. We never start over on a new project, and snce most functons are in ‘C’ they port quite well. Ya, we have to rewrite drivers for new cpus..Sorry, have to go.

I’ve refactored embedded systems firmware a few times, refining the design in a way to make the code easier to maintain and extend, and sometimes also to improve the efficiency. In my opinion refactoring is becoming more and more common, due to the increased complexity of embedded systems code. Performance level and memory availability are making firmware looking more like software than it was before. Maybe the real-time domain is less affected by this evolution, but the concept of ubiquitous computing is distributing the processing power that once was associated to a desktop PC to embedded devices (things that think). The maintainability is becoming more important because embedded applications are required to be somewhat flexible, and so testability (as part of the maintainability). I definitely agree that instrumenting the code with test harnesses is important to reduce the cost associated to the consequent validation. As Don pointed out, I think that using a common API can be a step further to have validation done in a shorter time, like having JTAG interfaces on your PCB can reduce the time to prepare a validation system.

I have refactored embedded real time systems – but not for an existing product (as noted the cost of verification is high). The refactoring was done while porting old code to a new product, to make it more maintainable for flexible for the next one.

Oh how I dislike that euphamism. Stated another way, and more to the point – Have you ever had to re-write software because of some unexpected oversight? (say because it didn’t comply with coding standards, memory budget, time budget etc.)

Of course I have. The previous developer failed to analyze the problem correctly.

Wow, this thread is dense with good comments. I too have a story to tell on this.

I tend to dislike the term, but yes, I did that. More than once, actually.
I agree that major rewrites of the entire codebase just for the sake of “updating the structure” hardly makes sense in deeply embedded industrial systems, and one of the reasons is due to major re-qualification and certification costs.
One recent such system, based on an 8051, was being ported with minimum intervention for the last 18years, during which period it saw 4 generations of hardware, and so many engineering teams that retouched the code.
The problem was, the product performed perferctly on the field, so the code was deemed “field proven for the last 15years” by higher management.
That code was so hard to maintain, with so many violations of good design practices, that it took the best out of very talented engineers to keep it, and eventually all of them ended up hating the job.

Recently we decided to do an utter and complete redesign on that product line, using the OOD code designed from scratch for a different ARM based instrument product line. We redesigned absolutely *everything* from scratch, keeping the case, connectors and panel largely untouched.

The whole redesign of the code took us just 6 man-weeks, and 50% of that time was spent on class and dataflow design to avoid ad-hoc solutions and concern separation on all modules. The result was a coherent code base that is now been used as the core for all of our product lines, designed from scratch to meet safety qualification, and built for maintenance.

In the past 5 years at least, the old product has drained engineering resources many times more expensive than the new, cleaner code would.

So yes, refactoring should be done sooner than later in some cases. The advantages of having a better product, that consumes less engineering, and can be expanded at lower costs, will probably be worth the effort right off the bat.

That is also true for subsystem design on an ongoing project. Especially in realtime systems, the lower levels of the system drivers and protocols, timer services and scheduler, for example, sometimes offer an invaluable opportunity to make the whle system to perform better. We do that on an ongoing system evolution, between release points.

I like this discussion and there are many life stories that I can relate . In many companies that I worked for, refactoring is dreaded by management. They delayed it, they minimized it, and they reduced the scope. In one project, the greatly anticipated refactoring work did not allow us to remove years of bad patching or do an risky redesign work.

Having said that, it’s better to adopt a strong process in your team/department to reduce the erosion and decay of your legacy code.

@Alexander – “everyday little by little” puts the liability on you. It’s like making improvements to a bathroom and then a big renovation project will make you improvements a waste of money and effort.

Sebastien raises an important point – a test harness. It gives you a great deal more confidence that you haven’t broken something along the way!

Back to the original question. Yes, I do refactor. As so many others have pointed out, the “status” of the software (test, certifications etc) greatly influences when you can do it, but there are often natural opportunity points (new release, adding new functionality) to do it.

Bouncing off Johnny Doin’s comment, I wonder how many of use would admit to recommending a ground up redesign over slogging through another revision of aging code, supported by the (perhaps weak) reason of using more modern hardware? Come on, admit it!

An even worse situation requiring “refactoring” is this. I wrote a program which morphed into several products over perhaps a dozen years. It was all structured as state machines, and that was natural for the application (LCD user interface and communications). The code was easy to expand because of it.

My customer had another internal programmer take that program and modify it for another application. He had apparently never used a state machine, so he globbed on about 30 flags and bits of code here and there to accomplish the task, and they started shipping the product. So it has the outward appearance of a well structured program, but the internals of a procedural disaster. Spaghetti to the tenth power.

Then they wanted me to do a few small changes to the program, since I wrote the initial incarnation and the other programmer is gone. But the program has now reached a state of *unmaintainability*, if that’s a word. It is impossible to make any change without screwing up something else. The number of possible states of the 30 various flags has resulted in an exponential number of conditions that could never be tested for, much less fixed if found defective.

I would have had to refactor just to get back up to zero, then do the mods to recreate the required functions. Yikes. Seems like the term refactoring encompasses justified reworks by professionals, as well as fixing problems cause by people who should not be programming.

@Hank : Wish all it took was a brief conversation instead of the metaphorical “guts”

I am interested to know if anyone here has been able to successfully convince their manager/customer for dedicated time for refactoring; especially the technically challenged type who can’t see beyond the immediate deadline?

In my case, all of our code is in assembler, and mostly undocumented. Our legacy systems have had to be “patched” many times to address new customer needs, and the system has reached a high level of complexity. Management finally allowed me the time necessary to re-implement large portions of the code to address many lessons learned to make the underlying code more efficient and understandable, removing unused code, lines commented out years ago with no explanation, addressing errata sheets from the silicon manufacturer, etc. This refactoring exercise has been expensive, frustrating and finally successful. It is not something I wish to repeat, and I am making sure all of the lessons learned are going into the requirements and design of the next system.

Just a question; How many here ever “tried” to create and “internal software archtiecture” that is totally reusable across products of different cpus and for different industries and “tried” to treat your code as reusable assets? In effect, treat code just like a piece of electronic hardware, where you develop, test, and catalog the code and reuse all the core objects on every product you do? I hesitate to post this question since I have always been told it couldn’t be done and until I had my own company, was unable to do it! So although I know this could get extremely opinionated, and I know it may not apply to all products (I here this all the time until we prove it wrong), why does this never occur? Is it because to the short sighted “ship it” mind set, is it because of “job security”, is it because of “competence to do it” or what? I have been part of the “reigning in” of multiple product networks (more than you want to know) down to three core networks, some OS standardization to reduce “some” support issues, an object based network interface to converge product differences where commonality was possible and desired by customers. I am sure there are many reasons, but why isn’t the creation, usage, and support real time embedded firmware assets treated “like a business” like the hardware guys do all the time?

Thanks, Don – as long as some portion of the platform remains constant, we have been able to head in that direction, but, alas, there are always exceptions. Our current systems use different communications protocols, and we have variants on different hardware and software platforms, making it difficult to develop standard libraries of functions that can be used across the entire product line. Shipping product supports my continued employment, so it is difficult to get these types of changes into the product for the sake of “carry-over”, especially as we migrate from one 16 bit hardware architecture in assembler to a 32 bit architecture with hardware floating point in C. One of the main reasons we selected this new platform is to enable just that sort of consistency and re-use of major portions of the software. Our goal is to enable support of PowerPC and TMS320 based products from the same code base with as little hardware dependent code as possible. One ditty from my early career often cited was “not every system is a VAX” – meaning beware of non-standard, single-platform implementations. Portability should not be viewed as a goal, but as an implementation requirement, with non-portable parts isolated and well documented for the eventuality that either the compiler or the hardware base will change eventually. A strong version management package and a commitment to effective configuration management is an underlying necessity if this goal of portable, re-usable code, documentation, test, feature sets and rapid development of new systems is to become a reality. Many have tried, I know of very few successes.

I will not debate good reason already mentioned to not refactor / redesign, and yes, although sometimes it is not possible, affordable or worthwhile, in my opinion these are constraints, difficulties to solve. I mean, redesign should be common best practices to follow, not an option to consider only when there is time, budget, opportunity… Under that way of thinking, there is always “good” reasons to not redesign. I have heard too many times those arguments…

Hank has mentioned a personal experience more common than not, one of the reasons that precisely breaks arguments like “it’s working, isn’t it?” or “No! What if you introduce new bugs?” or any other excuse. You design perfectly and develop maintainable, adaptable code and without bugs on the first attempt, don’t you?

The attitude should be “how can we refactor”, not “should we refactor”.
My personal attitude is towards quality, a word that I love for professional activities. Isn’t there anyone concerned about quality and improvement in software engineering, preferably in management positions? Fortunately yes, but maybe not enough…

@Sibin Thomas wrote: «… if anyone here has been able to successfully convince their manager/customer for dedicated time for refactoring; especially the technically challenged type who can’t see beyond the immediate deadline?»

When I joined my current company, I was told “we need to launch a new product line, which has the same inner functionality as the current line. You just have to program some new fieldbus protocols and maybe design a larger eeprom memory. You must use the same code, because it is working perfectly in the field for the past 15 years.”

After a few weeks of analysis on the “field proven code” I unearthed some nasty bugs that were just in plain view, and documented enough of the datapath to show several very dangerous global variables promiscuous re-use by different code modules. I did that to back up my position to redesign *from scratch* the whole system.

I drafted a new system architecture and highlighted a few really important buzzwords like “aspect oriented design”, “concern separation”, “object oriented design”, and defended the new view.

The catalyst for the complete turnover of design methodology was some very timely and critical bugs that happened in the field at just that time. The bugs talk much louder than any nicely explained software architecture. What I did was dedicate some of the best engineering staff to maintain the old code with surgical precision, but instead of sweeping the dirt under the carpet, we exposed every weakness in the old code base. That maintenance kept the product alive while a completely new system was build from scratch.

Today the code base for all new product lines is the clean and portable new system, and we are refactoring the existing lines, now without having to hide the effort.

I think the bottom line is: usually *you* are the firmware specialist, not the management. Instead of accepting weak reasons for doing a bad job, you should try to show the good reasons for not doing it. When you succeed, you have one season of real good and enjoyable work ahead. When you fail, you can work 9 to 5 and keep looking for a real job.

I worked on a line of instruments for HP many years ago where the management had bit that bullet — we designed a common firmware (re-usable) framework for several different divisions and across a global team. The spectrum analyzers and scopes had enough in common to do this from-scratch design that each team could tweak for their specialized needs. So management DOES sometimes allow this type of effort, but from the rest of my career, I would say it is not common.

For ten years, I’ve been working on a small team maintaining 150K CSLOC that was mainly written by one person (no longer on the team) who seemed to believe code was self commenting and that sr, sk, and sl were descriptive variable names. For ten years, we have been refactoring this code in order to modularize it, make it more readable, and to add comments.

We have spun off several derivative products, and the code has become much more stable. Our refactoring has included total rewrites of certain modules, breaking out countless subroutines to shorten huge switch statements, and reorganizing code for better data hiding.

We don’t call the work “done”, but we’re taking what we’ve learned into the next generation, a completely new platform, which hopefully won’t cry for so much rework

@Don P. – It’s not part of the company’s mandate to make a system completely portable and diverse because the project will never end. A designers’ job is to design it well with as much foresight and flexibility as possible. Only open source projects can afford to think “ideally”.

@Johnny and Vicky – as a developer, your refactoring motive is to rebuild a better system and to improve your life. When you are a manager, life is already better when the system becomes stable and the customer level is high. I never see eye to eye with upper management when it comes to refactoring work. So I’m very surprise that you have such a positive support.

@Johnny D – You are a wise and true developer at heart. I know of so many developers who became managers and it’s like swimming with the sharks. Climbing the ladder means keeping a steady profile and avoiding any career-killing projects.

@Julio – I share your sentiment and I enjoy perfecting my craft. Life’s experience taught me to pick my work place that shares my passion rather than staying loyal.

@Jonny : Fortunately for you you were aided by ‘timely’ on-field failures; how would your story have gone without those demonstrable failures?

@Everyone :
If I were to rephrase my question –
As an experienced person you see early signs of the problem and you know exactly how this is going to unfold – usually an implosion or a less dramatic slow dilapidation, eventually necessitating an expensive new construction. Now, how do you convince someone that corrective measures taken soon (‘fixing’ it when it ain’t ‘completely’ broken) will be hugely more beneficial in the long term?

Using an analogy from the medical world – you notice early stage cancer markers in the bloodstream, and you can estimate that this person will remain blissfully asymptomatic for likely the next few years. How do you convince this person to undergo preemptive treatment even before the symptoms begin to show? Note that the preemptive treatment is much costlier than doing nothing, also note that the treatment during late stage cancer is much costlier than the preemptive treatment and the risk of mortality is much higher too.
[From anecdotal evidence and some Googling I have learnt that the success rate in these cases too are much lower than what the conscientious doctor would like it to be]

Looking for a Gandhian/Zen approach as opposed to the usual flight/fight response.

@David Lightstone : I began reading the “Death March”…I am extremely concerned about the frequency with which I nod in agreement with what is written in the 1st chapter! The quote from Scott Adams does put things in the right light though

@Sibin: Maybe you’re right, I was saved by the bugs. But actually, on of the reasons to redesign is that the design is not good. Sooner or later the bugs will come.
One co-worker at the time told me that I was making lemonade with the lemmons we’ve got.
I realised some time ago that you can build on your successes, and in embedded engineering your successes are the legacy from past projects. Today I try to create an environment for my team, where instead of looking backwards, maintaining bad legacy code, I sell to higher management the vision that the code base is a valuable asset, and when that asset is frozen it de-values rapidly.
One of the reasons that “legacy code” semantics translate to pain and suffering is that it’s “other people’s code”.
The value of a code base that is under full control of the engineering is that you can turn it into wealth faster and more often.
Engineering, in a information or manufacturing company, is the place where you create potential value from thin air. Here is the place that transformed society in the past 200 years. We create the potential real valie, and the rest of the company may turn that into wealth, if the impedance is matched in the path to market.

We better keep the products in a state of live development, so we keep the technology current and ready to generate by-products, or to outpace the competition.

That is the frame of mind that can turn upper management into an ally on the refactor quest. And place the term “challenge” in a new context for us all.

P.S. Anti-Dilbert mantras should be chanted every morning and before any mgmnt meeting.

Because it is not visible or deliverable, refactoring is not often a management priority. But as so many of you, like @Alexander, have pointed out, it helps the long-term viability of the product code base. In cases where I have done or have been aware of refactoring, it’s generally been when introducing a new variant, or porting to a new RTOS and/or CPU. We already know there will be extensive testing on the new code so the fear factor is reduced. Unless you are redesigning from scratch like @Jonny, reverse engineering and refactorying tools can be a big help. Understand C++ has been helpful to me because it lets me view the source as a unit. I love the ‘butterfly’ call graph that shows all calls to a function and all calls that function makes. There are other tools that go farther by reverse engineering the actual design so you can visualize it. That helps you decide how much needs to be done.

@ Sibin Thomas
FYI – Those who read Death March may conclude that he is suggesting such strategies be followed. I think of it more so than anything as an observation of the form – if project management continues its evolutionary path, here are the consequences.

One must wonder if the Agile movement is a continuation of the evolution, or a redirection. My opinion is that it is a continuation. The goals of continuous integration and refactoring being manifestations.

Two initial “refactoring” strategies (assumes the code is totally worthless) are
(1) For Assembler
Make extensive use of macros. This for purposes of encapsolating data and isolating simple (very simple) algorithms. You can place the macros just about anywhere so as to achieve a prefered structure. The advantage is that you are not making any changes at all to the binary image (ie identical binary images)

Eventually this strategy will cease to be advantages. When that occurs you know when you make a change you need to test that change (both for temporal and algorithmic behavior)

(2) For C
Basically the same game, except now you exploit inline coding (if the compilier permits) or the C macro capabilities (if the compilier does not inline)

@all: I poked a bit at everyone’s sole becaue we all have been there and most comments likely put a smile on most of our faces. I have been fortunate in my prior corporate life and worked directly with the top guys in major controls companies. Despite what some may think, the guy on the top ALWAYS cares as he just wants to improve time to market, increase quality and bottom line, make more money. Depending upon the company, the ideal goals break down as you proceed down the political chains for the reasons many of you stated. In some of these companies I have been one of those “change agents” and as the old dilbert cartoon showed; “How do you know a change agent? He if face down in the mud with arrows in his back!” How true. But despite all the accurate comments from everyone I see there are still some that don’t think you can make reusable, tested software components, partition these for significant reuse, and primarily thorugh well architected software and reuse reduce time to market and improve quality. The old “denial syndrome” in some of the comments. All I can say is we have been doing it for nearly ten years, got it patented, and product ship durations won’t even be mentioned here since most would not believe it. But I will say this; No matter now SHORT the schedule and how LOW the development cost, most customers will always want it shorter and cheaper. I will also say I could NOT pull it off in corporate america due to cultural issues and had to do it in my own company being in 100% control for many of the reasons you guys stated. It also takes incremental evolution, but always starting from exactly the same core code. So a rephased question appeared which effectivey asked; how do you change this mindset? This is my life getting “new customers” and even when you “have the solution” and can “open the hood” and show how and why it can be done you could often hear a pin drop in the room. But those whom “buy the solution” are those whom manage the schedules and budgets and very often, against the recommendations of some engineers thinking it’s snake oil. So if you want to “sell the idea” you have to “do it”. And if some of you actually think you can do it, when you try observe your co-workers and how may “help” and how many “hinder” your attempts. You will soon discover that it is not “just a management issue”. I have always said; “If they could have, they would have. If they did it then they have it. If they don’t have it, then either they can’t do it or they don’t want to do it”. You be the judge as to why you have these problems in you companies. No more soap box, got to pay the bills.

There are many truths in this discussion:
1) company culture
- As a young company,developers strive for innovation and great design. You can include refactoring in your daily task or schedule it for later. Management supports that energy.
- As a profitable company, developers are pushed to continue the leadership with new features and maintaining the stability. Whatever it takes is the policy, schedules are overloaded, groups are divided, training is lacking, and process control breaks down. Management cannot overlook large unnecessary code commits.New developers do their best to fix bugs without proper feedback either.
- As an older company, those products are bread-and-butter and leadership had changed hands many times. I see very little support for refactoring work from the development group, and from management. It requires aligning good people to make this effort a reality but it also requires a quantitative delivery. Stability is the measure of the bug list which is not reflective of what is under the hood.

> “How do you change the mindset?”

It takes 3 or more innovators to sustain the drive for change.

>”If they could have, they would have. If they did it then they have it. If they don’t have it, then either they can’t do it or they don’t want to do it”.

Idealist and capable people will “do it”, “want it”, and “care for it” even in the face of negative feedback. The challenge is to satisfy the marketing team and you can achieve autonomy in your development group.

We actually “pulled off” maangement support for “factoring” a LONG time ago in one large company. One of us made it apparanet to some senior VP’s of the money they were leaving on the table due to NEVER going back a “cost reducing” existing “cash cow” products. Thus it made financial sense to make updates, where obviously most “cost reductions” were due to hardware changes, but sometime software changes were involved as well. In fact, as I recall, part to the vp’s bonus was based upon cost reductions so you know it got done. However, all good reward systems must only be temporary. I was actually in a meeting on a “new product” were one individual said jokingly, “who cares about the initial product cost, xyz will get a bonus when we reduce it later!” Hmmm, although a joke at the time, you can see where reward systems can break down. Relative to “refactoring firmware” without a manufactured cost savings? Guess that’s why it seldom occurs. Heck, despite me being such an architecture and reuse bigot, for my own products I try to do stuff right the first time just to reduce maintenance costs. But even still, at some point in time you have to “shoot the engineer and ship it” or “close the doors”. If people want to truly understand the “ship it” mind sets just run your own business for a while. The last idea about getting management support, get them to let you do one product “right the first time”, then reduce project time/cost on the second product, reduce it even further on the third product. Track this stuff and make doing it right visible and you will get the support you need.

I am just reading a book titled “Test-Driven Development for Embedded C” by James Grenning. He is a big agile proponent as well as a TDD proponent. TDD is sometimes used in agile. The use of TDD in some form or other predates agile, though. I mention the book because he also promoted refactoring. Frankly, I think it is necessary. If the software does not conform to good software engineering standards then it should not go though the certification tests that many safety critical systems require. That is just my opinion. I have worked on satellite and military systems where code correctness was paramount. Code was also long lived. Peter Maloy had a good example of where “refactoring”, or paying attention to how code is structured saved time.

It is an excellent book. I’m far from sold on either Agile or TDD as “processes”, and not too keen on some of the baggage that goes with them, but I would recommend the book to anyone involved in embedded development.

@ Louis Giokas
Certification has absolutely nothing to do with software code quality (in the sense of whether it does or does not need refactoring)

Irresepective of your experience with Military Systems, to give you a good measure – on a DO-178B Level B project (C-130 aircraft)I was most disheartened to discover that there were no coding standards.

@David: You are absolutely correct. My point was, considering the high cost of certification, why would you certify code you knew to be poorly structured. Certification does not guarntee correctness. Certification means that the code conforms to certian standards.

I was thrown in to a satellite project many years ago which was instructive. The lead on the CMS (as the operating system was called) intentionally created a system with self-modifying code. He did this so that he would be indispensable. Well, they finally fired him. I spent a couple of weeks figuring it out and redoing (refactoring) it. By the way, so as not to scare me off, the management did not tell me any of the history until after I had finished.

As another example, I have worked on several projects where customers have gone overseas for software development. This is not a hit on outsourcing, but these are the examples I have run into recently. Without standards, the code received was awful. It was poorly written and not maintainable. There was no documentation. I lay it at the feet of the customer. They wanted cheap code. Code is code, right? One that I heard of very recently involved an if statement that was “2 pages” long. It was replaced by a statement that was two lines long. That is radical refactoring. At a real-time systems conference last year one of the speakers talked about a military fire (as in gun) control system that had a problem. It was 500K lines of code and not documented. They just gave him a listing. Perhaps it worked at one time. Then it was left as is. It should never have been allowed into the acceptance test.

What I am getting at is that code quality is just as important as functionality in my opinion. Didn’t the auto industry go through that a few decades ago?

@Louis Giokas
Certification is related to fitness of function and performance characteristics of that functionality. It does not address how that functionality is achieved. White box vs black box testing. Refactoring is a white box sort of thing. Certification a black box sort of thing.

In DO-178 poor code (ie needs refactoring badly) will probably fail the Requirements to Code traceability review criterion at SOI 2, so it is not like the nasty stuff can excape notice. Its just a question of whether the nasty stuff will be tolerated

This discussion illustrates that you don’t get quality code without quality people. Sounds obvious, but it’s assumed and ignored too often. All you guys and gals are experts at the top of your game, but there are a lot of people out there writing code otherwise.

I worked with an agile team that was writing a program to interface to my embedded device. They would puzzle for hours over problems, then declare “time to refactor!” and disappear for days. All the scrums and pairs of programmers typing on one keyboard still produced code indicating a fundamental lack of understanding of the problem at hand. And they could only count in decimal. The formal system is useless in the hands of folks like that, though they had good intentions. It is that experience the term “refactoring” calls to my mind.

@Hank: You are 100% correct. I recall one study I read years ago that there was a productivity difference for worst engineer to best engineer of 20:1, and I would think we have all seen individuals at both ends of this range. It appears many on this site may the the “go to guys” when something needs to get done, and everyone in an organization knows whom they are. In a prior life I was in on may management discussions where I would often hear the comment; “we need our employees to act more like business people” and similar comments. I always said; “If you want them to behave that way, why don’t you treat them that way?” Bottom line is I ended up generating one of the “corporate america proposals” to minimally “force the obvious acknowledgment of people capabilities.” In just a few sentences the bottom line of the proposal was; 1) pay all engineers a “fixed rate” to survive, 2) have groups of engineers bid on a project, 3) if they get the project whatever money that remains once completed they split based upon their “inter engineer contracts”. There was quite a bit mroe,but that is the idea. Upon reading proposal my boss said something like; “But Don, if we did that then you and “x” and “y” would team up and get most of the projects and “a” and “b” and xxxxx wouldn’t get any. Hmmm, I sat back and said something like; “so you clearly understand the problem yet those other people are still here!” The whole idea of the proposal was that if you worked very hard, got stuff done early, you made a LOT of money and so did the company. Bad projects would die a quick death. Low bid projects would belly up when people quit. “Moving target” requirements would seldom be accepted by engineers unless they felt future revenue would be the result OR marketing would have to come up with more money to implement the changes. If you were no good, you didn’t make base rate and you would likely quit. Seemed like this was treating engineers like business men. I thought the proposal died a quick death. Then at a Christmas party, months later, one of the “top two” in the corporation came up and asked me some questions about the proposal, which he liked by the way. I amost fell over, I thought it died and never saw the light of day. Bottom line is, the proposal NEVER happened. The proposal had stuff about how marketing was rewarded, how product success rewarded the engineer on a decreasing scale, where due to incentives some “marketing plans” might go with a “no bid” from engineers and so on. In fact, in another life I acutally did an DFD and STD’s on the new product development process, where STD’s clearly showed why things stopped, where decisons weren’t made and so on. In no uncertain terms I was told to “loose it and stop all further work” on that analysis. I guess sometimes the truth hurts. You can probably see why I finally started my own company as my success is 100% dependent upon my performance and capability. So bottom line is, someone earlier asked about how do you “fix the system”. You “brave guys” might want to use your design skills to analyis your current development process and create a proposal! But like they say in the movie and you present it to your boss; “You can’t handle the truth”.

@Don – The summary of your life’s experience would be that a good engineer can either remain constantly bewildered by corporate “culture” or head out and start on his own?
Would you go so far as to state that there is nothing in between?

@Sibin: Nope, going on your own is NOT for the “faint at heart”. Living in corporate america has it’s rewards; you get vactation time, you don’t have to look for the next customer, you have sick time, you get paid while you post emails and stuff like that. Most engineers are “not” salesman and thus would likely not survive out here. Having worked for some of the better control companies and interacted with many engineers on a weekly basis while deeply involved in ODVA, I know a LOT of VERY GOOD engineers whom make a good living working for someone. Heck I did it for a long time before I split off on my own. It was nice working normal hours, talking around the coffee machine while gettng paid, working with some “best of breed” people every day and so on. On the other side, at medium and large companies (to their detriment) you are put in a specific area; software, hardware, packaging, manufacturing, firmware, project management and so on. You don’t “do it all” so you loose the sense of cost trade offs and functional trade offs and so on. I do work for many different companies and I see it all. When working with “good engineers” it couldn’t be better. BUT the time that gets “wasted” in many companies amazes me, and I was once one of those “guilty parties”. I could give TONS of examples, all of which add up, where as a consultant I always clearly give my opinion at least once, but come end of the day, my customers are paying the bills and they make the decisions. Just “refactored” a hardware in a project where I “discussed” the pros and cons of high and low side driver FETs with the hardware engineer and he “did what he knew”. A few years later, the consequences of my comments are now obvious, and I redesign (refactor) the hardware thus eliminating the problems. As I often said in the democratic decisons processes of my prior lifes; “Don’t confuse the issue with facts!” So on your own, you live of die by your decisions. Ya, I do hardware, actually lay out boards, actually build prototypes, do objects in firmware, write specs, do schedules, do dfd’s, stds, do PC software test tools, do patents, pay bills, buy paper, and effectively do it all. Personally I love it, and is why I continue to do it. Now I am doing my own products, for cash flow reasons, and that “re awakened me” to manufacturing issues and so on. I had some of the best technical jobs my empoyers had to offer and those were good jobs, where at times they look pretty good, but then I get the next order and loose those thoughts. Back to work, I need to get paid and now have to work later because of posting this response.

@Don: It is strange how economics rules at the company/customer interface, but not between internal departments in a company. I suggested that sort of black box engineering department at one company and it was rejected. Perhaps, if economics ruled, then any poor inputs to engineering process would be highlighted as much as poor outputs, creating discomfort for management.

And then perhaps the creative process cannot be codified, requiring the liberty of a little loose iterative project work, wasting some money, until the engineers find out what they can build and the marketing people what they can sell. Hey — we’re back to refactoring!

@Sibin: I went to a seminar put on by a creativity institute of some type, and they noted that the most productive person is the one working alone, because it eliminates all the overhead of communication. However, projects get so big, so fast that it’s not practical, so we suffer with the reduced productivity of larger teams as a requirement to get the job done! But without teams, Scott Adams would be washing cars for a living.

Back to refactoring work. In my experience, it wasn’t difficult to present my case for this project by showing the bug tracking history, maintenance overhead, architectural deficiency etc. The problem was timing and willingness to get it scheduled. The challenges were:
1) Marketing had a long list of new features to stay competitive
2) Refactoring work is Nth times more difficult than a feature project. You are essentially replacing one black box for another that is as stable or better. You don’t necessarily promise better performance but only a cleaner and more efficient infrastructure for lower maintenance cost and readiness for new features.
3) There is no glory in refactoring work compared to new features. New features brings more revenue and exposure to new markets. Everybody wants a piece of that.
4) I got plenty of bonuses for my performance and not related to the type of projects. Otherwise, people will move to better pasture. A team that is stuck with legacy firmware can still stay competitive if there is fairness.
5) Changing people’s mindset requires enough people (innovators) to do refactoring work correctly. Even though the refactoring project is real, my manager postponed it until he got promoted out. The next manager had de-scoped the project and became bias against the senior developers. New developers got overwhelmed and the senior developers were brought in later. It was failure from the start when no one wanted as badly as you.
6) From the start, refactoring was misunderstood. The project was de-scoped to modify certain modules but it failed to examine the whole infrastructure and to question the whole architecture.
7) I agree with David that refactoring is a white box affair and you need to understand the whole system as the sum of the parts before you start. Don’t be afraid to move the pieces around before thinking of risk and time management.

I have refactored embedded real time systems – I normally avoid making unnecessary changes, but with some inherited code it has been much better to improve large portions of the code to make adding major enhancements easier, and to make testing easier too. I have also done it to my own code when the product changes and earlier design decisions were proving limiting.

I usually just do this as Tony says because it saves me time now and in the future, but sometimes I have done it with my customer’s agreement, when they had several major changes to a product which required it. To be fair I didn’t give them much choice.

On the whole my customers don’t care what the software is like, but they do notice that my code is delivered with very few bugs, and I can usually add changes easily.

Yes, I have successfully convinced my manager to refactor embedded software BUT only by producing a convincing cost-benefit analysis.
If the project has an immediate deadline then you need to meet that deadline in order to get paid (c’est la vie).
If you can argue that refactoring would produce a higher quality software output (that is more likely to pass system test first time around) at the cost of an additional day or so in development, then you might have a convincing cost-benefit analysis for refactoring as part of the project to meet the immediate deadline.

Having done a lot of contracting in my time (though I try not to now), I’ve seen some awful code. It’s typical of contracting life that one gets called in only when the project is badly screwed up, instead of being drafted in earlier, as a proper consultant with an unbiased view who could probably have helped prevent the mess in the first place.

What experienced contractors tend to do is “refactor” (aka redesign) the code, piece by piece, from the bottom up, by stealth. Unfortunately, we rarely get our hands on the high level design or the requirements documents, but every little helps.

Contractors have a big advantage over employees in these matters, especially at the beginning of a contract. None of the employees has much time to devote to you and it’s expected that you’ll spend at least two weeks “familiarisation”, wading through code and documents. At this stage you get left alone and just raising the odd searching question is sufficient to keep everyone around feeling warm and fuzzy. So you can plan a campaign to improve the software while fixing it. Then you get to implement your plan. You have enough slack time to do this and still do what’s expected of you because, unlike the employees, against whose pace your own is unconciously measured, you don’t have to go to pep talks or appraisals, or answer the phone, or endure meetings about the pension scheme or scheduled training on “Diversity” or “Avoiding RSI” or some such. And nobody knows you well, so you can easily limit the amount of casual conversation you have to whatever suits you. Also, the managers are aware that you’re seriously burning their budget and are there for a limited time, so you’re more likely than an employee to obtain any resources you need, when you need them.

The next phase is the friendly phase. You lose some of that slack time because other people are now interacting with you. However, you’ve already improved the code, so it’s much easier to work with. You also know the system quite well by now. So you get to continue the good work.

In the third phase, you’re part of the furniture and indistinguishable from your employed colleagues. Time to withdraw before you’re bullied into the kind of behaviour which starts making things worse again, instead of better. But if you’ve done a good professional job, you might get invited back to fix a different project. Either way you have the warm glow of knowing (or at least believing) that you’ve improved someone’s anarchic software a little. Perhaps to the point where it now almost works properly!

I am into supporting my software, both bug fixes and for new features, well into the future. This gives me a good incentive to ensure I don’t have bugs to fix later at my own expense, and that I can add new features easily and without too much pain. Refactoring legacy code is therefore often to my advantage.

I did it on one system I inherited to simplify adding now features, and said nothing to the customer. They were delighted, because, the printing was much quicker, smoother, and the missing rows of dots had disappeared. They had been told by two different developers it was impossible to eliminate the stuttering sound from the printer and the occasional missing row of dots. After that I had no problems refactoring code for that customer.

I refractor for my benefit, it also benefits my customers even if they are not aware that it does but that is not my primary motivation. I like doing new work, and enhancing systems, not struggling to add simple additions just because the software is a mess. It is the new features that I earn money for so that is a double incentive to be able to add new features easily and reliably. I find it bizarre that some people suggest that it is too expensive, or too time consuming to write good software.

Often the best times to refactor is when adding new features. You can sometimes find that the changes required to the existing code to add the new feature can be substantial enough to warrent redesigning part of the system. Continually bolting on new bits without reviewing the existing code is a recipe for an unmaintainable system.

@John: absolutely. Quite often, new features extend existing structures and processes, and show caveats of the existing code. The right approach is to review the entire chain of processing related to the feature, and redesign it properly, instead of creating strange delegate code to “handle” the “new” problems.

When this is done frequently, the impact of the changes is minimal, the code base retains the original design, and the system is kept robust as it scales.

At my current position, my job is explicitly to refactor the code to increase stability and speed. Making the code portable to other environments is a side effect of this on which I am keeping my eye.

I believe that refactoring should be an ongoing project in any company. If you don’t, you can wind up with cruft from old code that is either no longer required, obsolete (w.r.t. hardware no longer supported, etc) or inefficient in the new environment.

I am sure that in some places (medical comes to mind) refactoring is not allowed for certification reasons. In these cases not even a skunkworks type project is allowed. Having worked in places where that could be the case, I have seen where code has become filled with places where it no longer fills a need and can’t be changed due to regulatory issues.

Refactor if you can, and continuously if possible. That is what I think.

Hank’s comment about “fixing problems cause by people who should not be programming” pressed one of my hot buttons. Most of my firmware assignments involve sustaining or enhancing a product that was developed by one or more previous programmer(s). I’ve seen such poor and/or almost non-existent documentation and code comments I have a hard time “justifying” it as just bowing to a tight schedule. I have to spend/waste so much time just trying to understand what the current code does before I can intelligently add new functionality. By then, I realize just how badly organized the original code is; it’s embarrassing some of the coding I’ve seen and there’s no schedule time to really make things production worthy.

Also I hate seeing sections of code commented out with no explanation.
You never know whether it is redundant or whether someone just forgot to put it back.

One annoying bug I found once was an error in a calculation that, when fixed, stopped the application from performing correctly, as everything else had been adjusted to compensate for the inaccurate result!

@John – I was always taught not to comment out code but to use #if to disable it if appropriate. The reason being it is easy to comment out a larger section whiule testing and then bring back additinale commented out code that busts the software because a commented out line should have been removed years before.

@Eric: that practice (using conditional compilation to comment out code) created a nightmare of legacy code with scattered «#if 0» in large sections, leading to an awfully hard to read source.

To avoid that type of quality decay, we follow some simple and strict rules.

Although useful during debug, commented-out code is a plague. Using code commenting (via // or #if) should be used only during debugging/testing, by a single developer. A source must not be published/comitted with any commented code or compilation/build/link warnings.

For deprecated code, we use to keep it commented with // for one sub-version only, and then it is removed from the mainstream source. Any such code is flagged and removed from any release candidate. Released code should not have dead stubs or conditional compilation alternatives.

Tony I’m with you on this one. Make incremental improvements as part of your daily efforts. As others have said, ‘refactoring’ can be a hot button for customers and management so I don’t really like to use that term. I have some skepticism about refactoring tools but that may be because of my lack of experience with them.

The way I see it, grand, ‘big bang’ refactoring efforts are expensive and risky and at its’ worst display a lack of appreciation of those came before.

It is easy to understand why management fears that a grand refactoring effort will result in ‘throwing the baby out with bathwater’. ie That code base has many man-years of effort in it and you want to make big changes?

On the other hand all code bases can benefit from continuous incremental rework.

I would say that refactoring is best performed as a series of incremental transformations. Each the code is altered to fix a bug or add functionality there is an opportunity to make it slightly better. Small transformations are easier to understand and test. Small transformations are generally low risk and low cost.
The accumulated effect of small transformations can have a remarkable positive effect. That positive effect manifests as less bugs, easier debugging, better performance, enhanced maintainability.

While fixing a bug or adding functionality there’s often some low risk, low cost, improvement that can be made.
Here’s some of what I think about when tromping around in code.
1) Reduce the scope of data and functions.
2) Change a poorly named data or functions. (What’s in a name? To the next person working with that code, and it may be you six months from now, the name can mean everything!)
3) Make repeating snippets of code into functions. It is amazing how often cut & paste gets used by people who surely know better.
4) Reduce the replication of data.
5) Increase cohesion, decrease coupling.
6) Reduce the use of ‘clever’ mechanisms.
7)Dig into area that ‘everyone knows’ shouldn’t be touched.The entire source should be accessible/understandable/changeable. Where everyone avoids an ‘unchangeable’ subsystem, related subsystems become ever more horrible as they strive to make up for inadequacies in the untouchable one.

I have always felt that the scourge of commented out code was either a result of
- poor understanding of SCM principles, or
- poor knowledge of the various features of SCM tools, or
- poorly designed/implemented SCM tools which make the “proper” handling of situations which necessitate such hacks more difficult.

If your or your team’s problem was only with the third point then the (relatively) new generation of DVCS tools should help. (Let us assume that everyone’s understanding of SCM principles and knowledge of earlier SCM tools is par excellence )

These tools (Git or Hg) make branching and merging back so simple and so cheap that no one could possibly have an excuse for leaving commented out code in the trunk/master-branch/production-code; or for that matter, an excuse to even comment out code per se – you either boldly modify code or if you are unsure you create a new branch and test out your ideas/hunches.

@Tom – Thanks. I was worried that everyone missed the point about refactoring in the true sense that the Agile proponents have taught us.

The main points about refactoring that these proponents believed in and made a point about explaining it in a book (sincere gratitude to Martin Fowler) is :
* Refactoring is almost always a certainty, make sure that you are prepared for it right from the word ‘go’. This primarily means that you should have a solid suite of tests for your code; this is the only way you can prove to yourself and others that that your refactoring hasn’t broken any existing feature/functionality.
* Refactoring shouldn’t be an ex post facto decision triggered by a major failure or catastrophe; the whole point of refactoring is to prevent us from getting into such situations. Like Tom said – it should be “a series of incremental transformations”.

@ Sibin:Your comment “you should have a solid suite of tests …” is right on the head. In our development process we iist detailed requirements. If the requirement can not be tested and verified it is “not” a requirment. Following the requirement is the “test plan” which is intended to verify conformance to the requirement. The test plan(s) supercede the requirements where the customer determines if the test has sufficient coverage. Then the test results follow, or are referenced, to verify the requirement, where these deailed tests and captured results supercede the test plan. Ideally you “timestamp” the test results. As probably everyone on this site has seen the old “but when will it be done?” So for those whom don’t want to read we effectively enter the requirement in black text, meaning it hasn’t been tested for verified. Then we write the test plan, in red text following by “test results: FAIL” as since they haven’t been run and captured, we assume failure. Then when the stuff is actually run and passes, we change the text blue. When asked “is it done yet” the response is simple; “Is the device profile completely blue?” Unless one is color blind it takes little knowledge to understand the status of the project. Obvioulsy completing projects is a matter of assume risk versus time and money. When we have “embedded code” to test say a new object class of object function, we leave it in the code and comment it out. Thus when a code change is made, remove the comments and rerun the tests, usually improving the test code. Same think for external PC based test equipment. I don’t know about others here, but my belief is all code has defects, you just haven’t found them yet. The issue is, what is the consequence of the defect or what is the scenario that revelas the defect? If you can identify the “scenario” you can relatively easily find the defect. Little wordy response relative to testing, but if you can’t prove it works under the expected scenarios, then someone is assuming the risk, which should be the project manager or whomever “signs off” on the “ship it”.

Some may think this is VERY time consuming, but to be honest, a ton is done through “reuse” and thus not as difficult as you might think. Addtionally you “incrementally evolve” to such approaches and the support docuemntation, test plans and so on, and after a while you find you focus on new functionality and seldom have to refactor anything. Where we do “refactor” is generally we create a new function for an existing object class for a specific applicaiton of that object class, test it, works fine. Later we use that function in a “new way” not envisioned when the function was written. It is this new “expected behavior” that was not an intial requirement that is considered a “defect” in the new applicaiton. Obviously it is not a defect in the old product as it was not “used that way”. However, extend the functionality of the old product appications to use the object function in the new way, and the “old code” must be updated. So since all object classes (code, declarations, ..) are in separate files, doing a code compare and fix is usually the first step, which usually fixes the problem.

Should a customer pay for “maintenance” to do this updating? Sue, I would. How many do so? Hmmm, only during functionality updates.

@ Eric: We don;t remove the commented out diagnostic code since “what you can’t find you can’t reuse, so we try to make finding stuff easy”.

I agree #if 0 should not be there for a long period of time because it becomes a permanent fixture. Many modern editors would grey out that code and therefore I don’t see the difference between #if 0 and //. But there has to be a decision to remove them at the end of the project.
Also, you can never trust this kind of code and think it’s re-useable someday. It is different if it was unused #ifdef FEATURE_X code.

Doing some refactoring within my daily choirs was what I used to believe in. I used to do it consciously with permission from the manager or sometimes not but this is what I faced:
1) More code change, more risk. Sooner or later, you introduced a bug or behaviour change. Good deeds go unnoticed but a bad mistake is well documented.
2) I butted head with my colleagues even if they agree with it but I need to respect my group’s risk assessment.
3) I run the risk of getting branded as a high risk taker, not manageable, or worst.
4) I can overrun a project because I refactored where I think was essential to the project.

I hear so much about the good reasons for daily refactoring. Can anyone share their negative experience?

Seriously, it’s the largest source of refactoring work in the known universe. If you want a job that pays poorly but has an infinite amount of ongoing work, search for GNU CODE REFACTORER on the job boards. Wait! That’s part of all our job descriptions… 1) download, 2) refactor for days, 3) test, test, test, 4) scrap huge sections of code and rewrite, 5) unwind spaghetti code, 6) debug, 7) retest, deliver working code that your manager called FREE!

I’ve spent hours digging through gnarly nested #ifdef’s, all based on five or ten deep #defined labels (all in different include files), with the original intention of running the code on anything from a Cray to a linux box to an IBM 360 to a Univac to a writstwatch. Now THAT’s refactoring!

I worked in a company that had an embedded display library for avionics use. We had a core product that we spun off into individual products for various customer projects. Each time we shipped, we had to do a complete retest/recertification so there was no additional cost if we did refactor the code. Refactoring did not alter functionality so our requirements based test didn’t have to change.

We did several significant refactorings, all of which removed run-time switches in situations where the code actually run never changed. This removed unused code that was never executed by any of the tests and simplified the recertification efforts.

This was a case where refactoring reduced costs, but outside of such a certification environment, I would expect – as others have pointed out – for refactoring to increase costs in an embedded product.

I worked for a printer manufacturer with a failed piece of firmware. It was easy to get permission and time to redesign it.

That same organization eventually sprang for a total redesign of -all- the printer firmwares, and all the printers, too. However, there was a reason: Management had made a lovely top-to-bottom corporate communication scheme: Everybody e-mailed their accomplishments, plans and problems each week. Managers summarized for their team, repeat all the way to the top. Then, the admins at each level sent it -down-, tacking on stuff each boss thought we should know.

You’d get a thirty page e-mail every Tuesday, but if you read it, you knew everything that your chain of command cared about.

Engineers discovered that hundreds of nearly identical firmware modules were repeated in each printer product, and tens of nearly identical hardware blocks. These were all being redesigned, retested, reintegrated etc. for each printer model with only the crudest debug tools.

The first idea was to do each one once, well, and then reuse it. A better design, reused, would be cheaper. People could work on any project, because the projects would all be more similar.

The new architecture was first a document, then became something like a series of “knobs” that marketing could turn to make a new product. Engineering started adding new “settings” to the knobs.

“Agile engineering” worked really well: the bespoke printers (a formerly hated backwater in the engineering organization) all became very reliable products sometimes with lovely unusual combinations of features. We starting producing entire product lines with similar features but different print engines.

Also, the quality department was actually able to communicate details to engineering without sovereign acts from God-like executives. Particular quick-failing parts of printers (e.g. ribbon-handling gearboxes) were replaced by very sturdy, cheaper stuff (stepper motors), and the improvements quickly migrated.

The final cool thing was that manufacturing discovered nearly a hundred million dollars of cash tied up in in-process inventory, and worked it off into cash.

I worked at a company where a certain software product had to be “transferred”
from one programming language to another (because of the execution speed, maintainability, …). The product was in the field of electrical grid controlling and power distribution. I am talking about millions of source code lines, taking in mind that software has been developing for years by dozens of software engineers. The code was written in Pascal and had to be transferred to C. Every engineer in a team had a part of a system to examine, the part he/she was most involved in. Of course, some kind of an automated tool was used, but that was just a first step. The tool generated new source files, majority of them not able to compile. The second step was to go manually through each generated C source file, correct compile errors, and somehow try to look for sources of possible errors. When a module/part was able to compile, it had to be tested. There was no real unit/white-box testing strategy, but the functionality testing only, proving the system “does” what it had done before, with numerous use case scenarios. You guess, some tests were successful, many weren’t. In the process of transferring the code, and later examining a lot of methods name changed, comments changed etc. but functionality remained 100%. It was a good experience. My question to all would be if you count this as code refactorring?

Zoran – I haven’t written Pascal in quite some time (last occasion was using early versions of Borland Delphi), but I recall that at the time Pascal compiled faster, and executed faster than equivalent C code. I think that this is probably because the language itself was more constrained and requires the programmer to put more work into making the code compile cleanly.

I suspect that the move to C was to gain access to a larger pool of experienced engineers than for pure technical reasons.

In my experience, it is almost always a mistake to do a direct translation of a legacy system (automated or otherwise), better to use the legacy system as the reference standard specifying the behaviour of the new system, which should be architected with the the benefit of knowing the required finish result rather than being cluttered with the questionable kludges from the old system.

@Zoran: I have to agree with @Hank, @Eric and @Peter, Pascal is a very compiler-friendly language, and generally can be optimized, much in the same way as C can.
Was this a decision made by knowledgeable engineers, or by project management guys with little experience in such porting?
The use of automated tools to translate the project, on such large-scale design, should have rang many bells in the very beginning of the job, where a different path could still be threaded. When the first modules came out of the grinding tool, the code quality must have been awful.

It was a huge project, I believe still is, started developing since more than 20 years. It includes a lots of modules (running in separate processes), very often working independently of other modules in the system. It is a distributed system spreading across several servers. Actually, it’s a SCADA. A number of teams all over the world were included in development, CORBA was used, C, C++, Pascal, recently Java. So, I guess some people in leading positions, decided to uniform developing process, maybe using same compilers, tools, or the reason could also be (@Peter Maloy) that new generations of software engineers are less familiar with Pascal then older generations are. It wasn’t my decision, and my question still remains the same, how would such a process be classified, refactoring or something else, if there is a special category or phrase for such a process in software engineering?

I see in this pages it is not the 1st time that some practices are referred out of contest.

Refactoring is a practice that is a must for agile teams, also for embedded agile teams DURING THE DEVELOPMENT. The reason is that also the software architecture emerges day by day during test driven development or better during behavior driven development. A well architected embedded software is not all tightly coupled to the hardware, it shoud be design with only the lower layer tightly coupled to the hardware. @Robert: refactoring is not a cosmetic activity nor a “better looking source code” it is a better implementation, a more efficient implementation.

What I learned in last 25 years is that we have some fear to change embedded software because we could alter the system behavior so we would spend other time to debug a piece of software that before was working. This is a bad thing, my idea is that this software is like a crystal glass, nice but very fragile.

So if you are an agile developer you will use BDD and TDD as your development guide and refactoring as the way to correct/optimize what you wrote. To perform these activities working on a team you need a continuous building tool. This and a good source version control are the best tools to use to rise your development quality.

Out of an agile contest refactoring is used to define all practices to rewrite a piece of code that is already working to improve it. This means that the code already exists and most of the times exists from a previous delivered release. You can refactor it but it is better to have a regression test tool and perform the tests before and after refactoring to be sure to don’t alter its behavior. I don’t know how many projects have a complete regression test fixture but it is one of the best investments you may do.

When software maintenance is required means that there is a “technical debt” on its design so it wasn’t done as well as possible. In any case before to start maintenance it is necessary to establish a regression testing platform, about 25% errors are introduced by maintenance without extended tests.

Frequent refactoring makes better code, more robust systems, and cheaper maintenance (as Frank said). I believe strongly in continuous improvement, when the opportunity is available. Along the way, everyone learns. To imagine that one’s code is ‘perfect’ and can’t be improved by those who come later is pure arrogance. The ‘paralysis of fear’ leads back to ignorance, and has no place in engineering.

@Tom: I have to agree in totum.
It is my perception that legacy codebases become frozen and tend to evolve to unmaintainable monsters due to lack of control over all aspects of the system. The greatest fear that drives code stagnancy is the fear of losing track of deployed versions. “If is released, do not change it unless it is broken.”

Which brings us to Source Code Management. Once you have a Release, you can start refactorng and designing new aspects of the system, targeting the next features in an architecturally-balanced design.

Sometimes it is hard to do that when Management has a poor vision of the technology process, but Engineering must drive to do it.

While I believe strongly that code “that works” should evolve carefully into more robust representations, I don’t want to minimize the real-world risk that any change to a complex system will impact the short-term bottom-line. This risk is compounded by simultaneous feature creep.

Software Zen:

The inability of Management to correctly allocate time for design of “non-profit” standard unit tests, system tests, and field tests, which themselves introduce a different class of failure, is legendary. I suspect that unexpected software refactoring horror stories leads non-practicing Software Management to conclude that software engineers are prima donnas.whose pride of ownership and OCD regarding “elegant” code must be quietly discouraged.

The difference in attitude concerning the quality of a “working” product can be experienced first-hand by the non-civil-engineering traveller as roads transition between the United States and Mexico. Software, not so much.

The psychological aspect of Software Engineering puts a premium on all aspects of interpersonal relationships, communication skills and good general mental health.

Though our argot is subtle, and slightly subversive, it’s universal. On the other hand, being profoundly wise doesn’t meet deadlines by itself.

I have to agree with Frank – refactoring imposes a risk and that has to be balanced so that the business risk is acceptable, i.e. it is not a universally good thing to do.

I tend to leave working legacy code alone however much I hate it and how ever much I think it could be improved – why do more work and increase risk when there is no need, why expose your business to risk for no gain.

If I am changing code I may refactor some of the code involved if it makes my task easier, if it reduces the risk of errors, if it improves my ability to maintain and support the software in future.

This risk based approach means – I change only what is absolutely essential if a release is small and most of the code has been tested and is stable.

I do low risk refactoring if the change to the software is small, and the risk is smaller enough to justify the improvement.

I do larger refactoring for major releases, to reduce my risk and reduce my effort now or in the future, or pave the way for changes I expect to receive later.

Sometimes what a customer wants is a significant change that requires refactoring, and also justifies a lot of testing in I may do a considerable amount of refactoring, and then usually with my customers full agreement. (That way they share some of the risk, but also understand some of the benefit to their product.)

Refactoring on a large scale for a minor enhancement release is foolish – and a risk I will not take. I am in business not just a techy.

I guess I am saying balance the effort and risk of refactoring against the benefit of the release it is being added to. Your customers do not want refactoring they just want new features, so this is a commercial and technical decision, that has to balance benefit and risk.

@Jonny,Frank & Eric: I have been watching, and commeting, on this issue for some time. You guys are “saying it like it is” and just have some minor additions. First I agree with all of you with some tweaks. I provide licensed, high quality, reusable assets to my customers and then interconnect them to make products. I send a “bill” for the liceinsing and the “work”. Thus the “cost of quality” can be measured. I sometimes deal with customers with no electronic product development knowledges and thus I see “normal human behavior” since those whom haven’t designed complex hardware or software can’t even relate to what you do every day. In effect, a product ships when the customer does not want to spend any more testing money. Companies with experence (generally large with deep pociets) tend to “do it right”. I have worked for two MAJOR control companies and when some division has a “recall” the “cost of lack of quality” can also be measured. Product quality is a “risk assessment issue” where I have NEVER seen any theories of calculations on “this issue” for an engineer to place in front of the “ship it” decision makers. It is like congress and the deficit and the budget, most people will put their head in the sand, determine what their next day will look like if the ship it, and kick the can down the road. This is not true for those whom understand the risks and open issues in a product. So what manager do you guys know, whom was fired after a bad product was shipped? If there are no consequences, this behavior does not change. However, you probably know of some engineers whom paid the price since they “didi it wrong”. Shipping a bad product, if it meant career disaster, might impact “just ship it” decisions. The old “cause & effect” discussion.

Product quality is a “risk assessment issue” where I have NEVER seen any theories of calculations on “this issue” for an engineer to place in front of the “ship it” decision makers.
————
Well, I certainly have.

So what manager do you guys know, whom was fired after a bad product was shipped? If there are no consequences, this behavior does not change. However, you probably know of some engineers whom paid the price since they “didi it wrong”. Shipping a bad product, if it meant career disaster, might impact “just ship it” decisions.
——-
Fred Brooks wasn’t fired – but he was one of our country’s greatest pioneers. Anyone who repeats his mistakes once acquainted with them, should see a shrink for shock therapy.

Ya, perhaps I was a little to emotional in my statement, but the issue remains relative to the consequences of shpping a bad product. I acutally thought about this some more last night and know of varios “ship it” scenarios where an individual, and there were quite a few, that would ask at times during a test phase; “But does anyone know of any identified defect right now?” If you ask the question often enough during testing there will be times when no identified defects exist, although testing is not complete. Then when there are no response, product is shipped! My guess is this is somewhat common as I have seen it a few times. These are the “cases” where some consequences should exist, as it is a kind of knowing negligence. Stuff goes wrong in products when an unforseen scenario occurs that no onw was “intuitive enough” to foresee and thus a defect is uncovered. This is what I refer to the ship it syndrome, and how your typical engineer (with a career) would confront this behavior? I have also seen projects where there is poor engineering and you can test forever and it may never ship, which the root problem here is different. So what advice do you give those that ask since CYA is not the issue, fixing the problem is? Anyone can identify the issue, how do you guys fix it? “Not my job man” or “above my pa grade” is not the answer I am looking for here.

@Don:
Your question (I presume it is not rhetorical) is central to the theme.
I am directly impacted by this paradoxically possible scenario:
- we need to have bug-free products;
- the route to that is good engineering coupled to comprehensive testing;
- a bug is, by stipulation, a software defect that hit the field;
- if you never release a product, you have a bug-free product;

Of course, you *must* release in a regular basis, so the company makes money etc.
The danger of Management forcing an early release of a firmware that is not sufficiently tested can be very damaging, especially in industrial systems, where bugs can potentially generate high losses and liability. Nevertheless, that scenario you described is surprisingly common.

What I did a few years ago, in my current company, is to force a minimum test coverage for *any* firmware, and declare that *all* firmwares have the highest criticality, i.e., operation on mission-critical environments. The test scenarios vary, but are based on the aspects changed in the firmware for the current release, on top of the checklist and regression tests currently in place. We simply do not sign off a release candidate that did not undergo the “minimum period” of testing.

The approval of the version involves an exposition of the aspects and associated risks of failure, so Management partakes in the decision to release, and Engineering applies pressure not to release early.

This process resulted in a significant reduction of fielded bugs. We have two product lines that are based on the 8051, and for which the releases were driven by field bug detections. After this protocol, we started to do planned refactoring and preemptive bug fixing, and almost eliminated bug impacts in the field.

@Jonny: Unfortunate that you also agree this is a common behavior. With defect fixes on a shpping product there is buy in up front, as it is currently negatively impacting “the company”, and as you say, test plans, test equipment and so forth already exist. The critical area testing is “generally followed” by majority of my customers, likely all of them if I thought about it. I generally don’t disclose too much about the RTOS, LLC processes, but since “us contractors” deal with all types of customers from those “skilled in the art” to those with literally no product development expertise we do the following with “limited success”. In the primary detailed design document, called the Device Profile, we include requirements, followed by test plans, followed by test results and/or references to other doucments containing these items when “large”. When we enter a requirement it is done in “black”. When we right a test plan (detailed) we change the requirement and test plan to red text. The test result, in Device Profile, will mininmally follow the test plan and minimally state PASS or FAIL. This text remains “red” until it “passes” at which time we turn the related material to “blue”. When asked the test status we merely reply;, “is the entire document blue?” There are some customers/industries that regularly turn them completely blue, which is the exception. Some customers the product ships and the document is black, or almost totally black, menaing “informal testing may have occurred” but no proof of testing. I don’t know about others experience, but I find that when doing test plans/testing/capturing test results that one’s brain must use “different neuron paths” or something, as I ALWAYS find what I would call a defect. Sometimes critical and sometimes not, but I seem to ALWAYS find stuff. It’s been my experience you have a more “gobal perspective” of interrelated product functionality during testing and a more “focussed view” when implementing and during testing I generally find “inter-dependency issues” mroe than anything else, and generally due to “evolution of requirements or refinement of requirements” during implementation, where you often “miss crossing the tees and dotting all the i’s”. Since I obviously have a “black, red and blue” overall view of project status, even without a “single clue” of the content of the Devcie Profile you get an immediate sense of the project status. Yep, I have lived in the world of “defect databases”, spreadsheets and all that “tracking” stuff, but I don’t know about others, but it never seemed to give me a good feeling relative to the coverage testing. Of course our definition fo a requirment is “if you can’t test to it and verity it is met, it is NOT a requirement”. Just wondering, how many “really test to requirements” and how many “test for conformance to all requirements”? Also, any suggestions on a “better way” to indicate test status that is actually indicates the level of test coverage? (By the way, your word processor, that everyone has on their computer, is the only “tool” needed when you do it with colors, versus using the defect reporting tools most of my customers have no access and thus won’t use.).

@Don:
As you said, it takes a keen mind to capture multidisciplinary contexts, model dependencies and effects, design and implement hardware and firmware to fit the visualized functions, and then analyze, find the design problems and better it in a continuous base.
What we are talking about is dedicated, focused engineers. People that continuously interest themselves in improving.
The process of improving and create better designs is part of what constitutes excellent engineering.
And the attitude of taking ownership of the design, in the sense that you are responsible, as an engineer, for the performance and safety of the system, counts when you are defending a seemingly more difficult decision of refactor “perfectly good code” or completing “expensive and time-consuming tests”. That attitude, when coupled with common sense, is very convincing to the customer and Management.

There are several methodologies and quality assurance tools that can be used to formalize this process, even very simple and effective as the color coding you described.
But the truth is that, when you are fortunate to make the process work, everybody in the team feels proud of the work. That is one hell of a good indicator. Non-technical people that are distant from the implementation may be proud of bad products, but not the engineers who can see the hardware and firmware details.

It is not only about refactoring also. If you have that attitude (continuous improvement), your current design will be at least as good as the last one, but probably better, in several aspects. You must be able to pick at least 5 aspects from your current work that you can improve. If not now, in the next one. But when you see those aspects, fix them in the current design.

Don’t let the response from the field drive this process, but rather, make the field to benefit from it.