Order In The Software Universe

Sunday, June 24, 2018

Intro

I have been in the software development business for well over twenty five years and have seen many successful and unsuccessful projects. The past ten years have been on teams following Agile principles to varying degrees. The teams have always been composed of different levels of experience and expertise, but I have been very fortunate to have teams that are heavily populated with not only experienced developers, but with open-minded developers.

On these teams, I have seen patterns of behavior that, on the surface, seem beneficial to the team, the product, and the company, but in reality come with a lot of unrecognized costs. These costs are unmanaged; they are not planned for; they are not quantified. Oftentimes they are never historically reconciled.

There is one particular behavioral offender in this developer ecosystem and it might surprise you. I'm talking about the hero developer.

Hercules

Meet Hercules. He is a member of our mythical development team. Hercules fits pretty well into the team. He communicates reasonably. He is a competent developer. He gets his work done consistently. But where he really shines is when there are looming, seeming impossible deadlines. He is the one that stays up into the wee hours, throwing himself across the finish line just-in-time.

He is the hero. He is the one who gets the adoration of managers and that good ole back slap with an "at-a-boy". Honestly he really loves the attention and approval from his manager and the awe from the junior developers who hope to grow up to be like Hercules someday.

Managers see the deeds of Hercules and know that this is how successful software is created. If only they had a bunch developers just like Hercules, their software problems would be solved.

A Deeper Analysis

First, in the context of Hercules, let's acknowledge that being dedicated to your team and working hard are assets and that these assets are not found in all developers. But when we look at Herculean deeds from an Agile point-of-view, we start to see issues that can undo all of the gains that came from the providence of Hercules.

Hercules specializes in cranking out code in a short amount of time. Sometimes it seems the more pressure, the better. Hercules knows that code. Due to time pressure, he didn't get to do a lot of the unit or integration tests, but through sheer ability to focus, he delivered today's functionality, and that's what's important, right?

Hercules sure knows that code, but the other developers don't. The agile developer wants group code ownership. They want group input on design choices. They want the most well thought out, flexible code that the company and project can afford. The agile development team wants to de-silo the implementation, meaning that multiple developers are familiar with the code so future changes occur at a reasonable cost. But given the single source effort, the odds go against this goal.

A good hero sees heroics as an aberration - something to be avoided if possible. But I have seen cases where heroes use heroics as a path to job security. They hoard information by being the only custodians of said code, such that others recognize their area expertise. They want to make sure that there is a large painful hole in the development team should they ever leave. Naturally job security like this is good for them and really bad for the company.

Another issue involves scaling the team. If you have a Hercules, it is almost impossible to clone him/her. You should be able to find developers, but if you need one to fit into a chaotic, barely-managed environment, you need another Hercules, and they are both harder to find and cost more money.

The final drawback is a bit more subtle. Tom Demarco's book "Slack" (https://www.amazon.com/Slack-Getting-Burnout-Busywork-Efficiency/dp/0767907698) does a good job of discussing the issue. In a nutshell, if you have heroics as your normal means of delivering software, your heroes are >100% busy - and this means time for creativity and giving back (like mentoring) fall by the wayside. This cost is never recognized by management but can have a very large impact on the business over time.

Parting Advice

Tuesday, June 5, 2018

A Brief
Lexicon of the Software Quality Landscape

The
quality of a software product is a multi-dimensional measurement, spanning such
things as functionality, correctness, performance, documentation, ease of use,
flexibility, maintainability, and many others. Many of these qualities are
difficult to measure, are difficult to see, and hence are difficult to manage.
The result is that they are ignored by all but the most enlightened in
management.

The topic
of interest for this screed is going to be correctness. This is not going to
concern correctness in the end-user sense, meaning the correct meeting of
requirements. It is going to mean "is the code doing what we think it
should be doing." Since we don't generally have provably correct programs,
it is a matter of convincing ourselves, through lack of evidence, that our
programs are working as we'd like. This precarious situation is perfectly
portended by Edsger Dijkstra's observation that "absence of evidence is
not evidence of absence."

So we have
several levels of ways to convince ourselves of correctness. They are, from
most detailed to most abstract, unit testing, integration testing, functional
testing, and system testing. Note that there is not industry agreement
concerning these exact terms, but the general concepts are recognized.

In typical
object oriented designs, unit testing involves isolating a given class and
driving its state and/or behavior and verifying that we see what we expect.
Integration testing is a layer above that where we use multiple classes in the
tests. Functional testing is yet above that, where we try to deploy our
programs in the natural components that they would inhabit in production, like
a server or a process. Finally system test covers testing in the full
production-like environment.

Unit
Testing

Our focus
will be on the lowest level, namely unit testing. The expectation is that unit
tests are both numerous (think many hundreds or even thousands) and are
extremely fast (think milliseconds). To be effective, these tests should be run
on every single compile on every developer's machine across the organization.
The goal is that the unit tests precisely capture the design intent behind the
implementation of the class code, and that any violation of those intents
result in immediate feedback to the developer making code changes.

I'd like
to tell you that every developer is doggedly focused on both the quality of the
production logic and the thoroughness of the unit tests that back that logic.
Through a combination of poor training, lack of emphasis at the management
level, and just plain laziness, developers produce tests that span from
greatness all the way down to downright destructive (more on that in another
blog entry). One of the easiest ways to try to externally track this testing is
through code coverage.

Code
Coverage

Code
coverage is a set of metrics that can give developers and other project
stakeholders a sense of how much of the production logic has been tested by the
unit tests. The simplest metric is the "covered lines of code" aka
line coverage. This is usually a percentage and it means that if a class has 50
lines of code in it, and it has 60% code coverage, then 30 lines of that
production logic is executed as part of the running of the unit tests for that
class. There are other coverage metrics that can help you gauge the goodness of
your tests, like branch coverage, class coverage, and method coverage. But
here, we will focus on line coverage since that is most widely used.

The
general, common sense assumption is that "more is better", so
mis-guided management and deranged architects insist on 100% code coverage,
thinking that would give the maximum confidence that the quality of the code is
high. If we had an infinite amount of time and money to spend on projects, this
conception could represent the optimum. Since this luxury has never been true
in the last 4 billion years, we have to spend our money wisely. And this
changes things drastically.

The truth
is that it might cost M dollars+time to achieve say 80% line coverage, but it
might take M *more* dollars+time to get that last 20%. In some cases, getting
the last few percentage might be extremely expensive. The reason for this
non-linear cost is complicated.

First,
production logic should be tested through its public interface where possible
rather than through a protected or private interface. It can be laborious to
construct the conditions necessary to hit a line of code buried in try/catches
and conditional logic behind public interfaces. This cost can be lowered by
refactoring the code towards better testability, but this is a continuous
struggle as new code is produced. There is a truism in the veteran developers
that increasing the testability of the production logic improves its design.

Second,
some code has high complexity also known as cyclomatic complexity. Arguably
this code should be refactored, but projects do have a certain percentage of
their code with high cyclomatic complexity that gets carried forward from
sprint to sprint.

The third
reason is a bit technical. Code like Java is compiled into byte code. The code
coverage tools run off of an analysis of the byte code, not the source code.
The Java compiler will consume the source code and emit byte code that may have
extra logic in it, meaning code with extra branches. It might not be possible
to control the conditions which would take one path or the other through this
invisible branch. Further complicating this, is that the invisible logic can
change from Java compiler release to release, putting a burden on the test
logic to reverse engineer the conditions needed to cover this invisible logic.

Summary

Based on
the above discussion, achieving 100% line coverage can be very expensive. On
teams that I have worked on over the years, a reasonable line coverage would be
70% or more. But you should let the development team determine this limit. If
you force your teams to get to 100% line coverage, you are spending money that
might be better spent on automation tests. In addition, I have seen cases where
developers will short-circuit the unit tests by writing tests only for the
purpose of increasing the coverage. You can readily identify these test because
they have no assertion or verification check in them - they just make a call and
never check on the result.

In short,
you should be careful what you ask for. Make sure you interact with the
development team in making the decision about code coverage. Spending another
50% of scarce testing dollars on that last 10% coverage is unlikely to bring a
return on investment.

Sunday, May 21, 2017

The Problem

On my current team, we use a common tool stack of Jenkins, Maven, Robot Framework, and Java. Our build system is centralized with a couple of dozen projects. We have suffered build and developer friction, namely

Our version of Jenkins is older than the Internet and cannot be easily upgraded

Upgrading a plugin/tool risks incompatibility with other plugins/tools

We have accumulated snippets of shell scripts embedded int POM files

Developers sometimes use our continuous integration (CI) to do compiles and test code

We sometimes have trouble with local builds passing and CI builds failing

These and more are the Usual Suspects on almost any build system that has grown over time. Having been on this team for around two years now, I have identified a direction we should take to modernize our build system.

A Solution

Docker and containerized builds have been around since at least 2013. There are some compelling reasons to consider using containerized builds. This is my attempt to clearly enumerate the advantages and to convince my chain-of-command to make the jump. Hopefully this will help any readers to do the same.

Moving to containers means we
can deploy onto a cloud with minimal work. This can address scaling
issues effectively. Note that some builds will still depend on lab access to local devices,
and these dependencies will not scale.

Containerizing the build
pipeline means easier upgrading. For example, running a component in its own
container isolates it so other containers that depend on it are forced through a
standard, explicit interface.

Containerizing the build
means better partitioning. So instead of making environments that contain all
possible tools and libraries, a container can only use those needed for its
specific purpose. This has the side effect of reducing defects due to
interacting 3rd party components.

Containerizing the build
means a clean reset state. Instead of writing reset scripts, the container is
discarded and resurrected from a fixed image. This is a phoenix (burn-it-down) immutable view of containers, and forces all build and configuration to be explicit (not accumulate in snow flake instances).

Containerizing means 100%
consistency between local development and the build system, which should eliminate
the but-it-works-on-my-machine problem.

Containerizing the build
means effective postmortems for build failures, potentially leaving failed
runs in the exact state of failure, rather than relying solely on extracted log files.

Containerizing the build means
building and installing an intermediate, shared artifact onces, instead of 52 times, and
potentially speeds up the build cycle.

Containerizing the build
means that some tests can make temporary checkpoints via a temporary
image/container and roll back to that state rather than tearing down and rebuilding,
affording a quicker build.

Judicious use of containers
might help with diagnosing hard-to-reproduce issues in the field. I have seen instance of technical support sending/receiving VM images to/from customers. Containers would be both simpler and could be a lot smaller.

Containerizing the build is
considered a modern best-practice and affords access to all kinds of build
work flow and management tools.

That's it! Good luck convincing your management when the time comes to modernize your build systems.

Monday, April 3, 2017

Recently I was given the opportunity to work on an open source project. My team had developed a new additional feature and we were eager to get is pushed upstream before a code freeze milestone. Our team was closed source and project was proprietary, so this effort was essentially to push a portion upstream as open source.

We decided to do three pushes due to various time and resource constraints: push the basic feature, then an enhancement, then some automated integration tests. Our team was not familiar with the process or requirements of the upstream project but two thirds of the code was written under contract to us by a former major upstream committer to that project.

The push of the basic feature was rougher than I thought, given the authorship of much of the code. Most of the issues were stylistic and our reliance on the contractor to "do the right thing" had been a bit shakey. After some corrective actions, the first merge happened. Based on code review issues raised on the first push, I revised the second batch of code, and submitted a pull request. As expected, there were a few minor issues raised which I corrected.

So this project has checkstyle enabled as warnings. But the build is so full of noise and warnings (that is deserving of another rant on its own) that developers tend to ignore warnings, especially those involving things like whitespace, which happens a lot with checkstyle.

So I submitted the pull request and sort of expected an approval and merge. This was highly desirable because the code freeze deadline was only a couple of days away. Instead what I got was a request to move the checkstyle setting from warning level (the project default) to error level.

This was a stunning request for several reasons.

* First, making the code freeze deadline was now unlikely.

* Second, this would be a new requirement on our code that was not met by any of the hundreds of thousands of lines of code that already existed. Plus our previous commit was accepted without that obligation.

* Third, the time to enable something like checkstyle is day zero of a project. Turn it on then and let issues be fixed as part of every single commit. If you have never turned on an analyzer like checkstyle on to an unsuspecting code base, it is dismal. Our commit was probably a few thousand lines of code, but checkstyle emitted about 250 "errors". This would be such gems as "there is a space between your method name and the parameter list". Manually fixing all of these would be tedious and expensive - probably several days of work.

There were a half-dozen "functional" issues that I fixed, like "unused import" and "catching naked exceptions" that were entirely reasonable to ask for a fix. Those functional fixes were done and the new pull request has been awaiting approval for three days now. I guess in open source, not everyone has a sense of urgency. Complicating this, your reviewer might actually work for a competing company, so politics might interlope.

Some observations...

I wanted to point out an issue of open versus closed source. In the past, on smaller closed source teams, we never enforced cosmetic coding standards. Everyone's slightly different coding styles acted as fingerprints so you could tell without looking who the developer was. But in an open source context, where there could be large numbers of developers, it makes a lot of sense to have strict coding standards and enforecment of those standards should be enabled by default.

Readability is subjective. The wars of bracket placement, and other whitespace issues, happen all the time and clearly have no real impact to the business. To me, a perfect solution would be that whitespace is always managed by the tools locally to each developer. For example, a developer's whitespace style is applied to the code base upon "get", does not show up as a "diff", and is removed as part of a "push".

The imperfect solution is the best we have today. You can get plug-ins into IntelliJ or Eclipse that will show you violations in "real time" so that you can take instant corrective actions. Unfortunately there is no comprehensive code reformatting tool to manage style violations. Many of them are amenable to automated fixes, but not all of them.

We have some ways to go before we can have our cake and eat it too. Maybe it will happen when we have spent enough expensive developer time chasing down that stray space in our code-under-review.

Monday, March 27, 2017

Recently I had to commit work upstream to the public Open Daylight project. As part of that effort, I had to learn some new tools, particularly Gerrit. This is a code review/management tool that works with Git, which I was already familiar with.

Here are some ad hoc things I learned as part of committing work upstream...

Gerrit (plugin) has some non-obvious controls, namely that you can kick off another build by putting "recheck" as your comment. Others are "rerun", "verify" as in here.

Upstream coders usually add the prefix "WIP: " to their bug report to let other developers know things are not ready for prime time reviews yet.

Mechanically you can review the issues by using "Diff against:" drop list to pick a revision to start at then go into the first file and then use the green upper right arrows to navigate between files. (It'd be nice if Gerrit had a "next feedback item" navigation).

The master branch on ODL can and does break, causing lots of down time with broken builds (days even).

Full builds take 10 hours and verifying-before-pushing is an evolving process.

If you are using "git review" and you forgot the "--amend" option on your commit, you will be prompted to squash and will end up with a new branch. You can recover using "git rebase -i HEAD~2" then using the force option and then abandoning the new branch.

Along with master breaking, master can also produce stale artifacts, so don't assume that clearing your m2 cache, creating a subproject repo and building will give you up-to-date code.

The patch workflow is not very good at all for having multiple people working on an item; traditional git branches and management are superior in this respect so expect some pain.

If your patch won't build because master is broken, you can rebase it to pick up a fix temporarily. When you do a "git review" to push it, you will be prompted because there are two commits (your new patch revision and the fix). You do not want to merge that fix into your patch.

Hopefully, some of these tips will find their way to helping someone else along the way.

Saturday, November 19, 2016

One of the things I enjoy doing is looking at other developers' code and dissecting it. You can find a lot of advice on how to refactor code (including this blog). One that caught my eye recently was this one. It shows a small C# example and how to refactor/improve it.

Go ahead and read it. It'll take 5 minutes.

I'll wait.

So I agree that those changes were probably improvements to the code. But when I see code, I see layers and coupling and dependencies. So I thought I'd post a refactoring here that is very different than you'll see in most places.

I bet you weren't expecting that. I have been a developer for over 30 years and it has taken some time to see software the way I do. So lets evaluate this. Is the code better? It depends.

The code above has separation of logging, error flow, and notification and emphasizes composition instead of inheritance.

The main business logic (found in InvoiceBasics) is very simple and clean.

More functionality can be added without disturbing everything else.

It is far more testable than the original.

It conforms more to the Single Responsibility Principle than the original.

But there is a cost. There are more classes (that do less). It is probably easier to understand what happens in the original code, where you'll have to think about it with the newly refactored code. It took more time to do than the refactored version, but you gain in testability, flexibility, etc. So you have to take all this into account. You, as the designer, have to decide how much money to spend on this code and how long you expect it to live. It is not always easy to decide.

Saturday, November 12, 2016

So you want to be the best developer you can be? Warning - your career needs resemble those of a living plant more than that inert diploma sitting on your wall. Uncle Bob of Clean Coders, said that developers should spend upwards of 20 hours per week in “sharpening the
saw”. I’ll let that sink in for a minute – that is half of your normal work
week – on top of your real work. Now you know what you’re up against.

So here are my recommendations to anyone who wants to get
better…

On Software Craftsmanship

You are paid to make decisions. One of the critical ones is
how much money to spend on developing a chunk of logic. Be thoughtful about
taking shortcuts because you are just pushing even bigger costs into the
future. Those costs will be born by you, your colleagues, and your company. The trick is to "do the right thing" without over-engineering (over spending). Here are some great technical resources you should consider: