Five Lessons Open Source Developers Should Learn from Extreme Programming

Extreme Programming, or XP, isn't so much revolutionary as it is
evolutionary. Developers have known the value of code reviews,
testing, and good communication for decades, though we've ignored
that knowledge far too often in practice. Five Lessons You Should Learn from Extreme Programming explained several XP
practices that apply to non-XP projects. A little common sense, a
bit of learning from failure, and a lot of discipline can improve
your team.

It's harder to see how XP can apply to open source projects,
especially those that apparently lack a formal customer and are
generally immune from budget and schedule pressures. The other
challenges of software development apply, though. Sustained
development requires managing complexity. To build a successful
open source project, you must solve many of the same problems you'd
face with a successful in-house project.

As you'd expect, there are several lessons open source
developers can learn from Extreme Programming.

1. Test, Test, Test

The second most valuable artifact in any project is an automated
test suite. (The first is the code itself.) Because few open source
projects have the luxury of pair programming or even mentoring with
any reasonable physical proximity, a good test suite is invaluable
to understanding the code.

There are two parts to a test suite. The first and most
important is the set of customer tests. Think of this as executable
specifications. The second part is the set of programmer tests.
These exercise individual pieces (functions, classes, and methods)
individually. They're also important, but secondary to the customer
tests.

For example, in my Mail::SimpleList project, customer
tests simulate sent and received email messages. That's it, I only
need to mock up the system's entry and exit points. From there, the
essential, user-visible behaviors can be tested.

Examining the customer tests allows new hackers to get a feel
for how the system works and the features it supports. Anything not
tested doesn't exist. Don't count on it working in the next
version. Don't count on it working in this version.

The programmer tests operate at a much lower level. They're
slightly harder to write because they expose more details. They
help immensely while debugging, though, as they can often pinpoint
bugs to certain lines of code. As well, programmer tests help keep
developers honest. From the developer's point of view, a class,
function, or method's necessary behavior is enshrined in an
executable test. You can be more confident developing if you know
that the tests will catch accidental breakage.

The de facto testing tools in the XP world are all open source.
Given the existence of so many excellent open-source testing
frameworks, testing is the best place to start. Explore the
appropriate testing framework for your language. Write a new
feature in a test-first fashion. After all your tests pass and you
can't think of any more tests to write, refactor what you've just
written. It takes a while to get the hang of testing, but even a
few naive tests are much better than nothing.

Retrofitting tests to an existing project is difficult. Instead,
write a test every time you touch code, whether you fix a bug or
add a feature. This will concentrate your tests where you need them
the most. As a side benefit, after you've fixed a few bugs in the
same section of code, you'll likely have enough tests to refactor
it; bugs tend to congregate.

Of course, you need a well-designed system to get the most use
out of your tests. Test-driven development will help. So will the
next lesson.

2. Practice Simplicity

The Unix philosophy of writing simple tools that each solve a
single problem simply and can be combined easily and flexibly has
worked well for decades. Simplicity's not limited to operating
systems. While it's possible to go overboard, perhaps linking
against dozens of libraries to avoid writing a few lines of code,
too many projects commit the opposite sin.

XP promotes simplicity in two stages of the development process.
First, the scope of each release is kept small. Each release
represents a fixed amount of development time. Only work that is
estimated to fit into this time can be scheduled. Second, each
development task is implemented as simply as possible to pass the
tests. No unrequested features are added unless the schedule is
readjusted.

Simplicity of Planning and Scheduling

Open source projects usually don't have the time or budget constraints to
require hard and fast release dates, but getting frequent feedback from users
and customers is vital to the survival of the project. Since "customers" are
often potential developers, having a good feedback loop can increase the
resources at your disposal. Keeping the source code public with regular
snapshots or anonymous CVS or Subversion access helps, but if features take a
long time to land or to stabilize, it can be difficult to know when the code is
worth using.

As with in-house projects, soliciting feedback and scrutiny can be scary.
It's integral to solving real problems correctly, though.

Anyone can come in to the code at any point, so keep the code accessible. Keep the main source tree passing all of its tests. Fix
any problems as soon as they occur. Several large projects,
including Mozilla and Perl, have regular smoke tests that run the
full test suite as often as possible on as many platforms as
possible. It's much easier to track down errors if you can narrow
down the breakage to a single change or set of changes. (Andreas
Koenig has a script that finds previously unknown regressions in
Perl by performing a binary search on changesets. It's very
handy.)

Work in steps as small as possible. Minimize each set of
changes. Not only is this less to test and less to debug, but
there's less migration between changes. Watch how Linus manages big
changes to the Linux kernel; he prefers small, steady patches.
They're easier to read as they change one thing at a time.

By working in small steps, it's easier to have regular releases.
Subversion is a good example. While they have goals for beta and
final releases, they release a new snapshot every three weeks.
Scheduling releases can be difficult, with random contributions,
but the bulk of development likely comes from a few, dedicated
coders anyway. You can't control what outside developers produce,
but you can focus their contributions into small, manageable
pieces.

Simplicity of Development

A small, simple application that does one thing well is much
more valuable than a hundred applications with lofty goals that
never actually do anything. There's nothing wrong with writing a
framework, provided you really need it. It's much easier to
generalize a framework out of working projects than to design a
framework to fit future, uncoded projects.

A few projects have survived despite going the other way around.
Mozilla comes to mind, giving the appearance of building everything
but a decent web browser for a couple of years. Don't count on a
combination of luck, skill, funding, and determination pulling you
through.

Practicing simplicity can be difficult. Test-driven development
helps, at least when you're implementing a feature. It's harder to
design features simply. If you have other developers, talk about
upcoming features and try to find the simplest way to implement
them. It may take a couple of rounds of brainstorming on IRC to
find a better approach than any one person's best idea, but it will
happen.

3. Refactor, Don't Rewrite

Refactoring is changing the design of your code without
changing its behavior. If you work in small steps and have good
test coverage, you can clean up amazing amounts of code without
changing externally visible behaviors. Think of it as reducing a
complex algebraic equation; you may start with something horribly
complex, but you simplify, one step at a time, according to
well-known rules and guides, ending up with something you can
understand at a glance.

Having maintainable, clean code is an excellent goal. Keep it as
a high priority, especially as you write new code. Write tests and
practice simplicity. On the other hand, having code that works
today is much more important than having beautiful, perfect code
that might work again when it's finished. Too many projects rewrite
themselves from scratch every few versions as the authors come up
with new ideas, forget how their code works, or decide the existing
code base isn't worth salvaging.

Rewriting seems tempting; it seems faster and easier. For very
simple cases, it may be. If you've been writing tests as you go and
if you can demonstrate that your software meets the customer needs
(because it passes the customer tests), it's almost always less
work to improve existing, working code. Would you build a new house
just because your kitchen is full of dirty dishes?

It's also tempting to start over from scratch when you come
across code you don't immediately understand, whether or not you
wrote it. You'll be better off learning how to read code, though.
Even if you're only coding for your own pleasure and education,
you'll likely learn more by exploring how other people have already
solved problems you might not even realize you'd have encountered.
Very few programming problems are as simple as they first seem.

Of course, if you're migrating to a different language or
platform, if you don't have any users, if you have licensing
conflicts, if you don't have any tests, or if all efforts to work a
critical feature into the existing codebase have been unsuccessful
or too expensive, sometimes rewriting from scratch is worth the
cost. Don't start without giving serious consideration to what you
can reuse, though.

Several large and influential projects, including Mozilla, Apache 2,
Enlightenment, and Perl 6, have opted for rewrites. It's hard to say whether
large-scale refactorings would have worked better, but it's easy to see common
drawbacks, including slow migration rates to the new versions and questions of
the quality of unknown, unproven new code. Splitting development efforts
across two major branches may, as in the case of Perl 5 and Perl 6, spur on
extra development effort and help recruit new developers. It's also possible,
as in the case of Mozilla versus Netscape 4, that your developers won't want to
maintain the old codebase and will provide only the minimum possible upgrades
for months or years. An extended period of low maintenance and little visible
progress can frustrate existing users and discourage potential new users.

4. Release Frequently

Common wisdom says "Release early, release often." The earlier
you release working code, the better the chance of finding other,
like-minded people to give you feedback and to refine your ideas.
The more often you release code, the more often you can receive
feedback from users. XP projects deliver code to the customer every
three weeks or so. These aren't alphas, betas, or even release
candidates. They're stable, high-quality releases, capable of being
delivered to end users immediately.

You might think XP developers would go crazy trying to get
everything done. The secret is three-fold.

First, all features are broken into small pieces that can be
completed in a day or two. These iterations are also scheduled and
monitored closely. Not only does the schedule hold only as much
work as the developers estimate they can accomplish, but work is
rescheduled if the schedule is too conservative or too liberal.
Second, comprehensive programmer and customer tests help identify
when features are truly finished. Finally, any new features that
require data conversion (such as database schema changes) must be
accompanied by migration utilities.

Since the software is always kept in a working, ready-to-release
state, it's easy to release regularly. This helps keep migration
risks low and the feedback loop between developers and users short.
It pays to automate as much of the release process as possible,
from smoke tests to packaging to installation tests.

Several projects have predictable release schedules, from
Mozilla to Subversion to OpenBSD. Though there are no hard and fast
deadlines and a project needs no financial backing (or even users)
to survive, managing schedules and change wisely can only help a
project survive.

5. Be the Customer, When Appropriate

XP gives the power to manage the schedule to the customer.
That'd be scary, if it didn't also give the power to estimate
development tasks to developers. A scheduling meeting generally has
the developers saying, "We can do X hours of work in the next three weeks. Here is a list of tasks and the amount of time we estimate each will take. Please choose enough tasks to add up to X."

Few open source projects have the luxury of a single customer
who can set development priorities. (It's nice to get one piece of
feedback a month from a happy user, let alone a useful feature
request!) That leaves the lead developers to wear the customer hat
when appropriate.

Solving Problems

XP customers decide what features to request based on actual
business problems. The whole point of the software is to make their
lives easier by helping them get work done. It may take some work
to describe the needs of your project in those terms, especially if
you're writing a game, but it's possible.

XP customers use stories to communicate feature requests to
developers. A story is just a sentence or two describing the
feature from the customer's point of view. Stories have to be
short, concrete, and testable; they must fit into the normal
release cycle and they should suggest customer tests.

Make your goals clear. Keep them small. They're easier to
explain to other people and they're easier to schedule. If they're
public, they're easier for other people to do for you; I
occasionally look through projects I use to look for small,
well-defined tasks to do in an afternoon.

Scheduling Features

Scheduling volunteers is hard. You don't know what they'll work
on, unless they tell you. You also don't know how much time they'll
have to contribute. Of course, you probably don't have financial
pressures to release in a given quarter, though Perl 5, Python, and
Ruby developers have all been scurrying to release the latest
versions in time for integration in Panther (Mac OS X 10.3).

There's nothing like a code freeze to bring out latent brilliant
ideas for potentially risky new features. If you wait for idea
after idea to materialize, your release cycle can stretch out far
longer than you anticipated.

If you build schedules around stories, you can adjust the amount
of work scheduled for each release. It's okay to add features to a
release as volunteers appear with patches and ideas, but, if you
release once a month, it's easy to delay work on an idea for a
couple of weeks until the start of a new release cycle.

Knowing When You're Done

If you're the customer, you have the authority to say when a
release is ready. Rely on your customer tests; the first task of
any story card should be to write customer tests. You may not ship
these tests (but if you write them well, they'll be invaluable
debugging aids), but they're a great way to keep track of your
status.

Every story scheduled for a release needs customer tests. If the
tests aren't even written, no one's worked on the story yet. If the
tests are written and they fail, the story is started. If the tests
are passing, the story is finished.

When all of the tests for the current stories are written and
all of the tests in the system pass, make a release.

After a release, make a list of the next features you'd like to
add. Write them as stories, from the point of view of the user.
Give each story a rough time estimate and arrange the stories by
priority, again, from the user's point of view. Then choose the two
or three most important stories and schedule the next release based
on their estimates. It may be a small release, but if you resist
the temptation to add features without going through the scheduling
process, it will be predictable. You may have to automate your
release process, but that's a good thing!

Applying These Principles

It's hard to adapt "traditional" software development
processes to account for the realities of open source development.
The twin goals of excellent software are the same, though: to
write high-quality, maintainable software that meets the customer's
real needs.

Several projects practice these lessons. Some have come about
after XP was introduced. Some came about before. One good example
of project managment is Subversion. They have good tests,
they reuse lots of good code from other projects, and they have a
stable, predictable release schedule.

As before, the best weapon in your arsenal is the knowledge and
talent of your development team. Find the most pressing problem and
solve it. If you're a typical open source project, you'll likely
benefit from one or more of the above lessons.

Like it or not, your project needs management. Yet few good software projects can survive bad management. If you're a programmer on a high-visibility project, this PDF offers five principle guidelines for managing upward that will help you help your boss make the right decisions about setting project expectations, working with users and stakeholders, putting the project on the right track and keeping it there. The PDF also covers what problems cause projects to fail and how to fix them, and what you can do to keep your software project from running into trouble.