Getting people involved in free software projects

I'm currently running a Mozilla Bug
Week to help new people get involved
with working on the Mozilla code. The purpose of this article is both to
plug it, and to discuss whether it's possible to lower the barrier to
entry to such a complex project, or whether I am doomed :-)

My perception has for a long time been that the barrier to entry at
the Mozilla project was too high. Mozilla is an extremely complex
project, requiring 1.5Gb of space and multiple hours on a normal box to
build a debug build, and our docs are pretty terrible.

Patch Maker was
one part of the solution to that - contribute patches to interpreted
files (JS, CSS, XUL) using only a nightly build, using a mapping from
paths in the install to paths in CVS and munging the patches before
upload to Bugzilla. It works
on Linux and, more importantly, Win32 - buying MSVC++ 6 is not an option
for many people, and the hope is that this will widen the field of
people who are able to contribute patches.

Bug Week
is another piece of the plan. Basically, I set up a rota of experienced
Mozilla hackers to hang out in the channel to help anyone who wants to
get involved with the Mozilla code. I then sent the announcement to as
many people as I could think of, and
Slashdot picked it up this morning. The idea is that people turn up
and are walked through making their first patch, in the hope that they
can then go away and make more without too much help.

So the question is: will it work? Will a load of people turn up, see the
Win32 build
requirements and go home again, or do we have any chance of holding
onto the people who become interested? What else can be done to lower
the barrier to entry?

Mozilla is a cross-platform product, running on multiple kinds of
*nixen, Mac OS, BeOS and Windows, using a cross-platform application
framework that was developed for Mozilla itself and which is now being
used to build other applications.

I helped with the release of ZooLib as open source. It
is another cross-platform application framework.

When I wrote the ZooLib
website to put on SourceForge,
I wrote the following essay to help people understand the importance of
cross-platform applications to developers and the community:

I'd love to actively contribute to Mozilla but my other commitments
prevent me from doing so. However, I'm doing my part in a small way by
running it under Debian PowerPC, Slackware, and a talkback build on
Windows.

If you're not using Mozilla yet, you should know that it's a good idea
to do so for the single reason that it gives you control over your
cookies. You can set it to ask permission before setting a cookie, then
it will remember your answer. In most other browsers, if you set it to
ask first it will ask repeatedly rather than remembering.

That way advogato and slashdot can have
cookies for your logins, but doubleclick can't so they can't track you.
There's lots of other stuff but that's the main reason I've finally
completed the switch from netscape.

As far as Win32 build dependancies being scary goes one thing that
strikes me is since that most of them appear to be free software of one
form or another and there's nothing stopping someone putting together a
single distribution of most the tools that would install everything
required. You should also be able to integrate stuff like getting the
envirioment variables set up and possibly initial CVS downloads into
that. Something like this would make the process of getting to the
point at which one can at least build something rather more direct.

This assumes, of course, that getting things set up is a serious
problem. Given the number of people who seem to find build environment
setup scary it's quite possible that it is.

The whole "one single point of distribution" thing is part of what
makes kitchen sink systems like Debian attractive -
it's very easy get most free software working on Debian since
the vast majority of useful stuff is already packaged and availiable.

For a long time (even before Netscape went opensource), I've
felt that, in my opinion as a professional software architect who's
worked with very big (up to 1.3 million lines of C code) projects, Netscape showed all the signs of a project that did
not have any acceptable sort of software infrastructure.

Especially not for a project of the size that it appeared to be.
AFAICT (and I may be wrong, please correct me if I am), Mozilla
started from a Netscape codebase, and tried (very hard, I'm not
putting down the efforts that have been put in, they have been
impressive) to fix it.

However, if there's anything I know as a professional software architect, it's
that large projects with flawed architecture are unfixable.
You cannot take a large bulk of code, and retrofit lessons
learned onto that bulk of code. You don't get much benefit from doing that, and it costs you far more time and energy than you get in terms of results.

The best you can do, is re-spec and re-design from scratch,
and reuse a very small fraction (typically 5-10 percent if you're very lucky and the original code is well designed and written) of the
existing code that happens to do exactly what you want.

Take Opera, for example... I downloaded it the other week - and it's
.deb file is less than 3Mb. And, AFAICT, it does more than Netscape
does, and does it better.

Why? Because, I'm fairly certain, they did some rigorous analysis
of the features that existed in browsers, thought about what they
wanted, came up with some new ones (Mouse Gestures are just a genius
idea) that fitted, then built it.

So... I'm sorry, this isn't intended to denigrate your efforts - I
think it's great that people are willing to put in such effort,
but I really think that the entire project is misdirected.

And no, this isn't Not Invented Here syndrome... quite the opposite,
I recommend a thorough analysis of the functionality (not the design - the functionality), and a redesign based on the
lessons learned from that analysis and the existing problems with the code.

Then, if you can reuse some of the code, great, do so... but I would be very surprised if it was found that more than 10 percent of the code was reusable, given a proper reanalysis and redesign.

Mozilla started from a Netscape codebase, and tried very hard... to
fix it.

Yes and no. Mozilla started from a Netscape codebase. After nine months,
at the end of 1998, they did exactly what you said - threw it all away
and started again, reusing only about 5% of it.

that large projects with flawed architecture are unfixable

I don't think Mozilla's new, current architecture is flawed but, even if
it is, that's reasonably unrelated to the topic at hand. :-) As with any
project, there are a few bits that could be written better.

Take Opera, for example... I downloaded it the other week - and it's
.deb file is less than 3Mb. And, AFAICT, it does more than Netscape
does, and does it better.

Opera's a fine product. But it certainly doesn't do more than Mozilla
0.9.5, and in many areas it doesn't do better. Mozilla just supports
More Stuff. Small example: I tried to use string_var.match(<regexp>) in
JS the other day; doesn't work in Opera.

Gestures isn't hard - some guy knocked that up as an installable addon
for Mozilla in the space of about a week. I don't have the URL to hand,
though, as I don't use it.

I really think that the entire project is misdirected.

This conclusion seems to be to be based on the faulty assumption you
made at the beginning. Anyway, with Mozilla just passing from usable
into excellent, now is not the time to throw it all away and start again.

Does anyone have any comments on good ways of getting people involved in
free software projects? ;-) (At the end of the week, I'll report back
with my experiences.)

Personally I think encouraging interested users to becoming
contributiors is important: not only does the code base improve, but it
seems the surest way of getting people to understand the freedoms of Free
Software.

Alot of projects don't seem to pay that much attention to reducing
barriers of entry: presumably because in the past people
who worked on projects were experienced hackers. Whereas the chance now
is to also introduce completely new people to the traditions.

Some things I think people could do would be:

Do what you can to make the tools of development easy.
So if I look at GNOME then the vicious-build
scripts which make forming the development enviroment easy are very
important in making it easy to setup and maintain a working system. I'd
really like to see more documents and help on the various technologies
of the project as they help people to see the big picture.

Documentation on how to contribute.
Most people seem to get stuck on 'where to start' so documentation on
how to get involved is important. Whether that be 'hacking' text files
or more formal Web pages. For GNOME 2.0 I think the Web pages on developer.gnome.org have
really helped.

Mentors you can go to
Having a mentor to help when there are problems, or just someone to give
encouragement and guidance is really good. Often 'lead developers'
don't want to do this but being approachable is a real key. With GNOME
Chema has done a
lot of work on GNOME-LOVE
which is a project to help new hackers. From what I've seen it's been
good as new hackers can find someone to ask.

Part of the problem I think is that projects can sometimes give the
impression that there is a cast of thousands working on them and that
every little bit done isn't a help: so people sit there thinking that
there's no point.

Whereas from my experience, even 10 minutes spent on helping a project
be that hacking, testing, writing a bug or helping with a document can
make have an effect: it's the cumlative effect of all those little bits
of work!

Gerv - I hope you'll repost with your thoughts when your bug week is
done. I'd be particularly interested in any things that really got a
response from new hackers, documents etc.

I wish you lot's of luck, but I think the problem
is really intractable. Bug fixing is a skill that
requires some significant experience to master and
is not taught anywhere. Especially if it's code
that's written by other people.

In fact one of my co-workers is extremely frustrated
by this. We just seem to "know" where to look for
bug X. Mostly, because we have seem this before
or something similar. She often asks me how I know
where to look or start and I have the damnedest time
answering her.

There are two skills a good bug fixer requires:

1. The ability to read other people's code.

2. The ability to recognize patterns.

Both benefit from experience, but the second
requires that you have large store of previous
bug-fixing patterns to draw from. An inexperienced
person can spend weeks on a bug that a more seasoned
programmer can dispatch in an hour or two.

I think your efforts might be better focused on making
your experienced bug fixers more productive. What are
they doing that could be done by less experienced
people? Or that could be automated in some way?
The PatchMaker stuff is an excellent example
of this.

The other hurdle you face is that people are much
more interested in adding features than fixing bugs.
People that fix bugs in open source mostly arrive
to the project by contributing some feature and
in this way they learn enough of the architecture
to find and fix bugs. If I wanted to draw people
into an open src program, I would work on creating
a list of "features" that need implementing.

Okay... So I indeed predicated my conclusion on a false premise, that
Mozilla hadn't yet thrown it all away and started again. Good to
hear that it happened.

Nevertheless, if your build area takes up 1Gb plus, then you need a lot
of software architecture, and you really *really* need a controlled
change process.

As bbense and blah have said... You need some way to make
your experienced people more productive, by having a process for them
to be able to farm out the more easy stuff to newer people, and for being able to help those newer people become more experienced.

You don't have to have the same person diagnose the bug as analyses it, and you don't have to have the same person analyse the bug as fixes it.
I don't know how much of that goes on at the moment... but you'll find
that it's a lot easier for newcomers to the project to pick up an already analysed bug (to the level of, the problem is in that function, it needs to do this instead), and implement it, than it is to dive into
a vast code area with no direction.

If there isn't a directed way for newcomers to be able to start small,
they won't start at all. That's part of the reason people come to the project with new features rather than bugfixes - it's easier to bolt on something new (regardless of whether it fits cleanly with the rest of the architecture) than it is to look at the code as a whole.

And remember - software architect grade people are rare. Very few people that come to the project are *capable* of grokking the whole piece of software as a coherent thing, understanding all its ins and outs and the way data and events flow from place to place, the implications of the internal APIs and how they reflect on everything from speed of code to size of code and programmer mentality...

It's important to identify those that *do* have that understanding, and to make them *direct* the rest of the process, by providing analysis
of bugs and leaving the implementation to others, and by having them
look at architecture level features, instead of getting caught up in
the nitty gritty of "that gui button is in the wrong place".

I know absolutely nothing about the Mozilla development process (as witness by my previous comment), so I can't say whether you guys are
doing any of this already... but it is stuff you *should* be doing, if you're not.

The size of your codebase is suggestive that there are architectural problems, though. There are probably ways that the build process could be refactored to be smaller and better, and it is probably time to do that, before the project gets even bigger and less possible for any human to understand.

I downloaded the sources a month ago, tried reading the code and
gave up after seeing the huge size of code. I'd like to help but it
looks like I will need a class browser or a tree showing all the classes
and interfaces though. Maybe a documentation similar to the MFC or
BeBook docs is
definitely a great way to help Mozilla newbies like me.

Have you seen the Tinderbox system? If
that's not a method of controlling change, I don't know what is :-)

The size of your codebase is suggestive that there are architectural
problems, though.

Leaving aside the fact that in any codebase that size, there are always
going to be things that could be done better, perhaps the reason for the
enormous codebase size is that it does a lot? :-)

I was talking to some other guys this evening - we couldn't think of a
bigger cross-platform free software project. I think our codebase is
significantly larger than Open Office's (the only other real contender)
and evolves at a pretty impressive rate. Mozilla is having to solve
scaling problems that no-one else is even encountering.

bbense said: If I wanted to draw people into an open
src program, I would work on
creating a list of "features" that need implementing.

Features are the last thing we need at the moment. :-) In fact, we are
in a feature freeze leading up to 1.0. But the idea is sound - I hope
we'll get more people involved this way after 1.0.