Rules of thumb for building frameworks

Here are some things I've learned from frameworks I've used and frameworks I've worked on. Please add your own lessons and guidelines in the comments.

Don't force users to use all your functionality.

Users should be able to use only some of the features, without having to use them all, most especially if you're providing a solution for everything. This makes the framework easier to use, promotes a cleaner design, and easier to test. Also, who says you came up with the right solution for everything? let the user decide!

Design for easy testing.

From day one, your code should have tests, and should be implemented in such a way that makes testing easy. This makes refactoring easy, and promotes a clean, untangled design. It also provides a source of examples for users until you've written documentation, and can uncover use cases you hadn't thought of.

Your coding standard is your friend.

A clean, consistent look to your code will make it easier for new users and developers to get started. A messy, complex and inconsistent code base will make developers think twice before joining your project.

Never declare APIs stable before they've been used by other people.

Your first design is probably not going to do the right thing. You need other people to find all the end cases you haven't considered.

Users will use unstable APIs, and they may complain when you change them.

This keeps you from changing APIs for no reason, and helps you find the use cases your new API no longer supports.

You should only build a framework in response to a need created by an
application. After all, YAGNI.

Often, the best way to write a framework is to use someone else's

Just because it's Not Invented Here doesn't mean that it's not usable.

Your app may be developing it's own framework behind your back.

As applications grow, they tend to form layers. Designing the lower layers as if they are a framework has a whole bunch of associated advantages, including reuse.

Also, coding standard is extremely important. Not only is constistency critical, but so is the choice of standard. People are less likely to use a framework that uses Hungarian notation. Perhaps a rule of thumb would be to use a standard similar to that of the primary language's standard libraries?

<person>jml<person> said Not Invented Here is no reason. This is right, but more important is the recipecroal: just because it was invented somewhere else is no reason to do it. Study options carefully, but if the options all suck, reinvent it. A careful mix of "use the work of someone else" and "rewrite from scratch" needs to be arrived here. Here are some of the reasons I found rewriting often useful:

Licensing.

Unclean dependencies (for example, the people who did it never released it formally).

Versioning problems -- if the code you're trusting has different versions, with compatibility problems, often it's easier rewriting.

Requires too many compromises.

Note that all the above are not cases are assuming you already have some valid NIH candidates for doing what you need. Don't be afraid of writing code, even if there is code out there.

I seriously hope that code reuse will always be an issue. Just because you're legally entitled to take random parts from other projects and stick them in your code, does not mean it magically becomes a smart idea. If you have an issue which requires you to modify the code you're depending on it, here's a list of things to do:

Don't. Consider that distributions hate carrying two versions of the same thing.

If the language/app allowes it, try doing stuff like subclassing or using various patterns.

Remember that you'll have to stick with fixing bugs in this code.

Remember that it might cause problems, unless very carefully done, for 3rd party stuff trying to use your stuff and the original project.

Consider: can the changes you do be submitted upstream?

If not, consider: can you make changes upstream (perhaps, a refactoring) which will stop your needs to change their code?

Talk it over with the people writing the original code. Maybe they had a reason to do it this way.

Getting users is the most important thing you need to do. Beg, borrow, steal or entice with fancy promises, but get them users. Without users, you will always guess what kind of API is useful.

Be Experienced

You can't write a framework before having a significant amount of experience. The kind of experience most useful is that of using frameworks, because that gives you a natural head start at designing APIs -- never design an API you would hate yourself.

If you don't have experience, get it. One good way to get it is to become a co-developer (or even a user, or BEST OF ALL, a documentor) for another framework. Documentation is good, because it forces you to read the API (thus teaching you which APIs are horrible first hand), it forces you to understand how to use, and warn about the pitfalls in API and, best of all, the developers will love you and so will be willing to fix your mistakes in documentation.

I agree with much of what's been said here. There is one guiding rule that I feel bears mentioning though, which has been pretty much my one invariant design rule specifically for frameworks. A wise man once gave me this rule and it has never proved wrong.

Design for use, not re-use.

Re-use is something you do with garbage, or hand-me-downs. You want to design an interface that seems shiny-new and spiffy to your prospective users, not just workable but huge and clunky. Don't design your application as if it's going to be used by very smart programmers who are re-using a ton of code, but by interns writing their first serious project. Polish things and write explanations. Set a good example.

To be more specific: re-use is re-purposing a component that was designed for a specific use but can be used more generally. Use is simply taking a component that was designed for general use and applying it. If you find you are actually re-using a lot of code that you originally intended to be for one purpose only, it could be that you are a very good designer, but it is more likely that you are being lazy about refactoring. How could that code be made more general? Even if the code is already general, is it clear from its placement and naming that it is general-purpose? Will developers looking for this functionality locate it easily?

This distinction is subtle but quite powerful. One of the consequences, which is a hallmark of good quality in a framework, is small, deep interfaces. If you are focusing on use, you will try to minimize the number of functions and data structures that developers will have to know about in order to get the most out of your framework. If you are focusing on re-use, you will find yourself writing junk code and tossing it out in the hopes that someone will find every little bit useful in some different way. (This also creates a maintenance nightmare, since you have to publish updates about every single tiny internal interface change.)

One other useful tool in this mindset is to try to imagine at least a simple user-interface for each bit of code you write, even if you never actually write one. Thinking of it as a tool that a user will use will prevent you from depending on the programming skills of your audience to locate or understand the code.

Once you define the central task, make it work really really well before giving serious consideration to all the peripheral tasks which support it. For example, Mozilla is a browser; it also allows me to compose, to work as an e-mail client, as a model for XUL programming, and so forth. Making it work as cleanly, as quickly, and as intuitively as a browser is the foremost objective. That way, you can "hang" peripheral tasks onto the central one, knowing that it is solidly built.

About this, I would point out that using more than once is re-using. Maybe what glyph meant was "re-use is implementation inheritance", or something along these lines. Making code more general is effectively designing for re-use.

If you're going to be pedantic about definitions, at least be correct about them. pphaneuf's definition is a common misconception, but it is that confusion of terminology that's at the heart of the problem. dict(1) says this of it:

From WordNet (r) 1.7 [wn]:

reuse

v : use again after processing; "We must recycle the cardboard boxes" [syn: {recycle}, {reprocess}]

So, it is not merely to use something more than once. Different dictionaries will disagree about whether this is connotational or denotational meaning, but FOLDOC says, of this particular application of the term:

Using code developed for one {application program} in another application. Traditionally achieved using program libraries.

Notice that it specifically says "for one program in another". What I'm saying is, don't develop your code for one program and use it in another; develop it as its own program and invoke it from many others.

The myth of "re-use" (and the reason that "re-use" initiatives fail in so many companies under so many circumstances) is that effort can magically be recycled from one project to the next. It can't. If you want to design for generality you have to specifically have a project whose goal is to be general; you cannot expect to have a project whose goal is to produce a billing system and have one of its outputs be a general transaction-processing framework. The exception to this is when that project has the freedom to start a subproject whose members will be able to focus on the vaguaries of general-purpose transaction processing and not on the task at hand.

Sometimes, it is OK to forget about generality, forget about other people using your code in unexpected circumstances, and just get the job done. This is the basis of YAGNI in XP. Even in the context of framework development, you need to stay focused on what your framework is supposed to be providing, and not on making every ancillary bit of code "reusable" in new and exciting ways.

Not only should you design for easy testing but you should provide a full suite of regression tests. Both as proof that the framework is fully functional but also as a development tool/suite toward future enhancements.

I've found that coding the test first helps get the API stable fast, and it's quite easy to spot errors that break code when you make them.

jml says "Do not design frameworks for the sake of it". This is important. This is Rule 0.

A framework is a library whose architecture and domain model I have to make a real commitment to before I can get any use out of. A library, on the other hand, I can just use.

If you can design a library instead of a framework, design a library, not a framework. If you can't design a library instead of a framework, go away and think very hard about the problem for a while, then repeat the exercise. The more functionality you can provide that doesn't require the users to marry you, the more use your creation will get.

My interpretation for this one is simply code them like a small library, similar to a module. It may look like a Shared Object in Unix or a Dynamic Link Library in Windows. Keeping them self-contained and available as a component is probably the best way of testing them.

glyph: ah, I understand you point better. Still, I feel that when someone says something like "designing for re-use", it kind of implicitly moves from being developed for one program to being developed as its own program/library.

Still, there's a bazillion ways to get this wrong, and we see them all the time. You can re-factor something into a more general piece, but just writing re-usable code all the time is (probably) stupid.

Also, your point about a subproject is quite subtle, but very important. Many companies idea of re-use is to copy/paste a class out from one project into another, leading to duplication and bugs getting fixed in only one place, etc, etc... No! Make a clear split and have this one library/program "used" (rather than "re-used") by the other projects.

Never underestimate the value of the operating system as the framework.

An OS has most of the api's and services required by most framework tasks - more than is often initially obvious. Unix systems, especially, are developed as a grand, open, framework - designed to be pieced together and augmented by server-class functionality.

Never underestimate the value of the language and standard libraries as the framework.

Again, the language and its libraries are another obvious toolset that many people forget about. Take a C compiler and standard library, for example; augmented with a few additional libraries and you have a solid system framework (especially if one were to leverage the OS as well).

Transparency and clean modularity is the key to a long life.

Take a good look at some powerful, wise, Unix applications (or Unix itself). These well-aged systems are transparent and are cleanly modular - with very simplistic interfaces. Transparency refers to the ability to see and touch the inner workings, like rc scripts in system-v for example. And simple modularity weighs the specialization of the interface vs. the ubiquity of it - the command-line interface is a good example of that balance (siding on ubiquity).

Always pick Open Standards and Liberated over 'Invented-here' or Proprietary.

All software frameworks and systems, including commercial ones, will be more likely to succeed given a firm grounding. Proprietary parts of a system are destined to die, as corporate entities are not eternal - and they rarely consider the users of their libraries after death. I've lost support for at least half of the Propietary libraries I've used in systems over the years - which either killed the project, or cost it terribly. And open standards, have a similar, yet more subtle effect. Don't invent or use non-standard approaches unless it is the vital aspect of the project. I've always wondered the total cost to software systems was when the the lzh part of GIF suddenly wasn't free. Don't kill your system with un-liberated components or concepts ;-)

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser
code is live. It needs further work but already handles most
markup better than the original parser.