Despite that, until today, all that a newbie would get when going to that page was two links: one to the list of approved licenses alphabetically, and another by category. This is obviously not ideal – it provides the newcomer with information useful only to an expert, so they lose; and OSI misses an opportunity to educate and inform, so we lose.

Because of this, in the middle of last year I sent an email to license-discuss proposing a revision to the page, and followed up several times in the second half of the year. Yesterday, I took the revision live.

gives context: what is an open source license? what does OSI-approved mean? These give a newcomer to the list a fighting chance of figuring out what the lists mean.

provides a less-overwhelming list of licenses: using the “popular, widely used, or have strong communities” list created by the 2006 Proliferation Report, it gives people pointers to several useful licenses immediately, while still providing access to the full lists.

is progress: OSI can be, and often should be, a very change-averse organization. But it is nice to score a small win here and there- I hope this will be the first of many while I chair the license committee.

And what it doesn’t do:

change the world: I’m blogging about this because it’s significant. But I also want to be clear that it is only a small win, and hopefully one that in 2-3 years OSI will look back on and have a good chuckle about.

change, update or revise the license categories: The original license proliferation committee license categories, from 2006, have been useful to many people, and were instrumental in slowing the pace of license proliferation. So they make sense to use as the (relatively neutral) basis for the list that is now prominent on /licenses/. But they’re showing their age- notably by including CDDL in “popular/widely used” but in other ways as well (primarily, by not categorizing a variety of new licenses). OSI’s licensing committee (aka the license-discuss list, with input from others) will be gradually investigating how to address this over the course of the next year or so. This process has already started, somewhat, with my calls for quantitative criteria for license analysis. I intend to continue to push the list (including hopefully new members!) to think through the issue and its implications.

Interesting research question/bleg: for a reasonably comprehensive set of important “open source + foo” terms, like collaboration, licensing, etc., where do search results point at? How many go to opensource.org? .com? other sites? Is there a tool that will do this sort of analysis automatically?

I want to heartily unendorse Simon Phipps’ Infoworld article about Github and licensing. Simon’s article makes it sound like no one benefits from sloppy licensing practices, and that is simply not true. Specifically, lawyers benefit! I regularly get calls from clients saying “I have no idea if I’m allowed to use <project X>, because it is on github but doesn’t have a license.” When that happens, instead of money going to developers where it could actually build something productive, instead, I get to spend my time and the client’s money fixing a problem that the original author could have easily avoided by slapping an Apache license on the thing in the first place – or that github could have avoided by adding default terms.

So, support your local open source lawyer today – publish source code without a license!1

Tongue firmly in cheek, in case that isn’t obvious. Seriously, lawyers are the only ones who benefit from this situation, except for that handful of seconds it took you to “git add LICENSE”. Always license your code, kids!

I don’t currently do much heavily collaborative writing, but I’m still very interested in the process of creating very collaborative works. So one of the many stimulating discussions at Monktoberfest was a presentation by two awesome O’Reilly staffers about the future (and past) of authorship. Needless to say, collaborative authoring was a major theme. What particularly jumped out at me in the talk and the discussion afterwards was a nagging fear that any text authored by multiple people would necessarily lack the coherence and vision of the best single-author writing.

I’ve often been very sympathetic to this concern. Watching groups of people get together and try to collaboratively create work is often painful. Those groups that have done best, in my experience, are often those with some sort of objective standard for the work they’re creating. In software, that’s usually “it compiles,” followed (in the best case) by “it passes all the tests.” Where there aren’t objective standards all team members can work with – as is often the case with UI – the process tends to fall apart. Where there are really detailed objective standards that every contribution can be measured against – HTTP, HTML – open source is often not just competitive, but dominant.

Wikipedia is another very large exception to the “many cooks” argument. It is an exception because most written projects can’t possibly have a rule of thumb so straightforward and yet effective as “neutral point of view,” because most written projects aren’t factual, dry or broken-up-into-small-chunks. In other words, most written projects aren’t encyclopedias and so can’t be written “by rule.”

Or at least that’s what I was thinking during the talk. In response to this, someone commented during the post-talk Q&A1 that essentially all TV shows are collaboratively written, and yet manage to be coherent. In fact, in our new golden age of TV drama they’re often more than coherent- they’re quite good, despite extremely complex plots sprawling over several years of effort. This has stuck in my head ever since because it goes against all my hard-learned instincts.

I really don’t know what the trick is, since I’m not a TV writer. I suspect that in most cases the showrunner does it by (1) having a very clear vision of where the show is going (often not the case in software) and (2) clearly articulating and communicating that vision – i.e. having a good show bible and sticking to it.

If you’re not looking carefully, this looks a lot like what Aaron has rightly called a cult of personality. But I think, after being reminded about showrunners and show bibles, it is important to distinguish the two. It is a fine line, but there is a real different between what Aaron is concerned about and skilled leadership. Maybe a good test is to ask that leader: where is your show bible? What can I read to understand the vision, and help flesh it out like the writer of an episode? If the answer is “follow whatever I’m thinking about this month,” or “I’m too busy leading to write it down”, then you’ve got problems. But if your leadership can explain, don’t throw the baby out with the bathwater- that’s a person who has thought seriously about what they’re doing and how you can help them build something bigger and better than you could each do alone, not a cult leader.

As part of a general drive to get rid of stuff, I’ve recently become increasingly willing to part with my old books. This has been a painful process – books have many happy memories for me – but I think also a good and focusing one. As part of my emotional reaction to this, I’ve become increasingly interested in making beautiful, printed texts – things that stand up better to the test of time than the paperbacks I’ve been thinning out.

This gave me the idea to thank the most involved contributors to the MPL with a hand-made, printed copy of the text of the license.

The wonderful Kim Vanderheiden, of Painted Tongue, worked with me over the course of several months to plan this process, and then she and her team put them together. First, we designed the layout, not just of the text, but of the relatively unusual accordion-fold binding, which allowed the final product to be displayed like an A-Frame or by hanging the entire (very long!) thing from a wall. Then we picked paper for the text, and cloth and ribbon for the bindings (the ribbon symbolising both the fact that these are gifts and traditional bindings for legal documents). Kim’s team then hand printed them on their presses, and Kim used watercolors to paint the colored highlights (including the yellow highlighting that replaces the ALL CAPS text). Finally, they were bound.

The end result has been fifteen copies of beautiful, tangible, printed words, which I am now in the slow process of distributing to various contributors. I hope that this token of the maintainers’ appreciation for their assistance (in a variety of ways) is appreciated.

The thanks and colophon is as follows:

Thank You!

This revision of the MPL would not have happened without your help. Please accept this hand-crafted printing of the license as a token of our appreciation, and a reflection of the effort and care you put into your contributions to the license.

The type was set in Equity by Matthew Butterick (typo.la/equity – used with permission of the typographer) and Droid Sans Mono by Google (droidfonts.com – used under the Apache 2.0 license). The book is printed on Somerset Velvet Radiant White and covered in Duo Cloth Birch.

Design, printing, binding, and painting were done with care by the excellent team at Painted Tongue Press, Oakland, California (paintedtonguepress.com).

This edition of MPL 2.0 was printed in August 2012 to celebrate the publication of, and thank contributors to, MPL 2.0. You are holding copy # __
of 15.

When I was at Monktoberfest, our esteemed host reminded me that I’d disagreed with his article “AGPL: Solution In Search of a Problem”, and nudged me to elaborate on the point. Here goes nothing. TL;DR: for most developers, AGPL is really about preventing free riding, not fragmentation – so as long as there is concern about free riding people will use AGPL.

Stephen makes a few key points in his article (mistakes in paraphrasing mine):

AGPL’s alleged benefit (the “problem that doesn’t exist”) is the prevention of fragmentation.

Permissive licenses are on the rise, so using a super-strong copyleft is counter-productive when you’re looking to attract developers.

By being so aggressive, it courts FUD about all open source licenses, which could be counter-productive to open source generally.

Let me take these in order.

Urban Fragments, by APM Alex under CC-BY 2.0

Issue #1 is based on a misapprehension: I don’t think it’s correct to think of the purpose of any copyleft (Affero or otherwise) as preventing fragmentation. GPL has never prevented fragmentation – there have been forks of many GPL projects (and complaints about same) for about as long as GPL has been around. (*cough*emacs*cough*)

Critically for many developers, what GPL does attempt to prevent is free riding – taking a benefit without contributing back. GPL means any valuable improvements in forks (whether or not incompatible) are available to integrate back under the same license terms. This means you can’t “cheat” the primary developers by building your business around proprietary forks of “their” work – they can always reincorporate the valuable bits if they want to.

The frequent use of AGPL in commercial dual-licenses also suggests that free riding is the problem being attacked by strong copylefts, not fragmentation. The logic is simple: AGPL means users usually pay some cost (i.e., not free ride) to participate: either by buying a commercial license, or by sharing code. In contrast, if the goal was to limit fragmentation, the license would say something like “your patches have to be accepted back into the core, or else you have to write a check”, or even better “you have to pass a compatibility test, or else you have to write a check.”

It is important to note that “cheat” is in quotes above. In many cases, people have realized that maintaining proprietary forks isn’t actually cheating the primary developers. For example, in many cases, we’ve realized that forking primarily cheats the forkers. For example, many users of the Linux kernel have learned the hard way that running an old fork + a small proprietary module leads to very high maintenance costs. In other cases, the permissive license actually helps fund the primary developers by enabling an open-core model (even if those aren’t trendy at the moment). In yet other cases, the primary author is making their money from other tools or services and so doesn’t care if anyone free-rides on their open source components. 37 Signals and Rails are probably the poster child for this. And of course, much of the industry has simply gotten more mature and less possessive about their software – realizing that whether or not they are “cheated” is usually a silly concern.

This leads to my response to issue #2: in my opinion, the recent increase in permissive licenses is driven as much by the decreasing concern about “cheating” developers (aka free riding) as it is by increased interest in adoption. In that light, the use case for AGPL is straightforward: AGPL makes sense if you’ve got a good reason to be concerned about free riding (say, if your revenue is directly tied to the tool you’re choosing a license for). This is a decreasing number of people, for the reasons described above, but it’s still far from zero. For those folks, increasing adoption may not actually be useful – it’s a case of “we lose money on every sale, but we’ll make it up on volume”.

On Issue #3 (increased FUD risk): this certainly seems like a possibility, but in my practice, I’ve seen only a single instance of confusion caused by AGPL spilling onto other licenses, and it was quick and easy to clear up. There is certainly plenty of worry about AGPL, but the worriers are quite clear that this stems from requirements other licenses don’t share. Maybe there will be more confusion if/when someone drafts another Affero-style license, but it doesn’t appear to me to currently be an issue. (By way of contrast, the confusion about the various patent clauses, and who licenses what to whom when, is a recurring theme of discussion with any company that is both filing patents and doing open source.)

Finally it’s important to note that both my post and Steve’s are about the costs, benefits, and freedoms accorded to developers. As I’ve mentioned before, when thinking about what “problem” is being solved by a license, it’s always important to remember that for some people (particularly the authors of the AGPL) the analysis begins and ends with problems for users. A full analysis of that issue has to wait for another day (it may be reminiscent of bike helmets) but suffice to say that neither of us are attempting it here, and we should always be cognizant of that.

Mark Pilgrim had a great post1 a little while ago where he talked about Docbook as ‘The Format of Forever’, but HTML as the ‘Format of Now.’ He also argued that (since technical books were constantly outdated) generating technical books in the Format of Now instead of the Format of Forever made a lot of sense.

I’m working on a project that I’d like to see as a long-term, Format of (nearly) Forever kind of work. Specifically, it is my grandfather’s autobiography, which I’d like to see as a long-term enough work that I can give it to my own grandkids some day. As a result, I’ve been wrestling on and off with two questions: (1) what is the right ‘Format of Forever’ and (2) once you’ve chosen that source format, what is the best ‘Output Format of Now’? Thoughts welcome in comments; my own mumblings below.

Grandpa, of course, wrote in the ultimate in formats of forever: typewriter. I scanned and OCRed it shortly after he passed away using the excellent gscan2pdf2, and have been slowly collecting other materials to use to supplement what he wrote – mostly pictures and scans of his Apollo memorabilia, but also family photos, like Grandpa’s Grandpa, Lewis Hannum, pictured above.

I’ve converted that to what I think may be the right ‘Format of Forever’: pandoc markdown, plus printed, easily re-scannable hard-copy. I’m thinking that markdown is the right source for a couple of reasons. Primarily: plain, simple ASCII text is hard to beat for future-proofing. Markdown is also easier to edit than HTML3.

The downside with markdown is that, while markdown is terrific for a very simple document (like grandpa’s writing is) I’d like to experiment with some slightly non-traditional media inclusion. For example, it would be nice to include an audio recording of my brother at the 1982 Columbia Shuttle launch, or a scan of Grandpa’s patent. Markdown has some facilities for including other files, but they appear to be non-standard (i.e., each post-processor handles them differently). Even image inclusion and basic formatting often feels wonky. HTML would make me happier in that direction, I suspect. And of course styling the output is a pain, though I think I have various ideas on how to do that.

Thoughts? Tips?

vanished since I originally drafted this, but link kept for reference

Which, for the record, was roughly 1,000 times better than Canon’s bundled scanning crapware.

which is sort of pathetic; how come we still don’t have a decent simple HTML editor?