Jimmy Wales on a dozen things that WILL be free

Jimmy Wales, international man of mystery, father of Wikipedia and non-resident Berkman fellow joined his fellow Berkmaniacs in Cambridge yesterday, and filled us in on his new intellectual project, “10 Things that Will be Free”.

Jimmy’s list is inspried by David Hilbert’s address to the International Congress of Mathematicians in Paris, 1900, where he proposed 23 critical unsolved problems in mathematics. This list was enormously influential in shaping mathematical research over the 20th century, and most of the problems have been resolved. (Hilbert’s 23 Problems should not be confused with Jay-Z’s 99 Problems which, though also influential, have had less of an influence on academic research.)

Jimmy’s list is, like Hilbert’s, an outline of what we don’t know how to do yet in the world of free culture, and a call to action. It’s also, to a certain extent, a prediction of the future – Jimmy makes the point that it’s 10 things that will be free in the next ten to twenty five years, not should be free.

The list Jimmy presented yesterday was slightly different from the list blogged by Ross Mayfield (compiled from Jimmy’s posts on Larry Lessig’s blog), and from his Wikimania keynote presentation – this indicates that the list is changing and expanding over time, eventually approaching a Hilbert/Jay-Z limit, perhaps. Yesterday’s list:

1) Free the Encyclopedia – Wikipedia is probably how this will be accomplished, though the Wikipedia goal involves a freely licensed, high quality encyclopedia in every language – while we’re more or less there for people who speak English or German and have broadband net access, it’s a long way away for speakers of Arabic, Hindi or Bengali…

2) Free the Dictionary – While Wiktionary is working on this problem, it’s proved harder to accomplish than Wikipedia. One reason – dictionary data is highly structured – every entry has certain things (an authoritative spelling, a derivation, a pronunciation…) while encyclopedia articles are less structured. A new version of MediaWiki software that better supports structured data is in development, and Jimmy thinks this will move the project forward.

3) Free the Curiculum – Free textbooks and curicula, from kindergarten through the university level. Jimmy’s done some work on this with the WikiBooks project, though the project is, again, not taking off with the same rapidity as Wikipedia.

I made the argument that it’s harder to get someone to commit to writing a book – or even outlining a book for someone else to contribute to – than it is to get them to write a Wikipedia entry, suggesting that Wikibooks is a project where the wiki model might not scale. Jimmy conceded that this may be an issue, and that Wikibooks has moved to a “book module” model, encouraging people to write sections of books rather than the whole thing. Jimmy believes that public school textbooks in some US states would be easily built under the module model, since the modules are clearly specified by state standards – this would allow teachers to contribute small sections of curiculum and rapidly create free books.

4) Free the Music. Most of the great works of classical music are in the public domain. But most recordings of them aren’t. And many scores and arrangements aren’t. The Free the Music project would encourage community orchestras to create freely licensed recordings of great works.

5) Free the Art. Again, many of the great sculptures and paintings that represent our collective cultural heritage are no longer copyrighted. But many photos of these works ARE copyrighted. Jimmy tells a story about receiving complaints from museums that Wikipedia contains “unlicensed reproductions” of works that they hold in their collections. These complaints aren’t quite cease and desist letters, because the images on Wikipedia might be photos taken by Wikipedia users and released under a free license. But they are threats, designed to deter users from reproducing works of art that are in the public domain. Jimmy’s response to these letters is to write back letters encouraging museum directors to feel a sense of shame in locking away cultural works from the public… he’s not gotten any responses to these letters.

(I personally love the idea of empowering an army of wikimuseumfolken to take digital cameras into the world’s museums and begin creating a comprehensive collection of artwork. There are museums and artists that I’d be prepared to sign up for and begin photographing when Jimmy gives the word.)

6. Free the File Formats – Jimmy argues that proprietary file formats are worse than proprietary software. If your data is in a proprietary format, you’re trapped if you want to stop using a particular piece of software. Wikipedia uses Ogg Vorbis instead of MP3 due to patent concerns and fears of being locked into a proprietary format.

7. Free the Maps – As Google Map hackers are proving, there’s tremendous interest in building GIS-enabled services. Open source hackers are concerned about building services on Google Maps because Google owns the underlying data – Jimmy believes that hackers will build their own maps database and start creating GIS and GPS enabled services on top of this data.

I’m skeptical, if only because Google Maps is a) free, as in free beer, b) good and c) has a good API. My sense is that open information projects succeed when what they’re replacing is frustratingly bad – Wikipedia is a success in part because Encyclopedia Britannica’s web presence was so poor. Jimmy offers Apache as an example – people loved to hack on Apache because Microsoft’s competing product was expensive and bad. Will ideological purity be sufficient to get people to build an open alternative to Google Maps?

8. Free Product Identifiers – If you link to a book on Amazon.com, you have two choices in constructing your URL – an ISBN number (non-proprietary) or an ASIN number (proprietary). Jimmy recommends you link using an ISBN, so if you decide not to continue selling books as an Amazon affiliate, you can migrate to another bookseller, rather than being locked in by proprietary product identifiers. He’d like to see a world where there’s a full set of free product identifiers where people could more easily participate in the world of “long tail” sales by getting an LTIN: a “long-tail identification number”. There was more than a little skepticism from the group on this one – yes, it’s worrisome that Amazon numbers are non-transferrable, but will open product ID numbers really help people sell to a global market?

9. Free the search engine – Jimmy believes we’ll see an open, transparent, ad supported search engine in the future. Unlike Google et.al., its ranking algorithms will be published and won’t rely on security via obscurity. Jonathan Zittrain wonders if this engine will rely on the long-promised “semantic web” – Jimmy explains that this is more a prediction/call for a non-proprietary search engine ala Google.

Having worked on search engines a little in my distant technical past, I wonder about the feasibility of this one. Someone, somewhere needs to build the index for a search engine, a process that requires huge amounts of disk space and processor time. Will folks really volunteer this many machines and disks? Jimmy points out that this is a problem that may be solved by doubling processor speed a few more times – just as video over the Internet seemed pretty absurd in 1994, building a search engine capable of indexing the web on a PC might not seem unreasonable in 2010.

10. Free the Communities – The terms of service agreements at many online community sites (like my former venture, Tripod) include text giving the community host either ownership of or a perpetual license to any content you create. Jimmy believes that projects like WikiCities will start creating new community spaces where users own their content and can decide whether or not hosts can use it.

Two bonus “will be frees”:

– TV listings. If you want to build your own digital video recorder, like MythTV, you need a good source of data for what programs are on when. It’s not hard to believe that a group of end users could discover and enter this data on a free basis.

– Academic publishing. Jimmy’s slowly but surely coming around to the Open Access model for academic publishing advocated by Peter Suber and others.

One interesting proposition not included in Jimmy’s list: Free the News. While he supports the WikiNews project, he says he’s not convinced that a wikipedia-like community can produce a meaningful competitor to AP or Reuters, despite some huge successes the WikiNews community has had thus far. Very interesting. I’m pretty firmly convinced that professional journalists – with travel budgets and legal departments – will be able to do reporting that wikireporters won’t be able to replicate for a long time to come.

I wonder whether there aren’t some organizing principles that could help unite Jimmy’s 10 or 12 propositions. Based on a rolicking conversation that ran well over the time alloted for Jimmy’s presentation, I offer the following half-baked possible assertions (how’s that for a set of disclaimers!):

– When users have a strong personal incentive to collect information, they’re more likely to do so. I create metadata about webpages in del.icio.us because it makes it easier for me to find these pages in the future – the fact that you might benefit from this metadata is a happy coincidence, but it’s the benefit to me that makes me do it, not a utilitarian impulse. This suggests to me that projects like Television Information will succeed, as they’re analogous to projects like CDDB, or its open alternative, MusicBrainz.

– Projects where users can work on bite-sized chunks are likely to succeed, while projects that require massive organizational effort from one or more individuals are less likely to succeed. This, I believe, is why Wikipedia has had so much traction, while Wikibooks is having less luck. It’s one thing to commit to writing a 500-word encyclopedia essay – it’s another thing entirely to commit to writing a book and giving it away, or even to outlining a book and asking others to commit to fleshing it out.

This second assertion got me into a good debate with Luis Villa, Berkman’s new senior geek in residence. Luis points out that open source projects require massive amounts of time from people with highly specialized skills – kernel hackers, for instance – and that the success of these projects suggests that Wikibooks might succeed. I argued that many open source hackers build the tools they need to get their jobs done, and that most scholars don’t “need” a basic history textbook, and might not develop one without economic incentives.

But hey, I’ve been wrong about almost everything else concerning Wikipedia, so there’s a good chance I’ve got this one wrong as well.

10 Responses to Jimmy Wales on a dozen things that WILL be free

To add to your half-baked assertions, I offer a couple not-even-fully-mixed thoughts.

A dozen things
I like this list. It’s definitely skewed a bit toward the wiki-world, and could be expanded, especially in software terms toward needed free projects. Open, transparent software for government use, for example.

For example, let’s assume the congressional budget office uses a particular software package to manage most of the nation’s budget. Now let’s assume that an open source project creates a package that matches this functionality, for free. Now, let’s take it one step further, and build in public, web-based access to any of the CBO’s data. Given that the software worked, and was free, the government would have a hard time not adopting it (cost-effectiveness regulations and all), and transparency would come along for the ride.

Now given, that’s pretty pipe-dreamy. But there’s more to free than education and reference information. There’s also the knowledge that’s hidden in institutions. In a free-focused world, we need to figure out ways to get that information out.

Wikibooks / WikiDictionary
One other possible reason for the rapid growth of Wikipedia vs. WikiBooks is simply its how widely useful the information is. More widely useful = more users = more content.

There’s also the relation between user and creator. After I’ve looked up a few entries on Wikipedia, I’ve found it useful – and I feel some sort of obligation to contribute to that knowledge. Since I’m a reasonably intelligent person with some bits of specialized knowledge, it’s likely I can contribute somewhere. With Wikibooks, the end users (students) aren’t the people who generally create the content, so that sense of shared use/creation never happens.

Now, this explanation doesn’t fly for WikiDictionary, but in that case, the type of knowledge is the limiting factor. I can really only contribute to WikiDictionary if I’m really confident in the meaning of a particular word, which effectively limits me to being a linguist or an instructor of a given language. Most people don’t have that sort of precise definition for the words they use. (I can’t count the number of times I’ve failed the “define x without using the word x” test) On Wikipedia, I just have to know about a particular topic. This makes it easier.

I think the best shot for Wikibooks is for some educational institution to give its full backing. Full backing means first providing a good seed of content to Wikibooks. But second and more importantly, the institution has to commit to using the Wikibooks-based textbooks for instructing its own classes. Without actual use, they’ll be textbooks that joeRandom school board/university is wary to take on. With the backing (and use) of a major institution, the ‘selling’ process gets much easier.

Google Maps
This is quite an interesting example. Effectively, Google has created a standard by being the first to the market with a good API. (ala some of Microsoft IE’s proprietary HTML extensions). But what makes this situation unique is Google’s “Don’t be Evil” motto.

Now whether or not Google is actually evil isn’t the issue. The issue is that once thousands of developers have created large apps with the Google Maps API, that API becomes a de-facto standard for GIS information online, and Google will have a hard time changing the licensing to something more “evil”. It’s obviously not a technical issue (this is Google, after all), but a social one. If the Maps API is entrenched, Google is smart enough to know that they’d have a PR backlash by trying to switch to a non-free license, or less-functional API. They’re also smart enough to know that rivals would be waiting with competing products in the case of such a slip up.

What you end up with is a commercial product, extended by the community beyond its original design, that is semi-transparent and free to use. Best of all, the product is sustained by social stigma and a bit of competition. While this surely isn’t open source, it’s not business as usual.

In a way, it’s almost open-source coercion. It will be interesting to see if future APIs emulate this fate, or try to avoid it.

I got interviewed for ac/net article about wikibooks the other day and had an epiphany about the challenges facing wikibooks, at least in Biology. Figures. Biology textbooks need great figures. I’m thinking about how to put together a project to fix it.

This blog-entry is extremely well timed, as I’m currently in London at a hacker conference writing an open book about wireless networks in the developing world. I’m not actually writing the book, I just gathered the right people in one room, and in the amazing space of 4 days we’ve gone from vague idea to pretty well fleshed-out outline with a few chapters already written for good measure. We’re calling the model a BookSprint, even though it’ll take us at least a month after the event to complete the writing and edit the book, but the intriguing thing about this is that we may have solved exactly the problem that you elude too, by gathering consesus about an outline and a vision. Instead of a large 300-page blob of text we’ve essentially converted this project into a about 20 independent articles, and some concerted editing to make it all read like a book….