The first version of Redis was released to the public almost one year ago (10 months ago actually). This seems like a good time to look back at the development process.

Redis can't be considered a successful project yet, it's just too early, however the followings are a few concepts I learned in the last months that I feel like to share with you in the hope that I'll be able to apply this ideas in the future, and in the hope that programmers interested in creating a new open source project will find something interesting here and possibly avoid some mistake.

Develop something that you think you'll use for years

Redis is not my first open source project. My main past projects are hping and the Jim interpreter. Still Redis is the first project I'm sure I"ll develop for years, assuming it will be successful, because this time I was wise enough to select something that I want to use myself for the years to come.

I stopped the development of Hping when I quitted security, and I stopped the development of Jim when I quitted Tcl. I'm confident that I'll not quit databases as it's ten years more or less that I'm a MySQL user. When you know you'll need what you are doing for years to come there is a profund vision shift. You think all your efforts are not wasted even if no one will want to use your code, and that likely your users will not be left alone in a few months.

So make a careful choice when starting the development of a new open source project, don't pick something you are interested in today, but something you'll probably be interested in for the next decade.

Early adopters

Early adopters are vital for your project: the mass will not use what you build as long as there isn't a solid initial user base. This sounds like the chicken or egg problem, but actually is not: there are smart guys that will be brave enough to use what you build if it's worthwhile.

Your early adopters are not brave because they are irresponsible, just they can evaluate something without the need to follow the mass. So where should you search for your early adopters? Among the smartest guys around. To post your code on Hacker News once you have something that is complete enough to get an initial feeling is a good idea.

A wonderful side effect of all this is that at least on the initial stages you'll have a terrific community. I love people around Redis, they are in the average incredibly smart and interesting. I enjoy when I provide help via the google group or when I try to fix bugs in very little time because with such a community it's worth the efforts.

Simplicity matters

Users don't like to read zillions of pages of documentation just to get started using your new open source project. They don't like compilation errors, nor complex ideas or protocols.

Your project should be trivial to run, and your documentation should include in the first page instructions about how to try an Hello World usage in a few trivial steps. Once users will have a working hello world they'll be willing to learn more, and read documentation, but not before most of the times.

If the libs you use are using are not included in debian/ubuntu apt-get and/or in mac os x package systems, it's better to include the libs inside the code. Your users should not need more than five minutes to go from the download to the working hello world usage example.

I suspect that the fact that Redis is one of the rare case of NoSQL database that will compile almost everywhere just with make, that will run without a configuration with default settings just with ./redis-server, and that uses a simple enough protocol that you can understand and implement in minutes (so that I could claim many client libs since the first weeks), is playing a very important Role in the relative good adoption Redis is experimenting considering how young it is.

Simplicity also matters in the concepts your users are required to understand to get started. Redis is trivial but I'm always surprised by the number of people that don't get it. I don't even want to think about how hard is for the average user to understand a more complex NoSQL database.

Of course there are also people that told me they don't like Redis because it's too simple to be powerful enough for their use cases. I don't trust this argument, and anyway I think that is a good tradeoff, but be prepared to hear this kind of arguments if you take the simplicity path.

Be conservative about adding features

It's very hard to understand if a user request should or should not be implemented. It's not a matter of development time: I mean that even if the feature request provides a patch, maybe the right thing to do is to not merge.

Every user has his specific needs. They are legitimate from the point of view of the user, but possibly they are not legitimate from the point of view of the project: maybe there are other ways to solve the problem, or the problem in the first instance is a result of a design error the user is doing.

Many times it's just that the feature request is too particular and specific: it's something legitimate but that 1/1000 of users will actually need, but still the feature adds complexity and code to your project. To say no to this feature requests is almost always the right thing to do.

Other times you instead feel like the feature request is ok, general enough, not too hard to implement, and there are no good ways to address the problem in some other way: this may be a good feature to implement, and yet it is a good idea to wait a a few weeks at least, to see if after some time the addition appears to be still good. Basically every non trivial feature should stay in the TODO list some time before to get implemented.

Be pragmatic about your roadmap

Real programmers love to solve hard problems, so it's easy to fall in the trap of implementing what's the most fun to code instead of implementing what's useful. Actually if you love the problem domain, most things will be fun to code in the end, but to get the roadmap wrong is a huge mistake.

For instance I was quite convinced to implement redis-cluster (a layer that gives automatic sharding and fault tolerance among N nodes) as it is a very interesting problem to solve. There is to study new things and possibly design some new algorithm variant that will work well with the Redis semantic and data model. But actually most people in the short time will need much more the ability to use datasets bigger than RAM, that is, the Virtual Memory feature. I changed the plans and I'm going to work on VM in all the first part of 2010, this means that most people will have a very simpler upgrade path once their datasets will be bigger (assuming accesses are not evenly distributed). Even if redis-cluster is nice and will be the next big thing after VM to get inside Redis, this is not as important for most users. To implement one or the other feature before can change the users feeling about your project, so here the rule is, solve problems accordingly to the number of people that will benefit from the new implementations.

Don't expect tons of code that you can actually merge

There is a fable in the open source world, that you get a lot of code once you start to have an user base. This is not how it works: be prepared to write 95% of the code of your project for the first years. There will be somebody that will contribute code actually, but most of the times this code will be about features you don't want to implement, or will not look like sane enough to be merged without a profound review, or will solve a good problem in a way that is not general enough, or simply the coder does not understand enough of the Redis internals or about your future plans to provide an implementation that is acceptable.

From time to time actually it's possible to merge a patch as it is, but this is rare. The idea of "let's implement a solid base so that other programmers will build all the rest" will not ever work.

BSD can be a strength even in the business side

If you are going to develop something that targets not just end users but companies, BSD can be the best pick, as in many business environments it's much more comfortable to use code with a license that allows for internal developments without to deal with distribution of the changes.

Most of the time it's not that this companies don't want to share their changes with the rest of the community, but that this changes are not ready for prime time or well documented, or may show too many things about corporate secrets, and so forth.

The good thing is that, you'll be free to provide a closed source version of your project for instance, even if you accept external patches. This can be a viable business model in many ways: the commercial version can only include things that are marginally useful for most users, but that are important in corporate environments, or may have special features that are too specific to get inside the "real" project but that it's ok to support commercially, and so forth.

Basically the BSD license does not mean that it will be impossible to do business with your project, but your users can be much more comfortable using something that can't experience problems similar to the ones MySQL experienced lately.

Congratulations on the first year and thanks for doing it. I think you should include one last paragraph: "Talk a lot and show commitment to the project". When i first heard of redis, i did a quick read in the page and followed you on Twitter, but didn't started using it. Then i saw you tweeting so much and so passionately about it that i decided to give it a try... and now i'm in love with Redis, which is the 2nd stuff from Italy i most like (1st is Pizza :D)

Redis arrives when RDBMS'es are not going anywhere.
How it arrives?. With ANSI C, a very simple installation (the first goal).
To have 16 GB of RAM in servers it'is quite common, in addition moderm web apps have created new problems domains in data storage and querying.

Redis is a alternative noexistent before.(IMHO second goal).
Part of Redis future is basically in efforts by disseminate key-value concepts
Non-relational stores are not relational
The key/value store not knows the abstraction on table, rows, ..
Redis is a evolution in the key-value store. To get data structures "shared and persistent" from my apps
adding only the network layer, it's simple, too much simple but, I don't never seen before.
List push operation with O(1) complexity, it's simple, that's is your conceptual niche.
Some Rules:
Don't try to implement everything that the MySQL driver implements.
Use the strengths of alternative store.

Add to all this advise, one specific to NoSQL projects: (seamless) integration with existing frameworks. The presence around major frameworks will encourage people to try it out, which will result in new use cases and that would finally lead to more and easier adoption. I have posted a longer article on this subject just a few days back http://nosql.mypopescu.com/post/299775877/nosql-to... Right now, according to the data I can gather on MyNoSQL (http://nosql.mypopescu.com), it looks like Redis is one of the most actively tried NoSQL projects. So, keep it going!

Your comments about BSD is based on a misinterpretation of the GPL. The GPL does not require you to release anything externally. All that it requires is that IF you release binaries externally, THEN you must release source. Anything you do in the privacy of your company can stay within the company. It is only once you decide to release binaries externally that the GPL kicks in.

The GPL is perfectly fine for internal development. This isn't ambiguous in the GPL -- what I stated is the interpretation of the GPL that the FSF subscribes to, as well as lawyers at every major Internet company. Google does a ton of internal development on a GPL codebase for internal use without releasing changes. Most of the other major Linux-based Internet companies do the same thing.

I favor GPL for projects like this for a number of reasons. The most substantial one is that it does give you the option of dual-licensing later, which can make a nice profit stream eventually. I do like "GPLv2 or later" or same for LGPL licenses, to avoid offending both the GPLv2 die-hards and the GPLv3 die-hards.

Note: I am not a lawyer, but I have studied technology law under a number of people, including Lessig, and am friends with RMS, know Eben Moglen.

@Peter: example, I want to build an appliance that runs a web application. I sell this appliance that contains Redis binaries internally. I think this qualifies to require source code distribution for the GPL license, while I consider this internal use.

Also I don't like the dual licensing strategy, it's something like the proof that GPL code is not really *free*. Just a point of view btw.

What made you quit TCL? Seeing your past projects (Jim, picol) you seemed very fond of it, and it made me check it out for myself. Did you ditch it for another language, or do you just not write the kind of code that you used to use TCL for anymore?

@Justin: I'm now using Ruby instead of Tcl to write the same kind of code I used to write with Tcl mostly. I still use Tcl for "scripting" instead of bash or to write trow away networking code, as I think Tcl is still more practical in many ways for this problem domain.

The reason I quit Tcl is because the TCT (Tcl Core Team) has an idea about how to handle/evolve the language that are in my opinion not compatible to the need of evolution of Tcl. Compatibility with the past is taken in great regard even when changes to break with the past are absolutely needed (for instance Arrays should be first calss objects).

Also numerous design errors were made in the fields of namespaces, I/O, and so forth.

This wrong development path also lead to a very impoverished community, this means less libs, less documentation, less everything.

Basically Tcl needed his benevolent dictator with an unique, consistent, modern vision. The TCT can't work this way for a number of reasons, so Tcl is a language without hopes IMHO.

When I realized this I looked around me and started trying other dynamic languages, and I picked what I think it's the best, Ruby. I loved SmallTalk and Lisp and I found part of both, and a consistent simple design, in Ruby.

@Mario P. security at some point was no longer fun. It started to become a product, and there were no longer great new attacks to discover. Even worse, a security research to live have to do consultancy, and security consultancy were (are?) particularly boring: go in a big corp and try to figure how their network works to add layers of security, firewalls, proxies, ..., or on the other side source code auditing that is even worse :)

I believe the real success can be determined by the subject of your blog / website. If you offer quality, you will have just quality visitors, instead of low quality one. And this seems to be the most important reason to offer quality.

comments closed

PROGRAMMING AND WEB

Welcome, this blog is about programming, web, open source projects I develop, and rants I love to share from time to time. From the point of view of a programmer that loves to define himself a craftsman.