Meet the Project Founder: Roman Shaposhnik

This installment of “Meet the Project Founder” features Apache Bigtop founder and PMC Chair/VP Roman Shaposhnik.

What led you to your project idea(s)?

Conceptually, Apache Bigtop can actually be traced as far back as me working at Sun Microsystems in 2007-2008. I was assisting the team responsible for coming up with a 100% community-driven, open source Solaris distribution that could also be used as a basis for an enterprise-grade commercial product offering (which eventually became OpenSolaris). I then joined Yahoo! Inc. as a manager of a small team of extremely talented engineers tasked with integration efforts around Yahoo’s internal cloud offering based on Hadoop. Our project was called HIT (Hadoop Integration Testing) and we were known as “HIT-men”.

Aside from doing the initial commit, what is your definition of the project founder’s role across the lifespan of the project? Benevolent dictator, referee, silent partner?

Honestly, my role model is Linus Torvalds. He’s somebody who’s deeply passionate about the state of the community yet he still finds enough time to be involved with most technical aspects of the Linux kernel on a daily basis. But at the end of the day, he’s just plain fun to be around. Of course, the governance framework of Apache Software Foundation is quite different than the governance model of the Linux kernel. I use that excuse when I can’t quite measure up to Linus where influence is concerned.

What has surprised you the most about how your project has evolved/matured?

I’m still amazed at how quickly the ‘Powered by Bigtop’ list is growing.

The elevator pitch for Bigtop has always been: Bigtop is to Hadoop what Debian is to Linux. The most surprising development to me was how well that message resonates with the commercial vendors in the Big Data space. I’m still amazed at how quickly the “Powered by Bigtop” list is growing.

What is the major work yet to be done, from your perspective as the project’s founder?

Developers, developers, developers!

We have to grow the community by leaps and bounds if we want to be remembered as the Debian of Hadoop. This translates into investing in outreach activities, but also (and maybe primarily) into creating enough value in the project itself so that external developers get hooked. Just steer clear of trying to boil the ocean by yourself — make the project interesting enough, and the developer community will come to the party.

What is your philosophy, if you have one, for balancing quality versus quantity with respect to contributions?

That’s a tough one. In the ideally balanced community everybody keeps an eye on all the proposed changes and casts +/-1 as needed. Hence there’s a self-throttling process that also provides a learning opportunity for the newcomers. Fundamentally though, developers don’t like to review patches; they like to write code. Making the review duty appealing is like making the broccoli-eating process appealing to your toddler. The bottom line is: You have to get creative (and personally brace yourself for way more reviews than direct code contributions).

At the same time, accepting somebody’s patches is a great way of keeping that contributor around. Hence, personally, I try and err on the side of openness and community growth over polishing each patch ad infinitum. But my personal coding philosophy is: Commit early, commit often, and re-factor mercilessly.

Any other advice for other potential project founders?

Attachment is the root of all suffering; don’t get attached to your own code or ideas. The only thing that lasts is community. At ASF, we are reminded of the “community over code” mantra all the time, but it’s not just a phrase. It’s for real.

P.S.: Oh, and here’s one more crucial bit of advice: Before naming your project, make sure to check that the vanity license place with its name is available in your state.