What do you do at Cloudera, and in which Apache projects are you involved?

I am a software engineer here at Cloudera, working on the security aspects of the platform. I specifically work on and an active contributor to the Apache Sentry (incubating) project, which is part of the Project Rhino effort with Intel to bring comprehensive security for data protection to Hadoop. I am also a committer and a PPMC member of the project.

Sentry is a system for enforcing fine-grained, role-based authorization to data and metadata stored on a Hadoop cluster. It seamlessly integrates to provide authorization control to Apache Hive, Impala, and Apache Solr users currently.

Why do you enjoy your job?

It has been a great experience so far. Learning something new every day, working with some of the brightest minds in the industry, working in open source — and moreover, working with a fun-loving, dedicated team — is very rewarding. I also really like the fact that Cloudera encourages public knowledge sharing, and thus engineers get to talk at various meetups and conferences about the work they are doing.

What is your favorite thing about Apache Hadoop?

As Cloudera rightly says in one sentence: It lets you “ask bigger questions”. If you think about how much data we produce every day versus how much we actually process, it is astonishing to imagine how many ways the world could benefit if we had the software capabilities to easily store, process, measure, and learn from all of it. And with more and more new datasets becoming available daily, it is very important for the software to evolve rapidly in terms of scale, performance, usability, and security.

I think this rapidity of software development in the Hadoop ecosystem is only possible because of the open source community, and I am very glad to be a part of that community as well as working with the leader, Cloudera.

What is your advice for someone who is interested in participating in any open source project for the first time?

There are numerous, high-impact, and interesting open source projects out there. I think it would be best to pick a couple of projects that interest you the most (even better, if you are already using that project). Subscribe to the users@ and dev@ mailing lists, follow the activity, and start using the project. Most of the projects have newbie JIRAs, which are relatively self-contained bug fixes. These are the ideal candidates to get started contributing.

You should completely feel free to file bugs and contribute patches when you hit problems. And each project also has a “How to contribute” page with detailed instructions for new contributors.

At what age did you become interested and programming, and why?

I started coding in C when I was around 17, and I instantly felt like I earned supernatural powers to do much more than what I could do with limited resources like physical strength and time. Coming from a sports background (tennis), I used to train my body for six hours a day — so the return over investment (in terms of time and energy) I saw in programming was very very exciting!

In tennis, once you reach the point of having good technique, all you need to do is keep your body in the best possible condition (which is no easy task) and experiment with your gaming strategy. But in programming, I can’t imagine a time where I will not have anything new to learn. That is something that keeps me excited every day.