The latest news from Google on open source releases, major projects, events, and student outreach programs.

My summer of code and galaxies

Monday, October 13, 2014

Today we have a post from Doris Lee, a 2014 Google Summer of Code student for the Laboratory for Cosmological Data Mining. Doris talks about her fascinating summer project exploring the galaxy.

I first learned about Google Summer of Code (GSoC) through an informational session at my school, UC Berkeley. I was interested because of the program’s project-based nature which sounded like a lot of fun. I started by listing my favorite projects from each organization’s ideas page, and ended up getting so engrossed in the project and the code problem associated with the application that I submitted just one to the Laboratory of Cosmological Data Mining. The Laboratory was founded in 2002 at the University of Illinois by Professor Robert J. Brunner to develop and apply computational technologies to extract cosmological information from large astrophysical data sets being generated within the community.

During GSoC’s community bonding period, my mentor and I discovered my initial project proposal had been completed by another contributor. Together, we came up with an alternative project involving creating image mosaics of galaxies. Although I wasn’t as familiar with the subject matter of this new project, I was still very excited and couldn’t wait to get started!

I soon realized that not knowing much about the topic was actually a good thing. I was always learning something new throughout the summer which kept it both interesting and challenging. It was also the first time that I undertook such a large individual project.

But what I enjoyed the most about GSoC was the freedom to define the direction of my own work. The initial goal of the project was to make mosaics for large, bright galaxies from the Sloan Digital Sky Survey. However, in our first attempts some of the pictures that we made of the sky contained no galaxy at all! We discovered this was due to inherent inaccuracies in the coordinate values because the catalog was quite old. The course of my GSoC project was turned into developing an algorithm that uses newer imaging data to fix these inaccuracies and then mosaic them into pretty images. In the end, I designed a pipeline that enables users to automatically generate multi-band color images on any catalog of their choice. In addition, this pipeline is designed so it could be used on any set of data taken in the future.

Overall, my mentor was very helpful with guiding me through my project and addressing my questions. I feel lucky to have worked on a project that I was so excited about and something I truly wanted to see working. Since my project made heavy use of open source software developed by other programmers and members of the scientific community, I learned a lot about how open source software projects are managed, documented, distributed and maintained. This was especially useful when I was developing the user interface and documentation to present my final GSoC project and making the code open source. In addition, I learned that in the free and open-source software community, effective documentation and readable code can be just as important as getting the code to work. The value of publicly-available code derives from how other users can benefit from it. You can view the work here on GitHub.

There are so many organizations who participate in GSoC, it would be hard to find one that isn’t up your alley. I would encourage any interested student to look at the GSoC organizations and ideas lists when they are posted in February. GSoC enables students of all skill sets and levels to learn and contribute to the open source community and to develop skills in real-world software coding and design. And not to mention—it’s a great way to spend your summer!