Hi Basheer! We love hearing about new and interesting uses of InterMine. Can you give us a brief non-technical intro to FlyCAGE? What inspired you to make it?

FlyCAGE is a web application that allow users to search for genes in Drosophila melanogaster that follow a specific mRNA expression pattern. The user can either enter a known gene name to find other genes with similar expression profiles, or the user can enter a custom expression pattern based on experimental data to find genes that follow the pattern. FlyCAGE would be useful in identifying candidate genes involved in a given process and discovering regulatory interactions in genetic networks.

Tell us a little about yourself.

My name is Basheer Becerra, and I am currently an undergraduate junior at Illinois State University double majoring in Computer Science & Statistics and minoring in Biological Sciences. Programming is something I love doing, specifically web development and data science. What makes me even more excited is using programming and mathematics to answer difficult questions in biology! When I’m not on the computer, you will usually see me reading, training for upcoming marathons, or spending time with friends and family.

You’ve been a friendly presence in our Twitter feed for a while now. How did you hear about InterMine originally?

InterMine was originally introduced to me by my advisor, Dr. Nathan Mortimer, as a tool to help scientists quickly review information about any gene. While InterMine is a helpful tool for scientists to browse gene information, I’ve also realized that InterMine is incredibly useful for developers and data scientists. When my advisor and I came up with the idea of FlyCAGE, InterMine was chosen to be the best solution for retrieving data due to its data integration features and ease of use.

Can you tell us a bit about the technical implementation of FlyCAGE?

The technologies used to implement FlyCAGE includes Spring Framework (Java) for the web back-end, Thymeleaf for template resolving, and HTML/CSS/Bootstrap and JS/jQuery for front-end development. After the expression information is extracted from the entered gene or pattern, Pearson’s correlation is performed on every gene stored in FlyBase for Drosophila melanogaster. Genes with the highest pearson’s correlation coefficient relative to the input expression is returned to the user. InterMine plays a significant role in the operation of FlyCAGE since FlyMine is the only resource used to retrieve the gene information and mRNA expression data. With FlyMine’s modern HTTP API, only a single query is needed to retrieve all the necessary data for FlyCAGE to operate. Without FlyMine’s data integration, FlyCAGE would have to manually integrate data from several different data-sources such as FlyBase, FlyAtlas, BDGP, etc., which would slow down development significantly.

What are your future plans for FlyCAGE? Are you going to expand to other organisms apart from flies?

As far as the program logic, the plan is to include more complex analysis of expression data such as including other data features to help determine “gene similarity”, predict regulatory interactions and unknown gene functions, and explain subtle differences in gene pairs with correlated expression patterns. There has also been a lot of interest to expand CAGE towards other organisms such as plant genomes. With InterMine’s standardized API interface across its several resources, I predict that scaling the functionality of CAGE towards other organisms should be a relatively feasible task.

FlyCAGE is currently in-alpha and can be accessed by this link. However, the link is likely to change as FlyCAGE gets close to releasing. If you would like to stay up-to-date with FlyCAGE or if you would like to help us with usability tests to improve FlyCAGE, please enter your email with this Google Form. If you’ve already looked at FlyCAGE and would like to send some feedback, send me a quick note at bbecer2@ilstu.edu. Any help is appreciated!