Implementing a simple web-log based recommender system

I’ve now implemented such a system as an extension to Catalyst, the open source Perl web framework. The system isn’t yet ready for general distribution, but I’d like to share my approach.

First, I’ve gathered ten years of web access logs from WormBase, a generic model organism database where I work as the project manager.

Next, I correlated IP addresses with requests and tried to trace browsing patterns from one object to the next. This isn’t an exact science since we haven’t historically tried to uniquely identify users.

Data is loaded into a simple MySQL schema with object and object2related tables. Expediently simple.

About Todd Harris

Do you have a data management, analysis, or visualization problem you need some help with? Do you need to connect with the best people to build out your team of data scientists, bioinformaticians, or curators? Drop me a line -- I'd be happy to chat with you about your project.

Welcome!
My name is Todd Harris. A geneticist by training, I now work at the intersection of biology and computer science developing tools and systems to organize, visualize, and query large-scale genomic data across a variety of organisms.

I'm driven by the desire to accelerate the pace of scientific discovery and to improve the transparency and reproducibility of the scientific process.