Pages

Monday, January 6, 2014

An Easy Way to Bridge Between Python and Vowpal Wabbit

Python is a great programming language. It is has a clean syntax, tremendous user community support, and excellent machine learning libraries. Unfortunately it is SLOW! So, when the situation calls for it, I prefer to drop down to machine code to run the actual machine learning algorithm.

One fast and amazing Machine Learning tool that I have used on a number of projects is Vowpal Wabbit. It was developed by researchers at Yahoo! Research and later at Microsoft Research. It has support for many types of learning problems, automatically consumes/vectorizes text, can do recommendations, predictions, classifications, (single and multi-class), supports namespaces, instance weighting, and the list goes on.

The problem with wrappers is that they don't always expose all the features you want to use. Vowpal has a lot of features. So, after a bit of hemming and hawing, I did a "slash and burn" then wrote what I needed. This is how I currently use Vowpal Wabbit with Python. Instead of a wrapper, I offer you code snippets which can be tailored to your specific needs.

This code assumes you know how to use Python and Pandas. It runs on linux and uses the matrix factorization feature (recommendation engine) of Vowpal.

Performance: With over 43 million rows, it took about 16 minutes to generate the inputs in the Pandas DataFrame, but only 9 minutes to train with 20 passes. (I7-2600K)

Enjoy!

Steve Geringer

##########################################################################
# Here are the essential ingredients. You'll have to fill in the
rest...;)
##########################################################################

import os
from time import asctime, time
import subprocess
import csv
import numpy as np
import pandas as pd
.
.
.
#############################################################
# Parameters and Globals
#############################################################
environmentDict=dict(os.environ,
LD_LIBRARY_PATH='/usr/local/lib')
# Hat Tip to shrikant-sharat for this secret incantation
# Note: only needed if you rebuilt vowpal and the new libvw.so is in
/usr/local/lib