Pomegranate: Fast and Flexible Probabilistic Modeling in Python

Jacob SchreiberPaul G. Allen School of Computer Science, University of Washington

Audience level:
Intermediate
Topic area:
Modeling

Description

We will describe the python package pomegranate, which implements flexible probabilistic modeling. We will highlight several supported models including mixtures, hidden Markov models, and Bayesian networks. At each step we will show how the supported flexibility allows for complex models to be easily constructed. We will also demonstrate the parallel and out-of-core APIs.

Abstract:

In this talk we will describe pomegranate, a flexible probabilistic modeling package for python. We will highlight its wide library of probability distributions and compositional models, such as mixture models, Bayes classifiers, hidden Markov models, and Bayesian networks. At each step we will emphasize the flexibility provided by pomegranate and how it can easily allow the construction of more complicated models. We will compare these implementations to other implementations in the open source community. In addition, we will show how the underlying modularity of the code allows for models to be stacked to produce models such as mixtures of Bayesian networks, or HMMs with complicated mixture emissions. Lastly, we will show how easy it is to use the built-in out-of-core and parallel APIs to allow for multithreaded training on massive amounts of data that can't fit in data-- all without the user having to think about any implementation details. An accompany Jupyter notebook will allow users to follow along, see code examples for all figures presented, and try out modifications.

Some highlights of the tutorial are the following:

General

BLAS and cython are used to speed up calculations

multithreaded parallel processing is natively supported

out-of-core computing for large data sets

Models

models can be stacked in each other, such as a Bayesian classifier of mixtures, a mixture of hidden Markov models

models are faster and more representationally flexible than other packages