January 17, 2017 @ 6:30 pm - 8:00 pm

Doors Open at 6:30pm

Presentation Starts at 7:00pm

Abstract:

Decision-tree models, particularly random forests and gradient-boosted tree models, are popular for both classification and regression tasks. For many tasks, they offer not only accurate predictions but also plain interpretations, that is, insights into the relationships between the input variables and the output variable: Which inputs are most influential? Which inputs combine non-additively? For practical purposes, answering these questions is often more important than providing accurate predictions per se. This talk will briefly introduce decision trees, random forests, and gradient-boosted tree models. The talk will then explain and demonstrate (in Python) feature importances – Which inputs are most influential? – and feature interactions – Which inputs combine non-additively? Regarding feature interactions, Friedman-Popescu statistics in particular will be discussed.