5
Objectives of the Algorithm Exact results –Common approaches would either Collect a sample of the data Build independent models at each site and then use centralized meta-learning atop of them Communication efficiency –Naive approach: collect exact statistics for each tree node would result in GBytes of communication

6
Decision tree in a Teaspoon A tree were at each level the learning samples are splitted according to one attribute’s value Hill-climbing heuristic is used to induce the tree –The attribute that maximized a gain function is taken –Gain functions: Gini or Information Gain No real need to compute the gain

7
Main Idea Infer deterministic bounds on the gain of each attribute Improve bounds until best attribute is provenly better than the rest Communication efficiency is achieved because bounds require just limited data –Partial statistics for promising attributes –Rough bound on irrelevant attributes

8
Hierarchical Algorithm At each level of the hierarchy –Wait for reports from all descendants Contain upper and lower bounds on the gain of each attribute, number of samples from each class –Use descendant's report to compute cumulative bounds –If no clear separation, request descendants to tighten bounds by sending more data –At worst, all data is gathered