Decision Trees

‘All decision tree algorithms have the same structure. Basically, it's a divide and conquer algorithm” very similar to the game of 20 Questions.

1. Take the entire data set as input. 2. Search for a split that maximizes the "separation" of the classes. A split is any test that divides the data in two (e.g. if attribute5 < 10). 3. Apply the split to the input data (the "divide" step) into two parts. 4. Re-apply steps 1 and 2 to each side of the split (the recursive "conquer" step). 5. Stop when you meet some stopping criteria. 6. (Optional) Clean up the tree in case you went too far doing splits (called "pruning")