How do we decide what tests to use?
• There are many variations (CART, ID3, C4.5) but the basic
idea is to recursively partition the space by greedily picking
the best test from a fixed set of possible tests at each step.
– We need to decide on the set of possible tests
• Consider splits at every coordinate of every point in the training
data (i.e. all axis-aligned hyper-planes that touch datapoints)
• We could consider all possible hyper-planes but it is usually too
expensive in fitting time and model complexity.
– We need a measure of how good a test is (“purity”)
• For regression, compute the resulting sum-squared error over all
partitions.
• For classification it is a bit more complicated.