Classification
Regression
Clustering
Structure Learning
Decision Tree (ID3)
Recursively splits the training data on the attribute that gives the highest information gain, building a tree that can be traversed at classify time.
Induction of Decision Trees
J.R. Quinlan · Machine Learning, Vol. 1 · 1986
Algorithm
ID3 (Iterative Dichotomiser 3) builds a decision tree top-down by greedily selecting, at each node, the attribute that maximises information gain — the reduction in entropy when the data is partitioned by that attribute.
Algorithm: ID3 BuildTree
Information Gain
Shannon Entropy
Numeric attributes are handled with binary splits — the algorithm tries every midpoint between adjacent sorted values and picks the threshold with the highest information gain.
Theory → Code
Four implementation steps map to Quinlan's paper:
1
Entropy — measure impurity of a set of instances
2
Best split — find the threshold/value with maximum information gain
3
Recursive tree building — greedy top-down splitting
4
Classify — traverse the tree by following branches
Theory
Complexity
Complexity
Training
— n instances, d attributes, h max depthQuery
— single root-to-leaf traversalSpace
— tree nodes proportional to training setAt each node, scoring all attributes over the remaining instances costs ; with nodes per level and depth , total training is . Numeric attributes add a sort per attribute per node.
Parameters
Notes
Overfitting:ID3 grows trees until leaves are pure, which often overfits noisy training data. The max-depth limit is the only regularisation here. C4.5 (Quinlan's successor) adds post-pruning to address this.
Bias toward high-cardinality attributes: Information gain favours attributes with many values. C4.5 corrects this with Gain Ratio instead of raw gain.
machinelearning.js.org · open source · MIT · Marin's Web Site