Classification
Regression
Clustering
Structure Learning
Logistic Regression
Models the probability of each class using a sigmoid (logistic) function, trained by minimising cross-entropy loss with gradient descent.
The Regression Analysis of Binary Sequences
D.R. Cox · Journal of the Royal Statistical Society, Series B, Vol. 20 · 1958
Algorithm
Logistic Regression is a discriminative linear model. For binary classification it models P(y=1|x) = σ(w·x + b), where σ is the sigmoid function. The weights w and bias b are learned by minimising binary cross-entropy loss via gradient descent.
Sigmoid
Loss (binary cross-entropy), zᵢ = w·xᵢ + b
Gradient
Update
Multi-class problems are handled with one-vs-rest (OvR): one binary classifier is trained per class, and the class with the highest sigmoid score wins.
Theory → Code
1
Sigmoid function — squashes any real number into (0, 1)
2
Gradient descent on cross-entropy for one binary classifier
3
One-vs-rest — train one binary classifier per class value
4
Classify — return the class with the highest sigmoid score
Theory
Complexity
Complexity
Training
— one binary classifier per class (one-vs-rest)Query
— one sigmoid score per classSpace
— one weight vector per classParameters
Notes
Only numeric attributes are used as features in this implementation. Nominal attributes are currently ignored — convert them to numeric (e.g. one-hot encoding) before loading if they carry signal.
Feature normalisation is applied automatically: each numeric attribute is scaled to [0, 1] using training-set min/max. This is essential for gradient descent to converge reliably across attributes with different scales.
machinelearning.js.org · open source · MIT · Marin's Web Site