MachineLearning.js

Logistic Regression is a discriminative linear model. For binary classification it models P(y=1|x) = σ(w·x + b), where σ is the sigmoid function. The weights w and bias b are learned by minimising binary cross-entropy loss via gradient descent.

Sigmoid

\sigma(z) = \frac{1}{1 + e^{-z}}

Loss (binary cross-entropy), zᵢ = w·xᵢ + b

L(\mathbf{w}, b) = -\frac{1}{n}\sum_{i}\bigl[y_i\log\sigma(z_i) + (1-y_i)\log(1-\sigma(z_i))\bigr]

Gradient

\frac{\partial L}{\partial \mathbf{w}} = \frac{1}{n}X^\top\!\bigl(\sigma(X\mathbf{w}) - \mathbf{y}\bigr), \qquad \frac{\partial L}{\partial b} = \frac{1}{n}\sum_{i}(\sigma(z_i) - y_i)

Update

\mathbf{w} \leftarrow \mathbf{w} - \eta\,\frac{\partial L}{\partial \mathbf{w}}, \qquad b \leftarrow b - \eta\,\frac{\partial L}{\partial b}

Multi-class problems are handled with one-vs-rest (OvR): one binary classifier is trained per class, and the class with the highest sigmoid score wins.

Theory → Code

Sigmoid function — squashes any real number into (0, 1)

function sigmoid(z) { return 1 / (1 + Math.exp(-z)); }

Gradient descent on cross-entropy for one binary classifier

for (let epoch = 0; epoch < epochs; epoch++) { const dw = new Array(d).fill(0); let db = 0; for (let i = 0; i < n; i++) { const err = sigmoid(w.reduce((s, wj, j) => s + wj * X[i][j], b)) - y[i]; for (let j = 0; j < d; j++) dw[j] += err * X[i][j]; // ∂L/∂w db += err; // ∂L/∂b } for (let j = 0; j < d; j++) w[j] -= (learningRate / n) * dw[j]; b -= (learningRate / n) * db; }

One-vs-rest — train one binary classifier per class value

const classifiers = classValues.map(cv => { const y = instances.map(r => r[classIndex] === cv ? 1 : 0); // 1 = this class, 0 = rest return { cv, ...trainBinary(X, y, d, learningRate, epochs) }; });

Classify — return the class with the highest sigmoid score

export function classify(model, instance) { const x = featureIdxs.map((fi, j) => ((instance[fi] ?? 0) - mins[j]) / ranges[j]); let best = null, bestScore = -Infinity; for (const { cv, w, b } of classifiers) { const score = sigmoid(w.reduce((s, wj, j) => s + wj * x[j], b)); if (score > bestScore) { bestScore = score; best = cv; } } return best; }

Theory

Lemma 1.The binary cross-entropy loss

L(\mathbf{w}, b)

is convex in

(\mathbf{w}, b)

— gradient descent converges to the global minimum (assuming it exists; it may not when the data are linearly separable).

Proof sketch.The Hessian is

H = \tfrac{1}{n}X^\top \operatorname{diag}(\sigma_i(1-\sigma_i)) X

. Since

\sigma(z)(1-\sigma(z)) > 0

for all

z \in \mathbb{R}

, the diagonal matrix is positive definite, and

X^\top(\cdot)X

is positive semi-definite. Therefore

H \succeq 0

and

L

is convex.□

Lemma 2.When the training data are linearly separable, no finite weight vector minimises the cross-entropy loss — weights grow unboundedly to push

\sigma(z_i) \to 1

for all positive examples. Regularisation (L2 penalty) is required to recover a finite solution.

Complexity

Training

O(n \cdot d \cdot e \cdot c)

— one binary classifier per class (one-vs-rest)

Query

O(c \cdot d)

— one sigmoid score per class

Space

O(c \cdot d)

— one weight vector per class

Parameters

Learning rate (η)

Step size for gradient updates (default 0.1). Too large → diverge; too small → slow convergence.

Epochs

Number of full passes over the training data (default 500).

Notes

Only numeric attributes are used as features in this implementation. Nominal attributes are currently ignored — convert them to numeric (e.g. one-hot encoding) before loading if they carry signal.

Feature normalisation is applied automatically: each numeric attribute is scaled to [0, 1] using training-set min/max. This is essential for gradient descent to converge reliably across attributes with different scales.

Logistic Regression provides calibrated probability estimates — unlike k-NN and Decision Trees, the sigmoid output can be interpreted as a confidence score.

On this page

Original Paper Algorithm Theory → Code Theory Complexity Parameters Notes

Decision Tree (ID3)Support Vector Machine

machinelearning.js.org · open source · MIT · Marin's Web Site