MachineLearning.js

k-Means is an unsupervised algorithm that groups instances into k clusters. It uses only numeric attributes (the class attribute is ignored). MacQueen's formulation is the standard Lloyd's algorithm:

Algorithm: Lloyd's k-Means

1. Initialise k centroids by random sampling from the dataset

2. repeat until no assignment changes (or maxIter reached):

a. Assign each instance to its nearest centroid (Euclidean distance)

b. Recompute each centroid as the mean of its assigned instances

3. Report assignments, centroid values, cluster sizes, and WCSS

Lloyd's algorithm is guaranteed to converge in a finite number of steps because the number of possible assignments is finite and WCSS strictly decreases at each iteration.

Theory → Code

Five stages of MacQueen's algorithm map to the implementation:

Initialization — random instances as starting centroids

// Shuffle the dataset and take the first k instances as seeds. // Simple, avoids duplicate centroids, sufficient for small-to-medium datasets. const shuffled = [...instances].sort(() => Math.random() - 0.5); let centroids = shuffled.slice(0, k).map(inst => idxs.map(i => inst[i] ?? 0));

Assignment step — argmin distance to centroid

Update step — recompute centroids as cluster means

Convergence check — stop when no assignments change

const changed = newAssign.some((a, i) => a !== assignments[i]); assignments = newAssign; if (!changed) break; // Lloyd's algorithm is guaranteed to converge

WCSS — within-cluster sum of squares (objective function)

\text{WCSS} = \sum_{k=1}^{K}\sum_{\mathbf{x}\,\in\, C_k} \|\mathbf{x} - \boldsymbol{\mu}_k\|^2

const wcss = projected.reduce((s, p, j) => s + idxs.reduce((s2, _, ii) => s2 + (p[ii] - centroids[assignments[j]][ii]) ** 2, 0), 0); // Lower WCSS → tighter clusters; only compare across models with the same k.

Theory

Theorem 1.(Convergence of Lloyd's algorithm) The k-Means algorithm terminates in a finite number of iterations.

Proof sketch.There are

k^n

possible assignments of

n

points to

k

clusters — a finite set. The WCSS objective strictly decreases at each step: the assignment step minimises WCSS for fixed centroids (each point goes to its nearest centre); the update step (setting centroids to cluster means) minimises WCSS for fixed assignments. Since WCSS is bounded below by 0 and strictly decreases, the algorithm cannot revisit a prior assignment and must terminate.□

Lemma 1.Lloyd's algorithm converges to a local minimum of WCSS, not necessarily the global minimum. The result depends on initialisation; multiple restarts with different random seeds and selecting the run with lowest WCSS is standard practice.

Lemma 2.WCSS is a monotone decreasing function of

k

\text{WCSS}(k+1) \le \text{WCSS}(k)

always. Therefore WCSS alone cannot determine the optimal

k

— use the elbow method or gap statistic instead.

Complexity

Per iteration

O(n \cdot k \cdot d)

— assign each of n points to nearest of k centroids

Total

O(n \cdot k \cdot d \cdot t)

— t iterations until convergence

Space

O((n + k) \cdot d)

— data + centroid storage

In practice $t \ll n$ — most datasets converge in under 20 iterations. k-Means++ initialisation (choose centroids proportional to squared distance from existing centroids) improves both quality and convergence speed but is not implemented here.

Parameters

Number of clusters. Try several values and compare WCSS (elbow method) to choose.

Max iterations

Safety cap (default 100). Most real datasets converge in under 20 iterations.

Output

The Cluster tab reports cluster assignments for every instance, centroid values per attribute, cluster sizes, and the within-cluster sum of squares (WCSS).

WCSS always decreases as k increases — so comparing WCSS across different k values is only meaningful with an elbow plot. Look for the point where adding another cluster gives diminishing returns.

On this page

Original Paper Algorithm Theory → Code Theory Complexity Parameters Output

Linear Regression K2 Algorithm

machinelearning.js.org · open source · MIT · Marin's Web Site