No labels anywhere: K-Means invents structure by alternating two steps — assign every point to its nearest centroid, then move each centroid to its cluster's mean.
Each step can only lower inertia, so it always converges — but to a local optimum that depends on the random start. n_init=10 reruns and keeps the best.
It assumes round, similar-sized clusters; crescent moons and nested rings break it (that is DBSCAN territory).
You choose k, the data does not: elbow plots and silhouette scores advise, business meaning decides.
In the Test Kitchen: step the algorithm by hand and watch a bad initial spoon placement land in a worse final sort.
Don't just read the recipe — taste it. Drag, click and break things below.
A delivery just dumped ingredients all over the floor. K-means sorts them onto k shelves (the ✕ marks) with two alternating moves: ASSIGN each item to its nearest shelf, then UPDATEeach shelf to the middle of its pile. When the shelves stop moving — that's convergence.
FIG V.2: K-MEANS CLUSTERING — FAINT LINES SHOW CURRENT MEMBERSHIP
from sklearn.cluster import KMeans from sklearn.preprocessing import StandardScaler X = StandardScaler().fit_transform(dishes) # distance-based → scale! km = KMeans(n_clusters=3, n_init=10, random_state=42).fit(X) print(km.labels_[:10]) # which buffet table each dish joined print(km.inertia_) # within-cluster squared distance (lower=tighter) # sweep k and plot inertia → the 'elbow' suggests how many tables exist