Code N Curry MLCode & Curry ML
Back to Library
CURRICULUM: S5 · E2 · 14 min
VERIFIED BLUEPRINT

K-Means: The Buffet Sorter

TIME: 14 min
🍽️YIELD: 1 clustering algorithm you can run in your head
📓CHAPTER: S5E2

The Idea

CONCEPT
An unlabelled buffet scatter. Three crossed-spoon centroids drop in at random; dashed arrows assign each dish to its nearest spoon, then the spoons slide to the centre of their crowd. Loop arrow labelled 'assign → recentre → repeat until nothing moves'.

No labels anywhere: K-Means invents structure by alternating two steps — assign every point to its nearest centroid, then move each centroid to its cluster's mean.

Each step can only lower inertia, so it always converges — but to a local optimum that depends on the random start. n_init=10 reruns and keeps the best.

It assumes round, similar-sized clusters; crescent moons and nested rings break it (that is DBSCAN territory).

You choose k, the data does not: elbow plots and silhouette scores advise, business meaning decides.

In the Test Kitchen: step the algorithm by hand and watch a bad initial spoon placement land in a worse final sort.

⚗️ The Test Kitchen

INTERACTIVE LAB

Don't just read the recipe — taste it. Drag, click and break things below.

EXP 01

The Buffet Sorter

"assign, recentre, repeat"

A delivery just dumped ingredients all over the floor. K-means sorts them onto k shelves (the ✕ marks) with two alternating moves: ASSIGN each item to its nearest shelf, then UPDATEeach shelf to the middle of its pile. When the shelves stop moving — that's convergence.

k =iteration 0

FIG V.2: K-MEANS CLUSTERING — FAINT LINES SHOW CURRENT MEMBERSHIP

The Recipe

CODE
REQUIRED SPICESk-meanscentroidsinertiaunsupervisedinitialisation
K-Means with the safety rails on
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

X = StandardScaler().fit_transform(dishes)     # distance-based → scale!
km = KMeans(n_clusters=3, n_init=10, random_state=42).fit(X)

print(km.labels_[:10])        # which buffet table each dish joined
print(km.inertia_)            # within-cluster squared distance (lower=tighter)
# sweep k and plot inertia → the 'elbow' suggests how many tables exist
NEXT EXPERIMENT
CODE & CURRY
APPROVED
ML KITCHEN