Code N Curry MLCode & Curry ML
Back to Library
CURRICULUM: S2 · E8 · 12 min
VERIFIED BLUEPRINT

Distance & Similarity: How Far Is That Flavour?

TIME: 12 min
🍽️YIELD: 1 metric-picking instinct for KNN, K-Means & embeddings
📓CHAPTER: S2E8

The Idea

CONCEPT
Two dishes A and B on a chilli/sweetness map. Three routes drawn between them: straight purple line (Euclidean), amber staircase along the counters (Manhattan), and a teal angle arc from the origin (cosine — direction only). Margin: 'double the portion, cosine doesn't blink.'

Distance is a modelling decision, not a fact. KNN, K-Means and vector search all inherit whichever ruler you hand them.

Euclidean cares about magnitude; cosine only about direction — which is why text embeddings (long vs short documents) almost always use cosine.

Unscaled features rig the vote: a column measured in grams out-shouts one in kilograms. Standardise before any distance-based model.

In high dimensions all points drift equally far apart (curse of dimensionality); distances lose contrast and neighbourhoods lose meaning.

In the Test Kitchen: park A and B on the same ray from the origin — Euclidean stays large while cosine hits 1.000.

⚗️ The Test Kitchen

INTERACTIVE LAB

Don't just read the recipe — taste it. Drag, click and break things below.

EXP 01

How Far Apart Are Two Flavours?

"distance is a design decision"

Two dishes, A and B, plotted by chilli and sweetness. Click to move whichever is nearer your cursor. Euclidean (purple) is the straight walk between them; Manhattan (amber) walks the counter-tops; cosine(teal arc) asks only "do they point the same way from bland (0,0)?" — double a recipe and cosine doesn't move. That choice changes who your neighbours are.

chilli →sweetness ↑AB
EUCLIDEAN L25.41
MANHATTAN L17.50
COSINE SIM0.738

FIG L.10: DISTANCE METRICS — SAME TWO DISHES, THREE DIFFERENT ANSWERS. KNN & K-MEANS INHERIT THIS CHOICE

The Recipe

CODE
REQUIRED SPICESeuclideanmanhattancosinenormalisationcurse of dimensionality
One pair of dishes, three answers
import numpy as np
a, b = np.array([2.5, 6.5]), np.array([7.0, 3.5])

eucl = np.linalg.norm(a - b)                       # 5.41 straight line
manh = np.abs(a - b).sum()                         # 7.50 city blocks
cos  = a @ b / (np.linalg.norm(a)*np.linalg.norm(b))
print(eucl, manh, round(cos, 3))                   # cosine: direction only
print(np.allclose(cos, (2*a) @ b /                 # scale a ×2 →
      (np.linalg.norm(2*a)*np.linalg.norm(b))))    # True: cosine unchanged
NEXT EXPERIMENT
CODE & CURRY
APPROVED
ML KITCHEN