Code N Curry MLCode & Curry ML
Back to Library
CURRICULUM: S5 · E4 · 15 min
VERIFIED BLUEPRINT

PCA: The Carving Angle

TIME: 15 min
🍽️YIELD: 1 dimensionality reducer + the variance-kept instinct
📓CHAPTER: S5E4

The Idea

CONCEPT
An elongated cloud of dishes with a skewer rotating through its centre. Each dish drops a perpendicular shadow onto the skewer. At the wrong angle the shadows bunch up (information lost); at PC1 they spread widest. Margin: 'compression = keeping the widest shadow.'

PCA asks one question: along which direction does the data vary most? That direction is PC1; PC2 is the best remaining direction at 90° to it.

Projecting onto few components compresses each row to a handful of numbers while keeping most of the variance — the recipe card instead of the full pantry.

Standardise first: variance is scale-sensitive, and a milligram column will hog PC1 for free otherwise.

The components are eigenvectors of the covariance matrix — linear combinations of original features, which costs some interpretability.

In the Test Kitchen: rotate the skewer yourself and try to beat FIND PC1. You cannot — that is the eigenvector's whole job.

⚗️ The Test Kitchen

INTERACTIVE LAB

Don't just read the recipe — taste it. Drag, click and break things below.

EXP 01

The Carving Angle

"keep the widest shadow"

Fifty-five dishes plotted by two correlated flavours. You may keep only one number per dish: its shadow on the amber skewer. Grab the skewer and rotate it — every angle you try is traced onto the variance curve below. PCA is nothing more than the angle at the top of that curve. Stuck? FIND PC1 lets the eigenvector glide there itself.

richness →tang ↑VARIANCE KEPT vs SKEWER ANGLE — TRACE IT YOURSELF90°
VARIANCE KEPT
34.7%

FIG L.9: PCA — PROJECT TO ONE DIMENSION, KEEP MAXIMUM VARIANCE. PC2 IS SIMPLY THE PERPENDICULAR (DASHED)

EXP 02

The Great Flatten

"watch 2-D become 1-D"

Same fifty-five dishes — now watch the compression actually happen. Each dish slides onto the skewer (the red threads are the information thrown away), then the skewer lays flat and the whole dataset becomes one number per dish. Run it along PC1, then along a bad angle: same data, same recipe, wildly different amount of flavour kept.

pick an angle below…

FIG L.10: DIMENSIONALITY REDUCTION = PROJECT, THEN FORGET THE PERPENDICULAR. PCA PICKS THE DIRECTION THAT FORGETS THE LEAST

The Recipe

CODE
REQUIRED SPICESPCAexplained varianceprojectioneigenvectorsstandardisation
Thirty flavours, two numbers
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

X = StandardScaler().fit_transform(dishes)   # PCA chases variance —
pca = PCA(n_components=2).fit(X)             # unscaled features cheat

print(pca.explained_variance_ratio_)         # e.g. [0.87, 0.09]
Z = pca.transform(X)                         # 30 flavours → 2 numbers/dish
# rule of thumb: keep enough components for ~95% cumulative variance
NEXT EXPERIMENT
CODE & CURRY
APPROVED
ML KITCHEN