PCA asks one question: along which direction does the data vary most? That direction is PC1; PC2 is the best remaining direction at 90° to it.
Projecting onto few components compresses each row to a handful of numbers while keeping most of the variance — the recipe card instead of the full pantry.
Standardise first: variance is scale-sensitive, and a milligram column will hog PC1 for free otherwise.
The components are eigenvectors of the covariance matrix — linear combinations of original features, which costs some interpretability.
In the Test Kitchen: rotate the skewer yourself and try to beat FIND PC1. You cannot — that is the eigenvector's whole job.
Don't just read the recipe — taste it. Drag, click and break things below.
Fifty-five dishes plotted by two correlated flavours. You may keep only one number per dish: its shadow on the amber skewer. Grab the skewer and rotate it — every angle you try is traced onto the variance curve below. PCA is nothing more than the angle at the top of that curve. Stuck? FIND PC1 lets the eigenvector glide there itself.
FIG L.9: PCA — PROJECT TO ONE DIMENSION, KEEP MAXIMUM VARIANCE. PC2 IS SIMPLY THE PERPENDICULAR (DASHED)
Same fifty-five dishes — now watch the compression actually happen. Each dish slides onto the skewer (the red threads are the information thrown away), then the skewer lays flat and the whole dataset becomes one number per dish. Run it along PC1, then along a bad angle: same data, same recipe, wildly different amount of flavour kept.
FIG L.10: DIMENSIONALITY REDUCTION = PROJECT, THEN FORGET THE PERPENDICULAR. PCA PICKS THE DIRECTION THAT FORGETS THE LEAST
from sklearn.decomposition import PCA from sklearn.preprocessing import StandardScaler X = StandardScaler().fit_transform(dishes) # PCA chases variance — pca = PCA(n_components=2).fit(X) # unscaled features cheat print(pca.explained_variance_ratio_) # e.g. [0.87, 0.09] Z = pca.transform(X) # 30 flavours → 2 numbers/dish # rule of thumb: keep enough components for ~95% cumulative variance