A distribution is the recipe; your dataset is a finite number of scoops. Small samples lie about the pot — the histogram only converges with volume.
Normal shows up wherever many small independent effects add (measurement noise); exponential where you wait for rare events; uniform where you genuinely know nothing.
The Central Limit Theorem says averages of almost anything go normal — which is why the normal earns its name without everything being normal.
Skewed pot? Box-Cox or log-transform before feeding linear models; they assume roughly symmetric noise.
In the Test Kitchen: 10 scoops look like noise, 500 scoops become the purple curve. That gap is why sample size matters.
Don't just read the recipe — taste it. Drag, click and break things below.
The pot has a true recipe — the purple curve. Each ladle scoop is one random draw from it. A handful of scoops looks like noise; a few hundred and the histogram becomes the curve. Models assume a shape for this curve, so try all three shapes before trusting one.
FIG L.4: SAMPLING — THE EMPIRICAL HISTOGRAM CONVERGES TO THE TRUE DENSITY (PURPLE)
import numpy as np rng = np.random.default_rng(42) heights = rng.normal(5, 1.2, 10_000) # bell: errors, heights waits = rng.exponential(2.0, 10_000) # ski slope: time between orders picks = rng.uniform(2, 8, 10_000) # flat: anything equally likely # CLT party trick: means of ugly samples look normal anyway means = rng.exponential(2.0, (10_000, 30)).mean(axis=1) print(means.mean(), means.std()) # ≈ 2.0, ≈ 2/sqrt(30)