Logistic Regression: Golden or Burnt?

⏱TIME: 15 min

🍽️YIELD: 1 probability machine (not just a classifier)

📓CHAPTER: S4E3

The Idea

CONCEPT

Pakoras along a fry-time axis: golden ones sitting on the floor (y=0), burnt ones on the ceiling (y=1). An S-shaped sigmoid sweeps between them; a vertical amber line where p=0.5 labelled 'decision boundary'. Margin: 'the model outputs a probability — the threshold is YOUR choice.'

Despite the name it is a classifier — built by passing a linear score through the sigmoid so the output lands in (0,1) and reads as probability.

It trains on log-loss, which punishes confident wrong answers brutally; accuracy is computed after, at whatever threshold you pick.

The default 0.5 threshold is not sacred: a kitchen that fears burnt food moves it down, trading false alarms for safety.

Coefficients live in log-odds: each extra minute multiplies the odds of burnt by e^w.

In the Test Kitchen: minimise the log-loss by hand with two sliders — then appreciate what the optimiser does in milliseconds.

⚗️ The Test Kitchen

INTERACTIVE LAB

Don't just read the recipe — taste it. Drag, click and break things below.

EXP 01

Golden or Burnt?

"be the optimiser for a minute"

Dots on the floor are golden pakoras (0), dots on the ceiling are burnt ones (1). The purple S-curve is your model's p(burnt). Slide the midpoint and steepness to push the log-lossas low as you can — you are doing, by hand, exactly what logistic regression's optimiser does.

MIDPOINT m3.0mSTEEPNESS k0.8

log-loss 0.517 · accuracy 69%

FIG L.5: LOGISTIC REGRESSION — σ(k(t−m)) MAPS FRY TIME TO PROBABILITY; THE BOUNDARY SITS WHERE p = 0.5

The Recipe

CODE

REQUIRED SPICESlogistic regressionsigmoidlog-oddslog-lossdecision boundary

Probabilities, then verdicts

from sklearn.linear_model import LogisticRegression
import numpy as np

X = np.array([[1.2],[2.4],[3.6],[4.6],[5.2],[5.9],[7.0],[8.4]])
y = np.array([0, 0, 0, 1, 0, 1, 1, 1])        # burnt?

clf = LogisticRegression().fit(X, y)
print(clf.predict_proba([[5.0]])[0, 1])       # p(burnt) at 5 minutes
# boundary = where p = 0.5, i.e. w·t + b = 0:
print(-clf.intercept_[0] / clf.coef_[0, 0])   # ≈ the safe fry time

NEXT EXPERIMENT →

CODE & CURRY

APPROVED

ML KITCHEN