Loss Functions

How does a network know it is wrong?

The Story

You built a network. It produces output. But how do you know if the output is good? That is what a loss function measures — the gap between "what the network predicted" and "what the correct answer should be".

Mean Squared Error

The simplest loss function: take the difference between prediction and target, square it, average it.

def mse(predicted, actual):
    return sum((p - a) ** 2 for p, a in zip(predicted, actual)) / len(predicted)

Why Square It?

Squaring does two things: (1) it makes every error positive (so they don't cancel out), and (2) it punishes big errors more than small ones. A prediction that is off by 4 is more than twice as bad as one off by 2.

Cross-Entropy Loss

For classification problems, we use cross-entropy. It measures how "surprised" the model is by the correct answer. Lower surprise = better model.

import math
def cross_entropy(predicted_probs, true_label_idx):
    return -math.log(predicted_probs[true_label_idx])

Exercises

Loading Python runtime (Pyodide)...

Primary Source: 3Blue1Brown — Gradient descent, how neural networks learn