This is a short distillation of the motivation for Kullback-Leibler (KL) divergence.

Distilled from this nice vid from Kapil Sachdeva.

(I’m having difficulties getting all of the mathjax to render properly, so this is an image for now.)