Notes

I draw visuals in order to understand concepts. Sometimes friends have found these helpful, so I thought I'd make them public.

These are raw notes from when I was learning each subject (sometimes many many years ago!), so they could be wrong! If you have a question about something, feel free to ask.

How probability distributions relate to information & entropy

Why continuous distributions have infinite entropy

Cross-entropy

Conditional entropy

Independence

That moment when it clicks why KL divergence between P(X,Y)P(X,Y) and P(X)P(Y)P(X)P(Y) equals mutual information I(X,Y)I(X,Y)! Thanks to Chris Olah for the rain/sun coat/no coat joint distribution example shown here.