Probability Distribution3: Beta and Dirichlet distribution
Introduction The sixth post of my probability theory series focuses on the Beta distribution. Basic Probability Two Random Variables Chain Rule of Probability Theory Probability Distribution1 Probability Distribution2: Normal distribution The beta distribution has the multivariate version called the Dirichlet distribution. The Dirichlet distribution used to be very popular in Bayesian and natural language processing literature before the LLM era. A common one-linear explanation of the beta distribution is "a distribution over a probability". I hope all readers are confused, so was I when I heard this for the first time. Here is an example. I visited paris every year in the past 10 years. That makes my chance of visiting Paris next year 100% according to my travel history. The beta distribution asks this question: "how confident are we with this 100% chance of me visiting Paris next year"? It is a distribution ("our confidence") over probabilities ("...