The Bernoulli Distribution
Outline:
- What is a probability distribution?
- The Bernoulli Distribution.
- Summary
- What is a probability distribution?
Our world is just a collection of random phenomena, that is a bunch of processes with a set of possible outcomes. The randomness stems from the fact that we don’t know which of the possible outcomes is going to be obtained when the experiment is carried out. Let us exclude this randomness by assuming we have a store that sells only apples, and any particular individual who enters the store will have to buy one and only one apple. This phenomena translates to:
Where X is the event of buying an apple, here we are saying that the probability of buying an apple for an individual who enters our store is 100%. This is essentially a probability distribution, because it’s assigning a probability to each possible outcome in our experiment, in our case, we have only one outcome ‘Buying an apple’, and its probability is 100%. Because we assumed that any individual who enters the store will have to buy an apple. A more realistic scenario would be that the customer is free to choose whether to buy or not to buy the apple. In that case, assuming the two outcomes (to buy or not to buy) are equally likely and the customer wont be tempted by our sexy red apples, then our probability distribution is modified to:
Okay, if we have grasped this basic concept, then any fancy probability distribution wont be intimidating. Because it’s essentially a mathematical function trying to map the set of possible outcomes to corresponding probabilities similar to our example. In our example:
Set of possible outcomes = {Buy the apple, Doesn’t buy the apple}
Set of corresponding probabilities = {0.5,0.5}
Now instead of dealing with each outcome textually, let us assign a random variable to each outcome, X = 1 if outcome = Buy and X=0 if outcome= Doesn’t buy:
Now we have all the ingredients necessary to start prepping our main course for this article, the Bernoulli distribution.
2. The Bernoulli Distribution
The Bernoulli distribution is a discrete probability distribution, meaning it assigns probabilities to a discrete set of outcomes, exactly like the example we have mentioned. The set of outcomes were {Buy, doesn’t buy}, a countable discrete set of outcomes. Moreover, the set of outcomes has to be binary, i.e. consisting of two possible outcomes.
It only differs from function 2 in the probabilities assigned to each outcome, in function 2 we have assumed the two outcomes are equally likely, but the customer entering our store might actually be tempted by our sexy red apples!! How tempted you might ask? Well, we don’t know. Hence, we assign random probabilities to each possible outcome. The customer is ‘p’ likely to buy the apple and ‘q‘ likely to skip buying the apple. Since we have only two possible outcomes that are mutually exclusive, then:
And consequently:
We can write Function 3 which represents our probability distribution in a more concise manner as:
Where p is the probability of the event X=1 occurring as in function 3, and (1-p) is q, which is the probability of the other event X=0 occurring. And x is the value the Random variable X takes, which represents the event that took place. If we are interested in finding the probability of X=1, then we plug in x=1 in equation 5, and we obtain P(X=1) = p. Similarly if we are interested in finding the probability of X=2, then we plug in x=0, and we get P(X=0) = (1-p) which is q. The exact same results in function 3.
Essentially, what the Bernoulli distribution is telling us if you have the probability of the occurrence of an event p, then its probability is that same probability p, and the probability of the other outcome is 1-p. This might seem redundant or useless, but the power of the Bernoulli distribution is in what other distributions can be built on top of it, as well as helping us approximate the value p for an observed random phenomena that can be modeled using a Bernoulli distribution as we will see later on.
Summary
- Probability distribution: Our attempt at modeling a random phenomena.
- Probability distribution: A function that maps the set of possible outcomes to corresponding probabilities.
- The Bernoulli distribution: A function that maps a set of discrete binary outcomes to their corresponding probabilities.