We usually contemplate the realm of probability from the frequentist point of view. If we have a die with six sides we assume that each side had one chance in six to appear every time we throw the die (assuming it’s a fair die and all its sides are equally likely).
If we have doubts as to whether the die is fair, what we do is to throw the die a huge number of times until we’re able to calculate how many times each side will predictably appear, thus calculating its probability. But in both cases, once we have the data, we stay with it forever. Whatever happens, we’ll continue saying that the probability of getting five is one in a sixth.
But sometimes the likelihood can change and become different that the one set at first. An initial probability can change if we feed the system with new information and it may depend on events that happen over time. This gives rise to the Bayesian statistical viewpoint, based largely on the Bayes’ rule, in which the probability of an event can be updated over time. Let’s see an example.
Suppose, for instance, that we have three coins. But they are three very special coins, as only one of them is fair (heads and tails, HT). Of the other two, one has two heads (HH) and the other has two tails (TT). Now we put the three coins in a bag and draw one of them without looking. The question is: what is the probability of drawing the coin with two heads?.
How easy!, most of you will think. It’s a simple case of desired events divided by possible events. As there’s one desirable event (HH) and three possible ones (HH, TT and HT), the probability is one-third. We have a 33% chance of drawing the coin with two heads.
But what happens if I tell you that I have toss the coin and I have gotten a head? Would I still have the same one-third chance of having the two-heads coin in my hands?. The answer is obviously no. And what is now the probability of having drawn the two-heads coin?. To calculate it we cannot use the quotient between favorable and possible events, but we have to use Bayes’ rule. Let’s deduce it.
The probability of two independent events A and B to occur is equal to the probability of A times the probability of B. If the two events are dependent, the probability of A and B would be equal to the probability of A times the probability of B given A:
P(A and B) = P(A) x P(B|A)
Taking it to the example of our coins, the probability of getting heads and having the two-heads coin can be expressed as
P(H and HH) = P(H) x P(HH|H) (probability of getting heads multiply by the probability of having HH given we have gotten heads).
But we can also express it in the opposite way:
P(H and HH) = P(HH) x P(H|HH) (probability of having the HH coin multiply by the probability of getting heads with the HH coin).
So, we can equate the two expressions and obtain our Bayes’ rule:
P(H) x P(HH|H) = P(HH) x P(H|HH)
P(HH|H) = [P(HH) x P(H|HH)] / P(H)
We will now calculate our chances of having drawn the HH coin if we have gotten heads. We know that P(HH) = 1/3. P(H|HH) = 1: if you have a coin with two heads you have a 100% chance of getting heads. What is P(H)?
The chance of getting heads is equal to the probability of having drawn the TT coin times the probability of getting head with the TT coin, plus the chance of having drawn the HH coin times the probability of getting heads with the HH coin, plus the probability of having drawn the fair coin times the probability of getting heads with it:
P(H) = (1/3 x 0) + (1/3 x 1/2) + (1/3 x 1) = 1/2
So, P(HH|H) = [1 x 1/3] / 1/2 = 2/3 = 0,66
That means that if we have toss the coin and have gotten heads, the probability that we have drawn the two-heads coin from the bag rises from 33% to 66% (and the probability of having the two-tails coin gets from 33% to 0).
Do you see how the probability is updated? What if we toss the coin again and get heads? What would them be the probability of having drawn the two-heads coin?. Let’s calculate it following the same reasoning:
P(HH|H) = [P(HH) x P(H|HH)] / P(H)
In this case, P(HH) is not equal to 1/3, but 2/3. P(H|HH) is still 1. Finally, P(H) has also change: we have already rule out the possibility of having drawn the two-tails coin, so the chances of getting heads in the second toss is the probability of having HH times the probability of getting heads with HH, plus the probability of having the fair coin times the probability of getting heads with it:
P(H) = (2/3 x 1) + (1/3 x 1/2) = 5/6
So, P(HH|H) = (2/3 x 1) / (5/6) = 4/5 = 0,8
If we get heads with the second toss, the probability of having the two-heads coin rises from 66% to 80%. Logically, if we keep repeating the experiment, every time we get heads we’ll be surer of having the two-heads coin, but we’ll never have total certainty. Of course, the experiment ends the moment we gets tails, in which the probability of having HH will drop automatically to zero (and the chance of having a fair coin will go to 100%).
As you can see, probability is not as immutable as it seem.
And here we stop playing with coins for today. Just say that, eventhough it’s less known than the frequentist approach, Bayesian statistics are of great use. There are text-books, dedicated software and specific method for analysis of results incorporating information derived from the study. But that’s another story…