Combinatory and probability
It’s a funny thing, but there’s a statistical quiz which I meet very often lately. I find it reading articles, doing my courses and studying my books. So, why not, I’m going to share it with you, although it’s a bit difficult to understand and many of you may already know it.
I’m talking about the birthday problem, a kind of puzzle that is used to demonstrate that our intuition can often deceive us when we are dealing with concepts of probability, especially if large numbers are involved.
Suppose I go to a movie theater one day. It strikes me how few people are in the room, so I expect the worst from the film. Unfortunately, my fears are confirmed and the movie is a crap, so my mind starts to wander, starting to count how many people are in the room. I see that there’re only 35 people in the room and then I make the big question: what is the probability that at least two of these 35 people had the same birthday?
What do you think?. At first, it seem quite difficult to calculate, but do you think the probability is high or low?. Our intuition tells us that the probability should not be too high, as there are only 35 people to distribute in matches between no less than 365 days in a year (we forget leap years). However, as the title of this post says, intuition can sometimes deceive us. Let’s calculate what the actual probability is that at least two people in the room had the same birthday.
To calculate the probability of an event we have to divide the number of favorable events between the number of possible events. For example, to calculate the probability of rolling a six on a die we have to divide one (the number of results we’re interested in, six) by six (the number of possible outcomes when rolling a die, from one to six). Well, in our case we’ll do the same. In the numerator we have to put the number of existing combinations with at least one match and in the denominator the total number of sets of 35 birthdays that can be made with 365 days.
Our first problem lies in the numerator. The possible number of matches includes those sets with one match, two matches, three…, lots of them. This can be very complex to calculate, so let’s use a little trick that is very often used in probability.
If you think about it, there may be two situations: there is at last one match or there is none. Therefore, the probability of the two events is equal to one (100%). So why not calculate the probability that there are no matches and subtract the result from one?
P(at least one match) = 1 – P(none match)
We are going to build our quotient to calculate the probability we are looking for and, finally, we’ll calculate its complementary value.
Factorials and combinatory
Let’s start with the denominator, which is far simple. How many sets of 35 birthdays can we make with 365 days?. Here we are dealing with permutations allowing replacement, because we consider the possibility that there’re two matches in the same day. So, it would be 365x365x365…x365 35 times or, put it in another way, 36535.
Let’s go with the numerator. In how many different ways can we distribute 365 days among 35 people with no matches?. In this case, we are dealing with the multiple combinations without replacement, so we can calculate it as 365 factorial (you know, 365x364x363x…x2x1) divided by the factorial number of the difference between the number of days in a years and the number of persons: 330.
We have built our formula to calculate the probability:
Now we only have to solve it. Don’t try to do it with your pocket calculator, because it can explode. I used the software R and I even had to do a little algebra to simplify factorials first. The result is 0.18.
Well, we know that the probability that there’re no matches in birthdays among people in the room is 0.18. If we subtract this value from one, we get 0.82. This means that there is an 82% chance that at least two people had his birthday the same day. It’s amazing how our intuition can deceive us. If you don’t believe me, go to the movies one day and do the test.
And I think it is time to finish this post. We could have given more detail about how to calculate the numbers of the numerator and denominator of our formula of probability, explaining the involved concepts of combinatorics. For those who do not know, combinatorics is a set of mathematical tools which serves, among other things, to count elements. But that’s another story…