This post is also available in: Spanish
White, black, filling, in ounces, to the cup, powdered, in ice cream, with hazelnuts, with almonds, with fruits, milky, pure, fondant, bitter, in pies, in candy, in hot or cold drinks, etc., etc., etc. I like them all.
So you can easily imagine my joy when my RSS reader showed me the title of the article in the New England saying that there was a relationship between chocolate consumption and Nobel prizes. I could see myself eating chocolate galore with my copy of the paper in my pocket to shut the mouths of all who would come to spoil me the party saying that I was going over the top with calories, fat, sugar, or whatever. At the end of the day, what could be more important than working to get a Nobel Prize?
It’s at this point that you can also easily imagine my frustration when reading the work and seeing that the title was fishy. It turns out that it was an ecological study.
In the epidemiological studies that we’re most used to read, the units of analysis are often isolated elements. However, in ecological studies these units are formed with aggregates of individuals.
A synthetic measure of the frequency of association and the effect on individuals in each aggregate is calculated, showing at the end if there’s an association between exposure and effect among the different units.
There’re two types of ecological studies. At one end are those which study frequency measures, such us incidence, mortality, etc., looking for different geographical patterns that may be related to social, economic, genetic factors or whatever. On the other, we have those who study the variations in frequency over time in order to look for temporal trends and detect them, trying to explain their cause.
These studies are usually simple and quick to perform, and often are made from data which are previously available in records or yearbooks, so they are also usually not too expensive. The problem with ecological studies is that the fact that there is an association among the units of analysis does not necessarily mean that it also exists at the level of individuals. If we take this association for granted at individual level, we’ll run the risk of committing a sin that is known by the beautiful name of ecological fallacy. You can get stuck comparing every variable you can think of to the frequency of a particular disease to find a significant association, but then it could be impossible to find a plausible mechanism to explain it. In our example, it could even be the case that, at the individual level, the more chocolate you eat the more brutalized your senses are, putting you away of the desired Nobel Prize.
And for those who do not believe me, we will see a totally absurd and invented example. Suppose we want to know if there is a relationship between watching television for more than four hours a day and to be a strict vegetarian. It turns out that we have data from three surveys in three cities, we will call A, B and C to not get us any more trouble.
If we calculate the prevalence of vegetarianism and tele addiction we’ll see that it’s 0.4 in A, 0.5 in B and 0.6 in C. It’s pretty clear, in cities where there are more addicted to the boob tube there are more strict vegetarians, which may indicate that the use of television is even more dangerous than previously thought.
But these are aggregate results. What happen at the individual level? We see that the odds ratios are 0.33 in A and C and 0.44 in B. So, surprisingly, even though in cities with more coach potatoes there are more vegetarians, people with coach potatoes stigma have a 33-44% less chance of being a strict vegetarian. So we see how important it is that the results of an ecological study are subsequently investigated with other designs of analytical studies to explain them properly.
Only two issues more before ending this post. First, let’s vegetarians forgive me, even if they are strict, and, why not?, also forgive me those who watch TV for too long. Second, we have seen the fallacy of chocolate is actually an ecological fallacy. But, even in the cases that data were extracted from individual units, we must always remember that neither correlation nor association is synonymous with causality. But that’s another story…