Science without sense…double nonsense

Píldoras sobre medicina basada en pruebas

Posts tagged Ecological bias

King Kong versus Godzilla

What a mess these two elements make when they are left loose and come together! In this story, almost as old as me (please, do not run to look at what year the movie was made) poor King Kong, who must have traveled more than Tarzan, leaves his Skull Island to defend a village from an evil giant octopus and drinks a potion that leaves him sound asleep. Then, some Japanese gentlemen seized the opportunity to take him to their country. I, who have visited Japan, can imagine the effect it produced on the poor monkey when he woke up, so it had no choice but to escape, with the misfortune of meeting Godzilla, who had also escaped from an iceberg where it had been previously frozen. And there they are bundled and the fight begins, stones over here, atomic rays over there, until the thing gets out of control and finally King Kong is going to attack Tokyo, I do not remember exactly for what reason. I swear I have not taken any hallucinogenic, the film is like that and I will not reveal more for not spoiling the end in the incredible case that you want to see the film after what I have told you. What I do not know is what the screenwriters would have taken before planning this story.

At this point you will be thinking about how today’s post may be related to this story. Well, the truth is that it has nothing to do with what we are going to talk about, but I could not think of a better way to start. Well, it may actually be related, because today we are going to talk about a family of monsters within epidemiological studies: the ecological studies. It’s funny that when you read something about ecological studies, it always starts by saying that they are simple. Well, I do not think so. The truth is that they have a lot to get our teeth into and we are going to try to explain them in a simple way. I thank my friend Eduardo (to whom I dedicate this post) for the effort he made to describe them intelligibly. Thanks to him I could understand them. Well… a little bit.

Ecological studies are observational studies that have the peculiarity that the study population are not individual subjects, but grouped subjects (in conglomerates), so the level of inference of their estimates is also aggregated. They tend to be cheap and quick to perform (I suppose that hence its supposed simplicity), since they usually use data from secondary sources already available, and are very useful when it is not possible to measure the exposure at the individual level or when the measurement of the effect can only be measured at the population level (such as the results of a vaccination campaign, for example).

The problem comes when we want to make inferences at the individual level based on their results, since they are subject to a series of biases that we will comment later on. In addition, since they use to be descriptive studies of historical temporality, it can be difficult to determine the temporal gradation between the exposure and the effect studied.

We will look at the specific characteristics in relation to three aspects of its methodology: types of variables and analyzes, types of studies and biases.

Ecological variables are classified in aggregate and environmental variables (also called global variables). The aggregate ones show a summary of individual observations. They are usually averages or proportions, such as the mean age at which the first King Kong’s movie is seen or the rate of geeks for every 1000 moviegoers, to name two absurd examples.

On the other hand, environmental measures are characteristic of a specific place. These can have a parallelism at an individual level (for example, the levels of environmental pollution, related to the crap that each swallows) or be attributes of groups without equivalence at the individual level (such as water quality, to say the least).

As for the analysis, it can be done at the aggregate level, using data from groups of participants, or at the individual level, but better without mixing the two types. Moreover, if data of both types is collected, it will be more convenient to transform them into a single level, the simplest being to aggregate the individual data, although it can also be done the other way around and, even, make an analysis in the two levels with techniques of hierarchical multilevel statistics, only afforded by a few privileged minds.

Obviously, the level of inference we want to apply will depend on what our objective is. If we want to study the effects of a risk factor at the individual level, the inference will be individual. An example would be to study the relationship between the number of hours television is watched and the incidence of brain cancer. On the other hand, and following a very pediatric example, if we want to know the effectiveness of a vaccine, the inferences will be made in an aggregated form from the data of vaccination coverage in the population. And to finish curling the curl, we can measure an exposure factor of the two forms, individual and grouped. For example, density of Mexican restaurants in a population and frequency of antacids intake. In this case we would make a contextual inference.

Regarding the type of ecological studies, we can classify them according to the exposure method and the grouping method.

According to the exposure method, the thing is relatively simple and we can find two types of studies. If we do not measure the exposure variable, or we do it partially, we talk about exploratory studies. In the opposite case, we will find ourselves before an analytical study.

According to the grouping method, we can consider three types: multiple (when multiple zones are selected), temporary (there is measurement over time) and mixed (combination of both).

The complexity begins when the two dimensions (exposure and grouping) are combined, since then we can find ourselves before a series of more complex designs. Thus, multiple group studies can be exploratory (the exposure factor is not measured, but the effect is measured) or analytical studies (the most frequent, we measure both here). The studies of temporal tendency, to not be less, can also be exploratory and analytical, in a similar way to the previous ones, but with a temporal trend. Finally, there will be mixed studies that compare the temporal trends of several geographical areas. Simple, isn’t it?

Well, this is nothing compared to the complexity of the statistical techniques used in these studies. Until recently the analyzes were very simple and based on measures of association or linear correlation, but in recent times we have seen the development of numerous techniques based on regression models and more exotic things such as the log-linear multiplicative models or the Poisson’s regression. The merit of all these studies is that, based on the grouped measures, they allow us to know how many exposed or unexposed subjects have the effect, thus allowing the calculation of rates, attributable fractions, etc. Do not fear, we will not go into detail, but there is available bibliography for those who want to keep warm from head to feet.

To finish with the methodological aspects of the ecological studies, we will list some of its most characteristic biases, favored by the fact of using aggregate analysis units.

The most famous of all is the ecological bias, also known as ecological fallacy. This occurs when the grouped measure does not measure the biological effect at the individual level, in such a way that the individual inference made is erroneous. This bias became famous with the New England’s study that concluded that there was a relationship between chocolate consumption and Nobel prizes but the problem is that, apart from the funny of this example, the ecological fallacy is the main limitation of this type of studies.

Another bias that has some peculiarities in this type of studies is the confusion bias. In studies dealing with individual units, confusion occurs when the exposure variable is related to the effect and exposure, without being part of the causal relationship between the two. This ménage à trois is a bit more complex in ecological studies. The risk factor can behave similarly at the ecological level, but not at the individual level and vice versa, it is possible that confounding factors at the individual level do not produce confusion at the aggregate level. In any case, as in the rest of the studies, we must try to control the confounding factors, for which there are two fundamental approaches.

The first one, to include the possible confounding variables in the mathematical model as covariables and perform a multivariate analysis, with which it will be more complicated to study the effect. The second one, to adjust or standardize the rates of the effect by the confounding variables and perform the regression model with the adjusted rates. To be able to do this it is essential that all the variables introduced in the model have to be adjusted too to the same variable of confusion and that the covariances of the variables are known, which does not always happen. In any case, and it is not to discourage, many times we cannot be sure that the confounding factors have been adequately controlled, even using the most recent and sophisticated multilevel analysis techniques, since the origin can be in unknown characteristics about the distribution of data among groups.

Other gruesome aspects of ecological studies are the temporal ambiguity bias (we have already commented, it is often difficult to ensure that exposure precedes the effect) and collinearity (difficulty in assess the effects of two or more exposures that can occur simultaneous). In addition, although they are not specific to ecological studies, they are very susceptible to presenting information biases.

You can see that I was right at the beginning when I told you that ecological studies seem to me a lot of things, but simple. In any case, it is convenient to understand what their methodology is based on, because, with the development of new analysis techniques, they have gained in prestige and power and it is more than possible that we meet them more and more frequently.

But do not despair, the important thing for us, consumers of medical literature, is to understand how they work so that we can make a critical appraisal of the articles when we deal with them. Although, as far as I know, there are no checklists as structured as CASP has for other designs, the critical appraisal will be done following the usual general scheme according to our three pillars: validity, relevance and applicability.

The study of VALIDITY will be done in a similar way to other types of cross-sectional observational studies. The first thing will be to check that there is a clear definition of the population and the exposure or effect under study. The units of analysis and their level of aggregation will have to be clearly specified, as well as the methods of measuring the effect and exposure, the latter, as we already know, only in analytical studies.

The sample of the study should be representative, for which we will have to review the selection procedures, the inclusion and exclusion criteria and its size. These data will also influence the external validity of the results.

As in any observational study, the measurement of exposure and effect should be done blindly and independently, using valid instruments. The authors must present the data completely, taking into account if there are loses or out of range values. Finally, there must be a correct analysis of the results, with a control of the typical biases of these studies: ecological, information, confusion, temporal ambiguity and collinearity.

In the RELEVANCE section we can begin with a quantitative assessment, summarizing the most important result and reviewing the magnitude of the effect. We must search or calculate ourselves, if possible, the most appropriate impact measures: differences in incidence rates, attributable fraction in exposed, etc. If the authors do not offer this data, but do provide the regression model, it is possible to calculate the impact measurements from the multiplication coefficients of the independent variables of the model. I’m not going to put here the list of formulas for not making this post even more unfriendly, but you know that they exist in case one day you need them.

Then we will make a qualitative assessment of the results, trying to assess the clinical interest of the main outcome measure, the interest of the effect size and the impact it may have for the patient, the system or the Society.

We will finish this section with a comparative assessment (looking for similar studies and comparing the main outcome measure and other alternative measures) and an assessment of the relationship between benefits, risks and costs, as we would do with any other type of study.

Finally, we will consider the APPLICABILITY of the results in clinical practice, taking into account aspects such as adverse effects, economic cost, etc. We already know that the fact that the study is well done does not mean that we have to apply it obligatorily in our environment.

And here we are going to leave it for today. When you read or do an ecological study, be careful not to fall into the temptation of drawing causality conclusions. Regardless of the pitfalls that the ecological fallacy may have for you, ecological studies are observational, so they can be used to generate hypotheses of causality, but not to confirm them.

And now we’re leaving. I did not tell you who won the fight between King Kong and Godzilla so as not to be a spoiler, but surely the smartest of you have already imagined it. After all, and to its disgrace, only one of the two later traveled to New York. But that is another story…