When nothing bad happens, is everything okay?

I have a brother-in-law who is increasingly afraid of getting on a plane. He is able to make road trips for several days in a row so as not to take off the ground. But it turns out that the poor guy has no choice but to make a transcontinental trip and he has no choice but to take a plane to travel.

But at the same time, my brother-in-law, in addition to being fearful, is an occurrence person. He has been counting the number of flights of the different airlines and the number of accidents that each one has had in order to calculate the probability of having a mishap with each of them and fly with the safest. The matter is very simple if we remember that of probability equals to favorable cases divided by possible cases.

And it turns out that he is happy because there is a company that has made 1500 flights and has never had any accidents, then the probability of having an accident flying on their planes will be, according to my brother-in-law, 0/1500 = 0. He is now so calm that he almost has lost his fear to fly. Mathematically, it is almost certain that nothing will happen to him. What do you think about my brother-in-law?

Many of you will already be thinking that using brothers-in-law for these examples has these problems. We all know how brothers-in-law are… But don’t be unfair to them. As the famous humorist Joaquín Reyes says, “we all of us are brothers-in-law”, so just remember it. Of which there is no doubt, is that we will all agree with the statement that my brother-in-law is wrong: the fact that there has not been any mishap in the 1500 flights does not guarantee that the next one will not fall. In other words, even if the numerator of the proportion is zero, if we estimate the real risk it would be incorrect to keep zero as a result.

This situation occurs with some frequency in Biomedicine research studies. To leave airlines and aerophobics alone, think that we have a new drug with which we want to prevent this terrible disease that is fildulastrosis. We take 150 healthy people and give them the antifildulin for 1 year and, after this follow-up period, we do not detect any new cases of disease. Can we conclude then that the treatment prevents the development of the disease with absolute certainty? Obviously not. Let’s think about it a little.

Making inferences about probabilities when the numerator of the proportion is zero can be somewhat tricky, since we tend to think that the non-occurrence of events is something qualitatively different from the occurrence of one, few or many events, and this is not really so. A numerator equal to zero does not mean that the risk is zero, nor does it prevent us from making inferences about the size of the risk, since we can apply the same statistical principles as to non-zero numerators.

Returning to our example, suppose that the incidence of fildulastrosis in the general population is 3 cases per 2000 people per year (1.5 per thousand, 0.15% or 0.0015). Can we infer with our experiment if taking antifildulin increases, decreases or does not modify the risk of suffering fildulastrosis? Following the familiar adage, yes, we can.

We will continue our habit of considering the null hypothesis as of equal effect, so that the risk of disease is not modified by the new treatment. Thus, the risk of each of the 150 participants becoming ill throughout the study will be 0.0015. In other words, the risk of not getting sick will be 1-0.0015 = 0.9985. What will be the probability that none will get sick during the year of the study? Since there are 150 independent events, the probability that 150 subjects do not get sick will be 0.98985150 = 0.8. We see, therefore, that although the risk is the same as that of the general population, with this number of patients we have an 80% chance of not detecting any event (fildulastrosis) during the study, so it would be more surprising to find a patient who the fact of not having any. But the most surprising thing is that we are, thus, getting the probability that we do not have any sick in our sample: the probability that there is no sick is not 0 (0/150), as my brother-in-law thinks, but 80 %!

And the worst part is that, given this result, pessimism invades us: it is even possible that the risk of disease with the new drug is greater and we are not detecting it. Let’s assume that the risk with medication is 1% (compared to 0.15% of the general population). The risk of none being sick would be (1-0.01)150 = 0.22. Even with a 2% risk, the risk of not getting any disease is (1-0.02)150 = 0.048. Remember that 5% is the value that we usually adopt as a “safe” limit to reject the null hypothesis without making a type 1 error.

At this point, we can ask ourselves if we are very unfortunate and have not been lucky enough to detect cases of illness when the risk is high or, on the contrary, that we are not so unfortunate and, in reality, the risk must be low. To clarify ourselves, we can return to our usual 5% confidence limit and see with what risk of getting sick with the treatment we have at least a 5% chance of detecting a patient:

– Risk of 1.5/1000: (1-0.0015)150 = 0.8.

– Risk of 1/1000: (1-0.001)150 = 0.86.

– Risk of 1/200: (1-0.005)150 = 0.47.

– Risk of 1/100: (1-0.01)150 = 0.22.

– Risk of 1/50: (1-0.02)150 = 0.048.

– Risk of 1/25: (1-0.04)150 = 0.002.

As we see in the previous series, our “security” range of 5% is reached when the risk is below 1/50 (2% or 0.02). This means that, with a 5% probability of being wrong, the risk of fildulastrosis taking antifuldulin is equal to or less than 2%. In other words, the 95% confidence interval of our estimate would range from 0 to 0.02 (and not 0, if we calculate the probability in a simplistic way).

To prevent our reheated neurons from eventually melting, let’s see a simpler way to automate this process. For this we use what is known as the rule of 3. If we do the study with n patients and none present the event, we can affirm that the probability of the event is not zero, but less than or equal to 3/n. In our example, 3/150 = 0.02, the probability we calculate with the laborious method above. We will arrive at this rule after solving the equation we use with the previous method:

(1 – maximum risk) n = 0.05

First, we rewrite it:

1 – maximum risk = 0.051/n

If n is greater than 30, 0.051/n approximates (n-3)/n, which is the same as 1-(3/n). In this way, we can rewrite the equation as:

1- maximum risk = 1 – (3/n)

With which we can solve the equation and get the final rule:

Maximum risk = 3/n.

You have seen that we have considered that n is greater than 30. This is because, below 30, the rule tends to overestimate the risk slightly, which we will have to take into account if we use it with reduced samples.

And with this we will end this post with some considerations. First, and as is easy to imagine, statistical programs calculate risk’s confidence intervals without much effort even if the numerator is zero. Similarly, it can also be done manually and much more elegantly by resorting to the Poisson probability distribution, although the result is similar to that obtained with the rule of 3.

Second, what happens if the numerator is not 0 but a small number? Can a similar rule be applied? The answer, again, is yes. Although there is no general rule, extensions of the rule have been developed for a number of events up to 4. But that’s another story…

Do they not trick you with cheese

If you have at home a bottle of wine that has gotten a bit chopped up, take my advice and don’t throw it away. Wait until you receive one of those scrounger visits (I didn’t mention any brother-in-law!) and offer it to drink it. But you have to combine it with a rather strong cheese. The stronger the cheese is, the better the wine will taste (you can have other thing with any excuse). Well, this trick almost as old as the human species has its parallels in the presentation of the results of scientific work.

Let’s suppose we conduct a clinical trial to test a new antibiotic (call it A) for the treatment of a serious infection that we are interesting in. We randomize the selected patients and give them the new treatment or the usual one (our control group), as chance dictates. Finally, we measure in how many of our patients there’s a treatment failure (how many has the event we want to avoid).

Thirty-six out of the 100 patients receiving drug A presented the event to avoid. Therefore, we can conclude that the risk or incidence of presenting the event in the exposed group (Ie) is 0.36 (36 out of 100). Moreover, 60 out of the 100 controls (we call them the non-exposed group) presented the event, so we quickly compute the risk or incidence in non-exposed (Io) is 0.6.

We see at first glance that risks are different in each group, but as in science we have to measure everything, we can divide risks between exposed and RAR_Anon-exposed to get the so-called relative risk or risk ratio (RR = Ie/Io). A RR = 1 means that the risk is the same in both groups. If RR > 1, the event is more likely in the exposed group (and the exposure we’re studying will be a risk factor for the production of the event); and if RR is between 0 and 1, the risk will be lower in the exposed. In our case, RR = 0.36 / 0.6 = 0.6. It’s easier to interpret the RR when its value is greater than one. For example, a RR of 2 means that the probability of the event is two times higher in the exposed group. Following the same reasoning, a RR of 0.3 would tell us that the event is two-thirds less common in exposed than in controls.

But what interests us is how much decreases the risk of presenting the event with our intervention, in order to estimate how much effort is needed to prevent each event. So we can calculate the relative risk reduction (RRR) and the absolute risk reduction (ARR). The RRR is the difference in risk between the two groups with respect to the control group (RRR = [Ie-Io] / Io). In our case its value is 0.6, which mean that the tested intervention reduces the risk by 60% compared to standard therapy.

The ARR is simpler: it’s the subtraction between the exposes’ and control’s risks (ARR = Ie – Io). In our case is 0.24 (we omit the negative sign; that means that for every 100 patients treated with the new drug, it will occur 24 less events than if we had used the control therapy. But there’s more: we can know how many patients we have to treat with the new drug to prevent each event just using a rule of three (24 is to 100 as 1 is to x) or, more easily remembered, calculating the inverse of the ARR. Thus, we come up with the number needed to treat (NNT) = 1 / ARR = 4.1. In our case we would have to treat four patients to avoid an adverse event. The clinical context will tell us the clinical relevance of this figure.

As you can see, the RRR, although technically correct, tends to magnify the effect and don’t clearly quantify the effort required to obtain the result. In addition, it may be similar in different situations with totally different clinical implications. Let’s look at another example. Suppose another trial with a drug B in which we get three events in the 100 patients treated and five in the 100 controls.

If you do the calculations, the RR is 0.6 and the RRR is 0.4, as in our previous example, but if you compute the ARR you’ll come up with a very RAR_Bdifferent result (ARR = 0.02) and a NNT of 50. It’s clear that the effort to prevent an event is much higher (four vs. 50) despite matching the RR and RRR.

So, at this point, let me advice you. As the data needed to calculate RRR are the same than to calculate the easier ARR (and NNT), if a scientific paper offers you only the RRR and hide the ARR, distrust it and do as with the brother-in-law who offers you wine and strong cheese, asking him to offer an Iberian ham pincho. Well, I really wanted to say that you’d better ask your shelves why they don’t give you the ARR and compute it using the information from the article.

One final thought to close the topic. There’s a tendency and confusion when using or analyzing another measure of association employed in some observational studies: the odds ratio. Although they can sometimes be comparable, as when the prevalence of the effect is very small, in general, odd ratio has other meaning and interpretation. But that’s another story…