This post is also available in: Spanish

We live in a frantic and highly competitive world. We are continually inundated with messages about how good it is to be the best in this and that. As indeed it is. But most of us soon realize that it is impossible to be the best in everything we do. Gradually, we even realize that it is very hard to be the best at something, and not only in general. In the end, sooner or later, ordinary mortals have to conform to the minimum of not be the worst at what one does.

But this is not that bad. You can’t always be the best and indeed, you certainly do not have to. Consider, for example, we have a great treatment for a very bad disease. This treatment is effective, inexpensive, easy to use and well tolerated. Are we interested in change to another drug?. Probably not. But think now, for example, that it produces an irreversible aplastic anemia in 3% of those who take it. In this case we would like to find a better treatment.

Better?. Well, not really better. If only it were the same in all but except the production of aplasia, we’d change to the new treatment.

The most common goal of clinical trials is to show the superiority of an intervention against a placebo or the standard treatment. But, increasingly, trials are performed with the sole objective to show that the new treatment is equal to the current. The planning of these equivalence trials should be careful and paying attention to a number of aspects.

First, there is no equivalence from an absolute point of view, so you must take much care in keeping the same conditions in both arms of the trial. In addition, we must first set the sensitivity level that we will need in the study. To do this, we first define the margin of equivalence, which is the maximum difference between the two interventions to be considered acceptable from a clinical point of view. Second, we will calculate the sample size needed to discriminate the difference from the point of view of statistical significance.

It is important to understand that the margin of equivalence is marked by the investigator based on the clinical significance of what is being valued. The narrower the margin, the larger the needed sample size to achieve statistical significance and reject the null hypothesis that the differences we observe are due to chance. Contrary to what may seem at first sight, equivalence studies usually require larger samples than studies of superiority.

After obtaining the results, we’ll analyze the confidence intervals of the differences in effect between the two interventions. Only those intervals not crossing the line of no-effect (one for relative risks and odds ratio and zero for mean differences) are statistically significant. If they are also included within the predefined equivalence margins, they will be considered equivalents with the probability of error chosen for the confidence interval, usually 5%. If an interval falls outside the range of equivalency, the intervention is considered not equivalent. In the case of crossing any of the limits of the margin of equivalence, the study is not conclusive as to prove or reject the equivalence of the two interventions, although we should assess the extent and distribution of the interval regarding to the margins of equivalence to rate its possible relevance from a clinical point of view. Sometimes, not statistically significant results or those outside the equivalence range limits may also provide useful clinical information.

Look at the example of the figure to better understand what we have said so far. We have the intervals of nine studies represented with its position regarding the line of no-effect and the limits of equivalence. Only studies A, B, D, G and H show a statistically significant difference, because they are not crossing the line of no-effect. A’s intervention is superior, whereas H’s is showed inferior. However, only in case of D’s can we conclude equivalence of the two interventions, while B’s and G’s are inconclusive with regard to equivalence.

You can also conclude equivalence of the two interventions of E study. Notice that, although the difference obtained in D is statistically significant, is not to exceed the limits of equivalence: it’s superior to E from the statistical point of view, but it seems that the difference has no clinical relevance.

Besides the studies B and G already mentioned, C, F and I are inconclusive regarding equivalence. However, C will probably not be inferior and F could be Inferior. We could even estimate the probability of these assumptions based on the amount of the intervals that fall within the limits of equivalence.

An important aspect of equivalence studies is the method used to analyze results. We know that the intention to treat analysis is always preferable to the per protocol analysis as it keeps the advantages of randomization of known and unknown variables that may influence the results. The problem is that the intention to treat analysis favors the null hypothesis, minimizing the differences, if any. This is an advantage in superiority studies: finding a difference reinforces de result. However, this is not so advantageous in the case of equivalence studies. Otherwise, the per protocol analysis would tend to increase any difference, but this is not always the case and may vary depending on what motivated the protocol violations, losses or mistakes of assignment between the two arms of the trial. For these reason, it’s usually advised to analyze results in both ways and to check that interventions showed equivalents with both methods. We’ll also take into account losses during study and analyze the information provided by the participants who don’t follow the original protocol.

A particular case of this type of trial is the non-inferiority. In this case, researchers are contented to demonstrate that the new intervention is not worse than the comparison. All we have said about equivalence is valid here, but considering only the lower limit of the range of equivalence.

One last thing. Studies of superiority are to demonstrate superiority and equivalence studies are to demonstrate equivalence. One of the designs is not useful to show the goal of the other. Furthermore, if a study fails to demonstrate superiority, it does not exactly mean that the two procedures are equivalent.

We have reached the end without speaking anything about other characteristic equivalence studies: bioequivalence studies. These are phase I trials conducted by pharmaceutical companies to test the equivalence of different presentations of the same drug, and they have some design specifications. But that’s another story…

## Leave a Reply