# Critical appraisal of network meta-analysis.

The methodological aspects of the network meta-analysis and the criteria for critically reading the studies based on this design are described.

When Georg Cantor wanted to develop the set theory, he could not get an idea of everything that would come after that, probably from the hand of mathematicians as dedicated as he was. I can think of the curious case of binary relations, which the older ones of you will remember of the time when children learned things at school.

It turns out that some mathematical genius begins to think and describes a series of properties. The first is reflective property. This means that, if a number x is equal to x, then so, it is x. In case anyone has not understood, let us give an anatomical example: my right hand is my right hand. I believe that the genius who invented the reflexive property needed a long recovery in some spa after such a huge mental strain.

It was in this spa where he decided to do something more intense, so he described the symmetric property, which is much more complex: whenever a number x equals y, then y equals x. Going back to the anatomical simile, if my arms and legs are my extremities, you will have to agree that my extremities are my arms and my legs. Algebra is fascinating.

Luckily, in the end, with the purpose of filling a file and save back, our anonymous genius invented the **transitive property**, which says more or less like this: if a number x is related to y, and y is related to z, there will be transitivity if x relates to z.

Again, to the anatomy: if my leg is mine and my foot is from my leg, my foot is also mine. After that, more properties were derived from these three, but we shall leave it here for the moment, because today we are going to use the power of transitive property to know which of two things that we have not really come to compare is the better of both.

Think, for example, of a crazed mob running into a shopping center on the first day of sales. They look at everything before deciding what to buy, but it is not necessary to compare all the products two to two to know which one we like best.

In medicine something similar happens. The usual thing is that there are several options to treat the same disease (although those of us who have been in the business for a long time now know that the more there are, the more likely that none will work at all). Clinical trials, and meta-analyzes of clinical trials, only compare pairs and it may happen that no one has compared the two we have at our disposal or that we want to know which is, in theory, the best of all available.

## Network meta-analysis

Well, for that a methodological design called **network meta-analysis** (NMA), also called multiple-treatments meta-analysis or mixed-treatments comparisons meta-analysis, has been invented. And in this last term, mixed comparisons, is the crux of the matter, because it turns out that there are several types of comparisons. Let’s see them.

Let’s assume we have three possible treatments that, after a deep reflection, I decided to call A, B and C. The simplest situation is to compare two of them, A and B, for example, with a conventional clinical trial. We would be making a **direct comparison** between the two interventions. But it may happen that we do not have any trial that directly compares A and B, but there are two different trials that compare the interventions with another intervention, C (you can see it in the attached figure).

In this case we can resort to the power of the transitive property and make an **indirect comparison** between A and B based on their relative efficacy against C. For example, if A reduces mortality by 100% compared to C and B reduces it by 50 % compared to C, we can say that B reduces mortality 50% relative to A. Of course, in order to do this, **transitivity **has to be fulfilled, something that we cannot take for granted.

For example, if I like pork and pig likes to reboar through mud, that does not mean that I like to reboar through mud. Transitivity is not fulfilled in this case (I think).

Well, an NMA is nothing more than a series of direct, indirect and mixed comparisons that allow us to compare the relative effects of several interventions. Multiple comparisons are typically represented using a diagram as a network where we can see the direct, indirect and mixed comparisons.

Each node in the network, which can vary in size according to its specific contribution, correspond with one of the primary studies of the review, while the lines joining the nodes represent the comparisons. The complete network will represent all comparisons of treatments identified from the primary studies of the review that incorporates our NMA.

As with the other types of meta-analyzes coupled with a systematic review, the validity of the NMA will depend on the validity of the primary studies, the heterogeneity among them and the possible existing information biases, factors that will condition the quality of the direct comparisons.

In addition, indirect comparisons are considered observational and require, as we have already mentioned, that the researcher issue the transitivity of the interventions based on her knowledge about them, about the disease and about the designs of the primary studies.

Another specific aspect of the NMA is that of coherence or **consistency**, which makes reference to the level of agreement among the evidence coming from direct and indirect comparisons. This level of agreement, which can be measured with specific statistical methods, must be high in order for the summary result measure to be valid. The results of the comparisons must go in the same direction, they cannot be divergent. When this is not fulfilled, the cause probably lies in the poor methodological quality of the primary studies, in their heterogeneity or in the presence of biases.

As in other meta-analyzes, the result of the NMA is expressed with a summary result measure that can be an odds ratio, a means difference, a risk ratio, etc. This point estimate is accompanied by an interval that gives us information about the accuracy of this estimate. The statistical analysis of the NMA can use frequentist methods (the one we usually see in usual clinical trials) or Bayesian methods.

The latter are based on the assignment of a probability of the effect of the treatment prior to the analysis of the data and then to assign a posteriori probability after the analysis. For what interests us here, the frequentist methods will assess the accuracy of the point estimate by means of the known **confidence intervals** (usually 95%), while the Bayesians will provide **credibility intervals** (also 95%), of similar significance.

With all this data we will obtain an ordered rank of the compared treatments, with the best heading the list. But do not trust yourself too much, you have to look at these ranks carefully for several reasons. First, the best treatment in one situation may not be so in another. Second, we must take into account other factors such as cost, availability, knowledge of the clinician, etc. Third, these ordered ranks do not take into account the magnitude of the differences between the different elements. And fourth, chance can play tricks on us and put in a good position a treatment that, in reality, is not as good as it may seem.

Once reviewed, at a glance, the peculiarities of the NMA, what can we say about their critical appraisal? As we have a checklist for the systematic review with the usual meta-analysis, the PRISMA statement, there is a specific declaration for the NMA, the PRISMA-NMA.

This list includes, as specific items, aspects such as the description of the geometry of the treatments network, the consideration of the transitivity and consistency assumptions and the description of the methods used to analyze the structure of the network and the suitability of the comparisons, in case some may have a lower degree of evidence. All this will be facilitated if the authors provide the graph with the study network and briefly explain its characteristics.

Anyway, you know that I’d rather resort on the CASP’s tools for critical appraisal of documents. Although there is no a specific for NMA, I advise you to use the systematic review with usual meta-analysis one and, later, to make some considerations about the specific aspects of the NMA.

## Critical appraisal of network meta-analysis

To not extend this post much, we will skip the whole part that NMA share with any other systematic review and go directly to its specific aspects. You can consult the corresponding post where we reviewed the critical appraisal of a systematic review. As always, we will follow our three pillars of wisdom: validity, relevance and applicability.

Regarding **VALIDITY**, we will ask three specific questions.

**Does the review respond to a well-defined clinical question that justifies the realization of a NMA?**This question has the classic components of the PICO question,although the intervention and the comparison will encompass the multiple comparisons of the network.**Was an exhaustive search of the relevant studies carried out?**This aspect is important to avoid publication biasand the inclusion of all the important information available. Their absence can affect the consistency of the comparisons.**There should be a clear specification of the target population, the treatments evaluated and the outcome measures used.**All these aspects can condition the validity of indirect comparisons.If we want to infer the relationship between the effects of A and B by comparing their individual effects with respect to C, it is essential that A and B are treated similarly in their comparison with C, that the A-C and B-C comparisons are made with patients that are similar, that the same outcome measures are used and that the risk of bias in the studies is low. The latter can be assessed with the usual tools, such as the Cochrane’s.

To finish this section, we will check that the results are analyzed and presented in an appropriate way, which statistical method has been used (frequentist or Bayesian), and if confidence or credibility intervals, the analysis of the network, etc. are provided.

Although we will not go into it, we will say that there are multiple types of networks (star, loop, line …). For comparisons to be more valid, indirect comparisons must be supported by direct ones. This can be seen in the network scheme by the presence of triangles similar to the graph that I attached at the beginning of the post (or other closed geometric shapes). In conditions of equality of other factors that can have an influence and that we have already mentioned, the more triangles we see, the more valid the comparisons will be.

As a last aspect, we will evaluate if the authors have used the appropriate methods to assess the heterogeneity and the possible existence of inconsistency: sensitivity analysis, metaregression, etc.

Going to the **RELEVANCE** section, we will value the results of the meta-analysis. Here we will consider five specific aspects:

**What is the result?**As in any other meta-analysis, we will assess the result and its importance from the clinical point of view.

It will be necessary to assess how the result could have been influenced by the risk of bias in the primary studies: the greater the risk of bias, the farthest our estimate can be from the truth.

**Are the results accurate?**In this sense, we must assess the amplitude of the confidence or credibility intervals, taking into account how the conclusions of the study would be affected at each end of the interval.**Is there consistency of results among different studies?**There may be variability by pure chance or by heterogeneity among the studies.We can assess it by observing the shape of the forest plots and helping us with the usual statistical methods, such as I^{2}.**Are indirect comparisons reliable?**We return again to the concept of transitivity, which must be taken into account together with the other factors that we have previously commented on and which may increase the risk of bias: homogeneous populations, outcome variables and common comparators, etc.**Is there consistency among direct and indirect comparisons?**We will have to check for closed geometric shapes within the network (our triangles or loops),as well as rule out causes of inconsistency, which are the same we have already mentioned as causing heterogeneity and intransivity.

Finally, we will finish our critical appraisal by making some special considerations regarding the **APPLICABILITY** of the results.

In addition to taking into account, as usual, if all the important effects and variables for the patient have been considered and if the patients are similar to those of our environment, we will ask the questions specifically related to the use of a NMA, such as if the the network has considered all the possibilities of treatment or if the different comparison subgroups that have been established have credibility from the clinical point of view.

## We’re leaving…

And here we will leave for today. A beast difficult to tame, this NMA. And that we have not spoken anything of its statistical methodology, quite complex but that computer packages develop without flinching. In addition, we could have talked a lot about the types of networks and the comparisons that can be drawn from each of them. But that’s another story…