# Sensitivity analysis.

Sensitivity analysis allows assessing the effect on the results of the type of data used, the absent or extreme values, among others.

Doing things with sensitivity usually guarantee good results. But surely you had never thought that this also applies in science: whenever we use the scientific method, to ensure the validity of our results we have to analyze them with sensitivity. Well, actually we have to do a **sensitivity analysis**.

## Sensitivity analysis

It turns out that, when performing biomedical studies, we sometimes make certain assumptions, and these assumptions, which are often related to analysis method or models used, can influence the results we get. Whenever we can ask ourselves whether the results would change if we change one of the definitions of the study, or the method of analysis, or how to deal with missing data or with compliance or study protocol violations, the validity of our results may be compromised. To avoid that we can perform a sensitivity analysis and, if results still remain unchanged, we can say that our conclusions are robust.

A sensitivity analysis is, therefore, the method we use to determine the robustness of an assessment by examining to what extent the results would be influenced by changes in the methodology or the models used in the study.

So, if our results are based on assumptions that may affect their impact, we will be forced to do a sensitivity analysis, whose methodology will depend on each specific clinical scenario.

## Outliers

An example may be the presence of extreme data, or outliers, that can skew the mean of a sample and alter the estimates made from it. The easiest way to check if there are any outliers is using a boxplot and, if so, do the analysis with and without the outliers to see how the results change.

In other cases there’re lack of adherence to the intervention or study protocol violations that may dilute the interventional effect. We may, in these cases, make an intention to treat analysis and a **per-protocol analysis** and check whether there’re any differences.

At other times, the definition of outcomes can be arbitrary, so it may be useful to review what conclusions we could obtain using different cutoff points.

With cluster sample designs, as in multicenter studies, we will compare the results obtained by global analysis with those obtained with every block, with and without adjusting for what cluster each participant is extracted from. This is so because homogeneity among and into clusters may be different.

An especial situation is that of competing risks. For instance, if we assess infarction, angina and death as outcome variables, death will avoid the further possible occurrence of the first two variables, so survival analysis could be compromised. To avoid this there’re methods of analysis using **Kaplan-Meier’s curves** censoring competing variables. In any case, the sensitivity analysis should make an adjustment for the competing risk factor.

It’s similar to what occurs when there’re differences in baseline characteristics between control and intervention groups. In these cases, the usual analysis should be complemented by an analysis adjusting for these differences, usually using a multivariate regression model.

## Frequency distributions

And finally, two prickly problems related to the statistical analysis. The first one concerns to what kind of frequency distribution we use to analyze data. It’s often assumed that continuous variables follow a normal distribution, a discrete variable a Poisson’s, and a binary variable a binomial distribution. We usually check that data fit the distribution chosen but, if we want to have more certainty about its validity, we should test the results assuming different distributions, such as Student’s t test for normal distributions or the negative binomial for Poisson’s.

## Missing data

The second issue is related to the problem of missing data. In this case we have two options: perform a complete analysis, using only records without missing data, or assume a value for missing values (people who know about these topics call it to impute) to include all records in the analysis. We run risk of bias with both possibilities, depending mostly on the cause of missing data and whether data are missing randomly or not at random. We can do both complete and imputated analysis and compare the results we obtain by both methods.

## We’re leaving…

And this is, broadly speaking, what a sensitivity analysis is. We didn’t say much about data imputation, but we could write a fat book about it. This is because, although the optimal situation is to prevent missing data, once we have them we can make them up in a lot of different ways. But that’s another story…