Have you read the novel by Henry James?. I recommend it. A classical in the horror genre, with a dead and evil governess who appears as a ghost and with turbid relationships in the background. But today I’m not going to tell you about any horror story, but I’ll give another turn to the screw of diagnostics tests, although some can get even scarier than with a John Carpenter’s film.
We know there’s not a perfect diagnostic test. All are wrong at some time, either diagnosing someone health as sick (false positive, FP) or yielding a negative result in someone who has the disease (false negative, FN). This is why it have been developed some parameters to characterize diagnostic tests and to give us an idea of their performance in the daily clinical practice.
Assessment of diagnostic tests
The best well-known are sensitivity (S) and specificity (Sp). We know they are intrinsic properties of the test and that they inform us about the ability of the diagnostic test to correctly classify sick patients (S) and healthy people (Sp). The problem is that we need to know the likelihood of being or not sick in condition of having obtained a positive or negative result. These probabilities conditioned by the result of the test are provided by the positive and negative predictive values.
These pair of values can characterize the worth of the test, but we’d prefer to be able to define the value of the test with a single number. We may use likelihood ratios, both positive and negatives ones, that gives us an idea about how much likely is to have the disease or not to have it, but these ratios weighs an ancient curse: they are of little knowing and even poor understanding by clinicians.
For these reasons, some people have tried to develop other indicators to characterize the validity of diagnostic tests. One of them is the so called accuracy or precision of the test, which reflects the probability that the test has made a correct diagnosis.
To calculate it, we construct a quotient placing in the numerator all possible true values (positives and negatives) and in the denominator all possible outcomes, according to the following formula:
This indicator informs us about in what percentage of cases the diagnostic test is not wrong, but it can be difficult to convert its value to a tangible clinical concept.
Another parameter to measure the overall effectiveness of the test is the Youden’s index, which sums the cases that are wrongly classified by the diagnostic test, according to the following formula:
Youden’s index = S + Sp -1
It’s not a bad index as an approximation to the overall performance of the test, but it is not recommended to use it as a single parameter to evaluate any diagnostic test.
Some authors go one step further and try to develop parameters that function in an analogous way to the number needed to treat (NNT) of treatment studies. Thus, two parameters have been developed.
Number needed to diagnose
The first one is the number needed to diagnose (NND). If NNT is the inverse of those which improve with treatment minus those which improve with control intervention, let’s make a NND placing in the denominator the difference between sick patients with positive result and healthy people with positive result.
S gives us the proportion of positive patients, and the complementary of Sp gives us the proportion of healthy people being positive with the test. So:
NND = 1 / S – (1 – Sp)
If we simplify the denominator by removing the parentheses, we’ll have:
NND = 1 / S + E – 1
That is, indeed, the inverse of Youden’s index we saw before:
NND = 1 / Youden’s I.
Number needed to misdiagnose
The second parameter is the number of patients to test to misdiagnose one (NNMD). To calculate it, we place in the denominator the complementary of the accuracy index that we talk about earlier:
NNMD = 1 / Accuracy I.
If we substitute the index for its actual value and simplify the equation, we’ll get:
where Pr is the prevalence of disease (pretest probability). This parameter provides the number of diagnostic tests that we have to do to be wrong once, so the higher the index, the better the test. This index and the previous one are much more graspable for clinicians, although both of them have the same drawback: FP and FN are given the same level of importance, which does not always fit with the clinical context in which we apply the diagnostic test.
And these are all the parameters I know, but surely there are more and, if not, someone will invent some soon. I would not end without a clarification on the Youden’s index, in which we have barely spent time. This index is not only important to assess the overall performance of a diagnostic test. It’s also a useful tool to decide what the best cut on a ROC curve is, since its maximum value indicates the point of the curve that is further away from the diagonal. But that’s another story…