Science without sense…double nonsense

Píldoras sobre medicina basada en pruebas

King of Kings

Print Friendly, PDF & Email

There is no doubt that when doing a research in biomedicine we can choose from a large number of possible designs, all with their advantages and disadvantages. But in such a diverse and populous court, among jugglers, wise men, gardeners and purple flautists, it reigns over all of them the true Crimson King in epidemiology: the randomized clinical trial.

The clinical trial is an interventional analytical study, with antegrade direction and concurrent temporality, and with sampling of a closed cohort with control of exposure. In a trial, a sample of a population is selected and divided randomly into two groups. One of the groups (intervention group) undergoes the intervention that we want to study, while the other (control group) serves as a reference to compare the results. After a given follow-up period, the results are analyzed and the differences between the two groups are compared. We can thus evaluate the benefits of treatments or interventions while controlling the biases of other types of studies: randomization favors that possible confounding factors, known or not, are distributed evenly between the two groups, so that if in the end we detect any difference, this has to be due to the intervention under study. This is what allows us to establish a causal relationship between exposure and effect.

From what has been said up to now, it is easy to understand that the randomized clinical trial is the most appropriate design to assess the effectiveness of any intervention in medicine and is the one that provides, as we have already mentioned, a higher quality evidence to demonstrate the causal relationship between the intervention and the observed results.

But to enjoy all these benefits it is necessary to be scrupulous in the approach and methodology of the trials. There are checklists published by experts who understand a lot of these issues, as is the case of the CONSORT list, which can help us assess the quality of the trial’s design. But among all these aspects, let us give some thought to those that are crucial for the validity of the clinical trial.

Everything begins with a knowledge gap that leads us to formulate a structured clinical question. The only objective of the trial should be to answer this question and it is enough to respond appropriately to a single question. Beware of clinical trials that try to answer many questions, since, in many cases, in the end they do not respond well to any. In addition, the approach must be based on what the inventors of methodological jargon call the equipoise principle, which does not mean more than, deep in our hearts, we do not really know which of the two interventions is more beneficial for the patient (from the ethical point of view, it would be necessary to be anathema to make a comparison if we already know with certainty which of the two interventions is better). It is curious in this sense how the trials sponsored by the pharmaceutical industry are more likely to breach the equipoise principle, since they have a preference for comparing with placebo or with “non-intervention” in order to be able to demonstrate more easily the efficacy of their products.Then we must carefully choose the sample on which we will perform the trial. Ideally, all members of the population should have the same probability not only of being selected, but also of finishing in either of the two branches of the trial. Here we are faced with a small dilemma. If we are very strict with the inclusion and exclusion criteria, the sample will be very homogeneous and the internal validity of the study will be strengthened, but it will be more difficult to extend the results to the general population (this is the explanatory attitude of sample selection). On the other hand, if we are not so rigid, the results will be more similar to those of the general population, but the internal validity of the study may be compromised (this is the pragmatic attitude).

Randomization is one of the key points of the clinical trial. It is the one that assures us that we can compare the two groups, since it tends to distribute the known variables equally and, more importantly, also the unknown variables between the two groups. But do not relax too much: this distribution is not guaranteed at all, it is only more likely to happen if we randomize correctly, so we should always check the homogeneity of the two groups, especially with small samples.

In addition, randomization allows us to perform masking appropriately, with which we perform an unbiased measurement of the response variable, avoiding information biases. These results of the intervention group can be compared with those of the control group in three ways. One of them is to compare with a placebo. The placebo should be a preparation of physical characteristics indistinguishable from the intervention drug but without its pharmacological effects. This serves to control the placebo effect (which depends on the patient’s personality, their feelings towards the intervention, their love for the research team, etc.), but also the side effects that are due to the intervention and not to the pharmacological effect (think, for example, of the percentage of local infections in a trial with medication administered intramuscularly).

The other way is to compare with the accepted as the most effective treatment so far. If there is a treatment that works, the logical (and more ethical) is that we use it to investigate whether the new one brings benefits. It is also usually the usual comparison method in equivalence or non-inferiority studies. Finally, the third possibility is to compare with non-intervention, although in reality this is a far-fetched way of saying that only the usual care that any patient would receive in their clinical situation is applied.

It is essential that all participants in the trial are submitted to the same follow-up guideline, which must be long enough to allow the expected response to occur. All losses that occur during follow-up should be detailed and analyzed, since they can compromise the validity and power of the study to detect significant differences. And what do we do with those that get lost or end up in a different branch to the one assigned? If there are many, it may be more reasonable to reject the study. Another possibility is to exclude them and act as if they had never existed, but we can bias the results of the trial. A third possibility is to include them in the analysis in the branch of the trial in which they have participated (there is always one that gets confused and takes what he should not), which is known as analysis by treatment or analysis by protocol. And the fourth and last option we have is to analyze them in the branch that was initially assigned to them, regardless of what they did during the study. This is called the intention-to-treat analysis, and it is the only one of the four possibilities that allows us to retain all the benefits that randomization had previously provided.

As a final phase, we would have the analyze and compare the data to draw the conclusions of the trial, using for this the association and impact measures of effect that, in the case of the clinical trial, are usually the response rate, the risk ratio (RR), the relative risk reduction (RRR), the absolute risk reduction (ARR) and the number needed to treat (NNT). Let’s see them with an example.

Let’s imagine that we carried out a clinical trial in which we tried a new antibiotic (let’s call it A not to get warm from head to feet) for the treatment of a serious infection of the location that we are interested in studying. We randomize the selected patients and give them the new drug or the usual treatment (our control group), according to what corresponds to them by chance. In the end, we measure how many of our patients fail treatment (present the event we want to avoid).

Thirty six out of the 100 patients receiving drug A present the event to be avoided. Therefore, we can conclude that the risk or incidence of the event in those exposed (Ie) is 0.36. On the other hand, 60 of the 100 controls (we call them the group of not exposed) have presented the event, so we quickly calculate that the risk or incidence in those not exposed (Io) is 0.6.

At first glance we already see that the risk is different in each group, but as in science we have to measure everything, we can divide the risks between exposed and not exposed, thus obtaining the so-called risk ratio (RR = Ie / Io). An RR = 1 means that the risk is equal in the two groups. If the RR> 1 the event will be more likely in the group of exposed (the exposure we are studying will be a risk factor for the production of the event) and if RR is between 0 and 1, the risk will be lower in those exposed. In our case, RR = 0.36 / 0.6 = 0.6. It is easier to interpret RR> 1. For example, a RR of 2 means that the probability of the event is twice as high in the exposed group. Following the same reasoning, a RR of 0.3 would tell us that the event is a third less frequent in the exposed than in the controls. You can see in the attached table how these measures are calculated.

But what we are interested in is to know how much the risk of the event decreases with our intervention to estimate how much effort is needed to prevent each one. For this we can calculate the RRR and the ARR. The RRR is the risk difference between the two groups with respect to the control (RRR = [Ie-Io] / Io). In our case it is 0.4, which means that the intervention tested reduces the risk by 60% compared to the usual treatment.

The ARR is simpler: it is the difference between the risks of exposed and controls (ARR = Ie – Io). In our case it is 0.24 (we ignore the negative sign), which means that out of every 100 patients treated with the new drug there will be 24 fewer events than if we had used the control treatment. But there is still more: we can know how many we have to treat with the new drug to avoid an event by just doing the rule of three (24 is to 100 as 1 is to x) or, easier to remember, calculating the inverse of the ARR. Thus, the NNT = 1 / ARR = 4.1. In our case we would have to treat four patients to avoid an adverse event. The context will always tell us the clinical importance of this figure.

As you can see, the RRR, although it is technically correct, tends to magnify the effect and does not clearly quantify the effort required to obtain the results. In addition, it may be similar in different situations with totally different clinical implications. Let’s see it with another example that I also show you in the table. Suppose another trial with a drug B in which we obtain three events in the 100 treated and five in the 100 controls. If you do the calculations, the RR is 0.6 and the RRR is 0.4, as in the previous example, but if you calculate the ARR you will see that it is very different (ARR = 0.02), with an NNT of 50 It is clear that the effort to avoid an event is much greater (4 versus 50) despite the same RR and RRR.

So, at this point, let me advice you. As the data needed to calculate RRR are the same than to calculate the easier ARR (and NNT), if a scientific paper offers you only the RRR and hide the ARR, distrust it and do as with the brother-in-law who offers you wine and cured cheese, asking him why he does not better put a skewer of Iberian ham. Well, I really wanted to say that you’d better ask yourselves why they don’t give you the ARR and compute it using the information from the article.

So far all that we have said refers to the classical design of parallel clinical trials, but the king of designs has many faces and, very often, we can find papers in which it is shown a little differently, which may imply that the analysis of the results has special peculiarities.

Let’s start with one of the most frequent variations. If we think about it for a moment, the ideal design would be that which would allow us to experience in the same individual the effect of the study intervention and the control intervention (the placebo or the standard treatment), since the parallel trial is an approximation that it assumes that the two groups respond equally to the two interventions, which always implies a risk of bias that we try to minimize with randomization. If we had a time machine we could try the intervention in all of them, write down what happens, turn back the clock and repeat the experiment with the control intervention so we could compare the two effects. The problem, the more alert of you have already imagined, is that the time machine has not been invented yet.

But what has been invented is the cross-over clinical trial, in which each subject is their own control. As you can see in the attached figure, in this type of test each subject is randomized to a group, subjected to the intervention, allowed to undergo a wash-out period and, finally, subjected to the other intervention. Although this solution is not as elegant as that of the time machine, the defenders of cross-trials argue the fact that variability within each individual is less than the interindividual one, with which the estimate can be more accurate than that of the parallel trial and, in general, smaller sample sizes are needed. Of course, before using this design you have to make a series of considerations. Logically, the effect of the first intervention should not produce irreversible changes or be very prolonged, because it would affect the effect of the second. In addition, the washing period must be long enough to avoid any residual effects of the first intervention.

It is also necessary to consider whether the order of the interventions can affect the final result (sequence effect), with which only the results of the first intervention would be valid. Another problem is that, having a longer duration, the characteristics of the patient can change throughout the study and be different in the two periods (period effect). And finally, beware of the losses during the study, which are more frequent in longer studies and have a greater impact on the final results than in parallel trials.

Imagine now that we want to test two interventions (A and B) in the same population. Can we do it with the same trial and save costs of all kinds? Yes, we can, we just have to design a factorial clinical trial. In this type of trial, each participant undergoes two consecutive randomizations: first it is assigned to intervention A or to placebo (P) and, second, to intervention B or placebo, with which we will have four study groups: AB, AP, BP and PP. As is logical, the two interventions must act by independent mechanisms to be able to assess the results of the two effects independently.

Usually, an intervention related to a more plausible and mature hypothesis and another one with a less contrasted hypothesis are studied, assuring that the evaluation of the second does not influence the inclusion and exclusion criteria of the first one. In addition, it is not convenient that neither of the two options has many annoying effects or is badly tolerated, because the lack of compliance with one treatment usually determines the poor compliance of the other. In cases where the two interventions are not independent, the effects could be studied separately (AP versus PP and BP versus PP), but the design advantages are lost and the necessary sample size increases.

At other times it may happen that we are in a hurry to finish the study as soon as possible. Imagine a very bad disease that kills lots of people and we are trying a new treatment. We want to have it available as soon as possible (if it works, of course), so after every certain number of participants we will stop and analyze the results and, in the case that we can already demonstrate the usefulness of the treatment, we will consider the study finished. This is the design that characterizes the sequential clinical trial. Remember that in the parallel trial the correct thing is to calculate previously the sample size. In this design, with a more Bayesian mentality, a statistic is established whose value determines an explicit termination rule, so that the size of the sample depends on the previous observations. When the statistic reaches the predetermined value we see ourselves with enough confidence to reject the null hypothesis and we finish the study. The problem is that each stop and analysis increases the error of rejecting it being true (type 1 error), so it is not recommended to do many intermediate analysis. In addition, the final analysis of the results is complex because the usual methods do not work, but there are others that take into account the intermediate analysis. This type of trial is very useful with very fast-acting interventions, so it is common to see them in titration studies of opioid doses, hypnotics and similar poisons.

There are other occasions when individual randomization does not make sense. Imagine we have taught the doctors of a center a new technique to better inform their patients and we want to compare it with the old one. We cannot tell the same doctor to inform some patients in one way and others in another, since there would be many possibilities for the two interventions to contaminate each other. It would be more logical to teach the doctors in a group of centers and not to teach those in another group and compare the results. Here what we would randomize is the centers to train their doctors or not. This is the trial with group assignment design. The problem with this design is that we do not have many guarantees that the participants of the different groups behave independently, so the size of the sample needed can increase a lot if there is great variability between the groups and little within each group. In addition, an aggregate analysis of the results has to be done, because if it is done individually, the confidence intervals are falsely narrowed and we can find false statistical meanings. The usual thing is to calculate a weighted synthetic statistic for each group and make the final comparisons with it.

The last of the series that we are going to discuss is the community essay, in which the intervention is applied to population groups. When carried out in real conditions on populations, they have great external validity and often allow for cost-efficient measures based on their results. The problem is that it is often difficult to establish control groups, it can be more difficult to determine the necessary sample size and it is more complex to make causal inference from their results. It is the typical design for evaluating public health measures such as water fluoridation, vaccinations, etc.

I’m done now. The truth is that this post has been a bit long (and I hope not too hard), but the King deserves it. In any case, if you think that everything is said about clinical trials, you have no idea of all that remains to be said about types of sampling, randomization, etc., etc., etc. But that is another story…

From the hen to the egg

Print Friendly, PDF & Email

Surely someone overflowing genius has asked you on any occasion, with a smug look, what came first, the hen or the egg? Well, the next time you meet with someone like this, you can answer with another question: what is it that that the hen and the egg have something to do which each other? Because we must first now not only whether if to have hens we have to have eggs before, but also how likely is to end having hens, with or without eggs (some twisted mind will say that the question could be raised backwards, but I am among those to think that the first thing we have to have, no offense, are eggs).

This approach would lead us to the design of a case-control study, which is an observational and analytical study in which sampling is done on the basis of presenting a certain disease or effect (the cases) and that group is compared with another group that it does not present it (the controls), in order to determine if there is a difference in the frequency of exposure to a certain risk factor between the two groups. These studies are of retrograde directionality and of mixed temporality, so most of them are retrospective, although, as was the case with cohort studies, they can also be prospective (perhaps the most useful key to distinguish between the two is the sampling of each one, based on the exposure in the cohort studies and based on the effect in the cases and controls).

In the attached figure you can see the typical design of a case-control study. These studies are based on a specific population from which a sample of cases that usually includes all diagnosed and available cases, are compared with a control group consisting of a balanced sample of healthy subjects from the same population. However, it is increasingly common to find variations in the basic design that combine characteristics of the cohort and case-control studies, comparing the cases that appear in a stable cohort over time with controls of a partial sample extracted from that same cohort.

The best known of this type of mixed designs is that of nested in a cohort cases and controls. In these cases, we start with a well-known cohort in which we identify the cases that are occurring. Each time a case appears, it is paired with one or more controls also taken from the initial cohort. If we think about it briefly, it is possible that a subject that is initially selected as a control becomes a case over time (develop the disease under study). Although it may seem that this may bias the results, this should not be the case, since it is about measuring the effect of the exposure at the time of the analysis. This design can be done with smaller cohorts, so it can be simpler and cheaper. In addition, it is especially useful in very dynamic cohorts with many inputs and outputs over time, especially if the incidence of the disease under study is low.

Another variant of the basic design are the cohort and cases studies. In this type, we initially have a very large cohort from which we will select a smaller sub-cohort. The cases will be the patients that are produced in either of the two cohorts, while the controls will be the subjects of the smallest (and most manageable) sub-cohort. These studies have a method of analysis a little more complicated than the basic designs, since they have to compensate the fact that the cases are overrepresented because they come from the two cohorts. The great advantage of this design is that it allows studying several diseases at the same time, comparing the different cohorts of patients with the sub-cohort chosen as control.

Finally, one last variation that we are going to discuss is that of the polysemic case-cohort studies, also known as crossed cases and controls, also known as self-controlled cases. In this paired design, each individual serves as their own control, comparing the exposure during the period of time closest to the onset of the disease (case period) with the exposure during the previous period of time (control period). This study approach is useful when the exposure is short, with a foreseeable time of action and produces a disease of short duration in time. They are widely used, for example, to study the adverse effects of vaccines.

As in cohort studies, case-control studies allow the calculation of a whole series of association and impact measures. Of course, here we have a fundamental difference with cohort studies. In cohort studies we started from a cohort without patients in which the patients appeared during the follow-up, which allowed us to calculate the risk of becoming ill over time (incidence). Thus, the quotient between incidents of exposed and not exposed gave us the risk ratio, the main measure of association.

However, as can be deduced from the design of case-control studies, in these cases we cannot make a direct estimate of the incidence or prevalence of the disease, since the proportion of patients is determined by the selection criteria of the researcher and not by the incidence in the population (a fixed number of cases and controls are selected at the beginning, but we cannot calculate the risk of being a case in the population). Thus, before the impossibility of calculating the risk ratio, we will resort to the calculation of the odds ratio (OR), as you can see in the second figure.

The OR has a similar interpretation that the risk ratio, being able to value from zero to infinity. An OR = 1 means that there is no association between exposure and effect. An OR <1 means that exposure is a factor of protection against the effect. Finally, an OR> 1 indicates that the exposure is a risk factor, the higher the value of the OR.

Anyway and only for those who like getting into trouble, I will tell you that it is possible to calculate the incidence rates from the results of a case-control study. If the incidence of the disease under study is low (below 10%), OR and risk ratio can be comparable, so we can estimate the incidence in an approximate way. If the incidence of the disease is greater, the OR tends to overestimate the risk ratio, so we cannot consider them to be equivalent. In any case, in these cases, if we previously know the incidence of the disease in the population (obtained from other studies), we can calculate the incidence using the following formulas:

I0 = It / (OR x Pe) + P0

Ie = I0 x OR,

where It is the total incidence, Ie the incidence in exposed, I0 the incidence in not exposed, Pe the proportion of exposed, and P0 the proportion of not exposed.

Although the OR allows estimating the strength of the association between the exposure and the effect, it does not report on the potential effect that eliminating the exposure on the health of the population would have. For this, we will have to resort to the measures of attributable risk (as we did with cohort studies), which can be absolute or relative.

There are two absolute measures of attributable risk. The first is the attributable risk in exposed (ARE), which is the difference between the incidence in exposed and not exposed and represents the amount of incidence that can be attributed to the risk factor in the exposed. The second is the population attributable risk (PAR), which represents the amount of incidence that can be attributed to the risk factor in the general population.

On the other hand, there are also two relative measures of attributable risk (also known as proportions or attributable or etiological fractions). First, the attributable fraction in exposed (AFE), which represents the difference of risk relative to the incidence in the group of exposed to the factor. Second, the population attributable fraction (PAF), which represents the difference in risk relative to the incidence in the general population.

In the attached table I show you the formulas for the calculation of these parameters, which is somewhat more complex than in the case of cohort studies.

The problem with these impact measures is that they can sometimes be difficult for the clinician to interpret. For this reason, and inspired by the calculation of the number needed to treat (NNT) of clinical trials, a series of measures called impact numbers have been devised, which give us a more direct idea of the effect of the exposure factor on the disease. in study. These impact numbers are the number of impact in exposed (NIE), the number of impact in cases (NIC) and the number of impact in exposed cases (NIEC).

Let’s start with the simplest one. The NIE would be the equivalent of the NNT and would be calculated as the inverse of the absolute risk reduction or of the risk difference between exposed and not exposed. The NNT is the number of people who should be treated to prevent a case compared to the control group. The NIE represents the average number of people who have to be exposed to the risk factor so that a new case of illness occurs compared to the people who are not exposed. For example, a NIE of 10 means that out of every 10 exposed there will be a case of disease attributable to the risk factor studied.

The NIC is the inverse of the PAF, so it defines the average number of sick people among which a case is due to the risk factor. An NIC of 10 means that for every 10 patients in the population, one is attributable to the risk factor under study.

Finally, the NIEC is the inverse of the AFE. It is the average number of patients among which a case is attributable to the risk factor.

In summary, these three parameters measure the impact of exposure among all exposed (NIE), among all patients (NIC) and among all patients who have been exposed (NIEC). It will be useful for us to try to calculate them if the authors of the study do not do so, since they will give us an idea of the real impact of the exposure on the effect. In the second table I show you the formulas that you can use to obtain them.

As a culmination to the previous three, we could estimate the effect of the exposure on the entire population by calculating the number of impact on the population (NIP), for which we have only to do the inverse of the ARP. Thus, a NIP of 3000 means that for every 3,000 subjects of the population there will be a case of illness due to exposure.

In addition to assessing the measures of association and impact, when appraising a case-control study we will have to pay special attention to the presence of biases, since they are the observational studies that have the greatest risk of presenting them.

Case-control studies are relatively simple to make, have in general lower cost than other observational studies (including cohort studies), allow us to study various exposure factors at the same time and to know how they interact, and they are ideal for diseases of exposure factors with very low frequency. The problem with this type of design is that you have to be extremely careful selecting cases and controls, as it is very easy to commit a list of biases that, to this day, does not have a known end.

In general, the selection criteria should be the same for cases and controls, but as to be a case one has to be diagnosed and be available for the study, it’s very likely that cases are not fully representative of the population. For example, if the diagnostic criteria are not sensitive and specific enough we’ll get many false positives and negatives, with the consequent dilution of the effect of the exposure to the factor.

Other possible problem depends on the selection of incident (newly diagnosed) or prevalent cases. Prevalence based studies favor the selection of the survivors (as far as it’s known, no dead has agreed to participate in any study) and if survival is related to the exposure, the risk identified will be less than with incident cases. This effect is even more evident when the exposure factor is of good prognosis, a situation in which prevalence studies produces a greater overestimation of the association. As an example to better understand these issues, let’s suppose that the risk of suffer a stroke is higher the more one smokes. If we include only prevalent cases we’ll exclude the people dead of more severe heart attacks, which probably would be the one who smoke most, with which the effect of smoking could be underestimated.

But if selecting cases seems complicated, it’s nothing compared to a good selection of controls. Ideally, controls have had the same likely of exposure than cases or, put it another way, should be representative of the population from which the cases were extracted. In addition, this must be combined with the exclusion of those who have any illness related positively or negatively to the exposure factor. For example, If we want to waste our time and study the association between air passengers who have thrombophlebitis and prior aspirin ingestion, we must exclude from the study the controls that have any other disease being treated with aspirin, even if they had not taken it before the journey.

We have also to be careful with some habits of control selection. For instance, patients who go to the hospital for reasons different to that of study are at hand and tend to be very cooperative and, being sick, they surely better recall past exposure to risk factors. But the problem is that they are ill, so the pattern of exposure to risk factors can be different to the general population.

Another resource is to include neighbors, friend, relatives, etc. These usually are very comparable and cooperative, but we have the risk that there’re paired exposure habit that can alter study results. These entire problems are avoided taking controls from general population, but it is more costly in effort and money, they usually are less cooperative and, above all, much more forgetful (healthy people recall less about past exposures to risk factors), with so the quality of the information we obtain from cases and controls can be very different.

Just one more comment to end this theme so enjoyable. Case-control studies share a characteristic with the rest of the observational studies: they detect the association between exposure and effect, but they do not allow us to establish causality relations with certainty, for which we need other types of studies such as randomized clinical trials. But that is another story…

One about Romans

Print Friendly, PDF & Email

What a fellows, those Romans!. They came, they saw and they conquered. With those legions, each one with ten cohorts, each cohort with almost five hundred Romans with their skirts and strappy sandals. The cohorts were groups of soldier that were in reach of the speech of the same boss. They always went forward, never retreating. This is how you can conquer Gaul (though not entirely, as is well known).

In epidemiology, a cohort is also a group of people who share something, but instead of being the boss’s harangue it is the exposure to a factor that is studied over time (neither the skirt nor the sandals are essential) . Thus, a cohort study is a type of observational, analytical design, of anterograde directionality and of concurrent or mixed temporality that compares the frequency with which a certain effect occurs (usually a disease) in two different groups (cohorts), one of them exposed to one factor and another not exposed to the same factor (see attached figure).

Therefore, sampling is related to exposure to the factor. Both cohorts are studied over time, which is why most of the cohort studies are prospective or of concurrent temporality (they go forward, like the Roman cohorts). However, it is possible to do retrospective cohort studies once both the exposure and the effect have occurred. In these cases, the researcher identifies the exposure in the past, reconstructs the experience of the cohort over time and attends in the present to the appearance of the effect, which is why they are studies of mixed temporality.

We can also classify the cohort studies according to whether they use an internal or external comparison group. Sometimes we can use two internal cohorts belonging to the same general population, classifying the subjects in one or another cohort according to the level of exposure to the factor. However, other times the exposed cohort will interest us because of its high level of exposure, so we will prefer to select an external cohort of subjects not exposed to make the comparison between both.

Another important aspect when classifying the cohort studies is the time of inclusion of the subjects in the study. When we only select the subjects that meet the inclusion criteria at the beginning of the study, we speak of a fixed cohort, whereas we will speak of an open or dynamic cohort when subjects continue to enter the study throughout the follow-up. This aspect will be important, as we will see later, when calculating the association measures between exposure and effect.

Finally, and as a curiosity, we can also do a study with a single cohort if we want to study the incidence or evolution of a certain disease. Although we can always compare the results with other known data of the general population, this type of designs lacks a comparison group in the strict sense, so it is included within the longitudinal descriptive studies.

When followed up over time, the cohort studies allow us to calculate the incidence of the effect between exposed and not exposed, calculating from them a series of association measures and specific impact measures.

In studies with closed cohorts in which the number of participants does not change, the measure of association is the relative risk (RR), which is the ratio between the incidence of exposed (Ie) and unexposed (I0): RR = Ie / I0.

As we know, the RR can value from 0 to infinity. A RR = 1 means that there is no association between exposure and effect. A RR <1 means that exposure is a factor of protection against the effect. Finally, a RR> 1 indicates that exposure is a risk factor, the greater the value of the RR.

The case of studies with open cohorts in which participants can enter and leave during the follow-up is a bit more complex, since instead of incidences we will calculate incidence densities, a term that refers to the number of cases of the effect or disease that they occur referring to the number of people followed by each follow-up time (for example, number of cases per 100 person-years). In these cases, instead of the RR we will calculate the incidence density ratio, which is the quotient of the incidence density in exposed divided by the density in not exposed.

These measures allow us to estimate the strength of the association between the exposure to the factor and the effect, but they do not inform us about the potential impact that exposure has on the health of the population (the effect that eliminating this factor would have on the health of the population). For this, we will have to resort to the measures of attributable risk, which can be absolute or relative.

There are two absolute measures of attributable risk. The first is the attributable risk in exposed (ARE), which is the difference between the incidence in exposed and not exposed and represents the amount of incidence that can be attributed to the risk factor in the exposed. The second is the population attributable risk (PAR), which represents the amount of incidence that can be attributed to the risk factor in the general population.

On the other hand, there are also two relative measures of attributable risk (also known as proportions or attributable or etiological fractions). First, the attributable fraction in exposed (AFE), which represents the difference of risk relative to the incidence in the group of exposed to the factor. Second, the population attributable fraction (PAF), which represents the difference in risk relative to the incidence in the general population.

In the attached table you can see the formulas that are used for the calculation of these impact measures.

The problem with these impact measures is that they can sometimes be difficult for the clinician to interpret. For this reason, and inspired by the calculation of the number needed to treat (NNT) of clinical trials, a series of measures called impact numbers have been devised, which give us a more direct idea of the effect of the exposure factor on the disease. in study. These impact numbers are the number of impact in exposed (NIE), the number of impact in cases (NIC) and the number of impact in exposed cases (NIEC).

Let’s start with the simplest one. The NIE would be the equivalent of the NNT and would be calculated as the inverse of the absolute risk reduction or of the risk difference between exposed and not exposed. The NNT is the number of people who should be treated to prevent a case compared to the control group. The NIE represents the average number of people who have to be exposed to the risk factor so that a new case of illness occurs compared to the people who are not exposed. For example, a NIE of 10 means that out of every 10 exposed there will be a case of disease attributable to the risk factor studied.

The NIC is the inverse of the PAF, so it defines the average number of sick people among which a case is due to the risk factor. An NIC of 10 means that for every 10 patients in the population, one is attributable to the risk factor under study.

Finally, the NIEC is the inverse of the AFE. It is the average number of patients among which a case is attributable to the risk factor.

In summary, these three parameters measure the impact of exposure among all exposed (NIE), among all patients (NIC) and among all patients who have been exposed (NIEC). It will be useful for us to try to calculate them if the authors of the study do not do so, since they will give us an idea of the real impact of the exposure on the effect. In the second table I show you the formulas that you can use to obtain them.

As a culmination to the previous three, we could estimate the effect of the exposure on the entire population by calculating the number of impact on the population (NIP), for which we have only to do the inverse of the ARP. Thus, a NIP of 3000 means that for every 3,000 subjects of the population there will be a case of illness due to exposure.

Another aspect that we must take into account when dealing with cohort studies is their risk of bias. In general, observational studies have a higher risk of bias than experimental studies, as well as being susceptible to the influence of confounding factors and effect modifying variables.

The selection bias must always be considered, since it can compromise the internal and external validity of the study results. The two cohorts should be comparable in all aspects, in addition to being representative of the population from which they come.

Another very typical bias of cohort studies is the classification bias, which occurs when an erroneous classification of the participants is made in terms of their exposure or the detection of the effect (basically, it is just another information bias). . The classification bias can be non-differential when the error occurs randomly independently of the study variables. This type of classification bias is in favor of the null hypothesis, that is, it makes it difficult for us to detect the association between exposure and effect, if it exists. If, despite the bias, we detect the association, then nothing bad will happen, but if we do not detect it, we will not know if it does not exist or if we do not see it because of the bad classification of the participants. On the other hand, the classification bias is differential when performed differently between the two cohorts and has to do with some of the study variables. In this case there is no forgiveness or possibility of amendment: the direction of this bias is unpredictable and mortally compromises the validity of the results.

Finally, we should always be alert to the possibility of confusion bias (due to confounding variables) or interaction bias (due to effect modifying variables). The ideal is to prevent them in the design phase, but it is not superfluous to control confusion in the analysis phase, mainly through stratified analyzes and multivariate studies.

And with this we come to the end of this post. We see, then, that cohort studies are very useful to calculate the association and the impact between effect and exposure but, careful, they do not serve to establish causal relationships. For that, other types of studies are necessary.

The problem with cohort studies is that they are difficult (and expensive) to perform adequately, often require large samples and sometimes long follow-up periods (with the consequent risk of losses). In addition, they are not very useful for rare diseases. And we must not forget that they do not allow us to establish causal relationships with sufficient security, although for this reason, case-control studies are better than their cousins, but that is another story…

Which family you belong?

Print Friendly, PDF & Email

As we already know from previous posts, the evidence-based medicine systematics begins with a knowledge gap that moves us to ask a structured clinical question. Once we have elaborated the question, we will use its components to make a bibliographic search and obtain the best available evidence to solve our doubt.

And here comes, perhaps, the most feared task of evidence-based medicine: the critical appraisal of the evidence found. Actually, the thing is not so much since, with a little practice, the critical reading consists only of systematically applying a series of questions about the article that we are analyzing. The problem sometimes comes in knowing what questions we have to ask, since this system has differences according to the design of the study that we are evaluating.

Here, by design we understand the set of procedures, methods and techniques used with the study participants, during the data collection and during the analysis and interpretation of the results to obtain the conclusions of the study. And there are a myriad of possible study designs, especially in recent times when epidemiologists have been led to design mixed observational studies. In addition, the terminology can sometimes be confusing and use terms that do not clarify well what is the design we have in front of us. It’s like when we get to a wedding of someone from a large family and we meet a cousin we do not know where it comes from. Even if we look for physical similarities, we will most likely end up asking him: and you, which family you belong? Only then will we know if he belongs to the groom or to the bride.

What we are going to do in this post is something similar. We will try to establish a series of criteria for classifying studies to finally establish a series of questions whose answers allow us to identify which family they belong to.

To begin with, the type of clinical question to which the work tries to answer can give us some guidance. If the question is of diagnostic nature, it is most likely that we will be faced with what is called a diagnostic test study, which is usually a design in which a series of participants are subjected, in a systematic and independent way, to the test in study and to the reference pattern (the gold standard). It is a type of design especially made for this type of questions but do not just take it from me: sometimes we can see diagnostic questions that can be tried to be solved with other types of studies.

If the question is about treatment, it is most likely that we are facing a clinical trial or, sometimes, a systematic review of clinical trials. However, there are not always trials on everything we look for and we may have to settle for an observational study, such as a case-control or a cohort study.

In case of questions of prognosis and etiology/harm we may find ourselves reading a clinical trial, but the most usual thing is that it is not possible to carry out trials and we only have observational studies.

Once analyzed this aspect, it is possible that we have doubts about the type of design we are facing. It will then be time to turn to our questions about six criteria related to the methodological design: general objective of the clinical question, direction of the study, type of sampling of the participants, temporality of the events, assignment of the study factors and units of study used. Let’s see in detail what each one of these six criteria means, which you see summarized in the table that I attach.

According to the objective, the studies can be descriptive or analytical. A descriptive study is one that, as the name suggests, only has the descriptive purpose of telling how things are, but without intending to establish causal relationships between the risk factor or exposure and the effect studied (a certain disease or health event, in most cases). These studies answer not very complex questions like how many? where? or to whom ?, so they are usually simple and they serve to elaborate hypotheses that later will need more complex studies for their demonstration.

By contrast, other analytical studies do try to establish such relationships, answering questions like why? how to deal with? or how to prevent? Logically, to establish such relationships it will need to have a group with which to compare (the control group). This will be a useful clue to distinguish between analytical and descriptive studies if we have any doubt: the presence of a comparison group will be typical of analytical studies.

The directionality of the study refers to the order in which the exposure and the effect of such exposure are investigated. The study will have an antegrade directionality when the exposure is studied before the effect and a retrograde directionality when the opposite is done. For example, if we want to investigate the effect of smoking on coronary mortality, we can take a set of smokers and see how many die of coronary diseases (antegrade) or, conversely, take a set of deaths from coronary heart disease and look to see how many smoked (retrograde). Logically, only studies with anterograde directionality can ensure that the exposure precedes the effect in time (I’m not saying that one is the cause of the other). Finally, to say that sometimes we can find studies in which exposure and effect are studied at the same time, talking then of simultaneous directionality.

The type of sampling has to do with how to select the study participants. These can be chosen because they are subject to the exposure factor that interests us, to having presented the effect or to a combination of the two or even other criteria other than exposure and effect.

Our fourth criterion is temporality, which refers to the relationship in time between the researcher and the exposure factor or the effect studied. A study will have a historical temporality when effect and exposure have already occurred when the study begins. On the other hand, when these events take place during the study, it will have a concurrent temporality. Sometimes the exposure can be historical and the effect concurrent, speaking then of mixed temporality.

Here I would like to make a point about two terms used by many authors and that will be more familiar to you: prospective and retrospective. Prospective studies would be those in which exposure and effect did not occur at the beginning of the study, while those in which the events have already occurred at the time of the study would be retrospective. To curl the curl, when both situations are combined we would talk about ambispective studies. The problem with these terms is that sometimes they are used indistinctly to express directionality or temporality, which are different terms. In addition, they are usually associated with specific designs: prospective with cohort studies and retrospective with case and control studies. It may be better to use the specific criteria of directionality and temporality, which express the aspects of the design more precisely.

Two other terms related to temporality are those of transversal and longitudinal studies. Transversals are those that provide us with a snapshot of how things are at a given moment, so they do not allow us to establish temporal or causal relationships. They tend to be prevalence studies and always of a descriptive nature.

On the other hand, in longitudinal studies variables are measured over a period of time, so they do allow establishing temporary relationships, but the researcher dos not control how the exposure is assigned to participants. These may have an antegrade (as in cohort studies) or retrograde (as in case and control studies) directionality.

The penultimate of the six criteria that we are going to take into account is the assignment of the study factors. In this sense, a study will be observational when the researchers are mere observers who do not act on the assignment of the exposure factors. In these cases, the relationship between exposure and effect may be affected by other factors, known as confusion, so they do not allow drawing conclusions about causality. On the other hand, when the researcher assigns the effect in a controlled manner according to a previous established protocol, we will talk about experimental or intervention studies. These experimental studies with randomization are the only ones that allow establishing cause-effect relationships and are, by definition, analytical studies.

The last of the criteria refers to the study units. The studies can be carried out on individual participants or on population groups. The latter are ecological studies and community trials, which have specific design characteristics.In the attached figure you can see a scheme of how to classify the different epidemiological designs according to these criteria. When you have doubts about which design corresponds to the work you are evaluating, follow this scheme. The first will be to decide if the study is observational or experimental. This is usually simple, so we move on to the next point. A descriptive observational (without a comparison group) will correspond to a series of cases or a cross-sectional study.

If the observational study is analytical, we will look at the type of sampling, which may be due by disease or study effect (case-control study) or by exposure to the risk or protection factor (cohort study).

Finally, if the study is experimental, we will look for if the exposure or intervention has been assigned randomly and with a comparison group. In the affirmative case, we will find ourselves in front of a randomized controlled clinical trial. If not, it is probably an uncontrolled trial or another type of quasi-experimental design.

And here we will stop for today. We have seen how to identify the most common types of methodological designs. But there are many more. Some with a very specific purpose and their own design, such as economic studies. And others that combine characteristics of basic designs, such as case-cohort studies or nested studies. But that is another story…

The hereafter

Print Friendly, PDF & Email

We have already seen in previous posts how to search for information in Pubmed in different ways, from the simplest, which is the simple search, to the advanced search methods and filtering  of results. Pubmed is, in my modest opinion, a very useful tool for professionals who have to look for biomedical information among the maelstrom of papers that are published daily.

However, Pubmed should not be our only search engine. Yes, ladies and gentlemen, not only does it turn out that there is life beyond Pubmed, but there is a lot of it and interesting.

The first engine I can think of because of the similarity to Pubmed is Embase. This is an Elsevier’s search engine that has about 32 million records of about 8500 journals from 95 countries. As with Pubmed, there are several search options that make it a versatile tool, something more specific for European studies and about drugs than Pubmed (or so they say). The usual when you want to do a thorough search is to use two databases, with the combination of Pubmed and Embase being frequent, since both search engines will provide us with records that the other search engine will not have indexed. The big drawback of Embase, especially when compared to Pubmed, is that its access is not free. Anyway, those who work in large health centers can have the luck to have a subscription paid through the library of the center.

Another useful tool is provided by the Cochrane Library, which includes multiple resources including the Cochrane Database of Systematic Reviews (CDSR), the Cochrane Central Register of Controlled Trials (CENTRAL), the Cochrane Methodology Register (CMR), the Database of Abstracts of Reviews (DARE), the Health Technology Assessment Database (HTA) and the NHS Economic Evaluation Database (EED). In addition, the Spanish-speakers can resort to the Cochrane Library Plus, which translates into Spanish the works of the Cochrane Library. Cochrane Plus is not free, but in Spain we enjoy a subscription that kindly pays us the Ministry of Health, Equality and Social Services.

And since we speak of resources in Spanish, let me bring the ember to my sardine and tell you two search engines that are very dear to me. The first is Epistemonikos, which is a source of systematic reviews and other types of scientific evidence. The second is Pediaclic, a search tool for child health information resources, which classifies the results into a series of categories such as systematic reviews, clinical practice guidelines, evidence-based summaries, and so on.

In fact, Epistemonikos and Pediaclic are meta-searchers. A meta-searcher is a tool that searches in a series of databases and not in a single indexed database like Pubmed or Embase.

There are many meta-search engines but, without a doubt, the king of all and one not to be missed is TRIP Database.

TRIP (Turning Research Into Practice) is a free-access meta-search engine that was created in 1997 to facilitate the search for information from evidence-based medicine databases, although it has evolved and nowadays also retrieves information from image banks , documents for patients, electronic textbooks and even Medline (Pubmed’s database). Let’s take a look at how it works.

In the first figure you can see the top of the TRIP home page. In the simplest form, we will select the link “Search” (it is the one that works by default when we open the page), we will write in the search window the English terms we want to search for and click on the magnifying glass on the right, with what the search engine will show us the list of results.

Although the latest version of TRIP includes a language selector, it is probably best to enter the terms in English in the search window, trying not to put more than two or three words to get the best results. Here we can use the same logical operators we saw in Pubmed (AND, OR and NOT), as well as the truncation operator “*”. In fact, if you type several words in a row, TRIP automatically includes the AND operator between them.

Next to “Search” you can see a link that says “PICO”. This opens a search menu in which we can select the four components of the structured clinical question separately: patients (P), intervention (I), comparison (C) and outcomes (O).

To the right there are two more links. “Advanced” allows advanced searches by fields of the record as the name of the journal, title, year, etc. “Recent” allows us to access the search history. The problem is that these two links are reserved in the latest versions for licensed users. In previous version of TRIP they were free, so I hope that this little flaw will not spread to the whole search engine and that, soon, TRIP will end up being a payment resource.

There are video tutorials available on the web of the search engine about the operation of the diverse modalities of TRIP; but the most attractive thing about TRIP is its way of ordering the results of the search, since it does so according to the source and the quality and the frequency of appearance of the search terms in the articles found. To the right of the screen you can see the list of results organized into a series of categories, such as systematic reviews, evidence-based medicine synopsis, clinical practice guidelines, clinical questions, Medline articles filtered through Clinical Queries, etc.

We can click on one of the categories and restrict the list of results. Once this is done, we can still restrict more the list based on subcategories. For example, if we select systematic reviews we can later restrict to only those of the Cochrane. The possibilities are many, so I invite you to try them.Let’s look at an example. If I write “asthma obesity children” in the search string, I get 1117 results and the list of resources sorted to the right, as you see in the second figure. If I now click on the index “sistematic review” and later on “Cochrane”, I’ll have a single result, although I’ll recover the rest just clicking any of the other categories. Have you ever seen such a combination of simplicity and power? In my humble opinion, with a decent management of Pubmed and the help of TRIP you can find everything you need, no matter how hidden.

And to finish today’s post, you’re going to allow me to ask you a favor: do not use Google to do medical searches or, at least, do not depend exclusively on Google, not even Google Scholar. This search engine is good for finding a restaurant or a hotel for holidays, but not for a controlled search for reliable and relevant medical information as we can do with other tools we have discussed. Of course, with the changes and evolutions that Google has accustomed us to, this may change over time and, maybe, in the future I will have to rewrite this post to recommend it (God forbid).

And here we will leave the topic of bibliographic searches. Needless to say, there are countless more search engines, which you can use the one you like the most or the one you have accessible on your computer or workplace. In some cases, as already mentioned, it is almost mandatory to use more than one, as in the case of systematic reviews, in which the two large ones (Pubmed and Embase) are often used and combined with Cochrane’s and some other that are specific for the subject matter. Because all the search engines we have seen are general, but there are specifics of nursing, psychology, physiotherapy, etc., as well as specific disease. For example, if you do a systematic review on a tropical disease it is advisable to use a specific subject database, such as LILACS, as well as local magazine searchers, if any. But that is another story…

Gathering the gold nuggets

Print Friendly, PDF & Email

I was thinking about today’s post and I cannot help remembering the gold-seekers of the Alaskan gold rush of the late nineteenth century. They went traveling to Yukon, looking for a good creek like the Bonanza and collecting tons of mud. But that mud was not the last step of the quest. Among the sediments they had to extract the longed gold nuggets, for which they carefully filtered the sediments to keep only the gold, when there was any.

When we look for the best scientific evidence to solve our clinical questions we do something similar. Normally we chose one of the Internet search engines (like Pubmed, our Bonanza Creek) and we usually get a long list of results (our great deal of mud) that, finally, we will have to filter to extract the gold nuggets, if there are any among the search results.

We have already seen in previous posts how to do a simple search (the least specific and which will provide us with more mud) and how to refine the searches by using the MeSH terms or the advanced search form, with which we try to get less mud and more nuggets.

However, the usual situation is that, once we have the list of results, we have to filter it to keep only what interests us most. Well, for that there is a very popular tool within Pubmed that is, oh surprise, the use of filters.

Let’s see an example. Suppose we want to seek information about the relationship between asthma and obesity in childhood. The ideal would be to build a structured clinical question to perform a specific search, but to show more clearly how filters work we will do a simple “bad designed” search with natural language, to obtain a greater number of results.

I open Pubmed’s home page, type asthma and obesity in children in the search box and press the “Search” button. I get 1169 results, although the number may vary if you do the search at another time.

You can see the result in the first figure. If you look closer, in the left margin of the screen there is a list of text with headings such as “Article types”, “text availability”, etc. Each section is one of the filters that I have selected to be shown in my results screen. You see that there are two links below. The first one says “Clear all” and serves to unmark all the filters that we have selected (in this case, still none). The second one says “Show additional filters” and, if we click on it, a screen with all the available filters appears so that we choose which we want them to be displayed on the screen. Take a look at all the possibilities.

When we want to apply a filter, we just have to click on the text under each filter header. In our case we will filter only the clinical trials published in the last five years and of which the full free text is available (without having to pay a subscription). To do this, click on “Clinical Trial”, “Free full text” and “5 years”, as you can see in the second figure. You can see that the list of results has been reduced to 11, a much more manageable figure than the original 1169.

Now we can remove filters one by one (by clicking on the word “clear” next to each filter), remove them all (by clicking “Clear all”) or add new ones (clicking on the filter we want).

Two precautions to take into account with the use of filters. First, filters will remain active until we deactivate them. If we do not realize it and deactivate them, we can apply them to searches that we do later and get fewer results than expected. Second, filters are built using the MeSH terms that have been assigned to each article at the time of indexing, so very recent articles, which has not been indexed yet and, therefore, have not get their MeSH terms allocated, will be lost when applying the filters. That is why it is advisable to apply the filters at the end of the search process, which is better to make more specific using other techniques such as the use of MeSH or advanced search.

Another option we have with indexes is to automate them for all the searches but without reducing the number of results. To do this we have to open an account in Pubmed by clicking on “Sign in to NCBI” in the upper right corner of the screen. Once we use the search engine as a registered user, we can click on a link above to the right that says “Manage filters” and select the filters we want. In the future, the searches that we do will be without filters, but above to the right you will see links to the filters that we have selected with the number of results in parentheses (you can see it in the first two figures that I have shown). By clicking, we will filter the list of results in a similar way as we did with the other filters, which are accessible without registering.

I would not like to leave the topic of Pubmed and its filters without talking about another search resource: Clinical Queries. You can access them by clicking on the “Pubmed Tools” on the home page of the search engine. Clinical Queries are a kind of filter built by Pubmed developers who filter the search so that only articles related to clinical research are shown.

We type the search string in the search box and we obtain the results distributed in three columns, as you see in the third figure attached. In the first column they are sorted according to the type of study (etiology, diagnosis, treatment, prognosis and clinical prediction guidelines) and the scope of the search that may be more specific (“Narrow”) or less (“Broad”). If we select “treatment” and narrow range (“Narrow”), we see that the search is limited to 25 articles.

The second column lists systematic reviews, meta-analyzes, reviews of evidence-based medicine, etc. Finally, the third focuses on papers on genetics.

If we want to see the complete list we can click on “See all” at the bottom of the list. We will then see a screen similar to the results of a simple or advanced search, as you see in the fourth attached figure. If you look at the search box, the search string has been slightly modified. Once we have this list we can modify the search string and press “Search” again, reapply the filters that suit us, etc. As you can see, the possibilities are endless.

And with this I think we’re going to say goodbye to Pubmed. I encourage you to investigate many other options and tools that are explained in the tutorials of the website, some of which will require you to have an account at NCBI (remember it’s free). You can, for example, set alarms so that the searcher warns you when something new related to certain search is published, among many other possibilities. But that’s another story…


Print Friendly, PDF & Email

We already know what Pubmed MeSH terms are and how an advanced search can be done with them. We saw that the search method by selecting the descriptors can be a bit laborious, but allowed us to select very well, not only the descriptor, but also some of its subheadings, including or not the terms that depended on it in the hierarchy, etc.

Today we are going to see another method of advanced search a little faster when it comes to building the search string, and that allows us to combine several different searches. We will use the Pubmed advanced search form.

To get started, click on the “Advanced” link under the search box on the Pubmed home page. This brings us to the advanced search page, which you can see in the first figure. Let’s take a look.

First there is a box with the text “Use the builder below to create your search” and on which, initially, we cannot write. Here is going to be created the search string that Pubmed will use when we press the “Search” button. This string can be edited by clicking on the link below to the left of the box, “Edit”, which will allow us to remove or put text to the search string that has been elaborated until then, with natural or controlled text, so we can click the “Search” button and repeat the search with the new string. There is also a link below and to the right of the box that says “Clear”, with which we can erase its contents.

Below this text box we have the search string constructor (“Builder”), with several rows of fields. In each row we will introduce a different descriptor, so we can add or remove the rows we need with the “+” and “-” buttons to the right of each row.

Within each row there are several boxes. The first, which is not shown in the first row, is a dropdown with the boolean search operator. By default it marks the AND operator, but we can change it if we want. The following is a drop-down where we can select where we want the descriptor to be searched. By default it marks “All Fields”, all the fields, but we can select only the title, only the author, only last author and many other possibilities. In the center is the text box where we will enter the descriptor. On its right, the “+” and “-” buttons of which we have already spoken. And finally, in the far right there is a link that says “Show index list”. This is a help from Pubmed, because if we click on it, it will give us a list of possible descriptors that fit with what we have written in the text box.

As we are entering terms in the boxes, creating the rows we need and selecting the boolean operators of each row, the search string will be formed, When we are finished we have to options we can take.

The most common will be to press the “Search” button and do the search. But there is another possibility, which is to click on the link “Add to history”, whereupon the search is stored at the bottom of the screen, where it says “History”. This will be very useful since the saved searches can be entered in block in the field of the descriptors when making a new search and combined with other searches or with series of descriptors. Do you think this is a little messy? Let’s be clear with an example.

Suppose I treat my infants with otitis media with amoxicillin, but I want to know if other drugs, specifically cefaclor and cefuroxime, could improve the prognosis. Here are two structured clinical questions. The first one would say “Does cefaclor treatment improve the prognosis of otitis media in infants?” The second one would say the same but changing cefaclor to cefuroxime. So there would be two different searches, one with the terms infants, otitis media, amoxicillin, cefaclor and prognosis, and another with the terms infants, otitis media, amoxicillin, cefuroxime and prognosis.

What we are going to do is to plan three searches. A first one about article about the prognosis of otitis media in infants; a second one about cefaclor; and a third one about cefuroxime. Finally, we will combine the first with the second and the first with the third in two different searches, using the boolean AND.

Let us begin. We write otitis in the text box of the first search row and click on the link “Show index”. A huge drop-down appears with the list of related descriptors (when we see a word followed by the slash and another word it will mean that it is a subheader of the descriptor). If we look down in the list, there is a possibility that says “otitis / media infants” that fits well to what we are interested in, so we select it. We can now close the list of descriptors by clicking the “Hide index list” link. Now in the second box we write prognosis (we must follow the same method: write part in the box and select the term from the index list). We have a third row of boxes (if not, press the “+” button). In this third row we write amoxicillin. Finally, we will exclude from the search those articles dealing with the combination of amoxicillin and clavulanic acid. We write clavulanic and click on “Show index list”, which shows us the descriptor “clavulanic acid”, which we select. Since we want to exclude these articles from the search, we change the boolean operator of that row to NOT.

In the second screen capture you can see what we have done so far. You see that the terms are in quotes. That’s because we’ve chosen the MeSHs from the index list. If we write the text directly in the box it will appear without quotes, which will mean that the search has been done with natural language (so the accuracy of the controlled language of MeSH terms will have been lost). Note also that in the first text box of the form the search string that we have built so far has been written, which says (((“otitis/media infants”) AND prognosis) AND amoxicillin) NOT “clavulanic acid”. If we wanted, we have already said that we could modify it, but we will leave it as it is.

Now we could click “Search” and make the search or directly click on the “Add to history” link. To see how the number of articles found can be reduced, click on “Search”. I get a list with 98 results (the number may depend on when you do the search). Very well, click on the link “Advanced” (at the top of the screen) to return to the advanced search form.

At the bottom of the screen we can see the first search saved, numbered as # 1 (you can see it in the third figure).

What remains to be done is simpler. We write cefaclor in the text box and give the link “Add to history”. We repeat the process with the term cefuroxime. You can see the result of these actions in the fourth screen capture. You see how Pubmed has saved all the three searches in the search history. If we now want to combine them, we just have to click on the number of each one (a window will open for us to choose the boolean we want, in this case all will be AND).

First we click on # 1 and # 2, selecting AND. You see the product in the fith capture. Notice that the search string has been somewhat complicated: (((((otitis/media infants) AND prognosis) AND amoxicillin) NOT clavulanic acid)) AND cefaclor. As a curiosity I will tell you that, if we write this string directly in the simple search box, the result would be the same. It is the method used by those who totally dominate the jargon of this search engine. But we have to do it with the help of the advanced search form. We click on “Search” and we obtain seven results that will (or so we expect and hope) compare amoxicillin with cefaclor for the treatment of otitis media in infants.

We click again on the link “Advanced” and in the form we see that there is a further search, the # 4, which is the combination of # 1 and # 2. You can already have an idea of how complicated the searching could become combining searches with each other, adding or subtracting according to the boolean operator that we choose. Well, we click on # 1 and # 3 and press “Search”, finding five articles that should deal with the problem we are looking for.

We are coming to the end of my comments for today. I think that the fact that the use of MeSH terms and advanced search yields more specific results than simple search has been fully demonstrated. The usual thing with the simple search with natural language is to obtain endless lists of articles, most of them without interest for our clinical question. But we have to keep one thing in mind. We have already mentioned that a number of people are dedicated to assigning the MeSH descriptors to articles that enter the Medline database. Of course, since the article enters the database until it is indexed (the MeSH is assigned), some time passes and during that time we cannot find them using MeSH terms. For this reason, it could not be a bad idea to do a natural language search after the advanced one and see if there are any articles in the top of the list that might interest us and are not indexed yet.

Finally, commenting that searches can be stored by downloading them to your disk (by clicking the link “download history”) or, much better, creating an account in PubMed by clicking on the link on the top right of the screen that says “Sign in to NCBI. ” This is free and allows us to save the search from one time to another, which can be very useful to use other tools such as Clinical Queries or search filters. But that is another story…

The jargon of the search engine

Print Friendly, PDF & Email

We saw in a previous post how to do a Pubmed search using the simplest system, which is to enter natural language text in the simple search box and press the “Search” button. This method is quite easy and even works quite well when we are looking for something about very rare diseases but, in general, it will give us a very sensitive and unspecific results list, which in this context means that we will get a large number of articles, but many of them will have little to do with what we are looking for.

In these cases we will have to use some tool to make the result more specific: fewer articles and more related to the problem that originates the search. One of the ways is to perform an advanced search instead of a simple search, but for this we will have to use the browser’s own jargon, the so-called thematic descriptors of controlled language.

A descriptor is a term used to construct indexes, also called thesauri. Instead of using the words of the natural language, they are selected or grouped under specific terms, which are to serve as a key in the index of the search engine database.

The thesaurus, formed by the set of descriptors, is specific to each search engine, although many terms may be common. In the case of Pubmed the descriptors are known as MeSH terms, which are the initials of Medical Subject Headings.

This thesaurus or list of terms with controlled vocabulary has also been developed by the National Library of Medicine and constitutes another database with more than 30,000 terms that are updated annually. Within the National Library there are a number of people whose mission is to analyze the new articles that are incorporated into the Medline database and assign them the descriptors that best fit their content. Thus, when we search using a particular descriptor, we will find the articles that are indexed with this descriptor.

But the thing of the descriptors is a little more complicated than it may seem, since they are grouped in hierarchies (MeSH Tree Structures), being able to a same descriptor to belong to several hierarchies, in addition to having subheadings, of such form that we can search using the general MeSH term or further narrow the search using one of its subheaders. The truth is that reading all this makes us want to forget the search using the thesaurus, but we cannot afford that luxury: the search using the MeSH database is the most effective and accurate, since the language has been controlled to eliminate inaccuracies and synonyms of natural language.

Also, the thing is not so complicated when we get to work with it. Let’s see it with the example we use to display the simple search. We want to compare the efficacy of amoxicillin and cefaclor on the duration of otitis media in infants. After elaborating the structured clinical question we obtain our five terms of search, in natural language: otitis, infants, amoxicillin, cefaclor and prognosis.

Now we can go to the Pubmed home page (remember the shortcut: type pubmed in the browser bar and press control-enter). Below the simple search window we saw that there are three columns. We look at the one on the right, “More Resources” and click on the first option, “MeSH Database”, which gives us access to the homepage of the database descriptors (as seen in the first figure).If we write otitis in the search window we see that Pubmed lends us a hand by displaying a list of terms that look like what we are writing. One of them is otitis media, which is what we are interested in, so we select it and Pubmed takes us to the next page where there are several options to choose from. At the moment I do the search there are three options: “Otitis Media”, “Otitis Media, Suppurative” and “Otitis Media with Effusion”. Notice that Pubmed defines each term, so that we understand well what it means with each term. These are the three MeSH terms that fit what we asked for, but we have to choose one.

The simplest thing we can do from this window is to check the selection box to the left of the term that interests us and click the button on the right side of the screen that says “add to search builder”. If we do this, Pubmed begins to construct the search string starting with the chosen term. If we do this with the first term in the list you will see that the text “Otitis Media” [Mesh] appears in the text box “Pubmed Search Builder” , on the top right of the screen (as you can see in the attached figure).But remember that we have said that the MeSH terms have subheaders. To get them, instead of marking the selection box of the term “Otitis Media”, we click on the term, opening the window with the subheadings, as you can see in the second figure.

Each of the terms with their selection box on the left corresponds to a subheading of the descriptor “Otitis Media”. For example, if we were interested in doing a search directed to the cost of the treatment, we could mark the subheading “economics” and then press the button to add to the search. The text that would appear in the text box of the search string would be “Otitis Media / economics” [Mesh] and the search result would be a bit more specific.

Before leaving the MeSH term window let’s look at a few details. In addition to the subheadings, which can be more or less numerous, the bottom of the page shows the hierarchy of the descriptor (MeSH Tree Structure). Our descriptor is in bold, so we can see which terms it depends on and which ones depend on it. In some cases we may be more interested in using a higher term for the search, so we will have to click on it to go to its own window. If we do this, in general, the search will be more sensitive and less specific (more empty vessels).

We can also click on a term that is below the hierarchy, making the search more specific and decreasing the number of results.

And it does not end here. If we select a MeSH term for the search, it includes the terms that are below in the hierarchy. For example, if we select the descriptor “Otitis Media”, Pubmed will include in the search all that hang from it (mastoidits, otitis with effusion, suppurative otitis and petrositis, which may not interest us at all). This can be avoided by checking the box that says “Do not include MeSH terms found under this term in the MeSH hierarchy”.

Well, I think we’re going to end up with this example, if there is still someone who is still reading at this point. Let’s say we chose the simplest way: let’s go to “Otitis Media” and add it to the search. Next we write the second search term in the search window of the database: infants. We get 14 possibilities, select the first (“Infant”) and add it to the search. We do the same with “Amoxicillin”, “Cefaclor” and “Prognosis”. When we have added all of them to the search string (note that the default boolean operator is AND, but we can change it), the search string is as follows: (“(Otitis Media [Mesh]) AND” Infant ” Mesh]) AND “Amoxicillin” [Mesh]) AND “Cefaclor” [Mesh]) AND “Prognosis” [Mesh].

Finally, click the “Search PubMed” button and get the search result, which in this case is a bit more restricted than we obtained with natural language (this is usually the case).

If we wanted to remove the articles about the treatment with clavulanic acid, as we did in the example with the simple search, we could add the term clavulanate as we add the other terms, but changing the boolean operator AND by the NOT operator. But there is another way that is even simpler. If you notice, when Pubmed gives us the list of results, in the search window of Pubmed is written the search string that has been used and we can add or remove terms from this string, using MeSH or natural language terms, which we prefer. So, in our example, to the text string we would add NOT clavulanate in the search box and we would click on the “Search” button again.

And here we are going to leave it for today. Just saying that there are other ways to use MeSH terms, using the advanced search form, and we can further narrow the results using some resources, like the Clinical Queries or using limits. But that is another story…

The oyster with the thousand pearls

Print Friendly, PDF & Email

We saw in a previous post that our ignorance as doctors is huge, which forces us to ask ourselves questions about what to do with our patients on numerous occasions.

At this point, we will be interested in seeking and finding the best available evidence on the subject that occupies us, for which we will have to do a good bibliographical search. Although the bibliographic search is defined as the set of manual, automatic and intellectual procedures aimed at locating, selecting and retrieving references or works that respond to our question, the vast majority of the time we simplify the process and we just do a digital search.

In these cases we will have to resort to one of the many biomedical databases available to find the pearl that clarifies our doubt and help remedy our ignorance. Of all these databases, there is no doubt that the most widely used is Medline, the database of the National Library of Medicine. The problem is that Medline is a very large database, with about 16 million articles from more than 4800 scientific journals. So, as is easy to assume, finding what you are looking for may not be a simple task on many occasions.

In fact, when we use Medline what we use is a tool that is known as Pubmed. This is a project developed by the National Center for Biotechnology Information (NCBI for friends), which allows access to three National Library of Medicine databases: Medline, PreMedline and AIDS. These databases are not filtered, so we will need critical reading skills to evaluate the results (there are other resources that give the information already filtered), since the searcher provides nothing more (and nothing less) than the article reference and, in many cases, a brief summary. Best of all, it’s free, which is not the case with all the search tools available.

So, if we want to explore this oyster with thousands of pearls, we will have to learn how to use Pubmed to find the pearls we are looking for. You can enter Pubmed by clicking on this link, although a shortcut is to type pubmed in the address bar of the browser and press control-enter. The browser will know where we want to go and will redirect us to the Pubmed home page. Let’s take a look at starting to use it (see the first attached figure) (Pubmed look changes from time to time, so something may have changed since I wrote this post, probably to improve).

The first thing we see is the simple search box, where we can type the search terms to get the results by clicking the “Search” button. You see that under this box there is a link that says “Advanced”, with which we will access the advanced search screen, which we will talk about of another day. Today we will focus on the simple search.

Below are three columns. The first one says “Using PubMed.” Here you can find help on the use of this tool, including tutorials on the different search modalities and tools that includes Pubmed. I advise you to dive in this section to discover many more possibilities of this search engine than the few that I will tell you in this post.

The second column is the “PubMed Tools”. Here are two of special interest, the “Single Citation Matcher”, to find the reference in PubMed of a specific article knowing some aspects of its bibliographic citation, and “Clinical Queries”, that allow us to filter the results of the searches according to the type of studies or their characteristics.

The third column shows search engine’s resources, such as the MeSH database, which is nothing more than the search term thesaurus that includes Pubmed.

Well, let’s get something to practice. Let us think, for example, that I want to know if it is better to use amoxicillin or cefaclor for the treatment of otitis in infants so that the evolution of the disease is less prolonged. Logically, I can not write this as it is. First I have to build my structured clinical question and then use the components of the question as search terms.

My question would be: in (P) infants with otitis, (I) treatment with cefaclor (C) compared to treatment with amoxicillin, (0) reduces the duration of disease ?. So, with this example, we could use five search terms: otitis, infants, amoxicillin, cefaclor and duration.

In the simple search we will simply enter the words in the search box (natural language) and click on the “Search” box.

The search box supports boolean operators, which are “y”, “o” and “not” (they are often capitalized: AND, OR and NOT). When we put several words in a row without any boolean operators, Pubmed understands that the words are separated by AND. Thus, if we have a term consisting of two words and we want it to be considered as one, we will have to write it in quotation marks. For example, if we write acute appendicitis and we want it to count as a single term, we will have to introduce “acute appendicitis”.

Another useful operator is the truncation, which is to place an asterisk (a wild mark) at the end of the root of the word to search for all words that begin with that root. For example, infan * will search for infant, infancy…

Let’s go with our example. We write otitis AND infants AND amoxicillin AND cefaclor AND course and click on “Search” (see the second attached figure). We were lucky enough; we get only 11 results (you can get a different number if you do the search at another time).

We take a look and see that the works are more or less adjusted to what we are looking for. The only drawback is that it includes articles that study the effect of amoxicillin-clavulanate, which we are not interested in. Well, we’re going to take them off. To the text of search we add NOT clavulanate, and we get an even more limited search.

We only have to select or click on the works that interest us to get the summary (if available) and, in some cases, even get access to the full text, although this will depend on whether the text is freely accessible or on the permissions or subscriptions of the institution from which we access to Pubmed.

So far we have seen the simplest way to search with Pubmed: simple search with free text. The problem is that using this form of search is not always going to get such a specific result, it will be much more frequent that we get thousands of results, most of them without any interest for us. In these cases we will have to resort to other resources such as advanced search, the use of MeSH terms or the use of Pubmed Clinical Queries. But that is another story…

Those who have no questions…

Print Friendly, PDF & Email

…will never have answers. My biochemistry’s teacher taught me this almost two lives ago, when I was a medicine freshman. I don’t remember what else she taught me but I have this etched into my memory because, I don’t want to remember how many years later, it still remains valid.

And it turns out that the wheel of evidence-based medicine starts spinning with a question. Of course, the problem is that, in medicine, we do not always get an answer for a lot of questions, and according to some, in four out of five times we will not get a satisfactory answer, no matter how well we look for it.

We physicians, let’s face it, are pretty ignorant, and anyone who thinks otherwise is because he doesn’t know how ignorant he’s, which is much worse and more dangerous. We are often challenged by gaps in our knowledge that we want to fill with the available information. It has been estimated that, at Primary Care level, we ask two questions for every 10 patients we receive, increasing this number to five for each patient admitted to Hospital Care. It is easy to understand that we cannot do a bibliography search every time we have a question, so we will have to set priorities.

At our beginnings, when we are very, very ignorant, the questions are quite general. These are called background questions, seeking information on general aspects of diseases and treatments. They are usually composed of a root with a word like how, how much, when or something similar, and a verb followed by the disease or whatever we are dealing with. Questions of this kind are, for example, “what germ causes risperidiosis?” or “how do we treat a dander attack?”

In general, the answer to background questions can be found in textbooks or review articles. There are digital sources of reviews on general topics, such as the one that is undoubtedly one of the most worshiped: UpToDate. We will all meet some uptodater, who are people easily recognizable because, in the first hour of the morning, they already have the latest information obtained from UpToDate, so they give you the answer even before you have asked yourself the question.

But, as we become wiser, the questions that we ask start to involve specific aspects of treatment, prognosis or whatever of a disease in a given patient or population. These advanced or foreground questions often have characteristics that differ qualitatively from that of the background questions: they are usually asked as part of the clinical decision making when we are seeking for information about any problem we are interested in.

Therefore, it’s essential to set them properly and formulate them clearly because if not, they won’t serve to plan the search strategy and to make the right decision that we’re looking for. They are formed by what is known as a structured clinical question, also known in the jargon of evidence-based medicine as PICO questions, after the initial of its components, as we can see below.

P stands for patient, but also for the problem of interest or the clinical description of the situation that we are studying. We must define very well the most relevant characteristics of the group of patients or the population that originated the question, trying not to restrict too much the characteristics of the group, because it may happen that later we find nothing that answers the question. It is often preferable to select the population more generally and, if the search is unspecific (we have many results), we can always restrict it later.

I represents the main intervention, which can be a treatment, a diagnostic test, a risk factor or exposure, etc. C refers to the comparison with which we contrast the intervention, and may be another treatment, placebo or, sometimes, do nothing. This component is not mandatory in the structure of the question, so we can avoid it in cases that we do not need it.

Finally, O represents the outcome of clinical interest in our question, whether in terms of symptoms, complications, quality of life, morbidity, mortality, or any other outcome variable we choose. Thus, it is important to emphasize that the result that we choose should have importance from the clinical point of view, especially importance from the point of view of the patient. For example, in a study to prevent coronary disease, we can measure the effect by decreasing troponin, but the patient will certainly appreciate it more if we estimate the decrease in mortality from myocardial infarction.

Sometimes, as I have already said, it’s not relevant to do any comparison with anything, so PICO becomes PIO. Some people add a fifth parameter, the time, and PICO becomes PICOt. You can also see it as PECO or PECOt if you prefer to say exposure rather than intervention. But, no matter what letters you use, the important thing is to divide the question into its components, because these elements will be the ones that will determine the keywords for the search of information and the type of study design that we’ll need to find the answer (some people add the type of study design as a fifth or sixth letter to PICO).

It’s very important to find a good balance between the scope and accuracy of the question. For instance, the question “in infants with cranial traumatism, do treatment with corticoids improve the prognosis?” may be too general to be of any use. In addition, “in 3-6 month-old infants who fall from the crib from 20 centimeters high and suffer a left side of his forehead traumatism against a carpeted floor, can we improve the prognosis using methylprednisolone at a dose of 2 mg/kg/day during five days?” seems to me as too specific to be used in the search strategy or to be useful for the clinical decision making. A better way of structure the question would be something like “in infants with minor cranial traumatism (minor trauma’s criteria must be previously stablished) does steroid treatment improve the prognosis?” P would be the infants who suffer the trauma, I the treatment with corticosteroids, C would be, in this case, not to give steroids and, finally, O would be the prognosis (which could be replaced by something more specific such as the probability of hospital admission, time until discharge, death, etc).

Let’s see another example: in (P) infants with bronchiolitis, (I) the use of intravenous corticosteroids, instead of inhaled (C), decreases the risk of hospital admission (O)?. Or this one: in (P) infants with otitis, does the use of antibiotics (I) shorten the duration of illness (O)?.

Depending on the type of answer that they are looking for, clinical questions can be classified into four basic types: diagnosis, treatment, prognosis and etiology/harm. Diagnostic questions are about how to select and interpret diagnostic tests. Treatment questions have to do with the treatment we can choose to provide more benefits than risks and with lower economic cost and resources. Prognosis questions give us the probability of a certain clinical course and anticipate complications. Finally, etiology/harm questions are those that serve to identify the causes of diseases, including iatrogenic.

The type of question is important because it will define the type of study design that most likely will answer to our question. Thus, diagnostic questions are best answered with studies with a design that is specific for the evaluation of diagnostic tests. Treatment or harm questions can be answered with clinical trials (ideally) or with observational studies. However, prognostic questions usually require observational studies to find the answer. Finally, just to mention that there’re other types of clinical questions besides the four basic ones such as the frequency (which will be answered using systematic reviews and observational studies) or cost-benefit questions (who need economic evaluation studies).

A well-structured clinical question can help us to solve a clinical problem but it also often serves to make more questions, which with we can fill the gaps of our knowledge and become a little less ignorant. In addition, if we don’t structure our question in its different components, it will be practically impossible to find useful information. Those of you that don’t believe what I’m saying, just write “asthma” in the search field of PubMed or any other search engine and see the number of results. Some browsers, such us Trip Database, even allow search using the PICO structure of the clinically structure question. But unfortunately, in most cases we’ll must find synonyms of each component and find the right descriptor for the database where we are doing the search, usually using advances search techniques. But that’s another story…