Author: Chad Cook PT, PhD, FAPTA

Physical therapists commonly compare two or more things to one another. For example, I’ve frequently heard the comparison of the diagnostic accuracy of one test to another, when defending or rejecting the use of a special test. I’ve also heard the reporting that one intervention is more effective compared to another; in most cases, incorrectly. Sometimes these judgments are not apples-to-apples comparisons and markedly depend on the context and type of the compared group. If you indulge me, I’ll give a non-physical therapy-related example to reinforce my point better.

I routinely mention that I score “The Last Jedi” as a 6 out of 10” when comparing it to “the Empire Strikes Back”, the film I believe is the best in the Star Wars saga. Nonetheless, I score “the Empire Strikes Back” as a 3 out 10 when I compare it to my favorite movie, John Carpenter’s “The Thing”. When a detail such as this is provided (two very different comparisons), the score for “The Last Jedi” doesn’t look so good. In truth, compared to “The Thing”, I would score “The Last Jedi” in the negatives (looks like I’m not alone on this)-and that’s not even possible. Yet it scored a 6 out of 10 in one of the above comparisons.

Having a ranked perspective when considering physical therapy interventions would be nice. Unfortunately, we rarely provide this much detail when we report the effectiveness of selected interventions to comparators. Frequently, we hear vague comments such as “exercise is as good as or better than other treatment options”. Well….what exactly does this mean? Is this information beneficial for clinicians? Is this a testimony in support of exercise or in spite of it? I believe there are two ways to improve how we message comparisons: 1) As researchers, improve the way we report the findings; and 2) As research consumers (if the paper is available) read beyond the abstract.

Improving how we report findings: The Cochrane handbook [1] emphasizes the importance of reporting details associated with the comparator groups. When describing comparison to specific interventions, they define three primary types of comparators.

  1. Intervention versus placebo (e.g., placebo drug, sham surgical procedure, psychological placebo).
  2. Intervention versus control (e.g., no intervention, wait-list control, usual care);
  3. Intervention A versus Intervention B (this may include the same intervention with different time or dosage parameters, or two different interventions).

The first two types of comparisons aim to establish the effectiveness of an intervention, whereas the last aims to compare the efficacy of two interventions [1]. I would argue that Cochrane could have divided these into four primary types of comparators, since usual care is a form of “active” intervention (in other words, the patient is doing something). In a musculoskeletal population, usual care will likely exhibit an effect that is different from no intervention or a wait-list control. In addition, we are typically interested in determining whether a new intervention is worth adopting, and a comparison to usual care is a way of assessing this.

As researchers, the comparator types must be sub-classified. However, the Cochrane handbook [1] recognizes that there are often limited subsets of comparative types, thus recommending combining interventions first, then sub-typing if studies are available. The aforementioned “three primary types of comparators” is a great first place to start when considering sub-typing. If possible, sub-type all three areas, especially if it addresses an important clinical and research question.

As research consumers, if the paper is available, read beyond the abstract: We are at a point when there are almost as many systematic reviews as studies that are included in reviews. In fact, it is safe to say that most systematic reviews are misleading and unlikely to reflect the underlying truth [2]; the majority include inferior articles or include a menagerie of comparative types that have no business being combined together. Reading beyond the abstract is also essential to determine if the intervention and comparator groups are appropriate, if the GRADE scores suggest very low confidence in the results of the findings, the timing of the outcomes, if bias is notable (and what type of bias), whether the study interventions had appropriate fidelity, and so on. This is of critical importance when comparison groups indicate they are placebo or sham, but fail to meet the requirements of these types of studies, which is very common in sham manual therapy attempts [3]. Frankly, if one hasn’t read the paper, and has only looked at the abstract, they shouldn’t be summarizing it in public or making critical clinical decisions. Abstracts are notoriously nuanced by spin [4] (and it does influence their perceptions of the findings) and are often erroneous [5].

Summary: I’ve indicated before that a dirty little secret among physical therapists is that there really aren’t any superior interventions in musculoskeletal care during a medium and long term follow up [6-10]. In fact, differences are as likely to be associated with methodological nuances as the effects of the interventions. This is why understanding the comparisons is so essential. If you hear that something has an effect versus a control of waitlist or no treatment, it means much less than if it has an effect against another efficacious intervention secondary to potential shared pain mechanisms. If you hear that something has a similar effect to placebo or sham, make sure and check that the placebo and sham truly reflect what these terms mean; most times, they do not. If you are told that something is similar to or better than usual care, consider the interventions’ costs, risks, and patient burden before adopting or abandoning a technique. When considering risk what is the least invasive approach that may attain the desired effect with the least risk of potential progressive disability related to surgery or medication. If you hear “exercise is as good as or better than other treatment options”, then recognize that likely means that all comparisons were dumped in a bucket, without considering the details of the patients symptomatic response and you can’t make heads or tails on what is happening.

References

  1. McKenzie JE, Brennan SE, Ryan RE, Thomson HJ, Johnston RV, Thomas J. Chapter 3: Defining the criteria for including studies and how they will be grouped for the synthesis. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.3 (updated February 2022). Cochrane, 2022. Available from training.cochrane.org/handbook.
  2. Ioannidis JP. The Mass Production of Redundant, Misleading, and Conflicted Systematic Reviews and Meta-analyses. Milbank Q. 2016 Sep;94(3):485-514.
  3. Puhl AA, Reinhart CJ, Doan JB, Vernon H. The quality of placebos used in randomized, controlled trials of lumbar and pelvic joint thrust manipulation-a systematic review. Spine J. 2017 Mar;17(3):445-456.
  4. Jankowski S, Boutron I, Clarke M. Influence of the statistical significance of results and spin on readers’ interpretation of the results in an abstract for a hypothetical clinical trial: a randomised trial. BMJ Open. 2022 Apr 8;12(4):e056503.
  5. Nascimento DP, Ostelo RWJG, van Tulder MW, Gonzalez GZ, Araujo AC, Vanin AA, Costa LOP. Do not make clinical decisions based on abstracts of healthcare research: A systematic review. J Clin Epidemiol. 2021 Apr 8:136-157.
  6. Chmielewski TL, George SZ, Tillman SM, et al. Low- versus high-intensity plyometric exercise during rehabilitation after anterior cruciate ligament reconstruction. Am J Sports Med. 2016;44:609–617.
  7. Cook CE, George SZ, Keefe F. Different interventions, same outcomes? Here are four good reasons. Br J Sports Med. 2018;52(15):951-952.
  8. Day MA, Ehde DM, Burns J, et al. A randomized trial to examine the mechanisms of cognitive, behavioral and mindfulness-based psychosocial treatments for chronic pain: Study protocol. Contemp Clin Trials. 2020;93:106000.
  9. Di Blasi Z, Harkness E, Ernst E, Georgiou A, Kleijnen J. Influence of context effects on health outcomes: a systematic review. Lancet. 2001;357(9258):757-762.
  10. O’Keeffe M, Purtill H, Kennedy N, et al. Comparative Effectiveness of Conservative Interventions for Nonspecific Chronic Spinal Pain: Physical, Behavioral/Psychologically Informed, or Combined? A Systematic Review and Meta-Analysis. J Pain. 2016 Jul;17(7):755-774.