Home » Blog
Category Archives: Blog
Advantages and Disadvantages of Research Metrics used to Evaluate a Researcher’s Impact or Influence
“It’s not you, it’s us…”: Heterogeneity of treatment effects as a challenge to effectiveness trials.
Is Myofascial Pain Syndrome a Legitimate Primary Diagnosis?
By: Chad E Cook, Damian Keter, Ken Learman
Background
Myofascial Pain Syndrome (MPS) is hypothesized to be both a primary and/or a secondary chronic pain disorder that can refer symptoms to other parts of the body. MPS is relatively common, affecting millions of people worldwide, particularly those who have experienced muscle overuse, trauma, or stress [1]. MPS can significantly impact daily activities and quality of life, as the persistent pain and discomfort can be both physically and emotionally draining [2]. Despite its notable impact on health and wellness, MPS is a controversial diagnosis that mainly stems from the lack of consensus on its diagnostic criteria and the underlying mechanisms. The objective of this blog is to identify whether MPS meets current criteria as a unique diagnosis, using the four criteria from the World Health Organization (WHO).
Diagnostic Criteria
Historically, the WHO, through its International Classification of Diseases (ICD) criteria, provides a global standard for diagnostic health data, facilitating international comparisons and collaborations in healthcare. For each unique diagnosis, the WHO requires four criteria [3]: 1) specificity, 2) consistency, 3) significance and 4) diagnostic stability. These criteria have allowed them to differentiate two competing conditions such as influenza and COVID-19, and have allowed them to recognize new diseases/syndromes such as E-Cigs and Vaping-Associated Lung Injury or Post-Traumatic Stress Disorder (PTSD) due to Complex Trauma in Childhood.
Specificity suggests that the condition must have a clear and specific set of symptoms and characteristics that distinguish it from other conditions. Consistency requires that the symptoms and characteristics should be reliably observed across different patients and settings. Significance involves its impact on the individual’s health, functioning, or quality of life. Diagnostic Stability suggests that the diagnosis should remain stable over time, meaning that the condition does not frequently change or evolve into another condition.
Based on the WHO criteria, is MPS a stand-alone, primary diagnosis? The answer is both “yes” and “no”.
According to the WHO, MPS refers to a musculoskeletal disorder characterized by pain originating from tight muscles and the surrounding fascia, often presenting as sensitive “trigger points” that can cause localized pain and referred pain to other areas of the body; this pain can be chronic and is often associated with repetitive motions, poor posture, or stress [4]. Under the ICD-11, MPS is classified under chronic primary pain and chronic secondary musculoskeletal pain. The criteria for chronic primary pain
include persistent or recurrent pain for at least three months, with significant emotional distress or functional disability. For chronic secondary musculoskeletal pain, the pain is associated with a musculoskeletal condition, which persists beyond the usual recovery period. Despite these descriptions from the WHO, neither of the two (for chronic primary and chronic secondary) meets all four of the original WHO criteria.
Truthfully, it is well understood that MPS does not have a clear, routine set of signs and symptoms that distinguishes it from other diagnoses (lacks specificity). This influences consistency as well. These are reasons it is difficult to differentiate MPS from other diagnostic conditions such as fibromyalgia, tension type headaches, and chronic fatigue syndrome. MPS is also traditionally categorized as a nociceptive pain condition, but there’s growing evidence suggesting it can also involve neuropathic or nociplastic pain components. Further, due to the lack of specific laboratory indicators and imaging evidence, there is no unified diagnostic criteria for MPS, adding to the confusion with other diseases [5].
Summary
Given the prevalence of MPS it is likely that most physical therapists and chiropractors see a high percentage of these individuals in their outpatient practices. As currently defined by ICD coding MPS is highly likely to contribute to patients’ pain experience as a secondary condition [6], which can be very debilitating to the individual [7]. This is likely why other global healthcare groups such as the International Association for the Study of Pain, support the contribution of MPS, which they characterize as local and referred pain perceived as deep, dull, pressure, and aching, along with the presence of myofascial trigger points in any part of the body [8]. As the complex nature of pain and associated pain conditions are further unraveled perhaps MPS will find a better home as a primary pain condition, however, currently, MPS may be difficult to differentiate from other conditions and is likely a secondary contributor to most musculoskeletal conditions seen by rehabilitation providers.
References
1. Li X, Lin Y, He P, Wang Q. Efficacy and safety of low-intensity ultrasound therapy for myofascial pain syndrome: a systematic review and meta-analysis. BMC Musculoskelet Disord. 2024 Dec 23;25(1):1059.
2. Jaeger B. Myofascial trigger point pain. Alpha Omegan. 2013;106(1–2):14–22.
3. Hebert O, Schlueter K, Hornsby M, Van Gorder S, Snodgrass S, Cook C. The diagnostic credibility of second impact syndrome: A systematic literature review. J Sci Med Sport. 2016 Oct;19(10):789-94.
4. Qureshi N, Hamoud AA, Gazzaffi IMA. Myofascial Pain Syndrome: A Concise Update on Clinical, Diagnostic and Integrative and Alternative Therapeutic
Perspectives. International Neuropsychiatric Disease Journal. 2019 Mar; 13(1): 1-14
5. Cao QW, Peng BG, Wang L, Huang YQ, Jia DL, Jiang H, Lv Y, Liu XG, Liu RG, Li Y, Song T, Shen W, Yu LZ, Zheng YJ, Liu YQ, Huang D. Expert consensus on the diagnosis and treatment of myofascial pain syndrome. World J Clin Cases. 2021 Mar 26;9(9):2077-2089. doi: 10.12998/wjcc.v9.i9.2077.
6. Plaut S. Scoping review and interpretation of myofascial pain/fibromyalgia syndrome: An attempt to assemble a medical puzzle. PLoS One. 2022 Feb 16;17(2):e0263087.
7. Lam C, Francio VT, Gustafson K, Carroll M, York A, Chadwick AL. Myofascial pain – A major player in musculoskeletal pain. Best Pract Res Clin Rheumatol. 2024 Mar;38(1):101944.
8. International Association of the Study of Pain. Myofascial Pain: Fact Sheet 14. Downloaded December 27, 2024 at: https://www.iasp-pain.org/wp-content/uploads/2022/10/14.-Myofascial-Pain-Fact-Sheet-Revised-2017.pdf.
Risk of Bias Measures can be Biased
By: Chad E Cook, Damian Keter, Ken Learman
Navigating the Literature: Navigating the ever-growing, healthcare literature can be challenging [1]. The sheer amount of new research, articles, and guidelines published regularly can be overwhelming. The number of biomedical publications has been steadily increasing over the years. As of 2022, there were approximately 3.3 million scientific and technical articles published worldwide [2]. The volume of information and the time constraints of a busy clinician can lead to information overload. This is particularly important since it can be difficult to determine which information is relevant and credible amidst the vast amount of available content.
In publishing, risk of bias measures are tools and methods used to assess the likelihood that the results of a study are influenced by systematic errors or biases. With the very high number of systematic reviews, which are designed to summarize overall results into a common understanding, the use of risk of bias measures is crucial for evaluating the quality, reliability, and trustworthiness [3-5] of research findings. This, and a focus on transparency in research, has led to the proliferation of risk of bias measures and their adoption into publication practice. However, there are limitations to risk of bias measures that may denude their utility in reconciling the literature. The purpose of this blog is to: 1) outline the limitations of risk of bias measures and 2) discuss the best ways of interpreting the literature when risk of bias measures provides interpretation conflict.
Limitations of Risk of Bias Measures: Risk of bias measures are useful tools that assist in guiding evidence synthesis, particularly in systematic reviews and meta-analyses. Risk of bias measures aid in selecting high-quality studies and weighting their contributions appropriately, leading to more reliable conclusions. Nonetheless, there are limitations to current risk of bias measures, which include: 1) subjectivity of raters, 2) elevating risk when reporting is actually the problem, 3) overemphasis on selected scoring areas and failure to recognize other notable contributors, and 4) interpretation issues (meaningful scaling) within and between instruments.
Subjectivity of raters: Assessments of risk of bias often involve subjective judgments, which can vary between reviewers. Best practice involves two different reviews and a consensus of findings, but assessment requires appropriate training to assure that reviewers truly understand each item of the risk of bias scale. A recent study [6] examined the inter-rater reliability of several risk of bias tools for non-randomized studies and found variability in the assessments that was attributed to differences in the complexity and clarity of the criteria used in the tools. Furthermore, results of the analysis using multiple tools on the same article can yield differing interpretations of the trustworthiness of a causal inference [7]. For this reason, it is
common practice for systematic review guidelines to mandate that two independent reviewers must complete risk of bias assessments and come to consensus on discrepancies [8].
Elevating risk when reporting is actually the problem: Reporting checklists in publishing are essential tools used to improve the transparency, completeness, and quality of research reporting. Common examples of reporting checklists include CONSORT for randomized controlled trials, PRISMA for systematic reviews and meta-analyses, and STROBE for observational studies. Unfortunately, not all studies are written using reporting checklists as a guide, which can lead to the inability to discriminate if the study design excluded the risk of bias component or if it was simply omitted from reporting. Risk of bias can only be evaluated based on what is reported and if what is reported is poor or omitted (despite being performed in the study), the risk of bias may be artificially inflated [9]. A counterfactual argument exists where investigators can use a checklist and report that design elements that meet the checklist occurred, when they did not or were inelegantly applied. This brings investigator intent to the table which we can never accurately assess, but exists nonetheless.
Overemphasis on selected scoring areas: In an effort to reduce administration burden, most risk of bias scales overemphasize areas (e.g., randomization, allocation concealment) and underemphasize others (e.g., interventional fidelity, blinding of outcomes, incomplete outcome data). Certainly, the underemphasized areas are as important or potentially more important than those that are historically supported [9].
Interpretation issues: There are two major considerations when interpreting results of a risk of bias tool. First, most risk of bias scales provide a summary score, but it is questionable whether this score actually reflects a meaningfully elevated risk, especially if the values are not weighted. For example, a high risk of bias score on the PEDro scale, a commonly used measure in physical therapy studies, total PEDro scores of 0-3 are considered ‘poor’, 4-5 ‘fair’, 6-8 ‘good’, and 9-10 ‘excellent’; it is important to note that these classifications have not been validated [10]. Second, the actual impact of bias may be variable depending on the direction of the impact. Two biases may move the outcome in opposite directions offsetting each other and producing minimal, if any, net effect on the inference. Third, best practice suggests that a sensitivity analysis or a subgroup analysis is appropriate when variations in risk of bias measures are identified in a synthesis-based review (e.g., systematic review). Conducting sensitivity analyses helps determine how the inclusion or exclusion of studies with high risk of bias affects the overall results. Performing subgroup analyses helps to explore whether studies with low, moderate, or high risk of bias yield different results [9].
Summary
Risk of bias measures provide additional data in the determination of study bias or quality, but these tools are not gospel and should not be taken as absolute, unquestionable truth. As with many tools used in interpreting publications, there are limitations to their use. As such, determining a study as “good” or “bad” or “trustworthy” or “not trustworthy”, purely from a risk of bias score should not be recommended.
References 1. https://www.pharmacytimes.com/view/tips-tricks-for-staying-up-to-date-with-medical-literature-guidelines-as-a-busy-pharmacist 2. https://ncses.nsf.gov/pubs/nsb202333/publication-output-by-region-country-or-economy-and-by-scientific-field
3. Riley SP, Flowers DW, Swanson BT, Shaffer SM, Cook CE, Brismée JM. ‘Trustworthy’ systematic reviews can only result in meaningful conclusions if the quality of randomized clinical trials and the certainty of evidence improves: an update on the ‘trustworthy’ living systematic review project. J Man Manip Ther. 2024 Aug;32(4):363-367.
4. Flowers DW, Swanson BT, Shaffer SM, Clewley DJ, Riley SP. Is there ‘trustworthy’ evidence for using manual therapy to treat patients with shoulder dysfunction?: A systematic review. PLoS One. 2024 Jan 18;19(1):e0297234.
5. Riley SP, Swanson BT, Shaffer SM, Flowers DW, Cook CE, Brismée JM. Why do ‘Trustworthy’ Living Systematic Reviews Matter? J Man Manip Ther. 2023 Aug;31(4):215-219.
6. Kalaycioglu I, Rioux B, Briard JN, Nehme A, Touma L, Dansereau B, Veilleux-Carpentier A, Keezer MR. Inter-rater reliability of risk of bias tools for non-randomized studies. Syst Rev. 2023 Dec 7;12(1):227.
7. Jüni P, Witschi A, Bloch R, Egger M. The Hazards of Scoring the Quality of Clinical Trials for Meta-analysis. JAMA. 1999;282(11):1054–1060. doi:10.1001/jama.282.11.1054.
8. Checklists for systematic reviews and research synthesis. https://jbi.global/sites/default/files/2020-08/Checklist_for_Systematic_Reviews_and_Research_Syntheses.pdf
9. Higgins JPT, Altman DG, Sterne JAC (editors). Chapter 8: https://training.cochrane.org/handbook/current/chapter-08
10. Assessing risk of bias in included studies. In: Higgins JPT, Churchill R, Chandler J, Cumpston MS (editors), Cochrane Handbook for Systematic Reviews of Interventions version 5.2.0 (updated June 2017), Cochrane, 2017. Available from www.training.cochrane.org/handbook.
Why Isn’t Everyone Using Stepped Care for Musculoskeletal Injuries?
By: Chad E. Cook PT, PhD, FAPTA
Resource efficiency models
Musculoskeletal (MSK) outcomes have shown some concerning trends over the last decade. Conditions like low back pain, neck pain, and joint pain have become more prevalent, contributing to the overall burden of a MSK disorder [1]. According to a report analyzing medical claims data from 2010 to 2020, MSK healthcare costs have doubled, despite the number of individuals reporting MSK disorders remaining relatively constant. This increase in costs is driven by a rise in per-member costs and the growing number of health plan members [2] and has prompted a number of novel management models that emphasize cost-effectiveness rather than a current fee-for-service dominant strategy (which rewards higher utilization and does not penalize the provider when outcomes are not optimized). These novel “resource efficiency models” focus on the optimal use of resources—such as time, personnel, equipment, and finances—to achieve comparable or superior patient outcomes to a traditional approach.
What is Stepped Care?
Stepped care for MSK conditions is a tailored and structured approach to treatment that starts with the least intensive, most cost-effective interventions first (Figure 1). Care steps up to more intensive treatments as/if needed [3] (only when selected clinical criteria are not met or if the patient is at risk for worsening if they do not receive a dedicated treatment approach). The earliest stepped care options were developed for mental health disorders, diabetes, and other behavioral conditions and thus far there is emerging evidence to support stepped care treatments for individuals with different forms of MSK disorders [4-9].
It works off the premise that there logical are first-line and second-line approaches to MSK conditions, as well as a series of assumptions [10]. These assumptions include: 1) Equivalence of clinical outcomes across the different levels of care. These steps within the model are assumed to be equally effective in achieving clinical outcomes; 2) Efficiency in resource use: The model assumes that using the least intensive, yet effective, intervention first will optimize resource use and reduce costs; 3) Acceptability of minimal interventions: Patients and providers are assumed to accept and adhere to less intensive interventions before moving to more intensive ones (watchful waiting has merit); 4) Self-correcting nature of the model: The model assumes that if an intervention is not effective, the next step in the care pathway will be more intensive and appropriate and may potentially be a better “match” for the patient; and 5) Stepped care reduces overtreatment: Overtreatment in MSK conditions is the provision of medical interventions that are unnecessary or excessive given the patient’s condition.
Why Isn’t Everyone using Stepped Care?
Thus far, there seems to be both clinical efficiency of stepped care and cost-effectiveness as well. If so, especially in light of the rather stagnant results we’ve seen globally in management of MSK conditions, “why isn’t everyone using stepped care?”. The answer for the United States is threefold. First, care within the United States is fragmented, often leading to poor communication across different forms of providers. Second, the parties involved as first-point providers are often those who provide the most invasive and potentially highest costs of care (a proverbial fox guarding the chicken coup scenario). Last, there are no financial incentives in a fee for service system, the payment system that dominates the United States, for adopting stepped car. In fact, it is likely that fee for service providers would lose business to lower cost providers and would also lose market share.
Summary
Stepped care has significant potential for improving the management of MSK conditions in the future. By providing tailored interventions that match the patient’s needs, stepped care can enhance treatment outcomes, reduce healthcare costs, and improve patient satisfaction. This model allows for early intervention with less intensive treatments, reserving more resource-intensive options for those who do not respond to initial therapies. Additionally, stepped care promotes a more efficient use of healthcare resources and encourages a collaborative approach among healthcare providers. As research continues to support its effectiveness, and as payment models are adjusted, stepped care could become a cornerstone of MSK management, leading to better overall health outcomes for patients.
References
- GBD 2021 Other Musculoskeletal Disorders Collaborators. Global, regional, and national burden of other musculoskeletal disorders, 1990-2020, and projections to 2050: a systematic analysis of the Global Burden of Disease Study 2021. Lancet Rheumatol. 2023 Oct 23;5(11):e670-e682.
- Hinge Health. State of MSK Report 2021. Downloaded on December 15, 2024 from: https://healthactioncouncil.org/getmedia/a738c3c5-7c23-4739-bb8d-069dd5f7406b/Hinge-Health-State-of-MSK-Report-2021.pdf
- Kongsted A, Kent P, Quicke JG, Skou ST, Hill JC. Risk-stratified and stepped models of care for back pain and osteoarthritis: are we heading towards a common model? Pain Rep. 2020 Sep 23;5(5):e843
- Garcia AN, Cook CE, Rhon DI. Adherence to Stepped Care for Management of Musculoskeletal Knee Pain Leads to Lower Health Care Utilization, Costs, and Recurrence. Am J Med. 2021 Mar;134(3):351-360.e1.
- Rhon DI, Greenlee TA, Fritz JM. The Influence of a Guideline-Concordant Stepped Care Approach on Downstream Health Care Utilization in Patients with Spine and Shoulder Pain. Pain Med. 2019 Mar 1;20(3):476-485.
- Kroenke K, Bair M, Damush T, Hoke S, Nicholas G, Kempf C, Huffman M, Wu J, Sutherland J. Stepped Care for Affective Disorders and Musculoskeletal Pain (SCAMP) study: design and practical implications of an intervention for comorbid pain and depression. Gen Hosp Psychiatry. 2007 Nov-Dec;29(6):506-17.
- Kroenke K, Krebs E, Wu J, Bair MJ, Damush T, Chumbler N, York T, Weitlauf S, McCalley S, Evans E, Barnd J, Yu Z. Stepped Care to Optimize Pain care Effectiveness (SCOPE) trial study design and sample characteristics. Contemp Clin Trials. 2013 Mar;34(2):270-81.
- Mylenbusch H, Schepers M, Kleinjan E, Pol M, Tempelman H, Klopper-Kes H. Efficacy of stepped care treatment for chronic discogenic low back pain patients with Modic I and II changes. Interv Pain Med. 2023 Nov 15;2(4):100292.
- Boyd L, Baker E, Reilly J. Impact of a progressive stepped care approach in an improving access to psychological therapies service: An observational study. PLoS One. 2019 Apr 9;14(4):e0214715.
Figure 1. Example of a Stepped Care Model for Musculoskeletal Conditions.
Three Ways That Recruitment in Randomized Controlled Trials May Not Reflect Real Life
By: Chad Cook, Amy McDevitt, Derek Clewley, Bryan O’Halloran
As we wind up a year of recruitment on the SS-MECH trial [1], we are compelled to reflect on our recruitment strategies and study participants. Our study has included four recruitment sites and we’ve enrolled over 110 participants, which is nearly 85% of our targeted sample. We are using well-rehearsed and successful strategies at our work sites, providing access to a wide range of individuals with chronic neck disorders. As an example, the recruitment process at Duke University uses the electronic medical record to identify individuals who have recently been seen for neck related conditions, who are not seeking a physical therapist’s care at the given time. This process and the processes at all recruitment sites have been very effective, leading to high conversion rates (enrollment) and strong study retention. The study investigators provide care for both arms, which increases the fidelity of the interventions, as each of us has a vested interest in doing this right. Further, thanks to generous external funding (https://foundation4pt.org/), we have financial support for our six-month follow-ups, which has also been instrumental in a very high completion rate.
All of this sounds like wonderful news for any clinical trialist. And indeed, by mid 2025, we will complete the last six-month follow-ups for the SS-MECH trial and will be able to report on our findings. In fact, of the >20 randomized clinical trials (RCTs) that we’ve independently been involved in, this one has one of the strongest implementation plans and efforts toward improving the study quality. However, we would be remiss if we did not outline some of the concerns for ALL RCTs, concerns that are not specific to our study but should be considered when reading any published paper. The purpose of this blog is to outline the potential limitations of the samples in RCTs.
Concern Number One: All RCTs have specific inclusion/ exclusion criteria, which may influence the type of participant seen in the trial. This can lead to selection bias, which occurs when the volunteers for the study differ from those who do not volunteer. All RCTs may select a more homogeneous group of patients to reduce variability. The homogeneity of the sample reduces the generalizability of the results, which is whether the results are reflective of a broader patient population seen in everyday clinical practice. All RCTs identify a sample representative of a pre-specified target population [2], which may be dissimilar to the general population with chronic neck pain presenting to clinicians. Individuals who agree to participate in a study are often healthier, live close to the study site, are younger, have higher health literacy, and have higher socioeconomic status [3]. All of these features are also moderators of an outcome and could influence the results of the study. An example of selection bias in our study is our requirement that the research participants not attend physical therapy during the time of their treatment. This is likely to increase non-care seeker enrollment, which is a very different population than a care seeking one [4]; care-seekers tend to have more severe symptoms and may be more motivated to pursue a change in their status.
Concern Number Two: Non-pragmatic RCTs are conducted under idealized and controlled conditions, which may not accurately represent the complexities and variability of real-world clinical settings. This often increases patient compliance and reduces dropouts, influencing a study’s results. Participants in RCTs are often more compliant with treatment protocols and follow-up visits compared to the general patient population, leading to differences in outcomes. Study dropouts can introduce bias, reduce power, and lead to missing data. This can lead to an overestimation or underestimation of the treatment effect. With fewer participants completing the study, the statistical power to detect a difference between treatment groups is reduced. Lastly, missing data from dropouts can complicate the analysis and interpretation of results, requiring the use of statistical methods to handle the missing information.
Concern Number Three: Because of costs, nearly all RCTs have shorter follow-up periods than what might be observed in clinical practice, potentially missing long-term effects and outcomes. The typical follow-up time for physical therapy-led randomized controlled trials (RCTs) can vary, but it often ranges from 6 months to 1 year [5,6]. Short-term outcomes can lead to limited insight into long-term efficacy, failure to capture reoccurrence rates, and a poorer understanding of variability in patient response. Past studies on trajectories demonstrate that outcomes change markedly over a 1-year period [7]. Lastly, short-term outcomes fail to capture the potential behavioral changes that occur because of the treatment and, conversely, the potential for lack of implementation of self-management strategies over the long term. Participants might alter their behavior or adherence to treatment protocols once the trial ends, affecting long-term outcomes.
Summary: This blog highlights three concerns about RCTs germane to all studies. We emphasize the importance of closely examining the inclusion/exclusion criteria to determine if the study population accurately reflects the patients that clinicians encounter in clinical practice. Additionally, consider the demographics, social status, and other relevant factors that describe the sample. How you integrate the findings into your workflow and care plan should be guided by a clear understanding of these limitations.
References
- Cook CE, O’Halloran B, McDevitt A, Keefe FJ. Specific and shared mechanisms associated with treatment for chronic neck pain: study protocol for the SS-MECH trial. J Man Manip Ther. 2024;32(1):85-95.
- Stuart EA, Bradshaw CP, Leaf PJ. Assessing the generalizability of randomized trial results to target populations. Prev Sci. 2015;16(3):475-85.
- Holmberg MJ, Andersen LW. Adjustment for Baseline Characteristics in Randomized Clinical Trials. JAMA. 2022;328(21):2155-2156.
- Clewley D, Rhon D, Flynn T, Koppenhaver S, Cook C. Health seeking behavior as a predictor of healthcare utilization in a population of patients with spinal pain. PLoS One. 2018;13(8):e0201348.
- Herbert RD, Kasza J, Bø K. Analysis of randomised trials with long-term follow-up. BMC Med Res Methodol 2018;18:48.
- Llewellyn-Bennett R, Bowman L, Bulbulia R. Post-trial follow-up methodology in large randomized controlled trials: a systematic review protocol. Syst Rev 2016;5:214.
- Nim C, Downie AS, Kongsted A, Aspinall SL, Harsted S, Nyirö L, Vach W. Prospective Back Pain Trajectories or Retrospective Recall-Which Tells Us Most About the Patient? J Pain. 2024 Nov;25(11):104555.
Pros and Cons of Paying Peer Reviewers
By: Juliana Ancalmo, Chad E Cook PT, PhD, FAPTA, Ciara Roche
Background
Critical appraisal is a hallmark of peer reviewed publishing. Critical appraisal provides analytical evaluations of whether the results of the study can be believed, and can be transferred appropriately into other environments, for use in policy, education, or clinical practice [1]. Historically, critical appraisal is performed by peer reviewers who are either content or research experts (or both). Peer reviewers have viewed this act as an obligation for science, especially those who benefit from peer review as authors, and are not currently paid for this service.
Recent limitations brought forth by qualified peer reviewers has ignited discussion around paying for reviewing services. Although this topic had been highly debated previously, a new wave of conversation was reignited when researcher and Chief Scientific officer James Heathers [2] argued for a $450 fee for a peer review in an editorial published on Medium. This, coupled with the challenges many researchers faced post-COVID have spurred people on both sides of this argument to speak out. In this blog we will outline the pros and cons of this debate and discuss the complexity of the issue at hand.
Pros of Paying Peer Reviewers
We propose several benefits of paying peer reviewers for their critical appraisals. Since the COVID-19 pandemic, there has been a notable decline in acceptance rates combined with an increase in submission rates in academic journals, creating a large imbalance within the peer review process [3]. Compensation could lead to reviewer buy-in and decrease this imbalance [2]. Interestingly, it may also increase the diversity of peer reviewers. Peer reviewers often reflect those who populate their field of study, which often are dominated by men. Theoretically, paying for peer review may better represent women and lower income countries, especially if they are targeted [4].
On top of a lack of diversity of reviewers, these publishing companies are generating record profits, and compensating reviewers may reduce the associated negative optics. For example, the arguably biggest academic publishing company in the world is Elsevier, which generates $3.35 billion in revenue with a profit margin of around 40% [5]. It is arguable that among the five major publishing companies, Elsevier, John Wiley & Sons, Taylor & Francis, Springer Nature and SAGE, who control 50% of all revenue of the academic publishing industry globally, a solution could be drawn up to financially compensate underpaid and worked reviewers [5]. Quite frankly, asking someone to do a lot of work for free is a tough sell during times of record profits. Finally, we believe reviewers simply deserve to be paid. Good reviewers spend a lot of time peer reviewing papers. This process improves the final manuscript and strengthens the science. Experts deserve to be compensated and asking people to work for free is an archaic and offensive stance.
Cons of Paying Peer Reviewers
There are also several arguments that can be made against paying peer reviewers. One often cited is that compensation of reviews may lead to unethical reviews being submitted. It is not a stretch to consider how reviewers may take advantage of this monetary system for their own financial benefit–this could impact the quality of the reviews submitted, as reviewers work to do as many reviews as they can for “easy cash-grab.” This leads into another concern regarding the payment of peer reviewers: there is currently no threshold on what constitutes a “good review.” Nowadays it can take several months to wait for feedback on a paper, only to receive a couple of lines from a reviewer and a rejection from the editor. Does this two-line review deserve the same compensation as someone who spent hours reading and giving critical feedback on a review?
It is clear there would need to be notable training and standardization in submitting a review that would qualify for compensation; however, this process would further limit individuals who could submit a review and may cause further delays in this process. Additionally, processing of the payments would likely be a disaster at first. Considering it can sometimes take a year for these journals to publish a review, it is not unreasonable to believe that a payment system for ongoing peer reviewers would result in lost, incorrect or delayed payments.
Finally, there is uncertainty of whether journals or the industry could even afford to pay these reviewers in the first place. Publishing consultant, Tim Vines, argued that if there is an average of 2.2 reviews per article, each reviewed article would cost $990, assuming the $450 fee proposed by Heathers is met [6]. Additionally, for a journal with a 25% acceptance rate, the cost of reviewing for each accepted paper would be $3,960 [6]. This additional cost would almost double research journals expenditures, which may lead journals to increase article-processing charges and subscription fees to cover these additional expenses.
Our Thoughts
In theory, we support payment for peer review. However, the traditional practice of peer review may be resistant to change due to system inertia, which is the resistance of an organization to change despite its necessity [7]. We support the need for additional steps before a model flip can occur. These include reducing unnecessary burden on reviewers such as: 1) papers that have fatal flaws and have no chance of acceptance; 2) requests that are outside the scope of the reviewers; 3) multiple requests at one time; and 4) unrealistic review turnaround. Simply put: there are too many submissions–a focus on quantity over quality. Predatory journals and journals that support weak science are interested only in publishing (and article processing fees) and little on science. We acknowledge the complexity of institutional reform in the presence of system inertia. Once these elements are sorted, we can get back to the discussion of paying for peer review.
References
- Katrik P, Bialocerkowski AE, Massy-Westropp N, Kumar VS, Grimmer KA. A systematic review of the content of critical appraisal tools. BMC Medical Research Methodology. 2004;22:(4).
- Heathers, J. The 450 Movement. Medium. Available at: https://jamesheathers.medium.com/the-450-movement-1f86132a29bd
- Künzli N, Berger A, Czabanowska K, Lucas R, Madarasova Geckova A, Mantwill S, von dem Knesebeck O. I Do Not Have Time -Is This the End of Peer Review in Public Health Sciences? Public health reviews. 2022;43. https://doi.org/10.3389/phrs.2022.1605407
- Cheah PY, Piasecki J. Should peer reviewers be paid to review academic papers? Lancet, 2022;399(10335):1601.
- Curcic D. Academic Publishers Statistics. WordsRated. Available at: https://wordsrated.com/academic-publishers-statistics/
- Brainard J. The $450 question: Should journals pay peer reviewers?. Science https://www.science.org/content/article/450-question-should-journals-pay-peer-reviewers
- Coiera E. Why system inertia makes health reform so difficult. BMJ (Clinical research ed.), 2011;342:d3693.
Yes, Peer Review is Broken, but It’s Probably Worse than You Think
By: Chad E. Cook PT, PhD, FAPTA
We have problems: There are countless publications, editorials, and blogs indicating we have a notable problem with the peer review system used in scientific publications [1-4]. Concerns have included its inconsistency, its slow process, and the biases associated with reviewers (especially reviewer two) who have an axe to grind. These limitations and the knowledge that publishing companies are making record profit margins [5] off the free labor of reviewers, while authors are required to pay to publish, are especially difficult to stomach. This problem has been ongoing for some time but in my opinion, it seems to have worsened recently. Having been immersed in publishing for over 25 years as an author, and over 20 years as an editor-in-chief or associate editor for four journals, I’d like to outline my concerns that qualify my statement in the title that it’s “probably worse than you think”.
Journals are overwhelmed and subsequently, unresponsive: The last three publications I’ve submitted to peer reviewed journals took 11 months, 10 months, and 6 months, to receive the first set of reviewers’ comments. For those that are not familiar with peer-reviewed publishing, this is a very long time to wait for your first set of reviews. We pulled the paper that took 11 months over 6 months ago (because we were tired of the lack of responsiveness from the journal) and informed the editor-in-chief that we removed it from the review process, but they kept it within their system anyway, and eventually provided the reviews (11 months later). It had already been accepted in a different journal by then. We were informed by the editor-in-chief of the paper that took 6 months that they had reached out to 60 reviewers, to receive two reviewers’ comments. They eventually used the names of reviewers that we recommended. Two of the three examples were review articles and the editors had the audacity to recommend an updated search!
Quality has been sacrificed for quantity: It is estimated that there are 30,000 medical journals published around the world [6]. In 2016, about 1.92 million papers were indexed by the Scopus and Web of Science publication databases; In 2022, that number jumped to 2.82 million [7]. This equates to approximately two papers uploaded to PubMed every minute [8]. Subsequently, it is no secret that quantity has replaced quality. It is especially prevalent in open access journals in which revenue is dependent on an article processing charge (APC) and volume. On average, an article processing charge (APC) of $1,626 USD has been reported [9]. Whereas, this may not seem to be unreasonable, some journals charge over $11,000 USD (Nature Neuroscience [10]), whereas others (PLOS One [11]) have published over 30,000 papers in a given year. I think it is hard-pressed to assume that enough useful science is being created that demands 2.82 million unique papers.
Reviewers are overwhelmed and are abused: I feel it is my responsibility to review for journals, since I’m a user of the peer review system, and I do so without compensation. It generally takes me an hour to do a meaningful and respectful review; sometimes it takes me longer if I need to check the trial registration, review attached appendices, or some of the more important references. Although I serve as an associate editor for a journal, I try and limit my reviews to two manuscripts a week. Apparently, this isn’t enough. From March 1st through March 31st in 2024, I was asked to review 67 papers for scientific journals. That’s an average of nearly 2.2 requests per day-including non-business days. Interestingly, one journal in particular, in which I just published a paper (after 10 months of waiting for the first review), requested my review services 13 times. I averaged >four requests a week from this journal until I finally stopped responding. It is important to recognize that reviewers are overwhelmed and should be compensated for their work. Those who agree to review understand the sarcastic phrase “no good deed goes unpunished”.
Editors are Often Underpaid, Overworked, and Pressured to Publish: A 2020 survey found that more than one third of editors surveyed from core clinical journals did not receive compensation for their editorial roles [12]. As an editor-in-chief from 2006 through 2012, I contributed over 20 hours a week to the journal, and did receive a small stipend for my efforts. I calculated an average hourly salary of a little over three dollars. Further, previous work has exposed the pressure editors have to publish work [13], especially those who run open access journals, in which payment is required to publish within the journal. This leads to the acceptance of inferior work and a flooding of review requests for papers that should have likely been triaged by the editor.
Fake journals are numerous and are getting difficult to discriminate: Predatory journals are open-access publishers that actively solicit and publishes articles for a fee, with little or no real peer review [14]. I’ve written about these before and even wrote a fake paper (with Josh Cleland and Paul Mintken) about a dead person being brought back to life with spinal manipulation to show how these journals will accept anything [15]. There are some estimates that there are 15,000 predatory journals in existence [16]. A popular publishing company MDPI has recently been placed on Predatory-Reports.com’s predatory publishing list because of concerning behaviors in the peer-review process [17]. It is worth noting that many borderline predatory behaviors have made the distinction of predatory journals more difficult, as the competition to secure submissions has ramped up correspondingly with the number of new journals that have been created. Publishing low quality or questionable work has also undermined the promotion and tenure process in academic settings as appointment, promotion and tenure (APT) committee members are often asked to review portfolios of individuals outside of their professional field.
Retraction rates are on the rise. A retraction occurs when a previously published paper in an academic journal is flagged as seriously flawed to the extent that their results and/or conclusions are no longer valid. Retractions occur because of plagiarism, data manipulation and conflict of interest [18] and overall, they are not very common; for every 10,000 papers, 2.5 papers were retracted. Journals self-govern (with external assistance) and often identify flawed work and retract the papers. As such, most retractions occur in higher level journals. To date, data simply don’t exist that can provide an estimate of how many flawed papers are present in journals with no real peer review (predatory) and those that aren’t predatory but have questionable behaviors.
This sounds awful, what should we do: I do realize this blog is negative, but it’s important to understand the context around peer review, especially if you have not the opportunity to publish, review or edit in the peer review system. There are strategies that one can take on that may help navigate these challenges. First, I’d recommend that you read work from reputable journals that are affiliated with reputable societies (e.g., JOSPT, Physical Therapy, Journal of Physiotherapy, etc.). Second, I think it is healthy and reasonable to question results that are notably different from known information, results that were obtained from a group with a vested interest in the outcome of the study, and results that are substantially better than the comparison group, because that’s just not very common or likely. Third, it is appropriate to support the current inertia toward paying reviewers for their efforts as long as their work is of high quality. Fourth, it is good when editors triage papers that are unlikely to be published (or those that shouldn’t be published) as this reduces the burden on peer review. Lastly, it’s important to recognize that someone has to pay for open access journals; it is typically the author that pays.
References
Smith R. Peer review: a flawed process at the heart of science and journals. J R Soc Med. 2006 Apr;99(4):178-82.
Flaherty C. The Peer-Review Crisis. Inside Higher Ed. Available at: https://www.insidehighered.com/news/2022/06/13/peer-review-crisis-creates-problems-journals-and-scholars
Malcom D. It’s Time We Fix the Peer Review System. Am J Pharm Educ. 2018 Jun;82(5):7144.
Subbaraman N. What’s wrong with peer review. Wall Street Journal. Available at: https://www.wsj.com/science/whats-wrong-with-peer-review-e5d2d428
Ansede M. Scientists paid large publishers over $1 billion in four years to have their studies published with open access. El Pais. Available at: https://english.elpais.com/science-tech/2023-11-21/scientists-paid-large-publishers-over-1-billion-in-four-years-to-have-their-studies-published-with-open-access.html
Gower T. What Are Medical Journals? WebMD. Available at: https://www.webmd.com/a-to-z-guides/medical-journals
(no author) Scientists are publishing too many papers—and that’s bad for science. Science Advisor. Available at: https://www.science.org/content/article/scienceadviser-scientists-are-publishing-too-many-papers-and-s-bad-science#:~:text=In%20recent%20years%2C%20the%20number,had%20jumped%20to%202.82%20million.
Landhuis E. Scientific literature: Information overload. Nature. 2016;535:457–458.
Morrison H, (2021-06-24). “Open access article processing charges 2011 – 2021”. Sustaining the Knowledge Commons / Soutenir les savoirs communs. Retrieved 2022-02-18.
Du JS. Opinion: Is Open Access Worth the Cost? The Scientist. Available at: https://www.the-scientist.com/opinion-is-open-access-worth-the-cost-70049
Kayla Graham (January 6, 2014). “Thanking Our Peer Reviewers – EveryONEEveryONE”. Blogs.plos.org. Retrieved 2015-05-17.
Lee JCL, Watt J, Kelsall D, Straus S. Journal editors: How do their editing incomes compare? F1000Res. 2020;24;9:1027.
De Vrieze JOP. Open-access journal editors resign after alleged pressure to publish mediocre papers. Science Advisor. Available at: VRIEZEhttps://www.science.org/content/article/open-access-editors-resign-after-alleged-pressure-publish-mediocre-papers
Cook CE, Cleland JA, Mintken PE. Manual Therapy Cures Death: I Think I Read That Somewhere. J Orthop Sports Phys Ther. 2018 Nov;48(11):830-832.
Cook CE, Cleland J, Mintken P. Temporal Effect of Repeated Spinal Manipulation on Mortality Ratio: A Case Report. ARCH Women Health Care Volume. 2018. 1(1): 1–4.
Freeman E, Kurambayev B. Rising number of ‘predatory’ academic journals undermines research and public trust in scholarship. The Conversation. Available at: https://theconversation.com/rising-number-of-predatory-academic-journals-undermines-research-and-public-trust-in-scholarship-213107#:~:text=That%20is%20roughly%20the%20same,there%20were%2015%2C000%20predatory%20journals
(anonymous author) Is MDPI a predatory publisher? Publishing with Integrity. Available at: https://predatory-publishing.com/is-mdpi-a-predatory-publisher/
Conroy G. The biggest reason for biomedical research retractions. Detection software is not enough. Nature Index. Available at: https://www.nature.com/nature-index/news/the-biggest-reason-for-biomedical-retractions
On Mastery
By: Seth Peterson, PT, DPT, OCS, FAAOMPT
“I don’t know how they can sleep at night.” I was getting chewed out in a hallway in my first year of residency training. My mentor was speaking in general terms, but it was painfully clear that “they” meant me. I had just seen an 11-year-old girl with an ankle sprain. I had given her a painful balance exercise in standing (because the evidence showed it was more effective) and we had talked about pain neurophysiology, which was cutting-edge at the time. Her problem with what she’d just witnessed was that, despite me applying “evidence-based care,” she hadn’t really seen me apply that care to the individual. She hadn’t seen me think.
Looking back, my lack of thinking about the interventions was made worse by the fact that I was doing so much thinking about the simple things. While my mentor was thinking about the words used to greet someone and deciding what mattered to that person on that day, I was focused on how to sequence an ankle examination. I was focused on the basics—and the basics were something they did unfailingly well. Using the conscious competence learning model, you could say I was at a stage of “conscious incompetence” while they were well into the “unconscious competence” stage. Another way to say it is they had “mastered” the basics, while I was just beginning to grasp them.
An Exercise in Interpreting Clinical Results
By: Chad E Cook PT, PhD, FAPTA
Randomized Controlled Trials
In clinical research, treatment efficacy (the extent to which a specific intervention, such as a drug or therapy, produces a beneficial result under ideal conditions) and effectiveness (the degree to which an intervention achieves its intended outcomes in real-world settings) are studied using randomized controlled trials. Randomized controlled trials compare the average treatment effects (ATEs) of outcomes between two or more interventions [1]. By definition, an ATE represents the average difference in outcomes between treatment groups (those who receive the treatment or treatments) and/or a control group (those who do not receive the treatment) across the entire population. Less commonly, researchers will include a secondary “responder analyses” that looks at proportions of individuals who meet a clinically meaningful threshold.