Home » Blog

Category Archives: Blog

Advantages and Disadvantages of Research Metrics used to Evaluate a Researcher’s Impact or Influence

By: Chad E Cook PT, PhD, FAPTA 

Background: Each year, in Duke University’s Division of Physical Therapy, I teach a class on research methodology. One of the topics we discuss in class involves ways to measure research impact among physical therapists’ (and other professions’) researchers. The discussion is complimentary to those that occur during the Appointment, Promotion and Tenure (AP&T) committee in the Department of Orthopaedics, of which I am a committee member. By definition, research impact metrics are quantitative tools used to assess the influence and productivity of researchers, to give some understanding who are leaders in their fields. Without fail, in the class (each year), there is some debate on the best methods. This blog will discuss four of the most common methods and will evaluate their advantages and disadvantages. The order presented does not imply superiority and these methods are not transferable with evaluating the impact of a single journal publication.  

 

H-index: The h-index was proposed in 2005 by Jorge E. Hirsch, a physicist at UC San Diego, as a tool for determining researchers’ relative quality and is sometimes called the Hirsch index or Hirsch number [1]. H-index measures both the productivity and citation impact of a researcher’s publications. The h-index measures both the productivity and citation impact of a researcher’s publications. The h-index is calculated by counting the number of publications an author has that have been cited at least as many times as their h-index value; essentially, it represents the number of papers with at least that many citations, meaning if a researcher has an h-index of 15, they have 15 papers cited at least 15 times each (Figure 1).  

A researcher’s h-index can be found on platforms like Google Scholar, Web of Science, and Scopus, but the values often differ, since each platform uses different evaluation methods for determining a “citation”. In most cases, Google Scholar includes a wider range of publications, such as preprints and less traditional sources such as textbooks (which is why researchers often have much higher h-indexes on Google Scholar), whereas Web of Science and Scopus focus on more curated, peer-reviewed journals (no textbooks), creating discrepancies in the calculated h-index. Whereas an h-index may vary markedly across professionals, Hirsch suggested that an h-index of 3 to 5 can be set as standard for assistant professor, 8 to 12 for associate professor and h-index of 15 to 20 is a good standard for appointment to full professor [1]. 

 

Advantages: 

  • Balances productivity and impact. 
  • Simple and widely recognized. 
  • Useful for comparing researchers in the same field. 

Disadvantages: 

  • Does not adjust for highly cited papers, beyond the core publications. 
  • Biased toward senior researchers with long careers. 
  • Ignores the contribution of co-authors or positioning of authorship. 
  • Field-dependent (e.g., higher citation rates in some fields). 

 

M-Index (m-Quotient): Hirsch recognized the limitations of the h-index and its bias toward senior researchers who have had multiple years to acquire citations. He subsequently made an adjustment by taking the h-index and dividing it by the time (in years) since researcher’s initial publication. Thus, if a researcher has an H-index of 20, and their first publication occurred 25 years ago, their M-Index is 0.8 (20/25). Hirsch suggested that an M-index of 1.0 is Very good, 2.0 is Outstanding and 3.0 is Exceptional [1].  

 

Advantages: 

  • Normalizes the h-index for career length [2]. 
  • Useful for comparing early-career researchers. 

Disadvantages: 

  • Still inherits many of the limitations of the h-index (e.g., less meaningful for researchers with short careers. 
  • It’s more difficult to understand than an h-index. 

 

Field-Weighted Citation Impact (FWCI): Field-Weighted Citation Impact (FWCI) is a metric used to measure the citation impact of a researcher’s work compared to the expected citation rate in their specific field [3]. It is calculated by taking the total number of citations received by a researcher’s publications, and dividing the average number of citations that similar publications in the same field, publication type, and year that they are expected to receive. The FWCI is a ratio calculation, and includes a ratio of the actual citation count to the expected citation rate. For example, if a researcher’s publications have received 50 citations, but the expected citation rate for similar publications is 25, the FWCI would be 2.0; this means the researcher has been cited twice as much as expected. A FWCI of 1 indicates that the researcher has been cited exactly as expected, whereas a FWCI greater than 1 indicates higher-than-expected citation impact, and a FWCI less than 1 indicates lower-than-expected citation impact. 

 

Advantages: 

  • Accounts for field-specific citation practices. 
  • Normalizes impact across disciplines. 

Disadvantages: 

  • Requires access to field-specific data. 
  • Calculators are often for articles, not researchers. 
  • Less intuitive for non-specialists. 

 

NIH Relative Citation Ratio (RCR): The National Institutes of Health (NIH) Relative Citation Rate (RCR) is a metric developed by the NIH Office of Portfolio Analysis to measure the scientific influence of a research paper [4]. RCR is first calculated by normalizing the citation rate of all papers to its field and publication years. The process involves estimating the citation rate of the researchers’ field using its co-citation network (also known as Field Weighted Citation Impact, see above). Secondly, the expected citation rate is calculated, by evaluating the rate for NIH-funded papers in the same field and publication years. The RCR compares the researcher’s papers’ citation rates to the expected citation rates. A researcher with an RCR of 1.0 has received citations at the same rate as the median NIH-funded researcher in its field. Values above 1.0 indicates that the researcher is cited as a rate above the median NIH-funded researchers. An RCR of 1.5, would mean the researcher is cited 1.5 times more frequently than the median NIH funded researcher. An RCR of 2.3 means they are cited 2.3 times more frequently, etc. Values below 1.0 suggest they are cited less frequently than the median researchers. 

 

Advantages: 

  • Normalizes for field and time. 
  • Useful for comparing researchers within disciplines. 

Disadvantages: 

  • The website is complicated and takes a little time to learn how to navigate. 
  • It is less known than other methods such as the h-index. 

 

Summary: Each metric has its strengths and weaknesses, and no single metric can fully capture the impact of every single researcher. All methods push the importance of citations, although two (FWCI and RCR) compare these to others in similar fields. In our AP&T meetings, we consider a combination of metrics as the best approach, tailored to the specific context (e.g., field, career stage, or type of impact) and we also look at number of first author or senior author papers; there are researcher impact metrics that do this as well but they are less often used. 

 

Disclaimer: Both Deepseek and Microsoft copilot were used to assist in this blog.  

 

References 

 

  1. Shah FA, Jawaid SA. The h-index: An Indicator of Research and Publication Output. Pak J Med Sci. 2023 Mar-Apr;39(2):315-316. 
  1. Kurian C, Kurian E, Orhurhu V, Korn E, Salisu-Orhurhu M, Mueller A, Houle T, Shen S. Evaluating factors impacting National Institutes of Health funding in pain medicine. Reg Anesth Pain Med. 2025 Jan 7:rapm-2024-106132. 
  1. Aggarwal M, Hutchison B, Katz A, Wong ST, Marshall EG, Slade S. Assessing the impact of Canadian primary care research and researchers: Citation analysis. Can Fam Physician. 2024 May;70(5):329-341. 
  1. Vought V, Vought R, Herzog A, Mothy D, Shukla J, Crane AB, Khouri AS. Evaluating Research Activity and NIH-Funding Among Academic Ophthalmologists Using Relative Citation Ratio. Semin Ophthalmol. 2025 Jan;40(1):39-43. 

“It’s not you, it’s us…”: Heterogeneity of treatment effects as a challenge to effectiveness trials.

By: Damian L Keter, PT, DPT, PhD

Background:  

Comparative effectiveness studies are the cornerstone of medicine and health sciences research. They have a goal of finding ‘the best’ treatment for each associated condition. In comparative effectiveness studies, statistical models are able to provide ‘average’ treatment effects, which are often used to establish standardized mean difference between the interventions; however, it is clear across interventional studies that the ‘average’ effect is not to be consistently expected. Whereas interventional design focuses on central tendency (mean or median of the population), one may more importantly consider the dispersion of data around that point.  

Heterogeneity of treatment effects (HTE) refers to the variation in how different individuals respond to the same treatment or intervention. HTE are often represented by standard deviations which are impacted by outliers, demonstrating individuals who respond ‘differently’ to the intervention than the ‘average’ results. There are a number of ways in which HTE can be analyzed or managed in secondary analyses including subgroup analysis of covariates and specific statistical methods to identify heterogeneity [1]. HTE is critically important in interpretation of results in interventional studies; however, they are often poorly reported [1]. Two factors must be considered when understanding HTE in interventional effectiveness trials: 1) what factors contribute to HTE, 2) the limitations and challenges in attempting to control HTE.  

 

Factors contributing to heterogeneity of treatment effects: 

A significant proportion of treatment effect may be attributed to non-specific effects- meaning those effects considered outside of the specific treatment mechanisms of the intervention itself [2]. Furthermore, individuals presented with the same stimuli display different responses, which has been proposed to be related to their personal physiological adaptability to pain [3-4]. This has been projected to be one of the reasons orthopaedic manual therapy elicits variable treatment responses (often displayed as ‘responders’ and ‘non-responders’ to treatments) although often reported as efficacious to the ‘average’ individual [5]. Treatment effect should therefore be viewed as the sum of a multitude of contributors that interact with one another and are known to influence outcomes. (Fig1)  

Fig1: Factors contributing to treatment effectiveness in interventional trials.  

 

Limitations and challenges in attempting to control heterogeneity of treatment effects: 

At first glance, the easy answer is to control HTE through attempting standardization of: 1) patients by narrowing inclusion criteria; 2) environment by using a lab-based design with high levels of control; and 3) intervention through a prescriptive design. This concept is challenged by current limited understanding of the multitude of factors (patient, provider, environmental, systemic), which influence the outcome and therefore must be considered [6]. Increased standardization also results in a reduction in external validity to the clinical setting as patients, environment, and interventions are highly dynamic and pragmatic. While highly controlled designs may be theorized to reduce HTE, it would be impractical to think they would markedly narrow to a homogenous treatment effect. Even so, if the study design is controlled to capture more of the ‘average’ responders and omit the outliers, does it improve meaningfulness of the results any more than what is reported with HTE? In essence, omitting the outliers from research does not reduce the chance that those patients similar to the outliers will walk into the clinic. (Fig 2).

Fig 2: The clinical setting is unable to narrow inclusion criteria as is possible in clinical trial design therefore limiting the external validity of highly controlled clinical trials.  

 

Summary:  

An expectation of HTE should be the norm within interventional studies. Comparative effectiveness studies should look to identify which intervention works better for the ‘average’ individual, but should use secondary analysis measures to establish if certain subgroups of individuals (clinical phenotypes) respond differently to the intervention based on a cluster of patient, environmental, or interventional factors [7]. Effect modeling procedures have been proposed to appreciate interaction between intervention and baseline covariates with the goal of developing absolute treatment effects (those anticipated within a certain population/subgroup) with the goal of improving translational clinical value [8]. During study design, investigators should appreciate the benefits and risks of increasing/decreasing control as well as different prospective versus post hoc analysis techniques and should develop studies appropriately based on their purpose.  

Patient management, including the use of manual therapies [9], has shifted towards person-centered models as research has established the individuality of experiences in healthcare, therefore research should look to work within these constructs as well appreciating that the days of searching for the ‘holy grail’ of interventions is behind us, but rather a focus should be on identifying which intervention is best suited for which patient, at what time.  

 

References:  

  1. Gabler NB, Duan N, Liao D, Elmore JG, Ganiats TG, Kravitz RL. Dealing with heterogeneity of treatment effects: is the literature up to the challenge? Trials. 2009;10(1):43. doi:10.1186/1745-6215-10-43 
  2. Ezzatvar Y, Dueñas L, Balasch-Bernat M, Lluch-Girbés E, Rossettini G. Which Portion of Physiotherapy Treatments’ Effect Is Not Attributable to the Specific Effects in People with Musculoskeletal Pain? A Meta-Analysis of Randomized Placebo-Controlled Trials. Journal of Orthopaedic & Sports Physical Therapy. 2024;54(6):391-399. doi:10.2519/jospt.2024.12126 
  3. Wan DWL, Arendt-Nielsen L, Wang K, Xue CC, Wang Y, Zheng Z. Pain Adaptability in Individuals With Chronic Musculoskeletal Pain Is Not Associated With Conditioned Pain Modulation. The Journal of Pain. 2018;19(8):897-909. doi:10.1016/j.jpain.2018.03.002 
  4. Zheng Z, Wang K, Yao D, Xue CCL, Arendt-Nielsen L. Adaptability to pain is associated with potency of local pain inhibition, but not conditioned pain modulation: A healthy human study. Pain. 2014;155(5):968-976. doi:10.1016/j.pain.2014.01.024 
  5. Keter D, Cook C, Learman K, Griswold D. Time to evolve: the applicability of pain phenotyping in manual therapy. J Man Manip Ther. 2022;30(2):61-67. doi:10.1080/10669817.2022.2052560” 
  6. Keter D, Loghmani MT, Rossettini G, Esteves JE, Cook CE. Context is Complex: Challenges and opportunities dealing with contextual factors in manual therapy mechanisms research. International Journal of Osteopathic Medicine. Published online January 2, 2025. doi:10.1016/j.ijosm.2025.100750 
  7. Edwards RR, Dworkin RH, Turk DC, et al. Patient phenotyping in clinical trials of chronic pain treatments: IMMPACT recommendations. Pain. 2016;157(9):1851-1871. doi:10.1097/j.pain.0000000000000602
  8. Kent DM, Paulus JK, Van Klaveren D, et al. The Predictive Approaches to Treatment effect Heterogeneity (PATH) Statement. Ann Intern Med. 2020;172(1):35. doi:10.7326/M18-3667 
  9.  Keter D, Hutting N, Vogsland R, Cook CE. Integrating Person-Centered Concepts and Modern Manual Therapy. JOSPT Open. 2023;2(1):60-70. doi:10.2519/josptopen.2023.0812 

Is Myofascial Pain Syndrome a Legitimate Primary Diagnosis?

By: Chad E Cook, Damian Keter, Ken Learman

Background

Myofascial Pain Syndrome (MPS) is hypothesized to be both a primary and/or a secondary chronic pain disorder that can refer symptoms to other parts of the body. MPS is relatively common, affecting millions of people worldwide, particularly those who have experienced muscle overuse, trauma, or stress [1]. MPS can significantly impact daily activities and quality of life, as the persistent pain and discomfort can be both physically and emotionally draining [2]. Despite its notable impact on health and wellness, MPS is a controversial diagnosis that mainly stems from the lack of consensus on its diagnostic criteria and the underlying mechanisms. The objective of this blog is to identify whether MPS meets current criteria as a unique diagnosis, using the four criteria from the World Health Organization (WHO).

Diagnostic Criteria

Historically, the WHO, through its International Classification of Diseases (ICD) criteria, provides a global standard for diagnostic health data, facilitating international comparisons and collaborations in healthcare. For each unique diagnosis, the WHO requires four criteria [3]: 1) specificity, 2) consistency, 3) significance and 4) diagnostic stability. These criteria have allowed them to differentiate two competing conditions such as influenza and COVID-19, and have allowed them to recognize new diseases/syndromes such as E-Cigs and Vaping-Associated Lung Injury or Post-Traumatic Stress Disorder (PTSD) due to Complex Trauma in Childhood.

Specificity suggests that the condition must have a clear and specific set of symptoms and characteristics that distinguish it from other conditions. Consistency requires that the symptoms and characteristics should be reliably observed across different patients and settings. Significance involves its impact on the individual’s health, functioning, or quality of life. Diagnostic Stability suggests that the diagnosis should remain stable over time, meaning that the condition does not frequently change or evolve into another condition.

Based on the WHO criteria, is MPS a stand-alone, primary diagnosis? The answer is both “yes” and “no”.

According to the WHO, MPS refers to a musculoskeletal disorder characterized by pain originating from tight muscles and the surrounding fascia, often presenting as sensitive “trigger points” that can cause localized pain and referred pain to other areas of the body; this pain can be chronic and is often associated with repetitive motions, poor posture, or stress [4]. Under the ICD-11, MPS is classified under chronic primary pain and chronic secondary musculoskeletal pain. The criteria for chronic primary pain

include persistent or recurrent pain for at least three months, with significant emotional distress or functional disability. For chronic secondary musculoskeletal pain, the pain is associated with a musculoskeletal condition, which persists beyond the usual recovery period. Despite these descriptions from the WHO, neither of the two (for chronic primary and chronic secondary) meets all four of the original WHO criteria.

Truthfully, it is well understood that MPS does not have a clear, routine set of signs and symptoms that distinguishes it from other diagnoses (lacks specificity). This influences consistency as well. These are reasons it is difficult to differentiate MPS from other diagnostic conditions such as fibromyalgia, tension type headaches, and chronic fatigue syndrome. MPS is also traditionally categorized as a nociceptive pain condition, but there’s growing evidence suggesting it can also involve neuropathic or nociplastic pain components. Further, due to the lack of specific laboratory indicators and imaging evidence, there is no unified diagnostic criteria for MPS, adding to the confusion with other diseases [5].

Summary

Given the prevalence of MPS it is likely that most physical therapists and chiropractors see a high percentage of these individuals in their outpatient practices. As currently defined by ICD coding MPS is highly likely to contribute to patients’ pain experience as a secondary condition [6], which can be very debilitating to the individual [7]. This is likely why other global healthcare groups such as the International Association for the Study of Pain, support the contribution of MPS, which they characterize as local and referred pain perceived as deep, dull, pressure, and aching, along with the presence of myofascial trigger points in any part of the body [8]. As the complex nature of pain and associated pain conditions are further unraveled perhaps MPS will find a better home as a primary pain condition, however, currently, MPS may be difficult to differentiate from other conditions and is likely a secondary contributor to most musculoskeletal conditions seen by rehabilitation providers.

References

1. Li X, Lin Y, He P, Wang Q. Efficacy and safety of low-intensity ultrasound therapy for myofascial pain syndrome: a systematic review and meta-analysis. BMC Musculoskelet Disord. 2024 Dec 23;25(1):1059.

2. Jaeger B. Myofascial trigger point pain. Alpha Omegan. 2013;106(1–2):14–22.

3. Hebert O, Schlueter K, Hornsby M, Van Gorder S, Snodgrass S, Cook C. The diagnostic credibility of second impact syndrome: A systematic literature review. J Sci Med Sport. 2016 Oct;19(10):789-94.

4. Qureshi N, Hamoud AA, Gazzaffi IMA. Myofascial Pain Syndrome: A Concise Update on Clinical, Diagnostic and Integrative and Alternative Therapeutic

Perspectives. International Neuropsychiatric Disease Journal. 2019 Mar; 13(1): 1-14

5. Cao QW, Peng BG, Wang L, Huang YQ, Jia DL, Jiang H, Lv Y, Liu XG, Liu RG, Li Y, Song T, Shen W, Yu LZ, Zheng YJ, Liu YQ, Huang D. Expert consensus on the diagnosis and treatment of myofascial pain syndrome. World J Clin Cases. 2021 Mar 26;9(9):2077-2089. doi: 10.12998/wjcc.v9.i9.2077.

6. Plaut S. Scoping review and interpretation of myofascial pain/fibromyalgia syndrome: An attempt to assemble a medical puzzle. PLoS One. 2022 Feb 16;17(2):e0263087.

7. Lam C, Francio VT, Gustafson K, Carroll M, York A, Chadwick AL. Myofascial pain – A major player in musculoskeletal pain. Best Pract Res Clin Rheumatol. 2024 Mar;38(1):101944.

8. International Association of the Study of Pain. Myofascial Pain: Fact Sheet 14. Downloaded December 27, 2024 at: https://www.iasp-pain.org/wp-content/uploads/2022/10/14.-Myofascial-Pain-Fact-Sheet-Revised-2017.pdf.

Risk of Bias Measures can be Biased

By: Chad E Cook, Damian Keter, Ken Learman

Navigating the Literature: Navigating the ever-growing, healthcare literature can be challenging [1]. The sheer amount of new research, articles, and guidelines published regularly can be overwhelming. The number of biomedical publications has been steadily increasing over the years. As of 2022, there were approximately 3.3 million scientific and technical articles published worldwide [2]. The volume of information and the time constraints of a busy clinician can lead to information overload. This is particularly important since it can be difficult to determine which information is relevant and credible amidst the vast amount of available content.

In publishing, risk of bias measures are tools and methods used to assess the likelihood that the results of a study are influenced by systematic errors or biases. With the very high number of systematic reviews, which are designed to summarize overall results into a common understanding, the use of risk of bias measures is crucial for evaluating the quality, reliability, and trustworthiness [3-5] of research findings. This, and a focus on transparency in research, has led to the proliferation of risk of bias measures and their adoption into publication practice. However, there are limitations to risk of bias measures that may denude their utility in reconciling the literature. The purpose of this blog is to: 1) outline the limitations of risk of bias measures and 2) discuss the best ways of interpreting the literature when risk of bias measures provides interpretation conflict.

Limitations of Risk of Bias Measures: Risk of bias measures are useful tools that assist in guiding evidence synthesis, particularly in systematic reviews and meta-analyses. Risk of bias measures aid in selecting high-quality studies and weighting their contributions appropriately, leading to more reliable conclusions. Nonetheless, there are limitations to current risk of bias measures, which include: 1) subjectivity of raters, 2) elevating risk when reporting is actually the problem, 3) overemphasis on selected scoring areas and failure to recognize other notable contributors, and 4) interpretation issues (meaningful scaling) within and between instruments.

Subjectivity of raters: Assessments of risk of bias often involve subjective judgments, which can vary between reviewers. Best practice involves two different reviews and a consensus of findings, but assessment requires appropriate training to assure that reviewers truly understand each item of the risk of bias scale. A recent study [6] examined the inter-rater reliability of several risk of bias tools for non-randomized studies and found variability in the assessments that was attributed to differences in the complexity and clarity of the criteria used in the tools. Furthermore, results of the analysis using multiple tools on the same article can yield differing interpretations of the trustworthiness of a causal inference [7]. For this reason, it is

common practice for systematic review guidelines to mandate that two independent reviewers must complete risk of bias assessments and come to consensus on discrepancies [8].

Elevating risk when reporting is actually the problem: Reporting checklists in publishing are essential tools used to improve the transparency, completeness, and quality of research reporting. Common examples of reporting checklists include CONSORT for randomized controlled trials, PRISMA for systematic reviews and meta-analyses, and STROBE for observational studies. Unfortunately, not all studies are written using reporting checklists as a guide, which can lead to the inability to discriminate if the study design excluded the risk of bias component or if it was simply omitted from reporting. Risk of bias can only be evaluated based on what is reported and if what is reported is poor or omitted (despite being performed in the study), the risk of bias may be artificially inflated [9]. A counterfactual argument exists where investigators can use a checklist and report that design elements that meet the checklist occurred, when they did not or were inelegantly applied. This brings investigator intent to the table which we can never accurately assess, but exists nonetheless.

Overemphasis on selected scoring areas: In an effort to reduce administration burden, most risk of bias scales overemphasize areas (e.g., randomization, allocation concealment) and underemphasize others (e.g., interventional fidelity, blinding of outcomes, incomplete outcome data). Certainly, the underemphasized areas are as important or potentially more important than those that are historically supported [9].

Interpretation issues: There are two major considerations when interpreting results of a risk of bias tool. First, most risk of bias scales provide a summary score, but it is questionable whether this score actually reflects a meaningfully elevated risk, especially if the values are not weighted. For example, a high risk of bias score on the PEDro scale, a commonly used measure in physical therapy studies, total PEDro scores of 0-3 are considered ‘poor’, 4-5 ‘fair’, 6-8 ‘good’, and 9-10 ‘excellent’; it is important to note that these classifications have not been validated [10]. Second, the actual impact of bias may be variable depending on the direction of the impact. Two biases may move the outcome in opposite directions offsetting each other and producing minimal, if any, net effect on the inference. Third, best practice suggests that a sensitivity analysis or a subgroup analysis is appropriate when variations in risk of bias measures are identified in a synthesis-based review (e.g., systematic review). Conducting sensitivity analyses helps determine how the inclusion or exclusion of studies with high risk of bias affects the overall results. Performing subgroup analyses helps to explore whether studies with low, moderate, or high risk of bias yield different results [9].

Summary

Risk of bias measures provide additional data in the determination of study bias or quality, but these tools are not gospel and should not be taken as absolute, unquestionable truth. As with many tools used in interpreting publications, there are limitations to their use. As such, determining a study as “good” or “bad” or “trustworthy” or “not trustworthy”, purely from a risk of bias score should not be recommended.

References 1. https://www.pharmacytimes.com/view/tips-tricks-for-staying-up-to-date-with-medical-literature-guidelines-as-a-busy-pharmacist 2. https://ncses.nsf.gov/pubs/nsb202333/publication-output-by-region-country-or-economy-and-by-scientific-field

3. Riley SP, Flowers DW, Swanson BT, Shaffer SM, Cook CE, Brismée JM. ‘Trustworthy’ systematic reviews can only result in meaningful conclusions if the quality of randomized clinical trials and the certainty of evidence improves: an update on the ‘trustworthy’ living systematic review project. J Man Manip Ther. 2024 Aug;32(4):363-367.

4. Flowers DW, Swanson BT, Shaffer SM, Clewley DJ, Riley SP. Is there ‘trustworthy’ evidence for using manual therapy to treat patients with shoulder dysfunction?: A systematic review. PLoS One. 2024 Jan 18;19(1):e0297234.

5. Riley SP, Swanson BT, Shaffer SM, Flowers DW, Cook CE, Brismée JM. Why do ‘Trustworthy’ Living Systematic Reviews Matter? J Man Manip Ther. 2023 Aug;31(4):215-219.

6. Kalaycioglu I, Rioux B, Briard JN, Nehme A, Touma L, Dansereau B, Veilleux-Carpentier A, Keezer MR. Inter-rater reliability of risk of bias tools for non-randomized studies. Syst Rev. 2023 Dec 7;12(1):227.

7. Jüni P, Witschi A, Bloch R, Egger M. The Hazards of Scoring the Quality of Clinical Trials for Meta-analysis. JAMA. 1999;282(11):1054–1060. doi:10.1001/jama.282.11.1054.

8. Checklists for systematic reviews and research synthesis. https://jbi.global/sites/default/files/2020-08/Checklist_for_Systematic_Reviews_and_Research_Syntheses.pdf

9. Higgins JPT, Altman DG, Sterne JAC (editors). Chapter 8: https://training.cochrane.org/handbook/current/chapter-08

10. Assessing risk of bias in included studies. In: Higgins JPT, Churchill R, Chandler J, Cumpston MS (editors), Cochrane Handbook for Systematic Reviews of Interventions version 5.2.0 (updated June 2017), Cochrane, 2017. Available from www.training.cochrane.org/handbook.

Why Isn’t Everyone Using Stepped Care for Musculoskeletal Injuries? 

By: Chad E. Cook PT, PhD, FAPTA

Resource efficiency models 

Musculoskeletal (MSK) outcomes have shown some concerning trends over the last decade. Conditions like low back pain, neck pain, and joint pain have become more prevalent, contributing to the overall burden of a MSK disorder [1]. According to a report analyzing medical claims data from 2010 to 2020, MSK healthcare costs have doubled, despite the number of individuals reporting MSK disorders remaining relatively constant. This increase in costs is driven by a rise in per-member costs and the growing number of health plan members [2] and has prompted a number of novel management models that emphasize cost-effectiveness rather than a current fee-for-service dominant strategy (which rewards higher utilization and does not penalize the provider when outcomes are not optimized). These novel “resource efficiency models” focus on the optimal use of resources—such as time, personnel, equipment, and finances—to achieve comparable or superior patient outcomes to a traditional approach. 

What is Stepped Care?

Stepped care for MSK conditions is a tailored and structured approach to treatment that starts with the least intensive, most cost-effective interventions first (Figure 1). Care steps up to more intensive treatments as/if needed [3] (only when selected clinical criteria are not met or if the patient is at risk for worsening if they do not receive a dedicated treatment approach). The earliest stepped care options were developed for mental health disorders, diabetes, and other behavioral conditions and thus far there is emerging evidence to support stepped care treatments for individuals with different forms of MSK disorders [4-9].  

It works off the premise that there logical are first-line and second-line approaches to MSK conditions, as well as a series of assumptions [10]. These assumptions include: 1) Equivalence of clinical outcomes across the different levels of care. These steps within the model are assumed to be equally effective in achieving clinical outcomes; 2) Efficiency in resource use: The model assumes that using the least intensive, yet effective, intervention first will optimize resource use and reduce costs; 3) Acceptability of minimal interventions: Patients and providers are assumed to accept and adhere to less intensive interventions before moving to more intensive ones (watchful waiting has merit); 4) Self-correcting nature of the model: The model assumes that if an intervention is not effective, the next step in the care pathway will be more intensive and appropriate and may potentially be a better “match” for the patient; and 5) Stepped care reduces overtreatment: Overtreatment in MSK conditions is the provision of medical interventions that are unnecessary or excessive given the patient’s condition. 

Why Isn’t Everyone using Stepped Care? 

Thus far, there seems to be both clinical efficiency of stepped care and cost-effectiveness as well. If so, especially in light of the rather stagnant results we’ve seen globally in management of MSK conditions, “why isn’t everyone using stepped care?”. The answer for the United States is threefold. First, care within the United States is fragmented, often leading to poor communication across different forms of providers. Second, the parties involved as first-point providers are often those who provide the most invasive and potentially highest costs of care (a proverbial fox guarding the chicken coup scenario). Last, there are no financial incentives in a fee for service system, the payment system that dominates the United States, for adopting stepped car. In fact, it is likely that fee for service providers would lose business to lower cost providers and would also lose market share.  

Summary

Stepped care has significant potential for improving the management of MSK conditions in the future. By providing tailored interventions that match the patient’s needs, stepped care can enhance treatment outcomes, reduce healthcare costs, and improve patient satisfaction. This model allows for early intervention with less intensive treatments, reserving more resource-intensive options for those who do not respond to initial therapies. Additionally, stepped care promotes a more efficient use of healthcare resources and encourages a collaborative approach among healthcare providers. As research continues to support its effectiveness, and as payment models are adjusted, stepped care could become a cornerstone of MSK management, leading to better overall health outcomes for patients.  

References

  1. GBD 2021 Other Musculoskeletal Disorders Collaborators. Global, regional, and national burden of other musculoskeletal disorders, 1990-2020, and projections to 2050: a systematic analysis of the Global Burden of Disease Study 2021. Lancet Rheumatol. 2023 Oct 23;5(11):e670-e682.  
  2. Hinge Health. State of MSK Report 2021. Downloaded on December 15, 2024 from: https://healthactioncouncil.org/getmedia/a738c3c5-7c23-4739-bb8d-069dd5f7406b/Hinge-Health-State-of-MSK-Report-2021.pdf 
  3. Kongsted A, Kent P, Quicke JG, Skou ST, Hill JC. Risk-stratified and stepped models of care for back pain and osteoarthritis: are we heading towards a common model? Pain Rep. 2020 Sep 23;5(5):e843 
  4. Garcia AN, Cook CE, Rhon DI. Adherence to Stepped Care for Management of Musculoskeletal Knee Pain Leads to Lower Health Care Utilization, Costs, and Recurrence. Am J Med. 2021 Mar;134(3):351-360.e1. 
  5. Rhon DI, Greenlee TA, Fritz JM. The Influence of a Guideline-Concordant Stepped Care Approach on Downstream Health Care Utilization in Patients with Spine and Shoulder Pain. Pain Med. 2019 Mar 1;20(3):476-485.  
  6. Kroenke K, Bair M, Damush T, Hoke S, Nicholas G, Kempf C, Huffman M, Wu J, Sutherland J. Stepped Care for Affective Disorders and Musculoskeletal Pain (SCAMP) study: design and practical implications of an intervention for comorbid pain and depression. Gen Hosp Psychiatry. 2007 Nov-Dec;29(6):506-17.  
  7. Kroenke K, Krebs E, Wu J, Bair MJ, Damush T, Chumbler N, York T, Weitlauf S, McCalley S, Evans E, Barnd J, Yu Z. Stepped Care to Optimize Pain care Effectiveness (SCOPE) trial study design and sample characteristics. Contemp Clin Trials. 2013 Mar;34(2):270-81.   
  8. Mylenbusch H, Schepers M, Kleinjan E, Pol M, Tempelman H, Klopper-Kes H. Efficacy of stepped care treatment for chronic discogenic low back pain patients with Modic I and II changes. Interv Pain Med. 2023 Nov 15;2(4):100292.  
  9. Boyd L, Baker E, Reilly J. Impact of a progressive stepped care approach in an improving access to psychological therapies service: An observational study. PLoS One. 2019 Apr 9;14(4):e0214715. 

Figure 1. Example of a Stepped Care Model for Musculoskeletal Conditions.  

Three Ways That Recruitment in Randomized Controlled Trials May Not Reflect Real Life

By: Chad Cook, Amy McDevitt, Derek Clewley, Bryan O’Halloran

As we wind up a year of recruitment on the SS-MECH trial [1], we are compelled to reflect on our recruitment strategies and study participants. Our study has included four recruitment sites and we’ve enrolled over 110 participants, which is nearly 85% of our targeted sample. We are using well-rehearsed and successful strategies at our work sites, providing access to a wide range of individuals with chronic neck disorders. As an example, the recruitment process at Duke University uses the electronic medical record to identify individuals who have recently been seen for neck related conditions, who are not seeking a physical therapist’s care at the given time. This process and the processes at all recruitment sites have been very effective, leading to high conversion rates (enrollment) and strong study retention. The study investigators provide care for both arms, which increases the fidelity of the interventions, as each of us has a vested interest in doing this right. Further, thanks to generous external funding (https://foundation4pt.org/), we have financial support for our six-month follow-ups, which has also been instrumental in a very high completion rate.  

All of this sounds like wonderful news for any clinical trialist. And indeed, by mid 2025, we will complete the last six-month follow-ups for the SS-MECH trial and will be able to report on our findings. In fact, of the >20 randomized clinical trials (RCTs) that we’ve independently been involved in, this one has one of the strongest implementation plans and efforts toward improving the study quality. However, we would be remiss if we did not outline some of the concerns for ALL RCTs, concerns that are not specific to our study but should be considered when reading any published paper. The purpose of this blog is to outline the potential limitations of the samples in RCTs.  

Concern Number One: All RCTs have specific inclusion/ exclusion criteria, which may influence the type of participant seen in the trial. This can lead to selection bias, which occurs when the volunteers for the study differ from those who do not volunteer. All RCTs may select a more homogeneous group of patients to reduce variability. The homogeneity of the sample reduces the generalizability of the results, which is whether the results are reflective of a broader patient population seen in everyday clinical practice. All RCTs identify a sample representative of a pre-specified target population [2], which may be dissimilar to the general population with chronic neck pain presenting to clinicians. Individuals who agree to participate in a study are often healthier, live close to the study site, are younger, have higher health literacy, and have higher socioeconomic status [3]. All of these features are also moderators of an outcome and could influence the results of the study. An example of selection bias in our study is our requirement that the research participants not attend physical therapy during the time of their treatment. This is likely to increase non-care seeker enrollment, which is a very different population than a care seeking one [4]; care-seekers tend to have more severe symptoms and may be more motivated to pursue a change in their status.  

Concern Number Two: Non-pragmatic RCTs are conducted under idealized and controlled conditions, which may not accurately represent the complexities and variability of real-world clinical settings. This often increases patient compliance and reduces dropouts, influencing a study’s results. Participants in RCTs are often more compliant with treatment protocols and follow-up visits compared to the general patient population, leading to differences in outcomes. Study dropouts can introduce bias, reduce power, and lead to missing data. This can lead to an overestimation or underestimation of the treatment effect. With fewer participants completing the study, the statistical power to detect a difference between treatment groups is reduced. Lastly, missing data from dropouts can complicate the analysis and interpretation of results, requiring the use of statistical methods to handle the missing information.  

Concern Number Three: Because of costs, nearly all RCTs have shorter follow-up periods than what might be observed in clinical practice, potentially missing long-term effects and outcomes. The typical follow-up time for physical therapy-led randomized controlled trials (RCTs) can vary, but it often ranges from 6 months to 1 year [5,6]. Short-term outcomes can lead to limited insight into long-term efficacy, failure to capture reoccurrence rates, and a poorer understanding of variability in patient response. Past studies on trajectories demonstrate that outcomes change markedly over a 1-year period [7]. Lastly, short-term outcomes fail to capture the potential behavioral changes that occur because of the treatment and, conversely, the potential for lack of implementation of self-management strategies over the long term. Participants might alter their behavior or adherence to treatment protocols once the trial ends, affecting long-term outcomes.  

Summary: This blog highlights three concerns about RCTs germane to all studies. We emphasize the importance of closely examining the inclusion/exclusion criteria to determine if the study population accurately reflects the patients that clinicians encounter in clinical practice. Additionally, consider the demographics, social status, and other relevant factors that describe the sample. How you integrate the findings into your workflow and care plan should be guided by a clear understanding of these limitations.  

References 

  1. Cook CE, O’Halloran B, McDevitt A, Keefe FJ. Specific and shared mechanisms associated with treatment for chronic neck pain: study protocol for the SS-MECH trial. J Man Manip Ther. 2024;32(1):85-95. 
  2. Stuart EA, Bradshaw CP, Leaf PJ. Assessing the generalizability of randomized trial results to target populations. Prev Sci. 2015;16(3):475-85. 
  3. Holmberg MJ, Andersen LW. Adjustment for Baseline Characteristics in Randomized Clinical Trials. JAMA. 2022;328(21):2155-2156. 
  4. Clewley D, Rhon D, Flynn T, Koppenhaver S, Cook C. Health seeking behavior as a predictor of healthcare utilization in a population of patients with spinal pain. PLoS One. 2018;13(8):e0201348. 
  5. Herbert RD, Kasza J, Bø K. Analysis of randomised trials with long-term follow-up. BMC Med Res Methodol 2018;18:48.  
  6. Llewellyn-Bennett R, Bowman L, Bulbulia R. Post-trial follow-up methodology in large randomized controlled trials: a systematic review protocol. Syst Rev 2016;5:214.  
  7. Nim C, Downie AS, Kongsted A, Aspinall SL, Harsted S, Nyirö L, Vach W. Prospective Back Pain Trajectories or Retrospective Recall-Which Tells Us Most About the Patient? J Pain. 2024 Nov;25(11):104555. 

Pros and Cons of Paying Peer Reviewers

By: Juliana Ancalmo, Chad E Cook PT, PhD, FAPTA, Ciara Roche

Background

Critical appraisal is a hallmark of peer reviewed publishing. Critical appraisal provides analytical evaluations of whether the results of the study can be believed, and can be transferred appropriately into other environments, for use in policy, education, or clinical practice [1]. Historically, critical appraisal is performed by peer reviewers who are either content or research experts (or both). Peer reviewers have viewed this act as an obligation for science, especially those who benefit from peer review as authors, and are not currently paid for this service.

Recent limitations brought forth by qualified peer reviewers has ignited discussion around paying for reviewing services. Although this topic had been highly debated previously, a new wave of conversation was reignited when researcher and Chief Scientific officer James Heathers [2] argued for a $450 fee for a peer review in an editorial published on Medium. This, coupled with the challenges many researchers faced post-COVID have spurred people on both sides of this argument to speak out. In this blog we will outline the pros and cons of this debate and discuss the complexity of the issue at hand.

Pros of Paying Peer Reviewers

We propose several benefits of paying peer reviewers for their critical appraisals. Since the COVID-19 pandemic, there has been a notable decline in acceptance rates combined with an increase in submission rates in academic journals, creating a large imbalance within the peer review process [3]. Compensation could lead to reviewer buy-in and decrease this imbalance [2]. Interestingly, it may also increase the diversity of peer reviewers. Peer reviewers often reflect those who populate their field of study, which often are dominated by men. Theoretically, paying for peer review may better represent women and lower income countries, especially if they are targeted [4].

On top of a lack of diversity of reviewers, these publishing companies are generating record profits, and compensating reviewers may reduce the associated negative optics. For example, the arguably biggest academic publishing company in the world is Elsevier, which generates $3.35 billion in revenue with a profit margin of around 40% [5]. It is arguable that among the five major publishing companies, Elsevier, John Wiley & Sons, Taylor & Francis, Springer Nature and SAGE, who control 50% of all revenue of the academic publishing industry globally, a solution could be drawn up to financially compensate underpaid and worked reviewers [5]. Quite frankly, asking someone to do a lot of work for free is a tough sell during times of record profits. Finally, we believe reviewers simply deserve to be paid. Good reviewers spend a lot of time peer reviewing papers. This process improves the final manuscript and strengthens the science. Experts deserve to be compensated and asking people to work for free is an archaic and offensive stance.

Cons of Paying Peer Reviewers

There are also several arguments that can be made against paying peer reviewers. One often cited is that compensation of reviews may lead to unethical reviews being submitted. It is not a stretch to consider how reviewers may take advantage of this monetary system for their own financial benefit–this could impact the quality of the reviews submitted, as reviewers work to do as many reviews as they can for “easy cash-grab.” This leads into another concern regarding the payment of peer reviewers: there is currently no threshold on what constitutes a “good review.” Nowadays it can take several months to wait for feedback on a paper, only to receive a couple of lines from a reviewer and a rejection from the editor. Does this two-line review deserve the same compensation as someone who spent hours reading and giving critical feedback on a review?

It is clear there would need to be notable training and standardization in submitting a review that would qualify for compensation; however, this process would further limit individuals who could submit a review and may cause further delays in this process. Additionally, processing of the payments would likely be a disaster at first. Considering it can sometimes take a year for these journals to publish a review, it is not unreasonable to believe that a payment system for ongoing peer reviewers would result in lost, incorrect or delayed payments.

Finally, there is uncertainty of whether journals or the industry could even afford to pay these reviewers in the first place. Publishing consultant, Tim Vines, argued that if there is an average of 2.2 reviews per article, each reviewed article would cost $990, assuming the $450 fee proposed by Heathers is met [6]. Additionally, for a journal with a 25% acceptance rate, the cost of reviewing for each accepted paper would be $3,960 [6]. This additional cost would almost double research journals expenditures, which may lead journals to increase article-processing charges and subscription fees to cover these additional expenses.

Our Thoughts

In theory, we support payment for peer review. However, the traditional practice of peer review may be resistant to change due to system inertia, which is the resistance of an organization to change despite its necessity [7]. We support the need for additional steps before a model flip can occur. These include reducing unnecessary burden on reviewers such as: 1) papers that have fatal flaws and have no chance of acceptance; 2) requests that are outside the scope of the reviewers; 3) multiple requests at one time; and 4) unrealistic review turnaround. Simply put: there are too many submissions–a focus on quantity over quality. Predatory journals and journals that support weak science are interested only in publishing (and article processing fees) and little on science. We acknowledge the complexity of institutional reform in the presence of system inertia. Once these elements are sorted, we can get back to the discussion of paying for peer review.

References

  1. Katrik P, Bialocerkowski AE, Massy-Westropp N, Kumar VS, Grimmer KA. A systematic review of the content of critical appraisal tools. BMC Medical Research Methodology. 2004;22:(4).
  2. Heathers, J. The 450 Movement. Medium. Available at: https://jamesheathers.medium.com/the-450-movement-1f86132a29bd
  3. Künzli N, Berger A, Czabanowska K, Lucas R, Madarasova Geckova A, Mantwill S, von dem Knesebeck O. I Do Not Have Time -Is This the End of Peer Review in Public Health Sciences? Public health reviews. 2022;43. https://doi.org/10.3389/phrs.2022.1605407
  4. Cheah PY, Piasecki J. Should peer reviewers be paid to review academic papers? Lancet, 2022;399(10335):1601.
  5. Curcic D. Academic Publishers Statistics. WordsRated. Available at: https://wordsrated.com/academic-publishers-statistics/
  6. Brainard J. The $450 question: Should journals pay peer reviewers?. Science https://www.science.org/content/article/450-question-should-journals-pay-peer-reviewers
  7. Coiera E. Why system inertia makes health reform so difficult. BMJ (Clinical research ed.), 2011;342:d3693.

Yes, Peer Review is Broken, but It’s Probably Worse than You Think

By: Chad E. Cook PT, PhD, FAPTA

We have problems: There are countless publications, editorials, and blogs indicating we have a notable problem with the peer review system used in scientific publications [1-4]. Concerns have included its inconsistency, its slow process, and the biases associated with reviewers (especially reviewer two) who have an axe to grind. These limitations and the knowledge that publishing companies are making record profit margins [5] off the free labor of reviewers, while authors are required to pay to publish, are especially difficult to stomach. This problem has been ongoing for some time but in my opinion, it seems to have worsened recently. Having been immersed in publishing for over 25 years as an author, and over 20 years as an editor-in-chief or associate editor for four journals, I’d like to outline my concerns that qualify my statement in the title that it’s “probably worse than you think”.

Journals are overwhelmed and subsequently, unresponsive: The last three publications I’ve submitted to peer reviewed journals took 11 months, 10 months, and 6 months, to receive the first set of reviewers’ comments. For those that are not familiar with peer-reviewed publishing, this is a very long time to wait for your first set of reviews. We pulled the paper that took 11 months over 6 months ago (because we were tired of the lack of responsiveness from the journal) and informed the editor-in-chief that we removed it from the review process, but they kept it within their system anyway, and eventually provided the reviews (11 months later). It had already been accepted in a different journal by then.  We were informed by the editor-in-chief of the paper that took 6 months that they had reached out to 60 reviewers, to receive two reviewers’ comments. They eventually used the names of reviewers that we recommended. Two of the three examples were review articles and the editors had the audacity to recommend an updated search!

Quality has been sacrificed for quantity: It is estimated that there are 30,000 medical journals published around the world [6]. In 2016, about 1.92 million papers were indexed by the Scopus and Web of Science publication databases; In 2022, that number jumped to 2.82 million [7]. This equates to approximately two papers uploaded to PubMed every minute [8]. Subsequently, it is no secret that quantity has replaced quality. It is especially prevalent in open access journals in which revenue is dependent on an article processing charge (APC) and volume. On average, an article processing charge (APC) of $1,626 USD has been reported [9]. Whereas, this may not seem to be unreasonable, some journals charge over $11,000 USD (Nature Neuroscience [10]), whereas others (PLOS One [11]) have published over 30,000 papers in a given year. I think it is hard-pressed to assume that enough useful science is being created that demands 2.82 million unique papers.

Reviewers are overwhelmed and are abused: I feel it is my responsibility to review for journals, since I’m a user of the peer review system, and I do so without compensation. It generally takes me an hour to do a meaningful and respectful review; sometimes it takes me longer if I need to check the trial registration, review attached appendices, or some of the more important references. Although I serve as an associate editor for a journal, I try and limit my reviews to two manuscripts a week. Apparently, this isn’t enough. From March 1st through March 31st in 2024, I was asked to review 67 papers for scientific journals. That’s an average of nearly 2.2 requests per day-including non-business days. Interestingly, one journal in particular, in which I just published a paper (after 10 months of waiting for the first review), requested my review services 13 times. I averaged >four requests a week from this journal until I finally stopped responding. It is important to recognize that reviewers are overwhelmed and should be compensated for their work. Those who agree to review understand the sarcastic phrase “no good deed goes unpunished”.

Editors are Often Underpaid, Overworked, and Pressured to Publish: A 2020 survey found that more than one third of editors surveyed from core clinical journals did not receive compensation for their editorial roles [12]. As an editor-in-chief from 2006 through 2012, I contributed over 20 hours a week to the journal, and did receive a small stipend for my efforts. I calculated an average hourly salary of a little over three dollars. Further, previous work has exposed the pressure editors have to publish work [13], especially those who run open access journals, in which payment is required to publish within the journal. This leads to the acceptance of inferior work and a flooding of review requests for papers that should have likely been triaged by the editor.

Fake journals are numerous and are getting difficult to discriminate: Predatory journals are open-access publishers that actively solicit and publishes articles for a fee, with little or no real peer review [14]. I’ve written about these before and even wrote a fake paper (with Josh Cleland and Paul Mintken) about a dead person being brought back to life with spinal manipulation to show how these journals will accept anything [15]. There are some estimates that there are 15,000 predatory journals in existence [16]. A popular publishing company MDPI has recently been placed on Predatory-Reports.com’s predatory publishing list because of concerning behaviors in the peer-review process [17]. It is worth noting that many borderline predatory behaviors have made the distinction of predatory journals more difficult, as the competition to secure submissions has ramped up correspondingly with the number of new journals that have been created. Publishing low quality or questionable work has also undermined the promotion and tenure process in academic settings as appointment, promotion and tenure (APT) committee members are often asked to review portfolios of individuals outside of their professional field.

Retraction rates are on the rise. A retraction occurs when a previously published paper in an academic journal is flagged as seriously flawed to the extent that their results and/or conclusions are no longer valid. Retractions occur because of plagiarism, data manipulation and conflict of interest [18] and overall, they are not very common; for every 10,000 papers, 2.5 papers were retracted. Journals self-govern (with external assistance) and often identify flawed work and retract the papers. As such, most retractions occur in higher level journals. To date, data simply don’t exist that can provide an estimate of how many flawed papers are present in journals with no real peer review (predatory) and those that aren’t predatory but have questionable behaviors.

This sounds awful, what should we do: I do realize this blog is negative, but it’s important to understand the context around peer review, especially if you have not the opportunity to publish, review or edit in the peer review system. There are strategies that one can take on that may help navigate these challenges. First, I’d recommend that you read work from reputable journals that are affiliated with reputable societies (e.g., JOSPT, Physical Therapy, Journal of Physiotherapy, etc.). Second, I think it is healthy and reasonable to question results that are notably different from known information, results that were obtained from a group with a vested interest in the outcome of the study, and results that are substantially better than the comparison group, because that’s just not very common or likely. Third, it is appropriate to support the current inertia toward paying reviewers for their efforts as long as their work is of high quality. Fourth, it is good when editors triage papers that are unlikely to be published (or those that shouldn’t be published) as this reduces the burden on peer review. Lastly, it’s important to recognize that someone has to pay for open access journals; it is typically the author that pays.

References

Smith R. Peer review: a flawed process at the heart of science and journals. J R Soc Med. 2006 Apr;99(4):178-82.

Flaherty C. The Peer-Review Crisis. Inside Higher Ed. Available at: https://www.insidehighered.com/news/2022/06/13/peer-review-crisis-creates-problems-journals-and-scholars

Malcom D. It’s Time We Fix the Peer Review System. Am J Pharm Educ. 2018 Jun;82(5):7144.

Subbaraman N. What’s wrong with peer review. Wall Street Journal. Available at: https://www.wsj.com/science/whats-wrong-with-peer-review-e5d2d428

Ansede M. Scientists paid large publishers over $1 billion in four years to have their studies published with open access. El Pais. Available at:   https://english.elpais.com/science-tech/2023-11-21/scientists-paid-large-publishers-over-1-billion-in-four-years-to-have-their-studies-published-with-open-access.html

Gower T. What Are Medical Journals? WebMD. Available at:  https://www.webmd.com/a-to-z-guides/medical-journals

(no author) Scientists are publishing too many papers—and that’s bad for science. Science Advisor. Available at: https://www.science.org/content/article/scienceadviser-scientists-are-publishing-too-many-papers-and-s-bad-science#:~:text=In%20recent%20years%2C%20the%20number,had%20jumped%20to%202.82%20million.

Landhuis E. Scientific literature: Information overload. Nature. 2016;535:457–458.

Morrison H, (2021-06-24). “Open access article processing charges 2011 – 2021”. Sustaining the Knowledge Commons / Soutenir les savoirs communs. Retrieved 2022-02-18.

Du JS. Opinion: Is Open Access Worth the Cost? The Scientist. Available at: https://www.the-scientist.com/opinion-is-open-access-worth-the-cost-70049

Kayla Graham (January 6, 2014). “Thanking Our Peer Reviewers – EveryONEEveryONE”. Blogs.plos.org. Retrieved 2015-05-17.

Lee JCL, Watt J, Kelsall D, Straus S. Journal editors: How do their editing incomes compare? F1000Res. 2020;24;9:1027.

De Vrieze JOP. Open-access journal editors resign after alleged pressure to publish mediocre papers. Science Advisor. Available at: VRIEZEhttps://www.science.org/content/article/open-access-editors-resign-after-alleged-pressure-publish-mediocre-papers

Cook CE, Cleland JA, Mintken PE. Manual Therapy Cures Death: I Think I Read That Somewhere. J Orthop Sports Phys Ther. 2018 Nov;48(11):830-832.

Cook CE, Cleland J, Mintken P. Temporal Effect of Repeated Spinal Manipulation on Mortality Ratio: A Case Report. ARCH Women Health Care Volume. 2018. 1(1): 1–4.

Freeman E, Kurambayev B. Rising number of ‘predatory’ academic journals undermines research and public trust in scholarship. The Conversation. Available at: https://theconversation.com/rising-number-of-predatory-academic-journals-undermines-research-and-public-trust-in-scholarship-213107#:~:text=That%20is%20roughly%20the%20same,there%20were%2015%2C000%20predatory%20journals

(anonymous author) Is MDPI a predatory publisher? Publishing with Integrity. Available at: https://predatory-publishing.com/is-mdpi-a-predatory-publisher/

Conroy G. The biggest reason for biomedical research retractions. Detection software is not enough. Nature Index. Available at: https://www.nature.com/nature-index/news/the-biggest-reason-for-biomedical-retractions

On Mastery

By: Seth Peterson, PT, DPT, OCS, FAAOMPT

“I don’t know how they can sleep at night.” I was getting chewed out in a hallway in my first year of residency training. My mentor was speaking in general terms, but it was painfully clear that “they” meant me. I had just seen an 11-year-old girl with an ankle sprain. I had given her a painful balance exercise in standing (because the evidence showed it was more effective) and we had talked about pain neurophysiology, which was cutting-edge at the time. Her problem with what she’d just witnessed was that, despite me applying “evidence-based care,” she hadn’t really seen me apply that care to the individual. She hadn’t seen me think.

Looking back, my lack of thinking about the interventions was made worse by the fact that I was doing so much thinking about the simple things. While my mentor was thinking about the words used to greet someone and deciding what mattered to that person on that day, I was focused on how to sequence an ankle examination. I was focused on the basics—and the basics were something they did unfailingly well. Using the conscious competence learning model, you could say I was at a stage of “conscious incompetence” while they were well into the “unconscious competence” stage. Another way to say it is they had “mastered” the basics, while I was just beginning to grasp them.

(more…)

An Exercise in Interpreting Clinical Results

By: Chad E Cook PT, PhD, FAPTA

Randomized Controlled Trials

In clinical research, treatment efficacy (the extent to which a specific intervention, such as a drug or therapy, produces a beneficial result under ideal conditions) and effectiveness (the degree to which an intervention achieves its intended outcomes in real-world settings) are studied using randomized controlled trials. Randomized controlled trials compare the average treatment effects (ATEs) of outcomes between two or more interventions [1]. By definition, an ATE represents the average difference in outcomes between treatment groups (those who receive the treatment or treatments) and/or a control group (those who do not receive the treatment) across the entire population. Less commonly, researchers will include a secondary “responder analyses” that looks at proportions of individuals who meet a clinically meaningful threshold.

(more…)