By: Chad Cook PT, PhD, FAPTA
Introduction: I recently responded to a very supportive post on Linked-In that discussed a study we published two years ago on patient reported experience measures [1]. In the observational research study, we found that nearly all of the 50,000 plus physical or occupational therapy patients scored a near perfect “experience” in the Select patient experience measure as well as a near perfect score in the Net Promoter Scale (likelihood of recommended the clinic to others). This is great for the clinics (it suggests that the care is considered valuable) but really difficult for running analyses and distinguishing differences across clinicians, clinics, or care packages. In fact, we are experiencing something nearly identical in our work with the SS-MECH trial [2] involving our lack of data spread in measures such as therapeutic alliance and patient engagement. More to come involving this later.
Why this is a Problem: In musculoskeletal rehabilitation, patient‑reported outcome measures (PROMs) are essential tools. They help us understand how patients perceive their pain, function, and quality of life. But PROMs are only as useful as the data they produce, and sometimes, the data simply can’t tell us what we need to know. Three of the biggest limitations in PROMs are ceiling effects, floor effects, and the broader issue of restricted score variability (a lack of data spread). These measurement problems don’t just frustrate researchers, they directly affect how clinicians interpret progress, evaluate treatment effectiveness, and make decisions about care.
A ceiling effect occurs when a large portion of patients score near the top of a scale. A floor effect occurs when most of the data selections cluster at the bottom choices in the items. In both cases, the measure loses its ability to detect meaningful change [3]. These limitations can lead to a number of challenges. If some form of outcome measure is used to evaluate the influence of a treatment, ceiling or floor effects may underestimate treatment benefit. Imagine using a functional scale where most patients with shoulder pain already score 90 out of 100 at baseline. Even if treatment improves strength, coordination, or confidence, the PROM may not budge. The patient improves but the data don’t show it. It may also mask individual variability in findings, failing to understand the nuanced differences across groups. It may also limit the ability to detect responders vs. non‑responders, when looking at individuals who received a dedicated care type. Perhaps most importantly, it may distort effect sizes in clinical trials, leading us to the litany of similar outcomes we see across studies.
Limited data spread can create similar challenges. Even when scores aren’t stuck at the top or bottom, some PROMs suffer from restricted variability. If most patients cluster tightly around a narrow range, the measure struggles to distinguish between individuals or detect subtle but clinically important changes [4]. This lack of spread is especially problematic when studying treatment effect heterogeneity. If everyone’s scores look the same, it becomes nearly impossible to identify which patient characteristics predict better or worse outcomes. This happens more than we think, it is especially challenging with patient satisfaction or net promoter scores, where patients are asked if they would recommend the care to others.
In our SS-MECH trial [1], nearly every participant reported an exceptional therapeutic alliance score with their clinician providers, leading to nearly no spread in the data. Our goal was to determine if therapeutic alliance was a mediator of outcomes and whether it influenced outcomes more in a manual therapy-oriented or an exercise-oriented approach. Because there was no spread, and because nearly everyone scored a near perfect alliance with their clinician, this measure will not “mediate” outcomes. If we had a study in which the alliance values were much lower (the patients had little time to create a relationship with their provider), we may have truly seen alliance as a mediator. In this “real world” care, it is an issue of measurement limitations, not the construct of therapeutic alliance.
In other words, limited variability doesn’t just blur the picture: it erases it.
Why This Matters for Clinicians? Ceiling effects, floor effects, and limited data spread create real challenges for clinicians who rely on patient‑reported measures to track day‑to‑day progress because they make meaningful change nearly impossible to detect. When a patient starts at the top or bottom of a scale, or when most scores cluster tightly in a narrow band, there’s simply no room for the measure to reflect subtle but important shifts in pain, confidence, or function. A patient may be sleeping better, moving with less hesitation, or tolerating activity more easily, yet their PROM score stays frozen, giving the false impression that nothing is improving. This disconnect can undermine clinical reasoning, obscure early treatment effects, and complicate shared decision‑making. In fast‑moving musculoskeletal care, where small daily gains often matter most, a measure that can’t register those changes risks misleading both the clinician and the patient about the true trajectory of recovery [3].
Recommendations: For starters, researchers should look closely at the spread of their data, report this information in a table, and consider this as a possible limitation. For researchers who use continuous based outcomes, they may wish to consider using analyses that address for a lack of spread such as a quantile regression or a method that measure proportions. Lastly, researchers should consider within‑person variability, confidence intervals around change, or minimally important difference (MID) thresholds to avoid over‑ or under‑interpreting small numerical shifts.
For clinicians, it is important to have a conversation with the patient on what they meant when scoring the PROMs. It shows that you find these measures to be important, and it may help qualify the results that are captured. When PROMs risk ceiling or floor effects, pairing them with performance‑based tests (e.g., sit‑to‑stand, gait speed, grip strength) or objective metrics (e.g., step count, range of motion, sleep data) can provide a more complete picture [5]. This triangulation helps ensure that meaningful clinical change is captured even when the PROM plateaus. When baseline scores cluster tightly, traditional change scores can be misleading. It is critical to remember; a change score is the difference between two imperfect measurements.
Portions of this blog were developed with assistance from AI‑based writing tools and were reviewed and edited by the author for accuracy and clarity.
References
1. Garcia AN, Cook CE, Lentz TA, et al. Predictors of Patient Experience for 89,205 Physical and Occupational Therapy Patients Seen for Musculoskeletal Disorders: A Retrospective Cohort Study. JOSPT Open. 2024: Advance publication. https://doi.org/10.2519/josptopen.2024.0071.
2. Cook CE, O’Halloran B, McDevitt A, Keefe FJ. Specific and shared mechanisms associated with treatment for chronic neck pain: study protocol for the SS-MECH trial. J Man Manip Ther. 2024 Feb;32(1):85-95.
3. Terwee CB, Bot SD, de Boer MR, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60(1):34‑42.
4. Angst F, Aeschlimann A, Stucki G. Smallest detectable and minimal clinically important differences of rehabilitation intervention with their implications for required sample sizes using WOMAC and SF‑36 quality of life measurement instruments in patients with osteoarthritis of the lower extremities. Arthritis Rheum. 2001;45(4):384‑391.
5. Stratford PW, Kennedy DM. Performance measures were necessary to capture the benefits of treatment for patients with hip OA. Phys Ther. 2006;86(2):153‑161.