Home » Blog » N-of-1 Trials. Are They Appropriate for Evaluating Physical Therapy Interventions?

N-of-1 Trials. Are They Appropriate for Evaluating Physical Therapy Interventions?

By: Chad E. Cook, PT, PhD, FAPTA

Background: On this Center of Excellence website, we’ve previously highlighted two key points: (1) individuals do not respond uniformly to otherwise effective interventions [1], and (2) randomized controlled trials (RCTs) estimate average treatment effects rather than individualized treatment effects, making it difficult to determine who benefits from a given treatment [2]. An individualized treatment effect reflects the expected benefit of a treatment for a specific patient, based on their unique characteristics, symptoms, and context. One method for evaluating individualized treatment effects is the N‑of‑1 study.

N of 1 studies-Poorly Understood: Although N‑of‑1 studies are considered strong causal designs, they remain poorly understood and are used far less frequently than traditional RCTs [3]. An N‑of‑1 study is a personalized, randomized, crossover experiment conducted within a single individual to determine that person’s causal response to one or more treatments. In this design, the participant undergoes multiple treatment periods (e.g., A–B–A–B, or B-A-A-B, or B-A-B-A, etc.), typically separated by washout intervals, with the order of treatments randomized to reduce bias. Outcomes are measured repeatedly during each period to estimate that individual’s response to treatment [4]. When multiple N‑of‑1 studies are conducted, their results can be aggregated using hierarchical or time‑series models, but the primary unit of inference remains the individual.

N‑of‑1 studies offer a powerful way to deepen our understanding of treatment specificity and advance personalized care, but they also come with important methodological complexities. In this blog, we highlight several key considerations for researchers and clinicians. These include:

  1. The assumptions underlying N‑of‑1 designs, including which conditions and treatments are best suited for this approach;
  2. How aggregated N‑of‑1 studies differ from traditional randomized crossover trials;
  3. The most effective methods for combining (aggregating) results across multiple N‑of‑1 studies; and
  4. How to interpret both individual and aggregated N‑of‑1 findings.

Assumptions for Use of an N-of-1 Study: Firstly, N-of-1 studies should include conditions (pathologies) that are stable over time [5]. Conditions should be chronic, with symptoms stable enough that changes can be attributed to treatment. Appropriate examples may include unchanging, chronic pain disorders, mental health conditions such as anxiety or chronic clinical depression, and/or ongoing syndromes such as fibromyalgia. In contrast, acute or recent onsite conditions that change rapidly during a course of care are inappropriate choices.

In addition, treatments that provide a quick, short-acting, measurable change during the measured treatment phase are appropriate choices [5]. This may include medication, manual therapy, TENS, or other fast-acting approaches that (supposedly) provide only short-term benefits. In contrast, long-lasting interventions such as surgery, graded physical therapy, and/or psychological therapies that provide fast-acting, adaptive cognitive responses that permanently change a patient, thereby influencing the next treatment phase, are considered inappropriate choices.

How N-of-1 studies differ from randomized cross-over designs: N-of-1 studies and crossover RCTs look similar because both involve repeated treatment periods (although crossover RCTs typically only include one crossover), randomized sequences, and within-person comparisons. Further, many studies combine N-of-1 results into an aggregate, resulting in a large sample of N-of-1 study participants. Nonetheless, they differ in purpose and analysis. N-of-1 studies focus on individual results, whereas crossover RCTs focus on group results. Aggregated N-of-1 studies use statistical models representing a hierarchical model of individual effects, while crossover RCTs use repeated measures, mixed model designs.

Best methods for combining N-of-1 results: Aggregating N‑of‑1 study results fundamentally changes how a single N‑of‑1 finding is interpreted [6]. Once results are aggregated, the focus shifts from an individual’s treatment response to the average treatment effect across patients, similar to the interpretive lens used in RCTs. The most effective way to combine results from multiple N‑of‑1 trials is through hierarchical (multilevel) modeling, which treats each individual’s estimated treatment effect as its own data point while allowing those estimates to inform one another. In this approach, each participant’s within‑person treatment effect is first estimated from their repeated, randomized treatment periods, and these individual effects are then pooled using a random‑effects or Bayesian hierarchical model. This framework preserves the individualized nature of N‑of‑1 data while producing a population‑level estimate and quantifying heterogeneity in treatment response. When study designs differ across individuals, meta‑analytic techniques or individual‑participant‑data meta‑analysis can also be used to synthesize results.

Aggregating results requires that each participant completes their own randomized crossover experiment (i.e., an N-of-1 study) and that their individual treatment effect is estimated first. These individual effects are then combined to understand the population‑level effect and the degree of variation across individuals. Conceptually, this is similar to running several small RCTs and then performing a meta‑analysis. In practice, however, hierarchical modeling is preferred because it pools information across individuals, estimates both the average effect and heterogeneity, and can generate predictions for new patients. Although this modeling framework is widely used in random‑effects meta‑analysis and multi‑site trials, it can be technically complex, and collaboration with a biostatistician is often helpful.

Interpretation of results: Interpreting individual N‑of‑1 studies and aggregated N‑of‑1 studies (which are also referred to in the literature as an N-of-1 trial meta-analysis, N-of-1 series, or pooled N-of-1 trials) requires two different mindsets because each design answers a distinct scientific question. An individual N‑of‑1 study focuses on a single person’s causal response to a treatment, prompting the clinician to ask: “Does this treatment work for this patient, and is the benefit meaningful enough to use in practice?” The emphasis is on whether the treatment produces a consistent, clinically relevant effect for that specific individual. In contrast, when multiple N‑of‑1 studies are combined, the goal shifts from individual decision‑making to population‑level inference and understanding variability in treatment response. Interpretation then centers on questions such as: “What is the average effect across patients, how much do individuals differ, and what factors predict who benefits [6]?”

Summary: Are N-of-1 studies appropriate for evaluating physical therapy interventions? The most honest answer is “it depends”. N-of-1 studies offer an intriguing alternative to RCTs and may be a design that improves our ability to provide personalized care. Nonetheless, it is important to recognize that some conditions and treatments are inappropriate for N-of-1 investigations. Further, a single N‑of‑1 study is interpreted at the level of the individual, focusing on whether a treatment produces a consistent and clinically meaningful benefit for that specific patient. In contrast, aggregated N‑of‑1 studies are interpreted at the population level, emphasizing the average treatment effect across individuals, the degree of heterogeneity in response, and the factors that predict who is most likely to benefit, which is similar to the interpretive lens used in an RCT.

Layperson’s Summary: An N-of-1 study may be helpful in identifying which treatment approach is best for you. To accomplish this, your researcher may switch treatments during your treatment to determine which approach is most meaningful for your recovery.

References

  1. Edwards RR, Dworkin RH, Turk DC, et al. Patient phenotyping in clinical trials of chronic pain treatments: IMMPACT recommendations. Pain. 2016;157(9):1851-1871.
  2. Cook C. Why Individualized Treatment Effects Matter More Than Averages in Musculoskeletal Care. Center of Excellence in Manual and Manipulative Therapy. Accessed February 20, 2026: https://sites.duke.edu/cemmt/2025/12/24/why-individualized-treatment-effects-matter-more-than-averages-in-musculoskeletal-care/
  3. Vohra S, Punja S. N-of-1 trials: individualized medication effectiveness tests. AMA J Ethics. 2013;15(11):947‑952.
  4. Guyatt GH, Keller JL, Jaeschke R, Rosenbloom D, Adachi JD, Newhouse MT. The N‑of‑1 randomized controlled trial: clinical usefulness. J Clin Epidemiol. 1990;43(3):255‑266.
  5. Lillie EO, Patay B, Diamant J, Issell B, Topol EJ, Schork NJ. The N‑of‑1 clinical trial: the ultimate strategy for individualizing medicine? J Clin Epidemiol. 2011;64(6):561‑570.
  6. Zucker DR, Ruthazer R, Schmid CH, et al. Individual (N‑of‑1) trials can be combined to give population comparative treatment effect estimates: methodologic considerations. J Clin Epidemiol. 2010;63(12):1312‑1323.

Leave a comment

Your email address will not be published. Required fields are marked *