Skip to content

S1C – Statistical Methods in Restricted Mean Survival Time

Chair: Hwanhee Hong, PhD (Duke University)
Co-Chair: Kaiyuan Hua, PhD (Duke University)

Abstract: Restricted Mean Survival Time (RMST) is a robust statistical measure for analyzing time-to-event data. It summarizes the mean survival time up to a clinically relevant truncation time. The difference or ratio of RMST between two treatments measures the relative treatment effect in terms of a gain or loss of event-free survival time within a specified time horizon. Unlike the hazard ratio, typically estimated from the Cox proportional hazard model, which can be a misleading and inappropriate summary of treatment effect when the proportional hazard assumption is violated, estimating RMST does not rely on model assumptions. In addition, RMST provides a more straightforward interpretation of survival benefit, making it increasingly popular in recent medical research. This session will feature leading academic experts to discuss recent methodological advancements and practical implementation for RMST analyses. Presentations will highlight innovative applications of RMST in clinical trials, observational studies, and medication evaluation.

Dr. Ludovic Trinquart from Tufts University introduces a novel combination test framework with RMST that empowers patient-centered clinical trial design. Dr. Lihui Zhao from Northwestern University discusses the application of RMST for measuring disease burden. Dr. Kaiyuan Hua from Duke University presents novel methods for enhancing evaluation of treatment effect consistency in multi-regional clinical trials with time-to-event outcomes. Lastly, Dr. Miki Horiguchi from Harvard University presents the methods for assessing survival benefits of immunotherapy using RMST.

Speaker: Kaiyuan Hua, PhD (Duke University)
Title: Novel Methods for Enhancing Evaluation of Treatment Effect Consistency in Multi-Regional Clinical Trials with Time-to-Event Outcomes
Abstract: Multi-regional clinical trials (MRCTs) play an increasingly crucial role in global pharmaceutical development by accelerating data collection and regulatory approval across diverse patient populations. However, differences in recruitment practices and regional demographics often result in variations in participant characteristics, potentially biasing treatment effect estimates and undermining the assessment of treatment effect consistency across regions. To address this challenge, we propose novel estimators and inference methods using inverse probability of sampling and calibration weighting. These approaches aim to eliminate exogenous regional imbalances while preserving intrinsic differences such as race and genetic variants. Additionally, time-to-event outcomes in MRCT studies receive limited attention, with existing methods primarily focusing on hazard ratios. In this paper, we utilize restricted mean survival time to characterize the treatment effect, offering more interpretable results with fewer assumptions than hazard ratios. Theoretical results for the proposed estimators are established, and supported by extensive simulation studies. We further demonstrate the effectiveness of our methods through a real MRCT case study on acute coronary syndromes.

Speaker: Miki Horiguchi, PhD (Harvard University)
Title: Assessing Survival Benefits of Immunotherapy Using Restricted Mean Survival Time
Abstract: For immunotherapy trials, we often see a delayed difference pattern where the proportional hazards (PH) assumption is unlikely to hold. Under the violation of the PH assumption, the conventional test/estimation approach using the log-rank test for between-group comparisons and Cox’s hazard ratio to estimate the magnitude of treatment effects is not optimal. The log-rank test is not the most powerful option, and the interpretation of the resulting hazard ratio is not obvious to clinicians and patients. However, almost all immunotherapy studies have been using the conventional approach. This implies that current immunotherapy treatment assessments are not necessarily optimal. There is a need to develop alternative approaches for assessing the survival benefits of immunotherapies which will help clinicians and patients to make better choices.

Restricted mean survival time (RMST)-based analysis is a good alternative as it can provide a robust and interpretable summary of the treatment effect. We propose RMST-based approaches particularly when a delayed difference pattern is expected. Simulation studies show how effectively the proposed methods can detect the treatment difference, compared to the log-rank test and other tests. Additionally, using past immunotherapy trials published in clinical journals, we estimate the empirical power of the proposed tests, which is determined based on the proportion of trials for which the test would have identified a significant result. We compare the empirical power of the proposed tests with other tests. In addition to the power advantage in the delayed difference scenarios, the proposed methods have test/estimation coherency regarding statistical significance and provide robust estimates of the magnitude of the treatment effect in both absolute and relative terms.

Speaker: Mitchell Paukner, PhD (Wake Forest University)
Title: Window Mean Survival Time
Abstract: In modern oncology clinical trials, it is common for time-to-event data to violate the proportional hazards assumption (PH). Due to this violation, the common practice of assessing treatment benefit through log-rank tests (LRT) and Cox models may produce lackluster results in terms of power and interpretability. Due to being free of distributional assumptions, there has been a resurgence in interest in restricted mean survival time (RMST) as a means of estimating and communicating treatment effects in settings where PH is in doubt. While interpretable, in scenarios where the treatment effect is delayed, the power to detect treatment benefit with RMST is hindered, and at times, abysmal.  We propose a class of alternative estimates and tests to RMST called window mean survival time (WMST) which improves power in numerous survival scenarios with non-proportional hazards (NPH) while maintaining a level of interpretability like RMST. WMST, an estimate of mean survival in a specified window of time (characterized by the area under the Kaplan-Meier (KM) curve between time horizons), offers the clinician and statistician great flexibility in choosing window bounds without relying on distributional assumptions that nullify results or interpretation when not satisfied.  We not only cover the basic methodology of WMST, but its various extensions (formulation of estimates and tests, combination testing, and trial design), and demonstrate its value through multiple simulation studies and real data examples.

Speaker: Ludovic Trinquart, PhD, MSc, MPH (Tufts University School of Medicine)
Title: Empowering Patient-Centered Clinical Trial Design: A Novel Combination Test Framework with Restricted Mean Survival Times
Abstract: The log-rank test is widely used for designing and analyzing randomized controlled trials (RCTs) with time-to-event endpoints. However, the log-rank test is no longer optimal under non-proportional hazards (PH) and deviations from PH are common. Weighted log-rank tests and their combinations, in particular the Max-Combo test, have been proposed to restore efficiency under non-PH. However, interpretation of these test results or the corresponding weighted HR can be unclear. Milestone difference in survival probabilities or in restricted mean survival times (RMST) offer clinically interpretable and patient-centric measures of treatment effects. A unified testing framework to harness their strengths remains underexplored.  We introduce novel milestone combination tests that integrate the milestone differences in survival probabilities and in RMST, providing robust, clinically meaningful tools for RCTs with time-to-event endpoints, under PH and non-PH conditions. In simulation studies, we consider trials designs with random number or fixed number of events, under PH, early treatment effects, late-emerging effects, and crossing survival curves. The proposed combination tests consistently maintained robust type I error control and exhibited superior power under non-PH scenarios compared to traditional methods. Applications in cancer trial datasets further highlighted their clinical relevance, detecting treatment effects at earlier, meaningful timepoints.