Estimands and sensitivity analysis in clinical trials E9(R1)

regulatory

estimands

Author

maj

Published

August 4, 2025

Modified

August 5, 2025

Code

library(data.table)
library(ggplot2)
library(gt)
options(scipen=999)

Introduction

The ICH efficacy guidelines deal with the design, conduct, safety and reporting of clinical trials. E9 is about the statistical methdology applied to (later phase) clinical trials for marketing applications. E9 (R1) is an addendum to E9 that introduces the estimand franework with a focus on the statistical priniciples related to estimands and sensitivity analyses.

There are six sections in the addendum:

Purpose and scope
Framework introduction
Estimand definitions
Impacts of framework on design and conduct
Impact on analyses
Documentation

Purpose

The adendum is about achieving greater clarity about what treatment effects imply. To do so the estimands are introduced, which are just a precise description reflecting the clinical question posed by the trial. At a fundamental level, estimands assume a causal inference perspective:

“An estimand … summarises at a population level what the outcomes would be in the same patients under different treatment conditions being compared.” (ICH E9(R1) training materials.

While this is clearly drawing on concepts from the potential outcomes framework, the guidance never mentions this term explicitly.

The stated purpose of the guidance is to address (1) the limitations of ITT with respect to its utility in decision making, (2) distinguishing missingness from intercurrent events (post randomisation events such as discontinuation of treatment, use of rescue medication, death etc) considerations especially when ICE impact the interpretaion of later measures (3) which analysis set should be referenced and how per protocol can be addressed and (4) the use of sensitivity analyses.

The adendum (i.e. the use of the estimand framework) is relevant irrespective of whether the study is an RCT and whatever data type is considered.

Framework introduction

The maintains a waterfall (linear/sequential) perspective on the development of clinical trials. Specifically, objectives should be translated into clinical questions, which would then be used to define estimands.

Only once the estimands have been formulated can a method of estimation be chosen and then sensitivity analyses (targeting the same estimand) are used to explore how robust the inference is.

The guideline wants the estimands to be the main determinant of the trial design, conduct and anlaysis. Doing this the other way around will effectively dictate the estimand from the selected analysis.

Estimands

The estimand captures a very precise description of the treatment effect that reflects the clinical question.

ICEs are events that happen after treatment and impact the interpretation or the existence of the measurements used to answer the clinical question. To fully understand the treatment effect, we need to consider ICEs.

For example, if two participants are assigned to the treatment arm of a trial but one receives additional medicine, then the information on the treatment differs between the two participants. Additionally, whether a participant needs to take additional medication and whether or not they can continue taking the assigned treatment might depend on which treatment arm they are exposed to. What needs to be decided is whether these events should be considered explicitly when defining the clinical question.

The guidance does not go into much detail for what ICEs it has considered so Table 1 gives a (somewhat) generic list:

Code

d_tbl <- fread("data/intercurrent_events_list.csv")
d_tbl[, id_type := .GRP, by = type]

# Order the data
setorder(d_tbl, id_type)

# Blank out repeated 'type' and 'subtype' values for better visual grouping
d_tbl[, my_lag := shift(type, type = "lag")]
d_tbl[, type_display := ifelse(my_lag != type | is.na(my_lag), type, "")]
d_tbl[, my_lag := NULL]


d_tbl[, my_lag := shift(subtype, type = "lag")]
d_tbl[, subtype_display := ifelse(my_lag != subtype | is.na(my_lag), subtype, "")]
d_tbl[, my_lag := NULL]

d_tbl[, .(type, subtype_display, element)] |>
  gt(groupname_col = "type") |>
  gt::text_transform(
    locations = cells_row_groups(),
    fn = function(x) {
      lapply(x, function(x) {
        gt::md(paste0("**", x, "**"))
      })
    }
  ) |>
  cols_label(
    # type = "Type",
    subtype_display = "Subtype",
    element = "Description"
  ) |>
  cols_width(
    subtype_display ~ pct(25),
    element ~ pct(75)
  ) |>
  # tab_style(
  #   style = cell_text(indent = px(20)),
  #   locations = cells_body(columns = subtype_display)
  # ) |>
  fmt_markdown(columns = everything()) |>
  tab_options(
    table.font.size = "x-small"
  )

Table 1: Examples of possible ICEs

Subtype	Description
Participant-Level Events
Treatment Discontinuation	Discontinuation due to adverse events
	Discontinuation due to lack of efficacy
	Discontinuation due to participant choice
	Discontinuation due to investigator decision
Treatment Switching / Change in Assigned Treatment	Switch to another active treatment
	Switch to placebo/control arm
	Escalation or de-escalation of dose outside of protocol-defined rules
Initiation of Rescue or Concomitant Medication	Use of rescue medication for symptom relief
	Use of non-protocol permitted concomitant medication
	Use of alternative (non-protocol) therapy for the condition under study
Non-Adherence to Treatment Regimen	Partial compliance (e.g., missed doses)
	Taking incorrect dose or frequency
	Use of alternative medicine or supplements that interact with treatment on outcome
Withdrawal from Follow-up	Participant withdrawal of consent for follow-up
	Loss to follow-up (cannot be contacted)
	Withdrawal due to logistical/administrative issues (system error, data loss)
Mortality	Death due to study treatment
	Death due to disease under study
	Death due to unrelated causes
	Death with unknown attribution
Clinical and Health Events
Occurrence of a Clinical Event that Alters Outcome Interpretation	Intercurrent illness affecting outcome measurement (e.g. contracting COVID-19)
	Complication or comorbidity unrelated to treatment
	Pregnancy (in trials not focused on pregnancy)
Diagnosis of a New Condition	Diagnosed with a comorbidity during follow-up
	Psychiatric diagnosis altering engagement or assessment
Hospitalisation or Institutionalisation	Emergency hospitalisation unrelated to study treatment
	Admission to rehabilitation, palliative, or psychiatric care
Measurement and Assessment Events
Missing Outcome Data Due to Operational Issues	Missed clinic visit or window for assessment
	Data entry failure, corrupted files, lost records
Change in Method of Outcome Measurement	Different device, temporary staff replacement, non-trial method used
	Incomplete outcome data (data entry mishaps)
Unblinding of Treatment Allocation	Accidental unblinding by staff or participant (clinician unblinded impacting subsequent assessment and care)
	Protocol-mandated unblinding
Participant Non-Cooperation During Assessment	Refusal to undergo a test or procedure
	Inability to perform task due to unrelated cause
Protocol and Trial Conduct Issues
Protocol Deviations / Violations	Incorrect treatment administration
	Enrolment of ineligible participant
	Missed or out-of-window assessments
Trial Discontinuation Events	Study site closure
	Sponsor decision to stop trial early
	Regulatory action halting the study
Randomisation Error	Misallocation to incorrect arm
	System malfunction during randomisation
Contextual and External Events
Environmental or Societal Disruptions	Natural disaster, pandemic, conflict affecting follow-up
	Travel restrictions preventing visits
Change in Standard of Care	Introduction of new competing treatment during the trial
	Public access to investigational treatment
Insurance or Access Issues	Loss of access to trial medication or assessments due to coverage
	Changes in healthcare system or site practices

ICEs can have a range of impacts on the outcome and/or treatment:

Discontinuation of the assigned treatment or the use of additional or alternative therapy can impact the interpretation of outcome measures and thus treatment effect.
Death may lead to the outcome measure being impossible to observe.

Some events will have permanent impacts on the outcome, others may be temporary.

ICEs may also fully (or partially) define principal strata. For example, some participants may never complete any of the treatments that they may be assigned.

The guideline offers several strategies for handling intercurrent events:

Treatment policy
Hypothetical
Composite
While on treatment
Principal stratum

Treatment policy

Within TP, ICE are considered irrelevant to the treatment effect of interest and thus the ICE becomes part of the treatment regime under comparison. In this sense, TP essentially replicates the ITT principle. However, TP cannot be implemented for terminal events such as death as the outcome variable will not exist.

Hypothetical

The hypothetical estimand envisages a situation where the event would not occur. However, this estimand might be difficult to implement. For example, if discontinuation of medication is the event in question, we might consider what would be the effect of the drug if all participants did adhere. One of the issues is that the adherence status of participants on other treatment arms had they been assigned to the drug in question was not observed.

As another example, we might consider the hypothetical estimand of the effect of a new inhaler were rescue medication inhaled. In this situation, the approach is implausible as it is unrealistic to implement and ethically problematic.

Composite

The composite strategy treats the occurrence of an ICE as being informative of the patients outcome. Death is the classic example, but discontinuation because of toxicity is also a good example of a form of treatment failure.

While on treatment

While on treatment is primarily for situations where the outcome variable is measure repeatedly and we are considering the effect of treatment up to the point where the ICE occurs.

Principal stratum

Under principal stratum we consider the stratum where the ICE either occurs or does not and the clinical question relates to the effects only within the strata. For example, we might be interested in the effect of treatment on severity of the infection in the patients that become infected after infection.

The guideline is very clear on the importance of distinguishing principal stratification (based on potential intercurrent events) from subsetting based on actual ICEs because the subset of participants who experience an ICE on the test treatment will likely be different from the subset that experience the same ICE on control. A naive perspective will lead to treatment effects being defined by comparing outcomes in subsets that confound the effects of the different treatments with the differences possibly being due to the different participant characteristics rather than any real effect.

Components of estimands

To define an estimand, the treatment condition, the popoulation (entire trial population, subgroup, principal stratum etc), the endpoint to be obtained for each patient and population summary need to be considered and precisely specified.

The guideline notes the important that when defining the treatment effect of interest, the definition identifies an effect due to treatment and not due to potential confounders such as differences in the duration of observation or participant characteristics.

Considerations

The main emphasis of the guideline is to specify estimands that are useful to the decision maker under consideration. For example, a hypothetical estimand that evaluates what the variable of interest would have been in absence of rescue medication might be important in certain scenarios but might be completely unrealistic in others.

More to come…