Design Notes

Published

February 2, 2024

Modified

February 23, 2024

Setup

source("./R/init.R")
log_info("Called design-notes notebook")

Note

Why bother with any of this? One reason is to identify the variables that are relevant to the design and how they relate. This helps us formulate a picture of the joint probability distribution which allows us to simulate data from the assumed data generation process. It also allows us to determine which causal effects we can identify. Even if we model all the domains separately, we still need some kind of joint view of the distribution in order to simulate the data properly and share parameters.

Start by saying we were doing a trial in just the late acute patients. We are interested in surgery (DAIR/revision) and antibiotic duration (long/short), but we are limited in that, for whatever reason, we can not ethically randomise the type of revision surgery, only revision surgery itself.

Choice of antibiotic duration is conditional on the type of revision surgery used, so surgery type needs to be considered in any joint analysis (alternatively, analyse separately).

I just want to work through from a basic scenario to more involved ones to check understanding of potential issues. The following will ignore much of the complexity, but just want to get some basics down as a reference.

Surgery

As a starting point, consider the following:

R: randomised surgery - 0: DAIR, 1: revision
S: preferred revision - 0: one-stage, 1: two-stage
A: allocated surgery - 0: DAIR, 1: one-stage, 2: two-stage
Y: treatment success - 0: no, 1: yes

Patients are randomised to a surgery type, \(R\): DAIR or revision. For every patient, if they were to have been allocated to revision, there is some preference/plan for one/two stage, \(S\). The value of \(S\) is determined by the surgeon/patient and I’m considering it here as just an attribute of the patient.

Note

The above determination is actually dependent on the surgeon’s position as well as patient characteristics. In fact, one may dominate the other. For example, would a surgeon choose two-stage for all their patients simply because they have more experience or sucess with that approach, whereas another might choose one-stage for all theirs?

Note to self - in practice, I actually think that what matters is that we know what selection is rather than what process occurs to decide. Additionally, later we see that this consideration becomes redundant.

The actual allocated treatment is a deterministic function of \(R\) and \(S\), i.e. \(A = R \times (S + 1)\). Note that I’m assuming that \(S\) is known for every patient before \(R\) is revealed to the surgeon, irrespective of whether they are eventually assigned to DAIR or revision (if that it isn’t the case then perhaps some issues might arise, but I don’t think that it matters too much for what’s being considered here given randomisation).

%%{
  init:{
    "flowchart":{"htmlLabels": "true"},
    "securityLevel": "loose",
    "theme": "base"
}}%%
flowchart LR
  R --> A --> Y
  S --> A 
  S --> Y
  US((U<sub>S)) -.-> S
  UR((U<sub>R)) -.-> R
  UY((U<sub>Y)) -.-> Y

Figure 1: Scenario 1, the \(U\) denote independent exogenous variables

Note

There are no back-door paths for the above, hence no open back-door paths
\(Y\) is caused by/depends on some function of \(R\) and \(S\): \(Y = f(R,S)\)
Assuming \(S\) causes \(Y\) a weaker/safer assumption than excluding that link.
Maybe there should be some shared \(U\) that influences both \(S\) and \(Y\) rather than assuming \(S\) causes \(Y\)?
The graph implies:
- \(S\) and \(R\) are independent; \(S \mathrel{\unicode{x2AEB}} R\)
- \(Y\) is conditionally independent of \(R\) given \(A\) and \(S\); \(Y \mathrel{\unicode{x2AEB}} R | A, S\)
We can estimate the total effect of \(R\) on \(Y\) where \(R = 1\) is loosely defined as revision with expert selection of one-stage or two-stage procedure (or something along those lines)
Think planned/intended procedure should be recorded prior to any elements of randomisation being revealed.
No adjustment is required to estimate the total effect of \(R\) on \(Y\)

Our intervention is on \(R\), everything down stream from that (revision type, antibiotic use, physiotherapy, complications) is a consequence of the intervention. It’s the overall effect of allocation to \(R=0\) or \(R=1\) that we are trying to (the only thing we can, not necessarily want to) compare.

Note

Isn’t revision-type independent of intervention? I don’t think I understand why it is downstream.
Not really sure what is meant by the statement the only thing we can, not necessarily want to compare

If \(Y(a)\) is the potential outcome of a patient under surgery of type \(a\) (dair, one-stage, two-stage), then:

\[ \begin{aligned} \mathbb{E}[Y(a)] &= \mathbb{E}[Y(a)|R=0] \\ &= \mathbb{E}[Y(a) | R=1] \\ &= \mathbb{E}[Y(a) | A = 0] \\ &= \mathbb{E}[Y(a) | A \in \{1,2\}] \\ &\ne \mathbb{E}[Y(a) | A = j], \quad j\in\{1,2\} \\ \end{aligned} \]

Note

Above we establish exchangeability assumptions, which are a partial requirement for identifying causal effects, i.e. it gets you to \(\mathbb{E}[Y(1)] = \mathbb{E}[Y|A=1]\) and \(\mathbb{E}[Y(0)] = \mathbb{E}[Y|A=0]\) whereby you are learning about the potential outcomes from the observed data.
All come about due to randomisation of \(R\).
Where independence holds (as above for the majority of the cases) knowing the conditioning variable tells you nothing about \(Y(a)\) (the term on the LHS of the conditioning).
For example \(\mathbb{E}[Y(a)] = \mathbb{E}[Y(a)|R=0]\) tells us that the potential outcomes of \(Y(a)\) in those receive dair have the same distribution as those that do not.
For \(\mathbb{E}[Y(a)] \ne \mathbb{E}[Y(a) | A = 1]\) and \(\mathbb{E}[Y(a)] \ne \mathbb{E}[Y(a) | A = 2]\) we are saying that the distribution of potential outcomes in those for whom one-stage is planned do not share the same distribtion as for those for whom one-stage is not planned. And this is by virtue of the fact that revision type is selected rather than randomised.

The only randomised comparison we can make is \(R=1\) vs \(R=0\), but, given we want to eventually condition on which revision type is selected, we want to include terms for preferred revision type in the model. Assume logistic regression is the true model and specify

\[ \begin{align} \mathbb{E}[Y | R] &= Pr(Y = 1 | R) = \text{expit}(\alpha_0 + \alpha_1R) \\ \end{align} \tag{1}\]

\[ \begin{align} \mathbb{E}[Y | A] &= Pr(Y = 1 | A) = \text{expit}(\beta_0 + \beta_1\mathbb{I}(A=1) + \beta_2\mathbb{I}(A=2)) \\ \end{align} \tag{2}\]

Equivalently, state in terms of \(R\) and \(S\).

\[ \mathbb{E}[Y | R, S] = \text{expit}(\beta_0 + \beta_1R + \beta_2RS) \\ \tag{3}\]

Equation 1 targets the thing we actually want to compare, the log-odds ratio of revision versus DAIR. Equation 3 splits this out into one and two stage which we would need to combine to get the overall revision effect. In Equation 3, the \(\beta_1\) gives the effect of revision under a one-stage procedure and the \(\beta_2\) gives the increment on that term when a two-stage procedure is undertaken. Some weighted combination of these terms will give us a view of the aggregated effect of revision.

Note

What isn’t directly stated above is that the (primary?) reason we need to split \(R\) into one-stage and two-stage effects is so that we can incorporate the duration domain within a single model. The randomised interventions for duration are 12 weeks vs 6 weeks antibiotics for those receiving one-stage surgery and 12 weeks vs 7 days for those receiving two-stage surgery. If a patient received one-stage, the duration intervention parameters associated with two-stage are mostly irrelevant to estimating the log-odds of treatment success for this patient. Moreover, the response for the one-stage patient does not inform the two-stage duration intervention effects. To accommodate this, the one-stage and two-stage duration parameters enter the model independently, e.g.

\[ \mathbb{E}[Y | R, S] = \text{expit}(\beta_0 + \beta_1 \mathbb{I}(R = \text{one-stage} \land D = short) + \beta_2 \mathbb{I}(R = \text{two-stage} \land D = short)) \\ \]

However, this means that the duration domain reference group for the one-stage and two-stage duration effects have the same log-odds of treatment success and this is very unlikely to be the case. Thus, by splitting \(R\) into one-stage and two-stage effects we allow the reference groups for the one-stage and two-stage duration effects to vary.

These models could also adjust for \(S\), i.e. in addition to any actual effect of one/two stage revision, the patients for whom a two-stage would be preferred may differ from those for who a one-stage is the preference. Interpretation of model parameters then changes of course.

\[ \mathbb{E}[Y | R, S] = \text{expit}(\beta_0 + \beta_1R + \beta_2RS + \beta_3S) \\ \tag{4}\]

We don’t really care about the difference due to \(S\), as the revision type effects may still be confounded by other factors anyway. Due to the randomisation, I think that both the version with and without \(S\) are targeting the same estimand for revision vs. DAIR (the distribution of \(S\) is the same (in expectation) amongst DAIR and revision patients, so is not a confounder, but obviously, \(S\) is known exactly to be \(A-1\) when \(R=1\)).

Note

I think what you are proposing for the \(\beta_3\) term (the effect of \(S\)) boils down to the following:

the effect of \(S\), clinical selection of one-stage vs two-stage may be confounded and we can probably do very little about that
however, the estimate we obtain for \(\beta_3\) would be the same in the group that received revision as the group that did not due to the fact that dair vs revision was randomised

Probably incorrectly, I think the selection effect is meaningless in the sense that receiving dair contradicts the possibility of a selection. It is analogous to a model for number of children whereby you make an adjustment for (randomised) marriage and (self-selected) age of marriage. The main effect of age of marriage is excluded, on the basis that it has no counterpart in reality and therefore no way to inform the effect.

What is the counter-argument? Perhaps something along the lines of:

To justify the inclusion of \(\beta_3\) we argue that the married and the non-married groups are balanced across age (and all other characteristics). Therefore, the married and unmarried groups are similar and the effect for age of married in the group is portable to the group that were not randomised to marriage, in the counterfactual world where they had been.

Ok, think I have convinced myself (if perhaps no-one else) that it makes sense.

For the second model above, in terms of \(R\) and \(S\), without adjustment for \(S\)

\[ \begin{aligned} \mathbb{E}[Y | R] &= \mathbb{E}[\mathbb{E}[Y | S, R] | R] \\ &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1R + \beta_2Rs)\mathbb{P}(S = s | R) \\ \mathbb{E}[Y|R =0] &= \text{expit}(\beta_0) \\ \mathbb{E}[Y|R = 1] &= \text{expit}(\beta_0 + \beta_1)\mathbb{P}(S=0|R=1) \\ & \quad \ + \text{expit}(\beta_0 + \beta_1 + \beta_2)\mathbb{P}(S=1|R=1) \end{aligned} \]

so the log-odds ratio for the marginal success probability for revision vs. DAIR is

\[ \begin{aligned} \ln\frac{\text{odds}(Y|R=1)}{\text{odds}(Y|R=0)} &= \text{logit}[\text{expit}(\beta_0 + \beta_1)\mathbb{P}(S=0|R=1) + \\ & \quad \quad \ \text{expit}(\beta_0 + \beta_1 + \beta_2)\mathbb{P}(S=1|R=1)] - \beta_0 \end{aligned} \tag{5}\]

We don’t know \(\mathbb{P}(S=s|R)\) so estimate it from the sample. Due to randomisation \(\mathbb{P}(S=s|R) = \mathbb{P}(S=s)\).

Note

Which is just creating a standardised probability of treatment success under revision based on a weighted version of the probability of treatment success for each of the selection groups (one-stage and two-stage) where the weights are formed from the (unknown) distribution of \(S\) (estimated from the sample). The standardised probability of treatment success under revision is then converted to the log odds of treatment success and the reference group (DAIR) log-odds of treatment success is subtracted to come up with the log-OR.

The final line in the above derivation is just saying that we can assume that the probability distribution of \(S\) is independent to treatment group membership and so can be estimated from the full sample rather than condition on \(R\). Note that \(\mathbb{P}(S=s)\) is not necessarily indicative of the probability distribution of \(S\) in the population because of our convenience sample, but this is true for all inference here and in the majority of trials.

An alternative to the above is to consider the “average” conditional log-odds ratio rather than the odds ratio of the marginal probabilities.

\[ \mathbb{E}\left[\ln\frac{\text{odds}(Y|R=1,S)}{\text{odds}(Y|R=0,S)}\right] = \mathbb{E}[\beta_1 + \beta_2S] = \beta_1 + \beta_2\mathbb{E}[S] \tag{6}\]

Note

The expectation of \(S\) above is across the sample, but below the weighting is taken from the revision group. Would it not be preferable to use the mean derived from the full sample or am I thinking about it incorrectly?

If the model also adjusts for \(S\), then

\[ \begin{aligned} \mathbb{E}[Y | R] &= \mathbb{E}[\mathbb{E}[Y | S, R] | R] \\ &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1R + \beta_2Rs + \beta_3s)\mathbb{P}(S = s | R) \\ \mathbb{E}[Y|R = 0] &= \text{expit}(\beta_0)\mathbb{P}(S = 0 | R=0) + \text{expit}(\beta_0 + \beta_3)\mathbb{P}(S=1|R=0)\\ \mathbb{E}[Y|R = 1] &= \text{expit}(\beta_0 + \beta_1)\mathbb{P}(S = 0 | R = 1) + \text{expit}(\beta_0 + \beta_1 + \beta_2 + \beta_3)\mathbb{P}(S=1 | R = 1) \\ \ln\frac{\text{odds}(Y|R=1)}{\text{odds}(Y|R=0)} &= \text{logit}(\mathbb{E}[Y|R = 1]) - \text{logit}(\mathbb{E}[Y|R = 0]) \end{aligned} \]

With the introduction of a main effect for \(S\) we can still report on the average conditional log-odds ratio in the same form as previously:

\[ \mathbb{E}\left[\ln\frac{\text{odds}(Y|R=1,S)}{\text{odds}(Y|R=0,S)}\right] = \mathbb{E}[\beta_1 + \beta_2S] = \beta_1 + \beta_2\mathbb{E}[S] \]

i.e. with the same terms as without adjustment for \(S\). However, the treatment effect parameter estimates produced from the model that includes the main effect for \(S\) (and the interaction term) and the model that only has the interaction term, are likely be different.

Example

Herein I am just assuming \(n\approx \infty\), i.e. checking consistency.

Note

Consistency is simply about whether the estimator produces an estimate that gets closer towards the true value as the sample size gets bigger; a consistent estimator does not negate the possibility of bias. For example, \(\frac{1}{N-1}\sum_i (x_i)\) is a consistent but biased estimator of the mean for a random vector, \(x\).

The default parameterisation for the data generating mechanism is to adopt the functional form from Equation 4, which includes terms for \(R\), \(S\) and an interaction between \(R\) and \(S\).

Generate data scenario 1

# Assume ~ infinite population as just checking consistency
# Precision will of course vary by approach at small sample sizes
generate_data_1 <- function(
    n = 1000000,
    f = function(r, s, x){-1 + s + 0.25 * r + 0.25 * r * s}) {
  s <- rbinom(n, 1, 0.7)
  r <- rbinom(n, 1, 0.5)
  x <- rbinom(n, 1, 0.25)
  a <- r * (s + 1)
  y0 <- rbinom(n, 1, plogis(f(0, s, x)))
  y1 <- rbinom(n, 1, plogis(f(1, s, x)))
  y <- (1 - r) * y0 + r * y1
  w <- mean(s) # Selection probability
  D <- data.table(r = r, x = x, s = s, a = a, y0 = y0, y1 = y1, y = y)[
    ,
    `:=`(
      r_fac = factor(r),
      a_fac = factor(a),
      s_fac = factor(s),
      a_cen = r * (a - 1 - mean(a[r == 1] - 1)),
      s_cen = s - mean(s)
    )
  ]
}

Simulate null effect

set.seed(123)
D <- generate_data_1(f = function(r, s, x){-1 + s})

# Eqn 1
fit1 <- glm(y ~ r, data = D, family = binomial())

# Eqn 3
fit2 <- glm(y ~ r + r:s, data = D, family = binomial())

# Eqn 6?
fit1s <- glm(y ~ r + s, data = D, family = binomial())

# Eqn 4
fit2s <- glm(y ~ r + r:s + s, data = D, family = binomial())

Note

In the use of “true” below, what is meant is the empirical log-odds ratios in the population (approximated by a very large sample) that we observe. We estimate the effects directly from the data by calculating the difference in the log-odds of treatment success in the strata of interest for those in the treated vs control groups.

The table just tries to line up the quantities, like with like as there were some minor differences in the naming.

Null effect quantities

w <- mean(D$a == 2) / mean(D$a %in% c(1, 2))
w_s1 <- mean(D[r == 0]$s == 1)
w_s2 <- mean(D[r == 1]$s == 1)

b1 <- unname(coef(fit1))
b2 <- unname(coef(fit2))
b1s <- unname(coef(fit1s))
b2s <- unname(coef(fit2s))

EY_R0_2 <- expit(b2[1])
EY_R1_2 <- (1 - w) * expit(b2[1] + b2[2]) + w * expit(b2[1] + b2[2] + b2[3])

EY_R0_1_S0 <- expit(b1s[1])
EY_R0_1_S1 <- expit(b1s[1] + b1s[3])
EY_R1_1_S0 <- expit(b1s[1] + b1s[2])
EY_R1_1_S1 <- expit(b1s[1] + b1s[2] + b1s[3])
EY_R0_1 <- (1 - w_s1) * EY_R0_1_S0 + w_s1 * EY_R0_1_S1
EY_R1_1 <- (1 - w_s2) * EY_R1_1_S0 + w_s2 * EY_R1_1_S1

EY_R0_2s_S0 <- expit(b2s[1])
EY_R0_2s_S1 <- expit(b2s[1] + b2s[3])
EY_R1_2s_S0 <- expit(b2s[1] + b2s[2]) # Pr(A = 2 | S = 0) = 0
EY_R1_2s_S1 <- expit(b2s[1] + b2s[2] + b2s[3] + b2s[4]) # Pr(A = 2 | S = 1) = 1
EY_R0_2s <- (1 - w_s1) * EY_R0_2s_S0 + w_s1 * EY_R0_2s_S1
EY_R1_2s <- (1 - w_s2) * EY_R1_2s_S0 + w_s2 * EY_R1_2s_S1

rbind(
  "True conditional (S = 0) log-odds ratio" =
    qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y)),
  "True conditional (S = 1) log-odds ratio" =
    qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y)),
  "True marginal log-odds ratio" = qlogis(mean(D[r == 1]$y)) - qlogis(mean(D[r == 0]$y)),
  "True weighted log-odds ratio" =
    (1 - w) * qlogis(mean(D[a == 1]$y)) + w * qlogis(mean(D[a == 2]$y)) -
      qlogis(mean(D[r == 0]$y)),
  "True average weighted log-odds ratio" =
    (1 - w) * (qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y))) +
      w * (qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y))),
  "fit1 marginal log-odds ratio" = b1[2],
  "fit2 marginal log-odds ratio" = qlogis(EY_R1_2) - qlogis(EY_R0_2),
  "fit2 conditional (weighted) log-odds ratio" = b2[2] + w * b2[3],
  "fit1s conditional log-odds ratio" = b1s[2], # Don't know what this targets
  "fit1s marginal log-odds ratio" = qlogis(EY_R1_1) - qlogis(EY_R0_1),
  "fit2s conditional (S = 0) log-odds ratio" = qlogis(EY_R1_2s_S0) - qlogis(EY_R0_2s_S0),
  "fit2s conditional (S = 1) log-odds ratio" = qlogis(EY_R1_2s_S1) - qlogis(EY_R0_2s_S1),
  "fit2s marginal log-odds ratio" = qlogis(EY_R1_2s) - qlogis(EY_R0_2s),
  "fit2s average (weighted) log-odds ratio" = b2s[2] + w * b2s[4]
)

                                                   [,1]
True conditional (S = 0) log-odds ratio    -0.003882742
True conditional (S = 1) log-odds ratio     0.003222299
True marginal log-odds ratio                0.002568068
True weighted log-odds ratio               -0.018727631
True average weighted log-odds ratio        0.001097800
fit1 marginal log-odds ratio                0.002568068
fit2 marginal log-odds ratio                0.002568068
fit2 conditional (weighted) log-odds ratio -0.018727631
fit1s conditional log-odds ratio            0.001436139
fit1s marginal log-odds ratio               0.002568068
fit2s conditional (S = 0) log-odds ratio   -0.003882742
fit2s conditional (S = 1) log-odds ratio    0.003222299
fit2s marginal log-odds ratio               0.002568068
fit2s average (weighted) log-odds ratio     0.001097800

Null effect quantities

d_out <- data.table(
  desc = c(
    "conditional (S = 0) log-OR",
    "conditional (S = 1) log-OR",
    "conditional log-OR (?)",
    "conditional (weighted) log-OR",
    "marginal log-OR",
    "weighted log-OR",
    "average (weighted) log-OR"
  ),
  true = c(
    qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y)),
    qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y)),
    NA,  
    NA,
    qlogis(mean(D[r == 1]$y)) - qlogis(mean(D[r == 0]$y)),
    (1 - w) * qlogis(mean(D[a == 1]$y)) + w * qlogis(mean(D[a == 2]$y)) -
      qlogis(mean(D[r == 0]$y)),
    (1 - w) * (qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y))) +
      w * (qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y)))
  ),
  fit_1 = c(
    NA, 
    NA,  
    NA,
    NA, 
    b1[2], 
    NA, 
    NA
  ),
  fit_2 = c(
    NA, 
    NA,
    NA, 
    b2[2] + w * b2[3], 
    qlogis(EY_R1_2) - qlogis(EY_R0_2), 
    NA,
    NA
  ),
  fit_1s = c(
    NA, 
    NA, 
    b1s[2],
    NA,
    qlogis(EY_R1_1) - qlogis(EY_R0_1), 
    NA,
    NA
  ),
  fit_2s = c(
    qlogis(EY_R1_2s_S0) - qlogis(EY_R0_2s_S0), 
    qlogis(EY_R1_2s_S1) - qlogis(EY_R0_2s_S1), 
    NA, 
    NA,
    qlogis(EY_R1_2s) - qlogis(EY_R0_2s),
    NA,
    b2s[2] + w * b2s[4]
  )
)

Tabulate effect quantities

gt_tbl <- d_out |> 
  gt(rowname_col = c("desc")) |> 
  cols_align(align = "right", columns = 2:6) |>  
  fmt_number(columns = everything(), decimals = 3) |>
  sub_missing(columns = everything(), missing_text = "") |>
  tab_options(table.font.size = "80%") |>
  tab_footnote(
    footnote = "y ~ r",
    locations = cells_column_labels(columns = fit_1)
    ) |>
  tab_footnote(
    footnote = "y ~ r + r:s",
    locations = cells_column_labels(columns = fit_2)
    ) |>
  tab_footnote(
    footnote = "y ~ r + s",
    locations = cells_column_labels(columns = fit_1s)
    ) |>
  tab_footnote(
    footnote = "y ~ r + r:s + s",
    locations = cells_column_labels(columns = fit_2s)
    ) |>
  tab_footnote(
    footnote = md("Should be labelled *conditional (weighted) log-OR*?"),
    locations = cells_stub(rows = c(
      "weighted log-OR"
    ))
  ) |>
  tab_footnote(
    footnote = md("log-OR for R term in model, but interpretation unclear"),
    locations = cells_stub(rows = c(
      "conditional log-OR (?)"))
  )
gt_tbl

	true	fit_1¹	fit_2²	fit_1s³	fit_2s⁴
conditional (S = 0) log-OR	−0.004				−0.004
conditional (S = 1) log-OR	0.003				0.003
conditional log-OR (?)⁵				0.001
conditional (weighted) log-OR			−0.019
marginal log-OR	0.003	0.003	0.003	0.003	0.003
weighted log-OR⁶	−0.019
average (weighted) log-OR	0.001				0.001
¹ y ~ r
² y ~ r + r:s
³ y ~ r + s
⁴ y ~ r + r:s + s
⁵ log-OR for R term in model, but interpretation unclear
⁶ Should be labelled conditional (weighted) log-OR?

Table 1: Null effects

Simulate effect

set.seed(123)
D <- generate_data_1()
fit1 <- glm(y ~ r, data = D, family = binomial())
fit2 <- glm(y ~ r + r:s, data = D, family = binomial())
fit1s <- glm(y ~ r + s, data = D, family = binomial())
fit2s <- glm(y ~ r + r:s + s, data = D, family = binomial())

Note

Simulate from Equation 4 with a baseline probability of treatment success equal to 0.27 together with \(\beta_1 = 0.25\), \(\beta_2 = 0.25\) and \(\beta_3 = 1\) for the log-ORs associated with treatment, the interaction between treatment and selection and selection, respectively.

Effect quantities

w <- mean(D$a == 2) / mean(D$a %in% c(1, 2))
w_s1 <- mean(D[r == 0]$s == 1)
w_s2 <- mean(D[r == 1]$s == 1)

b1 <- unname(coef(fit1))
b2 <- unname(coef(fit2))
b1s <- unname(coef(fit1s))
b2s <- unname(coef(fit2s))

EY_R0_2 <- expit(b2[1])
EY_R1_2 <- (1 - w) * expit(b2[1] + b2[2]) + w * expit(b2[1] + b2[2] + b2[3])

EY_R0_1_S0 <- expit(b1s[1])
EY_R0_1_S1 <- expit(b1s[1] + b1s[3])
EY_R1_1_S0 <- expit(b1s[1] + b1s[2])
EY_R1_1_S1 <- expit(b1s[1] + b1s[2] + b1s[3])
EY_R0_1 <- (1 - w_s1) * EY_R0_1_S0 + w_s1 * EY_R0_1_S1
EY_R1_1 <- (1 - w_s2) * EY_R1_1_S0 + w_s2 * EY_R1_1_S1

EY_R0_2s_S0 <- expit(b2s[1])
EY_R0_2s_S1 <- expit(b2s[1] + b2s[3])
EY_R1_2s_S0 <- expit(b2s[1] + b2s[2]) # Pr(A = 2 | S = 0) = 0
EY_R1_2s_S1 <- expit(b2s[1] + b2s[2] + b2s[3] + b2s[4]) # Pr(A = 2 | S = 1) = 1
EY_R0_2s <- (1 - w_s1) * EY_R0_2s_S0 + w_s1 * EY_R0_2s_S1
EY_R1_2s <- (1 - w_s2) * EY_R1_2s_S0 + w_s2 * EY_R1_2s_S1

rbind(
  "True conditional (S = 0) log-odds ratio" =
    qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y)),
  "True conditional (S = 1) log-odds ratio" =
    qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y)),
  "True marginal log-odds ratio" = qlogis(mean(D[r == 1]$y)) - qlogis(mean(D[r == 0]$y)),
  "True weighted log-odds ratio" =
    (1 - w) * qlogis(mean(D[a == 1]$y)) + w * qlogis(mean(D[a == 2]$y)) -
      qlogis(mean(D[r == 0]$y)),
  "True average (weighted) log-odds ratio" =
    (1 - w) * (qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y))) +
      w * (qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y))),
  "fit1 marginal log-odds ratio" = b1[2],
  "fit2 marginal log-odds ratio" = qlogis(EY_R1_2) - qlogis(EY_R0_2),
  "fit2 conditional (weighted) log-odds ratio" = b2[2] + w * b2[3],
  "fit1s conditional log-odds ratio" = b1s[2], # Don't know what this targets
  "fit1s marginal log-odds ratio" = qlogis(EY_R1_1) - qlogis(EY_R0_1),
  "fit2s conditional (S = 0) log-odds ratio" = qlogis(EY_R1_2s_S0) - qlogis(EY_R0_2s_S0),
  "fit2s conditional (S = 1) log-odds ratio" = qlogis(EY_R1_2s_S1) - qlogis(EY_R0_2s_S1),
  "fit2s marginal log-odds ratio" = qlogis(EY_R1_2s) - qlogis(EY_R0_2s),
  "fit2s average (weighted) log-odds ratio" = b2s[2] + w * b2s[4]
)

                                                [,1]
True conditional (S = 0) log-odds ratio    0.2462550
True conditional (S = 1) log-odds ratio    0.4944007
True marginal log-odds ratio               0.4038167
True weighted log-odds ratio               0.4003765
True average (weighted) log-odds ratio     0.4202020
fit1 marginal log-odds ratio               0.4038167
fit2 marginal log-odds ratio               0.4038167
fit2 conditional (weighted) log-odds ratio 0.4003765
fit1s conditional log-odds ratio           0.4285063
fit1s marginal log-odds ratio              0.4038167
fit2s conditional (S = 0) log-odds ratio   0.2462550
fit2s conditional (S = 1) log-odds ratio   0.4944007
fit2s marginal log-odds ratio              0.4038167
fit2s average (weighted) log-odds ratio    0.4202020

Effect quantities

d_out <- data.table(
  desc = c(
    "conditional (S = 0) log-OR",
    "conditional (S = 1) log-OR",
    "conditional log-OR (?)",
    "conditional (weighted) log-OR",
    "marginal log-OR",
    "weighted log-OR",
    "average (weighted) log-OR"
  ),
  true = c(
    qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y)),
    qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y)),
    NA,  
    NA,
    qlogis(mean(D[r == 1]$y)) - qlogis(mean(D[r == 0]$y)),
    (1 - w) * qlogis(mean(D[a == 1]$y)) + w * qlogis(mean(D[a == 2]$y)) -
      qlogis(mean(D[r == 0]$y)),
    (1 - w) * (qlogis(mean(D[r == 1 & s == 0]$y)) - qlogis(mean(D[r == 0 & s == 0]$y))) +
      w * (qlogis(mean(D[r == 1 & s == 1]$y)) - qlogis(mean(D[r == 0 & s == 1]$y)))
  ),
  fit_1 = c(
    NA, 
    NA,   
    NA,
    NA, 
    b1[2], 
    NA, 
    NA
  ),
  fit_2 = c(
    NA, 
    NA, 
    NA, 
    b2[2] + w * b2[3], 
    qlogis(EY_R1_2) - qlogis(EY_R0_2), 
    NA,
    NA
  ),
  fit_1s = c(
    NA, 
    NA, 
    b1s[2],
    NA,
    qlogis(EY_R1_1) - qlogis(EY_R0_1), 
    NA,
    NA
  ),
  fit_2s = c(
    qlogis(EY_R1_2s_S0) - qlogis(EY_R0_2s_S0), 
    qlogis(EY_R1_2s_S1) - qlogis(EY_R0_2s_S1),  
    NA, 
    NA,
    qlogis(EY_R1_2s) - qlogis(EY_R0_2s),
    NA,
    b2s[2] + w * b2s[4]
  )
)

Tabulate effect quantities

gt_tbl <- d_out |> 
  gt(rowname_col = c("desc")) |> 
  cols_align(align = "right", columns = 2:6) |>  
  fmt_number(columns = everything(), decimals = 3) |>
  sub_missing(columns = everything(), missing_text = "") |>
  tab_options(table.font.size = "80%") |>
  tab_footnote(
    footnote = "y ~ r",
    locations = cells_column_labels(columns = fit_1)
    ) |>
  tab_footnote(
    footnote = "y ~ r + r:s",
    locations = cells_column_labels(columns = fit_2)
    ) |>
  tab_footnote(
    footnote = "y ~ r + s",
    locations = cells_column_labels(columns = fit_1s)
    ) |>
  tab_footnote(
    footnote = "y ~ r + r:s + s",
    locations = cells_column_labels(columns = fit_2s)
    ) |>
  tab_footnote(
    footnote = md("Should be labelled *conditional (weighted) log-OR*?"),
    locations = cells_stub(rows = c(
      "weighted log-OR"))) |>
  tab_footnote(
    footnote = md("log-OR for R term in model, but interpretation unclear"),
    locations = cells_stub(rows = c(
      "conditional log-OR (?)"))
  )

gt_tbl

	true	fit_1¹	fit_2²	fit_1s³	fit_2s⁴
conditional (S = 0) log-OR	0.246				0.246
conditional (S = 1) log-OR	0.494				0.494
conditional log-OR (?)⁵				0.429
conditional (weighted) log-OR			0.400
marginal log-OR	0.404	0.404	0.404	0.404	0.404
weighted log-OR⁶	0.400
average (weighted) log-OR	0.420				0.420
¹ y ~ r
² y ~ r + r:s
³ y ~ r + s
⁴ y ~ r + r:s + s
⁵ log-OR for R term in model, but interpretation unclear
⁶ Should be labelled conditional (weighted) log-OR?

Table 2: Effects

So in the basic case, as \(n\to\infty\), these models are all in a sense equivalent, in that they are consistent for the treatment effects of interest.

More generally, can just use g-computation rather than analytic expressions

Code

m <- rbind(
  "G-comp marginal mean" =
    c(avg_comparisons(fit2, variables = "r", comparison = "lnoravg")$estimate, NA_real_),
  "G-comp marginal mean (adjust s)" =
    c(avg_comparisons(fit2s, variables = "r", comparison = "lnoravg")$estimate, NA_real_),
  "G-comp average log-odds" =
    c(avg_comparisons(fit2, variables = "r", comparison = "lnor")$estimate, NA_real_),
  "G-comp average log-odds (adjust s)" = 
    c(avg_comparisons(fit2s, variables = "r", comparison = "lnor")$estimate, NA_real_),
  "G-comp conditional (S) average log-odds" =
    avg_comparisons(fit2, variables = "r", comparison = "lnor", by = "s")$estimate,
  "G-comp conditional (S) average log-odds (adjust s)" =
    avg_comparisons(fit2s, variables = "r", comparison = "lnor", by = "s")$estimate
)
colnames(m) <- c("S = 0", "S = 1")
m

                                                        S = 0     S = 1
G-comp marginal mean                                0.4030520        NA
G-comp marginal mean (adjust s)                     0.4024538        NA
G-comp average log-odds                             0.3995870        NA
G-comp average log-odds (adjust s)                  0.4200453        NA
G-comp conditional (S) average log-odds            -0.4764725 0.7744050
G-comp conditional (S) average log-odds (adjust s)  0.2462550 0.4944007

Note

I think that the above are basically making predictions for the comparison of interest at each of the rows in the data set and then averaging to give a marginalised view.

The definitional differences between lnoravg and lnor amount to:

lnor \(hi, lo) log((hi/(1 - hi))/(lo/(1 - lo)))

vs.

lnoravg \(hi, lo) log((mean(hi)/(1 - mean(hi)))/(mean(lo)/(1 - mean(lo)))

so (I think but need to confirm) one is doing the calculation on every row and then averaging whereas the other averages first.

Covariate

Suppose we introduce a covariate \(X\) because it’s predictive of the outcome, e.g. sex. Our model, which does not adjust for \(S\), is

\[ \begin{aligned} \mathbb{E}[Y|R,S,X] &= \text{expit}(\beta_0 + \beta_1R + \beta_2RS + \beta_3X) \\ \mathbb{E}[Y|R,X] &= \mathbb{E}[\mathbb{E}[Y|R,S,X]|R,X] \\ &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1R + \beta_2Rs + \beta_3X)\mathbb{P}(S=s|R,X) \\ \mathbb{E}[Y|R=0,X] &= \text{expit}(\beta_0 + \beta_3X) \\ \mathbb{E}[Y|R=1,X] &= \text{expit}(\beta_0 + \beta_1 + \beta_3X)\mathbb{P}(S=0|R=1,X) \\ &\quad \ + \text{expit}(\beta_0 + \beta_1 + \beta_2 + \beta_3X)\mathbb{P}(S=1|R=1,X) \end{aligned} \]

The conditional (on \(X\)) log-odds ratio of marginal success probability for revision versus DAIR depends on the value of \(X\) (i.e. is not the same effect for every \(X=x\)) and cannot be simplified. It is

\[ \text{logit}(\mathbb{E}[Y|R=1,X]) - (\beta_0 + \beta_3X). \]

By marginalising over type of revision type (which is necessary for the comparison we want), we lose our one number summary. No way to avoid that other than perhaps considering fitting separate models for surgery and duration.

To maintain a one-number-summary, again an alternative is to consider the average conditional log-odds

\[ \mathbb{E}\left[\ln\frac{\text{odds}(Y|R=1|S,X)}{\text{odds}(Y|R=0|S,X)}\right] = \beta_1 + \beta_2\mathbb{E}[S]. \]

If \(S\) has an effect, say in truth,

\[ \begin{aligned} \mathbb{E}[Y|S,R,X] &= \text{expit}(\beta_0 + \beta_1R + \beta_2RS + \beta_3S + \beta_4X) \\ \mathbb{E}[Y|R,X] &= \mathbb{E}[\mathbb{E}[Y|S,R,X]|R,X] \\ &= \sum_{s=0}^1 \mathbb{E}[Y|S=s,R,X]\mathbb{P}(S=s|R,X) \end{aligned} \]

Then if our model does condition on \(S\),

\[ \begin{aligned} \mathbb{E}[Y|R,X] &= \mathbb{E}[\mathbb{E}[Y|R,X,S]|R,X] \\ &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1R + \beta_2Rs + \beta_3s+\beta_4X) \mathbb{P}(S=s|R,X) \\ \mathbb{E}[Y|R=0,X] &= \text{expit}(\beta_0 + \beta_4X)\mathbb{P}(S=0|R=0,X) + \text{expit}(\beta_0 + \beta_3 + \beta_4X)\mathbb{P}(S=1|R=0,X) \\ \mathbb{E}[Y|R=1,X] &= \text{expit}(\beta_0 + \beta_1 + \beta_4X)\mathbb{P}(S=0|R=0,X) + \text{expit}(\beta_0 + \beta_1 + \beta_2 + \beta_3 + \beta_4X)\mathbb{P}(S=1|R=0,X) \\ \ln\frac{\text{odds}(Y|R=1,X)}{\text{odds}(Y|R=0,X)} &= \text{logit}(\mathbb{E}[Y|R=1,X]) - \text{logit}(\mathbb{E}[Y|R=0,X]) \end{aligned} \]

Due to randomisation, \(\mathbb{P}(S=s|R,X) = \mathbb{P}(S=s|X)\).

The model without adjustment for \(S\) assumes

\[ \begin{aligned} \mathbb{E}[Y|R,S,X] &= \text{expit}(\alpha_0 + \alpha_1R + \alpha_2RS + \alpha_3X) \\ &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1R + \beta_2RS + \beta_3X + \beta_4s)\mathbb{P}(S=s|R,X) \end{aligned} \]

Example

Note

By definition, \(x\) has a 25% chance of occurrence in the sample data.

Generate data with covariate

set.seed(6124)
D <- generate_data_1(
  f = function(r, s, x){ -1 + s + x + 0.25 * r + 0.25 * r * s}
)
fit1 <- glm(y ~ r + x, data = D, family = binomial())
fit2 <- glm(y ~ r + r:s + x, data = D, family = binomial())
fit1s <- glm(y ~ r + s + x, data = D, family = binomial())
fit2s <- glm(y ~ r + s + r:s + x, data = D, family = binomial())

Using G-computation to marginalise over \(S\) and \(X\).

Effect quantities

tt <- cbind(
  qlogis(mean(D[r == 1]$y)) - qlogis(mean(D[r == 0]$y)),
  NA,
  qlogis(mean(D[r == 1 & x == 0]$y)) - qlogis(mean(D[r == 0 & x == 0]$y)),
  qlogis(mean(D[r == 1 & x == 1]$y)) - qlogis(mean(D[r == 0 & x == 1]$y))
)
rownames(tt) <- "True"
m1 <- rbind(
  avg_comparisons(fit1, variables = "r", comparison = "lnoravg")$estimate,
  avg_comparisons(fit2, variables = "r", comparison = "lnoravg")$estimate,
  avg_comparisons(fit1s, variables = "r", comparison = "lnoravg")$estimate,
  avg_comparisons(fit2s, variables = "r", comparison = "lnoravg")$estimate
)
m2 <- rbind(
  avg_comparisons(fit1, variables = "r", comparison = "lnor")$estimate,
  avg_comparisons(fit2, variables = "r", comparison = "lnor")$estimate,
  avg_comparisons(fit1s, variables = "r", comparison = "lnor")$estimate,
  avg_comparisons(fit2s, variables = "r", comparison = "lnor")$estimate
)
m3 <- rbind(
  avg_comparisons(fit1, variables = "r", comparison = "lnoravg", by = "x")$estimate,
  avg_comparisons(fit2, variables = "r", comparison = "lnoravg", by = "x")$estimate,
  avg_comparisons(fit1s, variables = "r", comparison = "lnoravg", by = "x")$estimate,
  avg_comparisons(fit2s, variables = "r", comparison = "lnoravg", by = "x")$estimate
)
m <- cbind(m1, m2, m3)
cols <- c("Marginal log-odds ratio", "Average log-odds ratio", "Conditional (X = 0)", "Conditional (X = 1)")
colnames(m) <- cols
rownames(m) <- c("fit1", "fit2", "fit1s", "fit2s")
round(rbind(tt, m), 3)

      Marginal log-odds ratio Average log-odds ratio Conditional (X = 0)
True                    0.376                     NA               0.397
fit1                    0.377                  0.391               0.391
fit2                    0.378                  0.405               0.408
fit1s                   0.378                  0.418               0.393
fit2s                   0.378                  0.416               0.398
      Conditional (X = 1)
True                0.367
fit1                0.391
fit2                0.334
fit1s               0.392
fit2s               0.371

Tabulate effect quantities

d_out <- data.table(t(round(rbind(tt, m), 3)))
d_out[, desc := cols]
setcolorder(d_out, "desc")



gt_tbl <- d_out |> 
  gt(rowname_col = c("desc")) |> 
  cols_align(align = "right", columns = 2:6) |>  
  fmt_number(columns = everything(), decimals = 3) |>
  sub_missing(columns = everything(), missing_text = "") |>
  tab_options(table.font.size = "80%") |>
  tab_footnote(
    footnote = "y ~ r + x",
    locations = cells_column_labels(columns = fit1)
    ) |>
  tab_footnote(
    footnote = "y ~ r + r:s + x",
    locations = cells_column_labels(columns = fit2)
    ) |>
  tab_footnote(
    footnote = "y ~ r + s + x",
    locations = cells_column_labels(columns = fit1s)
    ) |>
  tab_footnote(
    footnote = "y ~ r + r:s + s + x",
    locations = cells_column_labels(columns = fit2s)
    ) 

gt_tbl

	True	fit1¹	fit2²	fit1s³	fit2s⁴
Marginal log-odds ratio	0.376	0.377	0.378	0.378	0.378
Average log-odds ratio		0.391	0.405	0.418	0.416
Conditional (X = 0)	0.397	0.391	0.408	0.393	0.398
Conditional (X = 1)	0.367	0.391	0.334	0.392	0.371
¹ y ~ r + x
² y ~ r + r:s + x
³ y ~ r + s + x
⁴ y ~ r + r:s + s + x

Table 3: G-computation estimates in presence of prognostic covariate

Unmeasured Confounder

The above hides some complexity because we assume everything is correctly specified. Suppose we introduce some unmeasured factor which influences which patients are preferred for a given revision type. Consider the following:

R: randomised surgery - 0: DAIR, 1: revision
S: preferred revision - 0: one-stage, 1: two-stage
A: allocated surgery - 0: DAIR, 1: one-stage, 2: two-stage
Y: treatment success - 0: no, 1: yes
Z: unmeasured factor, patient attributes/type - continuous

We assume \(Z\) is some immeasurable combination of factors which partly determines a patients risk of failure. We also think that this \(Z\) partly determines the surgeons choice of one/two stage. Say patients with higher values of \(Z\) are less likely to have successful treatment. However, the surgeon has some knowledge/experience/expertise/intuition which means that they are more likely to prefer a two-stage revision for patients with higher values of \(Z\), as they expect those types of patients will have better outcomes under two-stage. The allocated treatment and the underlying patient risk determines the patients outcome, \(Y\).

%%{
  init:{
    "flowchart":{"htmlLabels": "true"},
    "securityLevel": "loose",
    "theme": "base"
}}%%
flowchart LR
  R --> A --> Y
  S --> A
  Z(Z) --> S & A & Y
  US((U<sub>S)) -.-> S
  UR((U<sub>R)) -.-> R
  UY((U<sub>Y)) -.-> Y

Figure 2: Scenario 2, the \(U\) denote independent exogenous variables,

Given the randomisation, this does not really change anything, except making explicit that differences between one/two stage may just be due to confounding rather than effect of treatment. We can’t tell which without adjusting for all confounders.

Checkpoint

So, perhaps the easiest quantity to consider is the average conditional log-odds, i.e.

\[ \mathbb{E}\left[\ln\frac{\text{odds}(Y|R=1,S,X,...)}{\text{odds}(Y|R=0,S,X,...)}\right] = \beta_1 + \beta_2\mathbb{E}[S]. \]

Is this sufficiently meaningful?

Note

I think what you are saying is to adopt Equation 4 and then report our effect estimate for revision at the sample mean of observed selection, which is this case fully characterises the distribution anyway.

The terminology used for effects seems to vary a bit throughout, but my label would probably be more explicit conditional log-OR evaluated at the mean selection or something like that, whereas I think it has been previously referred to as conditional (weighted) log-OR earlier and average (weighted) log-OR earlier, probably because that is how they have been computed.

Duration

Duration, \(D\), is randomised, however, the duration options depends upon assignment to revision \(R\), and the chosen revision type, \(S\). So it is random conditional on \(R\) and \(S\). Nothing else alters the distribution of \(D\). We expect that duration has an effect on the outcome.

Below we just use \(D=0\) for long and \(D=1\) for short, but note that the meaning of these is conditional on \(R/S\) (i.e. short for one-stage different to short for two-stage and only applies to revision)

%%{
  init:{
    "flowchart":{"htmlLabels": "true"},
    "securityLevel": "loose",
    "theme": "base"
}}%%
flowchart LR
  R --> A --> Y
  S --> A & D
  R --> D
  D --> Y
  UD((U<sub>D)) -.-> D
  US((U<sub>S)) -.-> S
  UR((U<sub>R)) -.-> R
  UY((U<sub>Y)) -.-> Y

Figure 3: Scenario 3, the \(U\) denote independent exogenous variables,

Note

There might be alternative representations of the above and also the potential for direct and indirect effects of \(S\), see below:

%%{
  init:{
    "flowchart":{"htmlLabels": "true"},
    "securityLevel": "loose",
    "theme": "base"
}}%%
flowchart LR
  R --> A --> Y
  S --> A
  S --> Y
  A --> D
  D --> Y
  UD((U<sub>D)) -.-> D
  US((U<sub>S)) -.-> S
  UR((U<sub>R)) -.-> R
  UY((U<sub>Y)) -.-> Y

%%{
  init:{
    "flowchart":{"htmlLabels": "true"},
    "securityLevel": "loose",
    "theme": "base"
}}%%
flowchart LR
  R --> D
  S --> D
  S --> Y
  D --> Y
  UD((U<sub>D)) -.-> D
  US((U<sub>S)) -.-> S
  UR((U<sub>R)) -.-> R
  UY((U<sub>Y)) -.-> Y

Figure 4

Figure 5

Now there are two interventions: \(R\) which has downstream unknown effects partly due to the unknown revision type, \(S\), which is selected, and \(D\) which is randomised.

The simplest approach is to analyse these separately. First, restrict the analysis to those patients who were assigned to one-stage and have an RCT for duration in embedded in that subset. Then, restrict analysis to only those assigned to two-stage and have an RCT for duration in that subset. However, we would like to have a joint model so that other effects can be shared (other domains/site/surgeon/age/whatever else). In the joint model, duration effect needs to be conditional on revision type.

Say the true model were something like

\[ \mathbb{E}[Y|R,S,D] = \text{expit}(\beta_0 + \beta_1S + \beta_2R + \beta_3RS + \beta_4RD + \beta_5RDS) \tag{7}\]

\(\beta_2\) is the shift associated with revision and long duration (assuming long-duration is the refrence group)
\(\beta_3\) the additional shift associated with two-stage long duration,
\(\beta_4\) the relative shift for short duration given revision,
\(\beta_5\) the relative shift for short duration given two-stage.

We might choose to setup the design matrix so that it is orthonormal so that a-priori we don’t assign more uncertainty to a specific revision type. E.g.

\[ \mathbb{E}[Y|R,S] = \text{expit}(\beta_0 + \beta_1(S - 0.5) + \beta_2R + \beta_3R(S-0.5) + \beta_4R(D-0.5) + \beta_5R(D-0.5)(S-0.5) \]

Note

Is the above missing a \(D\) from the conditioning?

but for simplicity, not doing this here.

From the above, and with a focus on the surgical domain, we could compare (randomised comparison) any of:

revision long vs. DAIR
revision short vs. DAIR, but no other combinations of revision.
one-stage short + two-stage long vs. DAIR
two-stage short + one-stage long vs. DAIR

Where (1) and (2) are the likely the most relevant comparisons of interest, but nothing precludes the comparisons stated in (3) and (4). The key point is the explicit statement of the duration type at which the comparison is made, which again is averaged over the empircal distribution of surgical procedure type (one-stage/two-stage).

However, we always needs to marginalise over \(S\) (selection/plan):

\[ \begin{aligned} \mathbb{E}[Y|R=0] &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1s)\mathbb{P}(S=s|R=0) \\ \mathbb{E}[Y|R=1,D=0] &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1s + \beta_2 + \beta_3s)\mathbb{P}(S=s|R=1,D=0) \\ \mathbb{E}[Y|R=1,D=1] &= \sum_{s=0}^1 \text{expit}(\beta_0 + \beta_1s + \beta_2 + \beta_3s + \beta_4 + \beta_5s)\mathbb{P}(S=s|R=1,D=1) \end{aligned} \]

Again, can alternatively consider the average conditional log-odds ratio

\[ \begin{aligned} \mathbb{E}\left[\ln\frac{\text{odds}(Y|R=1,S,D)}{\text{odds}(Y|R=0,S,D)}|D_0,D_1\right] &= \beta_2+\beta_3\mathbb{E}[S] + \beta_4D_0 + \beta_5D_1\mathbb{E}[S] \end{aligned} \]

where I’ve made it explicit that \(D_0\) (one-stage short) and \(D_1\) (two-stage short) may differ. However, in practice, these would likely be set to the same level.

The default data generating mechanism has the functional form:

\[ \mathbb{E}[Y|R,S,D] = \text{expit}(\beta_0 + \beta_1X + \beta_1S + \beta_2R + \beta_3RS + \beta_4RD + \beta_5RSD) \]

specified with non-zero effects on all terms.

Generate duration data

generate_data_2 <- function(
    n = 1000000,
    f = function(x, r, s, d){ -1 + x + s + r + 0.5 * r * s - 0.5 * r * d - 0.25 * r * s * d}) {
  x <- rbinom(n, 1, 0.25)
  s <- rbinom(n, 1, 0.7)
  r <- rbinom(n, 1, 0.5)
  a <- r * (s + 1)
  d <- as.numeric((a > 0) * rbinom(n, 1, 0.5))
  y0 <- rbinom(n, 1, plogis(f(x, 0, s, 0)))
  y10 <- rbinom(n, 1, plogis(f(x, 1, s, 0)))
  y11 <- rbinom(n, 1, plogis(f(x, 1, s, 1)))
  y <- (1 - r) * y0 + r * ((1 - d) * y10 + d * y11)
  D <- data.table(x = x, r = r, s = s, a = a, d = d, y0 = y0, y10 = y10, y11 = y11, y = y)[
    ,
    `:=`(
      a1d1 = as.numeric(a == 1 & d == 1),
      a2d1 = as.numeric(a == 2 & d == 1),
      r_fac = factor(r),
      a_fac = factor(a),
      s_fac = factor(s),
      a_cen = r * (a - 1 - mean(a[r == 1] - 1)),
      s_cen = s - mean(s),
      s_ort = s - 0.5
    )
  ]
}
set.seed(1246)
D <- generate_data_2(f = function(x, r, s, d){ -1 + s + x})
fit2 <- glm(y ~ x + r + r:s + r:d + r:s:d, data = D, family = binomial())
fit2s <- glm(y ~ x + s + r + r:s + r:d + r:s:d, data = D, family = binomial())

Note

In the following examples estimation under null effects is considered wherein the true data generation mechanism was \(\mathbb{E}[Y|S,X] = \text{expit}(-1 + S + X)\). That is, where there are no treatment effects in either the surgical or duration domains.

The estimates use G-computation. Specifically, all of the following average over all terms bar the comparison of interest.

Revision vs. DAIR

# Revision vs. DAIR
rbind(
  avg_comparisons(fit2, variables = "r", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", comparison = "lnoravg")
)


 Term              Contrast Estimate Std. Error     z Pr(>|z|)   S   2.5 %
    r ln(odds(1) / odds(0)) -0.00856    0.00410 -2.09   0.0370 4.8 -0.0166
    r ln(odds(1) / odds(0)) -0.00836    0.00406 -2.06   0.0396 4.7 -0.0163
    97.5 %
 -0.000515
 -0.000400

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted 
Type:  response

Revision (long) vs. DAIR and Revision (short) vs. DAIR

# Revision (long) vs. DAIR and Revision (short) vs. DAIR
rbind(
  avg_comparisons(fit2, variables = "r", by = "d", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", by = "d", comparison = "lnoravg")
)


 Term              Contrast d Estimate Std. Error     z Pr(>|z|)   S   2.5 %
    r ln(odds(1) / odds(0)) 0 -0.00864    0.00473 -1.83   0.0676 3.9 -0.0179
    r ln(odds(1) / odds(0)) 1 -0.00832    0.00473 -1.76   0.0785 3.7 -0.0176
    r ln(odds(1) / odds(0)) 0 -0.00863    0.00469 -1.84   0.0658 3.9 -0.0178
    r ln(odds(1) / odds(0)) 1 -0.00755    0.00469 -1.61   0.1074 3.2 -0.0167
   97.5 %
 0.000627
 0.000949
 0.000562
 0.001641

Columns: term, contrast, d, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Conditional on X

# Conditional on X
rbind(
  avg_comparisons(fit2, variables = "r", by = c("d", "x"), comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", by = c("d", "x"), comparison = "lnoravg")
)


 Term              Contrast d x Estimate Std. Error      z Pr(>|z|)    S
    r ln(odds(1) / odds(0)) 0 0  0.00267    0.00493  0.541   0.5884  0.8
    r ln(odds(1) / odds(0)) 0 1 -0.04726    0.00496 -9.525   <0.001 69.0
    r ln(odds(1) / odds(0)) 1 0  0.00270    0.00493  0.547   0.5842  0.8
    r ln(odds(1) / odds(0)) 1 1 -0.04602    0.00496 -9.277   <0.001 65.6
    r ln(odds(1) / odds(0)) 0 0 -0.00895    0.00490 -1.829   0.0675  3.9
    r ln(odds(1) / odds(0)) 0 1 -0.00909    0.00494 -1.841   0.0656  3.9
    r ln(odds(1) / odds(0)) 1 0 -0.00822    0.00489 -1.680   0.0930  3.4
    r ln(odds(1) / odds(0)) 1 1 -0.00667    0.00494 -1.350   0.1771  2.5
    2.5 %    97.5 %
 -0.00700  0.012332
 -0.05699 -0.037538
 -0.00696  0.012360
 -0.05575 -0.036301
 -0.01855  0.000643
 -0.01878  0.000588
 -0.01782  0.001370
 -0.01635  0.003014

Columns: term, contrast, d, x, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Short vs Long (one-stage) and short vs. long (two-stage)

# Short vs Long (one-stage) and short vs. long (two-stage)
rbind(
  avg_comparisons(fit2, variables = "d", by = "s", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "d", by = "s", comparison = "lnoravg")
)


 Term              Contrast s Estimate Std. Error      z Pr(>|z|)   S    2.5 %
    d ln(odds(1) / odds(0)) 0  0.00603    0.00491  1.228    0.220 2.2 -0.00359
    d ln(odds(1) / odds(0)) 1 -0.00174    0.00330 -0.528    0.597 0.7 -0.00821
    d ln(odds(1) / odds(0)) 0  0.00664    0.00538  1.234    0.217 2.2 -0.00390
    d ln(odds(1) / odds(0)) 1 -0.00176    0.00333 -0.527    0.598 0.7 -0.00829
  97.5 %
 0.01565
 0.00472
 0.01718
 0.00478

Columns: term, contrast, s, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Short vs long conditional on x

# Short vs long conditional on x
avg_comparisons(fit2, variables = "d", by = c("s", "x"), comparison = "lnoravg")


 Term              Contrast s x Estimate Std. Error      z Pr(>|z|)   S
    d ln(odds(1) / odds(0)) 0 0  0.00599    0.00488  1.228    0.220 2.2
    d ln(odds(1) / odds(0)) 0 1  0.00712    0.00580  1.228    0.220 2.2
    d ln(odds(1) / odds(0)) 1 0 -0.00184    0.00349 -0.528    0.597 0.7
    d ln(odds(1) / odds(0)) 1 1 -0.00172    0.00326 -0.528    0.597 0.7
    2.5 %  97.5 %
 -0.00357 0.01555
 -0.00425 0.01850
 -0.00868 0.00499
 -0.00812 0.00467

Columns: term, contrast, s, x, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Note

And now consider estimation under effects for all surgery and duration.

Data generation assuming effects in both domains

D <- generate_data_2()
fit2 <- glm(y ~ x + r + r:s + r:d + r:s:d, data = D, family = binomial())
fit2s <- glm(y ~ x + s + r + r:s + r:d + r:s:d, data = D, family = binomial())

Revision vs. DAIR

# Revision vs. DAIR
rbind(
  avg_comparisons(fit2, variables = "r", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", comparison = "lnoravg")
)


 Term              Contrast Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
    r ln(odds(1) / odds(0))     1.03    0.00427 240   <0.001 Inf  1.02   1.03
    r ln(odds(1) / odds(0))     1.03    0.00423 243   <0.001 Inf  1.02   1.03

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted 
Type:  response

Revision (long) vs. DAIR and Revision (short) vs. DAIR

# Revision (long) vs. DAIR and Revision (short) vs. DAIR
rbind(
  avg_comparisons(fit2, variables = "r", by = "d", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", by = "d", comparison = "lnoravg")
)


 Term              Contrast d Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
    r ln(odds(1) / odds(0)) 0    1.184    0.00518 228   <0.001 Inf 1.174  1.194
    r ln(odds(1) / odds(0)) 1    0.607    0.00482 126   <0.001 Inf 0.598  0.617
    r ln(odds(1) / odds(0)) 0    1.184    0.00515 230   <0.001 Inf 1.174  1.194
    r ln(odds(1) / odds(0)) 1    0.607    0.00478 127   <0.001 Inf 0.598  0.616

Columns: term, contrast, d, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Conditional on X

# Conditional on X
rbind(
  avg_comparisons(fit2, variables = "r", by = c("d", "x"), comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", by = c("d", "x"), comparison = "lnoravg")
)


 Term              Contrast d x Estimate Std. Error   z Pr(>|z|)   S 2.5 %
    r ln(odds(1) / odds(0)) 0 0    1.239    0.00532 233   <0.001 Inf 1.228
    r ln(odds(1) / odds(0)) 0 1    1.153    0.00554 208   <0.001 Inf 1.142
    r ln(odds(1) / odds(0)) 1 0    0.645    0.00499 129   <0.001 Inf 0.635
    r ln(odds(1) / odds(0)) 1 1    0.573    0.00510 112   <0.001 Inf 0.563
    r ln(odds(1) / odds(0)) 0 0    1.230    0.00528 233   <0.001 Inf 1.219
    r ln(odds(1) / odds(0)) 0 1    1.191    0.00554 215   <0.001 Inf 1.180
    r ln(odds(1) / odds(0)) 1 0    0.635    0.00495 128   <0.001 Inf 0.625
    r ln(odds(1) / odds(0)) 1 1    0.610    0.00509 120   <0.001 Inf 0.600
 97.5 %
  1.249
  1.163
  0.655
  0.583
  1.240
  1.202
  0.645
  0.620

Columns: term, contrast, d, x, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Short vs Long (one-stage) and short vs. long (two-stage)

# Short vs Long (one-stage) and short vs. long (two-stage)
rbind(
  avg_comparisons(fit2, variables = "d", by = "s", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "d", by = "s", comparison = "lnoravg")
)


 Term              Contrast s Estimate Std. Error     z Pr(>|z|)   S  2.5 %
    d ln(odds(1) / odds(0)) 0   -0.235    0.00504 -46.6   <0.001 Inf -0.245
    d ln(odds(1) / odds(0)) 1   -0.264    0.00295 -89.7   <0.001 Inf -0.270
    d ln(odds(1) / odds(0)) 0   -0.242    0.00520 -46.6   <0.001 Inf -0.252
    d ln(odds(1) / odds(0)) 1   -0.277    0.00309 -89.7   <0.001 Inf -0.283
 97.5 %
 -0.225
 -0.258
 -0.232
 -0.271

Columns: term, contrast, s, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Short vs long conditional on x

# Short vs long conditional on x
avg_comparisons(fit2, variables = "d", by = c("s", "x"), comparison = "lnoravg")


 Term              Contrast s x Estimate Std. Error     z Pr(>|z|)   S  2.5 %
    d ln(odds(1) / odds(0)) 0 0   -0.246    0.00527 -46.6   <0.001 Inf -0.256
    d ln(odds(1) / odds(0)) 0 1   -0.243    0.00523 -46.5   <0.001 Inf -0.253
    d ln(odds(1) / odds(0)) 1 0   -0.286    0.00319 -89.7   <0.001 Inf -0.292
    d ln(odds(1) / odds(0)) 1 1   -0.212    0.00241 -87.9   <0.001 Inf -0.217
 97.5 %
 -0.235
 -0.233
 -0.280
 -0.207

Columns: term, contrast, s, x, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Other Domains

The desire for a single model is incorporation of multiple silos and domains. Suppose we introduce the antibiotic type (rifampicin) domain, which is denoted by \(F\). Assume everyone were eligible. Our base model would be

\[ \mathbb{E}[Y|R,S,D,F] = \text{expit}(\beta_0 + \beta_1S + \beta_2R + \beta_3RS + \beta_4RD + \beta_5RDS + \beta_6F) \]

Note

\(\beta_2\) is now the shift for revision (under long duration irrespective of surgical type)

Generate rifampicin data

generate_data_3 <- function(
    n = 1000000,
    g = function(x, r, s, d, f){ -1 + x + s + r + 0.5 * r * s - 0.25 * r * d - 0.15 * r * s * d + 0.2 * f}) {
  x <- rbinom(n, 1, 0.25)
  s <- rbinom(n, 1, 0.7)
  r <- rbinom(n, 1, 0.5)
  f <- rbinom(n, 1, 0.5)
  a <- r * (s + 1)
  d <- as.numeric((a > 0) * rbinom(n, 1, 0.5))
  y <- rbinom(n, 1, plogis(g(x, r, s, d, f)))
  D <- data.table(x = x, r = r, s = s, a = a, d = d, f = f, y = y)
}
set.seed(1246)
D <- generate_data_3()
fit2 <- glm(y ~ x + r + f + r:s + r:d + r:s:d, data = D, family = binomial())
fit2s <- glm(y ~ x + s + r + f + r:s + r:d + r:s:d, data = D, family = binomial())

Revision vs. DAIR

# Revision vs. DAIR
rbind(
  avg_comparisons(fit2, variables = "r", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", comparison = "lnoravg")
)


 Term              Contrast Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
    r ln(odds(1) / odds(0))      1.1    0.00441 250   <0.001 Inf   1.1   1.11
    r ln(odds(1) / odds(0))      1.1    0.00437 253   <0.001 Inf   1.1   1.11

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted 
Type:  response

Revision vs. DAIR

rbind(
  avg_comparisons(fit2, variables = "r", comparison = "lnor"),
  avg_comparisons(fit2s, variables = "r", comparison = "lnor")
)


 Term              Contrast Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
    r ln(odds(1) / odds(0))     1.25    0.00508 247   <0.001 Inf  1.24   1.26
    r ln(odds(1) / odds(0))     1.26    0.00512 246   <0.001 Inf  1.25   1.27

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Revision vs. DAIR conditional

# Revision vs. DAIR conditional
rbind(
  avg_comparisons(fit2, variables = "r", by = c("f", "x", "d"), comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", by = c("f", "x", "d"), comparison = "lnoravg")
)


 Term              Contrast f x d Estimate Std. Error   z Pr(>|z|)   S 2.5 %
    r ln(odds(1) / odds(0)) 0 0 0    1.245    0.00542 230   <0.001 Inf 1.234
    r ln(odds(1) / odds(0)) 0 0 1    0.945    0.00519 182   <0.001 Inf 0.935
    r ln(odds(1) / odds(0)) 0 1 0    1.159    0.00561 206   <0.001 Inf 1.148
    r ln(odds(1) / odds(0)) 0 1 1    0.861    0.00532 162   <0.001 Inf 0.850
    r ln(odds(1) / odds(0)) 1 0 0    1.225    0.00543 226   <0.001 Inf 1.215
    r ln(odds(1) / odds(0)) 1 0 1    0.926    0.00518 179   <0.001 Inf 0.916
    r ln(odds(1) / odds(0)) 1 1 0    1.145    0.00567 202   <0.001 Inf 1.134
    r ln(odds(1) / odds(0)) 1 1 1    0.852    0.00537 158   <0.001 Inf 0.841
    r ln(odds(1) / odds(0)) 0 0 0    1.232    0.00538 229   <0.001 Inf 1.222
    r ln(odds(1) / odds(0)) 0 0 1    0.931    0.00515 181   <0.001 Inf 0.921
    r ln(odds(1) / odds(0)) 0 1 0    1.192    0.00560 213   <0.001 Inf 1.181
    r ln(odds(1) / odds(0)) 0 1 1    0.896    0.00531 169   <0.001 Inf 0.886
    r ln(odds(1) / odds(0)) 1 0 0    1.221    0.00539 227   <0.001 Inf 1.210
    r ln(odds(1) / odds(0)) 1 0 1    0.922    0.00514 179   <0.001 Inf 0.912
    r ln(odds(1) / odds(0)) 1 1 0    1.188    0.00569 209   <0.001 Inf 1.177
    r ln(odds(1) / odds(0)) 1 1 1    0.894    0.00538 166   <0.001 Inf 0.883
 97.5 %
  1.256
  0.955
  1.170
  0.871
  1.236
  0.936
  1.156
  0.862
  1.243
  0.941
  1.203
  0.907
  1.231
  0.932
  1.200
  0.905

Columns: term, contrast, f, x, d, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Revision (long) vs. DAIR and Revision (short) vs. DAIR

# Revision (long) vs. DAIR and Revision (short) vs. DAIR
rbind(
  avg_comparisons(fit2, variables = "r", by = "d", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "r", by = "d", comparison = "lnoravg")
)


 Term              Contrast d Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
    r ln(odds(1) / odds(0)) 0    1.182    0.00528 224   <0.001 Inf 1.171  1.192
    r ln(odds(1) / odds(0)) 1    0.889    0.00502 177   <0.001 Inf 0.879  0.899
    r ln(odds(1) / odds(0)) 0    1.182    0.00525 225   <0.001 Inf 1.171  1.192
    r ln(odds(1) / odds(0)) 1    0.890    0.00499 178   <0.001 Inf 0.880  0.899

Columns: term, contrast, d, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Rifampicin vs not

# Rifampicin vs not
rbind(
  avg_comparisons(fit2, variables = "f", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "f", comparison = "lnoravg")
)


 Term              Contrast Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
    f ln(odds(1) / odds(0))    0.168    0.00387 43.4   <0.001 Inf 0.161  0.176
    f ln(odds(1) / odds(0))    0.168    0.00382 44.0   <0.001 Inf 0.160  0.175

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted 
Type:  response

Short vs Long (one-stage) and short vs. long (two-stage)

# Short vs Long (one-stage) and short vs. long (two-stage)
rbind(
  avg_comparisons(fit2, variables = "d", by = "s", comparison = "lnoravg"),
  avg_comparisons(fit2s, variables = "d", by = "s", comparison = "lnoravg")
)


 Term              Contrast s Estimate Std. Error     z Pr(>|z|)     S  2.5 %
    d ln(odds(1) / odds(0)) 0   -0.119    0.00505 -23.6   <0.001 406.7 -0.129
    d ln(odds(1) / odds(0)) 1   -0.121    0.00284 -42.6   <0.001   Inf -0.127
    d ln(odds(1) / odds(0)) 0   -0.120    0.00508 -23.6   <0.001 407.5 -0.130
    d ln(odds(1) / odds(0)) 1   -0.129    0.00302 -42.6   <0.001   Inf -0.134
 97.5 %
 -0.109
 -0.115
 -0.110
 -0.123

Columns: term, contrast, s, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Short vs long conditional on x

# Short vs long conditional on x
avg_comparisons(fit2, variables = "d", by = c("s", "x"), comparison = "lnoravg")


 Term              Contrast s x Estimate Std. Error     z Pr(>|z|)     S
    d ln(odds(1) / odds(0)) 0 0  -0.1250    0.00530 -23.6   <0.001 406.9
    d ln(odds(1) / odds(0)) 0 1  -0.1210    0.00513 -23.6   <0.001 406.4
    d ln(odds(1) / odds(0)) 1 0  -0.1309    0.00307 -42.6   <0.001   Inf
    d ln(odds(1) / odds(0)) 1 1  -0.0946    0.00223 -42.4   <0.001   Inf
   2.5 %  97.5 %
 -0.1354 -0.1146
 -0.1310 -0.1109
 -0.1370 -0.1249
 -0.0989 -0.0902

Columns: term, contrast, s, x, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Silos

The above has considered the late-acute silo in isolation. Suppose we also had a chronic silo with the same limitations: randomise to DAIR vs. revision, then revision type is determined by the surgeon/patient, duration is randomised. In the new setting, nothing really changes, we can just introduce silo-specific parameters.

For lack of letters, let \(G\) denote group (silo). Then

\[ \mathbb{E}[Y|G=g,R,S,D] = \text{expit}(\beta_{0,g} + \beta_{1,g}R + \beta_{2,g}S + \beta_{3,g}RS + \beta_{4,g}RD + \beta_{5,g}RSD + \gamma_g^\mathsf{T}X) \]

We might choose to assume that some of the conditional effects are equal across groups. E.g. \(\gamma_g=\gamma\) for all \(g\). Or \(\beta_{4,g} = \beta_4\) and \(\beta_{5,g}=\beta_5\). Depends how realistic we think these assumptions may be and whether we have sufficient data to meaningfully estimate silo-specific effects.

Perhaps the only new issue is that now \(\mathbb{P}(S=s|G=1)\ne\mathbb{P}(S=s|G=2)\), so need to weight things within silo.

Note

The other thing to consider here is that in the early silo, there is no information contributed to parameters characterising the surgical intervention effects. I am not sure that the above model addresses this in that \(\beta_{1,g}\) for \(g = early\) would not be defined.

Revision Type

In all the above I’ve been assuming that the revision type, \(S\), is an attribute of the patient. E.g. all surgeons would choose the same \(S\) for the same patient. More realistically, \(S\) might also partly depend on the surgeon (e.g. say a surgeon would choose \(S=1\) for all patients, but another would choose \(S=0\) for all patients, now \(S\) is conditional on the surgeon rather than the patient). Assume \(S\) is an attribute of the patient/surgeon combination rather than either alone. Do we need to change anything? How to interpret “revision effect”? An average effect over patients and surgeons?

Already expect that we should at least condition on the site/surgeon, but do we need anything extra to account for \(S\)? Distribution of \(S\) conditional on surgeon?

Randomisation reveal

A number of simplifying assumptions have been made that result in an incomplete representation of the design. For example, at the start of the discussion on the duration domain, we assumed that duration is randomised, but depends on both \(R\) and \(S\). However, note:

\(R\) presupposes eligibility and reveal for the surgical domain, which is not the general case, e.g. early stage infection silo.
Patients may enter into the study without having being entered for a randomised comparison in the surgical domain (e.g. patients with early stage infection) in which case the allocated surgery (\(R_A\)) would have been determined entirely by clinician selection.
Patients enter into the duration domain based on the surgical procedure that occurred. This is expected to usually align with the allocated procedure \(R_A\), but may deviate from that.
\(D\) is random, conditional solely on \(R_P\) rather than both \(R\) and \(S_{R_A}\) (as stated in the duration section).
\(D\) exists only for \(R_P \in \{\text{one-stage}, \text{two-stage}\}\) otherwise reveal never occurs and duration allocated \(D_A\) is determined by clinician selection
While the duration domain for one-stage and two-stage both contain long (reference) vs short levels for duration of antibiotic, the levels are distinct for each procedure.

To incorporate some of these ideas, start with the following definitions:

\(E_R\) reveal for surgical domain - 0: no, 1: yes
\(E_D\) reveal for duration domain - 0: no, 1: yes
\(E_F\) reveal for choice domain - 0: no, 1: yes
\(R\) randomised surgery - 0: DAIR, 1: revision
\(S_R\) revision type preference (pre-randomisation) - 0: DAIR, 1: one-stage, 2: two-stage
\(R_A\) allocated surgery - 0: DAIR, 1: one-stage, 2: two-stage
\(R_P\) performed surgery - 0: DAIR, 1: revision
\(S_{R_P}\) revision type performed (post-randomisation) - 0: one-stage, 1: two-stage
\(D\) randomised duration - 0: long, 1: short
\(D_A\) allocated duration - 0: long, 1: short, 2: other
\(S_D\) selected duration - 0: long, 1: short, 2: other
\(F\) randomised choice - 0: norif, 1: rif
\(F_A\) allocated choice - 0: norif, 1: rif, 2: other
\(S_F\) selected choice - 0: norif, 1: rif, 2: other
\(Y\) treatment success - 0: no, 1: yes

In practice, we expect the allocated \(R_A\) and actual \(R_P\) surgical procedure performed will align, but in some cases they may not.

Contrived) example one: revision was randomised and the original surgeon only performs two-stage revision. The original surgeon becomes unavailable to do the procedure and another surgeon (who only performs one-stage revision) takes over the case. The patient enters into the duration domain for comparisons within the setting of one-stage revision.

Contrived) example two: revision was randomised and the surgeon intends to perform a two-stage revision. On the operating table, the surgeon switches to dair, for unknown reasons and the patient will no longer enter into randomised comparisons for the duration domain.

\(R_A\) (allocation) is now determined by additional variables:

\[ R_A = (1-E_R) S_R + E_R R S_R \]

so that when we have no revealed randomisation for the surgical domain, \(R_A\) aligns with the preferred procedure out of all those possible (dair, one-stage, two-stage). When randomisation revealed for the surgical domain, the first term disappears and for \(R = 0\), the allocation is dair, irrespective of preferred surgery, whereas when \(R = 1\) we get whatever one of one-stage (\(S_R = 1\)) or two-stage (\(S_R = 2\)) is preferred.

For duration allocated \(D_A\):

\[ D_A = (1-E_D) S_D + E_D D \]

when no reveal, \(E_D = 1\) and \(D_A = S_D\) (long/short/other duration) and when revealed, \(D_A = D\) (long/short duration).

For choice allocated \(F_A\):

\[ F_A = (1 - E_F) S_F + E_F F \]

when no reveal, \(F_A = S_F\) (norif/rif/other) and when revealed, \(F_A = F\) (norif/rif).

Note

Above, I am assuming that you can randomise someone within a domain without first assessing their eligibility status or knowing anything about them other than they want to enter the platform, which I believe is the intention. Randomisation is only revealed if eligibility is confirmed and this process is independent to the randomisation process.

The edited DAG is shown below, which still has a number of simplications relative to the intended approach but is intended to represent a generalised silo and site of infection, implicitly acknowledging that the outcome will be dependent on both these factors. Patients may contribute to some or all domains, which influences the treatment regimen (combination of treatments across the domains) they receive and which suggests the various causal effects that are identifiable.

%%{
  init:{
    "flowchart":{"htmlLabels": "true"},
    "securityLevel": "loose",
    "theme": "base"
}}%%
flowchart LR
  ER(E<sub>R) --> RA(R<sub>A) 
  ER --> SR(S<sub>R)
  ED(E<sub>D) --> DA(D<sub>A) 
  SR --> RA 
  SD(S<sub>D) --> DA 
  R --> RA
  RA --> RP(R<sub>P) 
  SRP(S<sub>R<sub>P) --> RP
  RA --> Y
  RP --> ED
  D --> DA(D<sub>A)
  DA --> Y
  F --> FA(F<sub>A)
  EF(E<sub>F) --> FA
  SF(S<sub>F) --> FA 
  FA --> Y
  UER((U<sub>E<sub>R)) -.-> ER
  UED((U<sub>E<sub>D)) -.-> ED
  UEF((U<sub>E<sub>F)) -.-> EF
  UF((U<sub>F)) -.-> F
  URP((U<sub>R<sub>P)) -.-> SRP
  UD((U<sub>D)) -.-> D
  USR((U<sub>S<sub>R)) -.-> SR
  UR((U<sub>R)) -.-> R
  USD((U<sub>S<sub>D)) -.-> SD
  USF((U<sub>S<sub>F)) -.-> SF
  UY((U<sub>Y)) -.-> Y

Figure 6: Scenario 4, the \(U\) denote independent exogenous variables,

Note

There is potentially a direct as well as the indirect effect of both \(S_R\) on \(Y\) and \(S_{R_P}\) on \(Y\) but these have been left out of the DAG for now. \(S_R\) is actually representing multiple ideas:

when the surgical domain is not applicable (randomisation never revealed) e.g. for the chronic patients, then \(S_R\) is the selection from dair, one-stage, two-stage
when surgical domain is applicable (randomisation is revealed) then \(S_R\) is the option between one-stage and two-stage that is most prefered.

In essence, \(S_R\) involves a conditional ranking of which surgical procedure is prefered.

For example, say the selection is: dair, two-stage, one-stage in order of preference. If randomisation is not reveal, dair is would be the selection. If randomisation is revealed, two-stage would be as it is the most prefered surgical procedure applicable to revision.

Originally, \(D\) was said to depend on some of the selection elements. However, subsequently it was decided that \(D\) (long/short) should be viewed like all other randomisation processes, i.e. independent of all other nodes, but is only manifest through \(D_A\) for certain surgery types.

Again consider intervening on surgical procedure whereby we are interested in the effect of dair vs revision on treatment success. There are no open backdoor paths apparent and therefore no adjustment is required to identify the total effect of \(R\) on \(Y\). Similarly, neither \(D\) nor \(F\) have backdoor paths.

Postulate the following model:

\[ \begin{aligned} \mathbb{E}[Y|L; \beta] &= \text{expit}( \beta_0 + \\ &\quad \beta_1 \mathbb{I}[G = 1] + \beta_2 \mathbb{I}[G = 2] + \beta_3 J + \\ &\quad \beta_4 J \mathbb{I}[G = 1] + \beta_5 J \mathbb{I}[G = 2] + \\ &\quad \beta_6 \mathbb{I}(1-E_R) + ([\beta_7 R + \beta_8 R \mathbb{I}(S_{R_P} = 2) ])\mathbb{I}(E_R) + \\ &\quad \beta_9 \mathbb{I}(1-E_D) + ([\beta_{10} R_P D + \beta_{11} R_P D \mathbb{I}(S_{R_P} = 2)])\mathbb{I}(E_D) + \\ &\quad \beta_{12} \mathbb{I}(1-E_F) + \beta_{13} F E_F ) \end{aligned} \]

where the \(L\) stands for the set of model variables and \(\beta\) the vector of parameters. The following describe the reference/movements in the log-odds of treatment success:

\(\beta_{0}\) baseline log-odds of treatment success in the early silo / knee site
\(\beta_{1}\) shift for late-silo membership relative to early
\(\beta_{2}\) shift for chronic-silo membership relative to early
\(\beta_{3}\) shift for hip
\(\beta_{4}\) relative shift for late-silo membership with hip
\(\beta_{5}\) relative shift for chronic-silo membership with hip
\(\beta_{6}\) shift under non-reveal (surgery¹)
\(\beta_{7}\) shift under revision that was performed with one-stage procedure for long duration and no-rif
\(\beta_{8}\) relative shift under revision that was performed with two-stage procedure for long duration and no-rif
\(\beta_{9}\) shift under non-reveal (duration) with no differentiation for surgical nor duration preference
\(\beta_{10}\) shift for short duration when one-stage was actually performed
\(\beta_{11}\) shift for short duration when two-stage was actually performed
\(\beta_{12}\) shift under non-reveal (choice) with no differentiation for choice preference
\(\beta_{13}\) shift for rif

With the complicating factor being that the surgical allocation may inform the type of surgery that the patient gets (but may deviate) and the randomisation that the patient is revealed to in the duration domain is determined by what surgical intervention the patient actually got.

We replace pre-randomised preference for surgical type with post-randomised surgical type performed and marginalise out this term.

Note

Full disclosure, I am not entirely sure as to whether the above addresses the full impacts of deviations between allocated and performed surgery or whether the DAG is sufficiently complete representation of the dependencies.

As previously, for the surgical domain, we are interested the effect of intervening on \(R\). For the duration domain, we are interested in the effect of intervening on \(D|R_P\). For the choice domain, we are interested in the effect of intervening on \(F\).

Note

Note that for the following specification, the design matrix can become singular (linear dependence between some of the variables) e.g. if surgical reveal is equivalent to silo membership. I am assuming some small amount of noise in reveal such that this isn’t a problem.

Generate complete data

generate_data_4 <- function(
    n = 1e6,
    g = function(p_a, l1, l2, j, er, ed, ef, r, rp, d, srp, f){ 
      -1 + -0.04 * l1 - 0.07 * l2 - 0.02 * j - 0.01 * l1 * j - 0.06 * l2 * j +
        -0.1*(1-er) + (0.2*r + 0.4*r*(srp==2))*(er) +
        -0.05*(1-ed) + (0.4*rp*d + 0.1*rp*d*(srp==2))*(ed) +
        -0.25*(1-ef) + 0.15*f*(ef) 
      }
    ) {
  
  p_a = array(
    c(0.65, 0.55, 0.6, 0.75, 0.6, 0.65),
    dim = c(3, 2), dimnames = list(c("early", "late", "chronic"),c("knee", "hip")))
  
  # silo (l) and joint (j)
  l <- sample(0:2, n, replace = T, prob = c(0.3, 0.5, 0.2)) 
  l1 = as.numeric(l == 1)
  l2 = as.numeric(l == 2)
  
  p_j <- array(c(0.4,0.7,0.5,0.6,0.3,0.5), dim = c(3, 2), 
               dimnames = list(c("early", "late", "chronic"), c("knee", "hip")))
  j <- rbinom(n, 1, p_j[l+1, 2])
  # reveal for late only, with a small number who never get revealed even if they
  # were in late. you can leave this out but the model spec will need to be updated
  # because there will be linearly depenedent cols in the design matrix
  er <- as.numeric(l == 1)
  er[l==1][as.logical(rbinom(er[l==1], 1, 0.05))] <- 0
  # randomise dair vs rev
  r <- rbinom(n, 1, 0.5)
  # (approx) 70% chance of clinician choosing two-stage if pt is rand to revision 
  sr <- numeric(n)
  sr[r == 0] <- sample(0:2, sum(r == 0), replace = T, prob = c(0.2, 0.2, 0.6))
  sr[r == 1] <- sample(1:2, sum(r == 1), replace = T, prob = c(0.3, 0.7))
  # pref towards two-stage, assuming revision
  sra <- as.numeric(sr == 2)
  # determine allocation of surgery type
  ra <- er * sr + (1-er) * r * (sr) 
  
  # 10% of the allocated treatments may have switch to a different surg type
  srp <- ra
  ic <- rbinom(n, 1, 0.1)
  srp[as.logical(ic)] <- sample(0:2, sum(ic), replace = T, prob = c(0.2, 0.2, 0.6))
  # was the procedure type dair or revision?
  rp <- as.numeric(srp %in% 1:2)
  
  # non-reveal of duration if rp is dair (0)
  ed <- as.numeric(rp == 1)
  # rand to long (0), short (1) based on surgery received
  d <- as.numeric((rp > 0) * rbinom(n, 1, 0.5))
  # 60% reveal ab choice
  ef <- rbinom(n, 1, 0.6)
  f <- as.numeric((ef == 1) * rbinom(n, 1, 0.5))
  
  y <- rbinom(n, 1, plogis(g(p_a, l1, l2, j, er, ed, ef, r, rp, d, srp, f)))
  # table(y, useNA = "always")
  
  D <- data.table(
    l1, l2, j, 
    er, ed, ef,
    erx = 1-er, edx = 1-ed, efx=1-ef,
    r, sr, sra, ra, 
    ic, rp, srp, srp2 = as.numeric(srp == 2), d,
    f, y
  )
}
set.seed(102)
D <- generate_data_4()

# early silo, should be non-reveal with no r options (default to 0)
# D[, .N, keyby = .(l, j, er, r)]
# those who received rev are revealed for d (ed = 0) and are 0:1 conditional on revision type (srp)
# D[, .N, keyby = .(rp, ed, srp, d)]
# those entering ab choice should come from all strata, infec site
# D[ef == 0, .N, keyby = .(l, j, f)]
# those that dont should also be dist across sample
# D[ef == 1, .N, keyby = .(l, j, f)]

fit2 <- glm(y ~ l1 + l2 + j + l1:j + l2:j +
              erx + er:r + er:r:srp2 +
              edx + ed:rp:d + ed:d:rp:srp2 +
              efx + ef:f, 
            data = D, family = binomial())
summary(fit2)


Call:
glm(formula = y ~ l1 + l2 + j + l1:j + l2:j + erx + er:r + er:r:srp2 + 
    edx + ed:rp:d + ed:d:rp:srp2 + efx + ef:f, family = binomial(), 
    data = D)

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -0.992730   0.017361 -57.182  < 2e-16 ***
l1           -0.036997   0.016251  -2.277  0.02281 *  
l2           -0.081398   0.009886  -8.234  < 2e-16 ***
j            -0.028103   0.008548  -3.288  0.00101 ** 
erx          -0.101703   0.015480  -6.570 5.04e-11 ***
edx          -0.045398   0.006282  -7.227 4.94e-13 ***
efx          -0.251435   0.005446 -46.167  < 2e-16 ***
l1:j         -0.022724   0.010840  -2.096  0.03605 *  
l2:j         -0.052919   0.013561  -3.902 9.53e-05 ***
er:r          0.187560   0.009727  19.283  < 2e-16 ***
ef:f          0.147332   0.005624  26.198  < 2e-16 ***
er:r:srp2     0.407762   0.010375  39.302  < 2e-16 ***
ed:rp:d       0.414709   0.008222  50.439  < 2e-16 ***
srp2:ed:rp:d  0.085740   0.008749   9.800  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1216852  on 999999  degrees of freedom
Residual deviance: 1179406  on 999986  degrees of freedom
AIC: 1179434

Number of Fisher Scoring iterations: 4

Generate complete data

# linearly_dep_cols(fit2)

Note

Using g-computation to determine the marginal effects. lnor and lnoravg are used as the comparisons.

The first (lnor) approach (average log odds?)

predicts the log-odds of treatment success for all units with the surgical approach set to revision.
predicts the log-odds of treatment success for all units with the surgical approach set to dair.
computes the difference between the response on the log odds scale and takes the mean

Also can be derived from a weigted combination of the parameters from the regression model. What is an accurate interpretation of this parameter? How would you explain it to a non-statistician?

The second (lnoravg) approach (marginal mean?)

predicts the probability of treatment success for all units with the surgical approach set to revision and computes the mean
predicts the probability of treatment success for all units with the surgical approach set to dair and computes the mean
converts the mean probability of treatment success to the log-odds scale and takes the difference

Code

# Revision vs. DAIR
cmp <- avg_comparisons(fit2, variables = "r", comparison = "lnor")

# avg_comparisons(fit2, variables = "r", comparison = "lnor", by = "er")

d_new <- copy(D)

# equivalent to using lnor
d_new[, `:=`(r = 1)]
lo1 <- predict(fit2, newdata = d_new)
d_new[, `:=`(r = 0)]
lo0 <- predict(fit2, newdata = d_new)

# in this setting is equivalent to weighted combination of parameters
b <- coef(fit2)

w_srp <- D[er == 1, mean(srp2)]
w_er <- D[, mean(er)]

rbind(
  "predict at r = 0/1" = mean(lo1 - lo0),
  "weighting coef by er and srp (condit on reveal)" = w_er * b["er:r"] + w_er * w_srp * b["er:r:srp2"] , 
  "avg_comparisons (lnor)" = cmp$estimate
)

                                                     er:r
predict at r = 0/1                              0.2136018
weighting coef by er and srp (condit on reveal) 0.2136018
avg_comparisons (lnor)                          0.2136018

Code

cmp


 Term              Contrast Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
    r ln(odds(1) / odds(0))    0.214    0.00301 70.9   <0.001 Inf 0.208   0.22

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Code

# using lnoravg
avg_comparisons(fit2, variables = "r", comparison = "lnoravg")


 Term              Contrast Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
    r ln(odds(1) / odds(0))    0.229    0.00314 73.1   <0.001 Inf 0.223  0.235

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted 
Type:  response

Code

d_new[, r := 1]
p1 <- predict(fit2, newdata = d_new, type = "response")
d_new[, r := 0]
p0 <- predict(fit2, newdata = d_new, type = "response")
qlogis(mean(p1)) - qlogis(mean(p0))

[1] 0.2292898

And for duration and choice.

Code

# Short vs Long (one-stage) and short vs. long (two-stage)
avg_comparisons(fit2, variables = "d", by = "srp2", comparison = "lnoravg")


 Term              Contrast srp2 Estimate Std. Error    z Pr(>|z|)   S 2.5 %
    d ln(odds(1) / odds(0))    0    0.179    0.00361 49.8   <0.001 Inf 0.172
    d ln(odds(1) / odds(0))    1    0.486    0.00558 87.1   <0.001 Inf 0.475
 97.5 %
  0.187
  0.497

Columns: term, contrast, srp2, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Code

# Rifampicin vs not
avg_comparisons(fit2, variables = "f", comparison = "lnoravg")


 Term              Contrast Estimate Std. Error    z Pr(>|z|)     S  2.5 %
    f ln(odds(1) / odds(0))   0.0894    0.00341 26.2   <0.001 500.7 0.0827
 97.5 %
  0.096

Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted 
Type:  response

Footnotes

The parameter ignores potential differentiation for surgical type.↩︎