Exercise 14. Non-collapsibility of proportional hazards models


We simulate for time-to-event data assuming constant hazards and then investigate whether we can estimate the underlying parameters. Note that the binary variable \(X\) is essentially a coin toss and we have used a large variance for the normally distributed \(U\).

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

We can use dagitty to graph the causal diagram:

##          y x          u delta
## 1 1.809348 1 -2.0774167 FALSE
## 2 1.251765 1  3.3740868  TRUE
## 3 2.078184 1  0.5236054 FALSE
## 4 3.528797 1 -5.9034201 FALSE
## 5 6.659236 0 -4.9328152 FALSE
## 6 5.366579 0 -0.3293542 FALSE

(a) Fitting models with both \(X\) and \(U\)

For constant hazards, we can fit (i) Poisson regression, (ii) Cox regression and (iii) flexible parametric survival models. With both covariates, these models are expected to estimate the parameters for \(X\) and \(U\), with values close to 1:

## 
## Call:
## glm(formula = delta ~ x + u + offset(log(y)), family = poisson, 
##     data = d)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.9137  -0.3774  -0.1355  -0.0274   3.9389  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -4.97595    0.05344  -93.11   <2e-16 ***
## x            0.96027    0.04372   21.96   <2e-16 ***
## u            0.99087    0.01091   90.78   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 14687.0  on 9999  degrees of freedom
## Residual deviance:  5811.2  on 9997  degrees of freedom
## AIC: 10257
## 
## Number of Fisher Scoring iterations: 7
## Call:
## coxph(formula = Surv(y, delta) ~ x + u, data = d)
## 
##   n= 10000, number of events= 2220 
## 
##      coef exp(coef) se(coef)     z Pr(>|z|)    
## x 0.96227   2.61764  0.04490 21.43   <2e-16 ***
## u 0.99034   2.69215  0.01461 67.81   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##   exp(coef) exp(-coef) lower .95 upper .95
## x     2.618     0.3820     2.397     2.858
## u     2.692     0.3714     2.616     2.770
## 
## Concordance= 0.922  (se = 0.002 )
## Rsquare= 0.525   (max possible= 0.98 )
## Likelihood ratio test= 7448  on 2 df,   p=<2e-16
## Wald test            = 4629  on 2 df,   p=<2e-16
## Score (logrank) test = 5611  on 2 df,   p=<2e-16
## Maximum likelihood estimation
## 
## Call:
## stpm2(formula = Surv(y, delta) ~ x + u, data = d, df = 4)
## 
## Coefficients:
##                        Estimate Std. Error z value     Pr(z)    
## (Intercept)          -14.565979   0.516079 -28.224 < 2.2e-16 ***
## x                      0.960750   0.044878  21.408 < 2.2e-16 ***
## u                      0.990848   0.014600  67.865 < 2.2e-16 ***
## nsx(log(y), df = 4)1   9.424531   0.494145  19.072 < 2.2e-16 ***
## nsx(log(y), df = 4)2   8.786272   0.328904  26.714 < 2.2e-16 ***
## nsx(log(y), df = 4)3  16.534497   1.013387  16.316 < 2.2e-16 ***
## nsx(log(y), df = 4)4   8.562140   0.206188  41.526 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: 8520.997
##          Estimate Std. Error
## Poisson 0.9602659 0.04372345
## Cox     0.9622728 0.04490412
## stpm2   0.9607498 0.04487831

If we fit a time-varying hazard ratio for \(X\), we see that the hazard ratio looks reasonably constant.

(b) Fitting models with only \(X\)

We now model by excluding the variable \(U\). This variable could be excluded when it is not measured or perhaps when the variable is not considered to be a confounding variable – from the causal diagram, the two variables \(X\) and \(U\) are not correlated and are only connected through the time variable \(T\).

## 
## Call:
## glm(formula = delta ~ x + offset(log(y)), family = poisson, data = d)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.1690  -0.8129  -0.5805  -0.1687   4.8126  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -3.19636    0.03377  -94.66   <2e-16 ***
## x            0.51289    0.04341   11.81   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 14687  on 9999  degrees of freedom
## Residual deviance: 14544  on 9998  degrees of freedom
## AIC: 18988
## 
## Number of Fisher Scoring iterations: 7
## Call:
## coxph(formula = Surv(y, delta) ~ x, data = d)
## 
##   n= 10000, number of events= 2220 
## 
##      coef exp(coef) se(coef)     z Pr(>|z|)    
## x 0.48511   1.62436  0.04342 11.17   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##   exp(coef) exp(-coef) lower .95 upper .95
## x     1.624     0.6156     1.492     1.769
## 
## Concordance= 0.565  (se = 0.005 )
## Rsquare= 0.013   (max possible= 0.98 )
## Likelihood ratio test= 127.9  on 1 df,   p=<2e-16
## Wald test            = 124.8  on 1 df,   p=<2e-16
## Score (logrank) test = 127.3  on 1 df,   p=<2e-16
## Maximum likelihood estimation
## 
## Call:
## stpm2(formula = Surv(y, delta) ~ x, data = d, df = 4)
## 
## Coefficients:
##                       Estimate Std. Error z value     Pr(z)    
## (Intercept)          -9.883577   0.473287 -20.883 < 2.2e-16 ***
## x                     0.485236   0.043423  11.175 < 2.2e-16 ***
## nsx(log(y), df = 4)1  7.458585   0.467805  15.944 < 2.2e-16 ***
## nsx(log(y), df = 4)2  6.323162   0.310307  20.377 < 2.2e-16 ***
## nsx(log(y), df = 4)3 13.212493   0.930871  14.194 < 2.2e-16 ***
## nsx(log(y), df = 4)4  5.541993   0.192043  28.858 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: 15844.98
##          Estimate Std. Error
## Poisson 0.5128869 0.04341441
## Cox     0.4851147 0.04342341
## stpm2   0.4852363 0.04342328

We clearly see that the estimate for \(X\) is different in the models without \(U\). If we now allow the the hazard ratio to vary by time:

We see that the hazard ratio is clearly time-varying, starting at an initial value at time = 0 of close to exp(1), but then declining rapidly. As discussed in class, this is an example of unmeasured heterogeneity, which is a normal random effect or, in this context, a log-normal frailty. Initially, there is no selection in the at-risk population, and the estimated marginal hazard ratio from the model without \(U\) is similar to the conditional hazrd ratio from the model that adjusts for \(U\). However, at later times, the at-risk population has been selected for those with a smaller frailty (because they were less likely to have had the event), and for a log-normal frailty the marginal hazard ratio is attenuated towards 1.

Let us stress that \(U\) is not a confounder. The issue is that fitting proportional hazards models with unmodelled heterogeneity for variables that are not confounders can lead to time-varying hazard ratios – and marginal estimates that do not have a causal interpretation.

(c) Rare vs common outcomes

## 
## Call:
## glm(formula = delta ~ x + offset(log(y)), family = poisson, data = d)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6528  -0.4527  -0.3585  -0.2266   4.5702  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.23991    0.06250 -35.839   <2e-16 ***
## x            0.69377    0.07688   9.024   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 6129.1  on 9999  degrees of freedom
## Residual deviance: 6043.0  on 9998  degrees of freedom
## AIC: 7557
## 
## Number of Fisher Scoring iterations: 7
## Call:
## coxph(formula = Surv(y, delta) ~ x, data = d)
## 
##   n= 10000, number of events= 755 
## 
##      coef exp(coef) se(coef)     z Pr(>|z|)    
## x 0.68466   1.98309  0.07688 8.905   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##   exp(coef) exp(-coef) lower .95 upper .95
## x     1.983     0.5043     1.706     2.306
## 
## Concordance= 0.583  (se = 0.009 )
## Rsquare= 0.008   (max possible= 0.737 )
## Likelihood ratio test= 83.85  on 1 df,   p=<2e-16
## Wald test            = 79.3  on 1 df,   p=<2e-16
## Score (logrank) test = 82.45  on 1 df,   p=<2e-16
## Maximum likelihood estimation
## 
## Call:
## stpm2(formula = Surv(y, delta) ~ x, data = d, df = 4)
## 
## Coefficients:
##                        Estimate Std. Error  z value     Pr(z)    
## (Intercept)          -10.096088   0.649528 -15.5437 < 2.2e-16 ***
## x                      0.684636   0.076881   8.9051 < 2.2e-16 ***
## nsx(log(y), df = 4)1   6.453301   0.640095  10.0818 < 2.2e-16 ***
## nsx(log(y), df = 4)2   5.610039   0.429274  13.0687 < 2.2e-16 ***
## nsx(log(y), df = 4)3  11.416226   1.274732   8.9558 < 2.2e-16 ***
## nsx(log(y), df = 4)4   5.043582   0.268584  18.7784 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: 3975.012
##          Estimate Std. Error
## Poisson 0.6937665 0.07687819
## Cox     0.6846584 0.07688225
## stpm2   0.6846362 0.07688112