Introduction to optband

Tom Chen & Sam Tracy

2017-05-09

Classical simultaneous confidence bands for survival functions (Hall-Wellner, Equal Precision, etc) are derived from transformations of the convergence to a Weiner process with a strictly-increasing variance function These transformations are often motivated by factors such as:

This package instead approaches the problem purely from an optimization perspective: given a certain coverage level, obtain bands such that the area between is minimized. While an exact solution cannot be obtained in closed form, optband provides an approximate solution based off local time arguments for both the survival and cumulative-hazard functions, building off of results specified by Kendall et al. (2007).

Usage

optband requires the utils and LambertW packages. Of primary use is the opt.ci function with method opt.ci(survi, conf.level = 0.95, fun = 'surv', tl = NA, tu = NA).

opt.ci takes a survfit object from the survival package with the desired \(1-\alpha\) coverage level, function of interest (either 'surv' for the survival function or 'cumhaz' for the cumulative-hazard function), and optional upper or lower bounds for data truncation. Defaults are \(\alpha=0.05\), fun = 'surv', tl = NA, tu = NA.

Example

First, let’s generate some survival data using the stats package and estimate the Kaplan-Meier curve:

set.seed(1990)
N = 200
x1 <- stats::rweibull(N, 1, 1)
x2 <- stats::rweibull(N, 2, 1)
d <- x1 < x2
x <- pmin(x1,x2)
mydata = data.frame(stop = x, event = d)
S = survival::survfit(Surv(x, d) ~ 1, type="kaplan-meier")

Now we can estimate the optimized confidence bands and plot the curve:

opt_S <- optband::opt.ci(S, conf.level = 0.95, fun = "surv", tl = NA, tu = NA)
plot(opt_S, xlab="time", ylab="KM", mark.time=FALSE)

And we can do the same with the estimated cumulative-hazard function:

opt_H <- optband::opt.ci(S, conf.level = 0.95, fun = "cumhaz", tl = NA, tu = NA)
plot(opt_H, fun="cumhaz", xlab="time", ylab="CH", mark.time=FALSE)

We can further play with the procedure by adjusting the coverage level and also truncating the data:

opt_H <- optband::opt.ci(S, conf.level = 0.90, fun = "cumhaz", tl = .1, tu = .9)
plot(opt_H, fun="cumhaz", xlab="time", ylab="CH", mark.time=FALSE)

And compare the results of this band to the Equal Precision and Hall-Wellner bands (for this we will use the km.ci package):

color <- c("grey", "darkblue", "red", "green")
plot(S, mark.time=FALSE, xlab="time", ylab="KM", col = "white")

lines(opt_S, col=color[2], lty=2, lwd=2, mark.time=F)
e <- km.ci::km.ci(S, conf.level = 0.95, method = "epband")
h <- km.ci::km.ci(S, conf.level = 0.95, method = "hall-wellner")
lines(e, col=color[3], lty=3, lwd=2, mark.time=F)
lines(h, col=color[4], lty=4, lwd=2, mark.time=F)
lines(S,col="grey", lwd=2, mark.time=F)

legend("topright", c("KM", "optband", "epband", "hall-wellner"), 
       lwd=2, lty=1:4, col=color)

Lastly, while the 2-sample survival function difference confidence bands are intractable with this method, we can estimate confidence bands for the cumulative hazard function difference. For this, we’ll use the bladder data set from the survival package, first subsetting upon only the times until first recurrence:

dat <- bladder[bladder$enum==1,]
Hdif = survival::survfit(Surv(stop, event) ~ rx, type="kaplan-meier", data=dat)
opt_Hdif <- optband::opt.ci(Hdif, fun="cumhaz", conf.level = 0.95, samples=2)

plot(opt_Hdif$difference~opt_Hdif$time, xlab="t", ylim=c(-1.5,1),
     main=expression(hat(Lambda)(t)[1]-hat(Lambda)(t)[2]), ylab="", col = "white")
lines(opt_Hdif$difference~opt_Hdif$time, col = "grey", lty=1, lwd=2)
lines(-log(opt_Hdif$upper)~opt_Hdif$time, col=4, lty=3, lwd=2)
lines(-log(opt_Hdif$lower)~opt_Hdif$time, col=4, lty=3, lwd=2)

References

Kendall, W.S., Marin, J.M. and Robert, C.P. (2007). Confidence bands for Brownian motion and applications to Monte Carlo simulation. Statistics and Computing, 17(1), pp.1-10.

Therneau T (2015). A Package for Survival Analysis in S. version 2.38, <URL: https://CRAN.R-project.org/package=survival>.

Terry M. Therneau and Patricia M. Grambsch (2000). Modeling Survival Data: Extending the Cox Model. Springer, New York. ISBN 0-387-98784-3.

Ralf Strobl (2009). km.ci: Confidence intervals for the Kaplan-Meier estimator. R package version 0.5-2. https://CRAN.R-project.org/package=km.ci

Georg M. Goerg (2016). LambertW: An R package for Lambert W x F Random Variables. R package version 0.6.4.

Georg M. Goerg (2011): Lambert W random variables - a new family of generalized skewed distributions with applications to risk estimation. Annals of Applied Statistics 3(5). 2197-2230.

Georg M. Goerg (2014): The Lambert Way to Gaussianize heavy-tailed data with the inverse of Tukey’s h transformation as a special case. The Scientific World Journal.