The dsrrec()
function computes directly standardized rates for recurrent and non-recurrent outcomes.
For recurrent outcomes, confidence intervals are computed using the negative binomial approach to variance estimation for directly standardized rates by Stukel et al. (1994).
For non-recurrent outcomes, the gamma distribution approach outlined in Fay et al. (1997) is used.
The dsrrec()
function expects person-level event counts and unit-times. On each person-level record, the following variables should be present:
Note: For the standard or reference population, the data must be aggregated by the standardization variables and the unit-time variable name must labeled pop. See example below.
The readmission
dataset from frailtypack
contains rehospitalization times (in days) following surgery in patients diagnosed with colorectal cancer. We will use this data to examine directly (sex) standardized rates for rehospitalizations by Dukes’ tumoral stage, a subgroup variable.
Since the data are in a person-period format, we’ll have to calculate the total number of events and total observation time for each person. This will result in a single record per person that summarizes the relevant information.
#Calculate total events and total observation times per person
treadm <- readmission %>%
group_by(id) %>%
filter(max(enum)==enum ) %>%
mutate(events=enum-1,time=t.stop) %>%
select(id, events, time, sex, dukes)
#View first 6 records
knitr::kable(head(treadm), caption='Person-level Dataset')
id | events | time | sex | dukes |
---|---|---|---|---|
1 | 2 | 1037 | Female | D |
2 | 1 | 1182 | Male | C |
3 | 1 | 783 | Male | C |
4 | 4 | 2048 | Female | A-B |
5 | 1 | 1144 | Female | C |
6 | 3 | 1407 | Male | A-B |
The next step is to form the standardized or reference population, which for this analysis, is the total observation time across our study. Since we are standardizing by sex, we’ll compute total observation time by sex. As noted previously, the unit-time must be labelled pop.
#Make the reference data
tref <- treadm %>%
group_by(sex) %>%
mutate(pop=sum(time)) %>%
select(sex, pop) %>%
distinct(sex, pop)
#View
knitr::kable(tref, caption='Reference Dataset')
sex | pop |
---|---|
Female | 180198 |
Male | 233093 |
Finally, lets calculate directly (sex) standardized rates for rehospitalizations by Dukes’ tumoral stage.
require(dsr)
my_analysis <- dsrrec(data=treadm,
event=events,
fu=time,
refdata=tref,
subgroup=dukes,
sex,
sig=0.95,
mp=1000,
decimals=3)
Subgroup | Numerator | Denominator | Cr Rate (per 1000) | Std Rate (per 1000) | 95% LCL (NB) | 95% UCL (NB) | 95% LCL (Gamma) | 95% UCL (Gamma) |
---|---|---|---|---|---|---|---|---|
D | 143 | 31444 | 4.548 | 4.541 | 3.793 | 5.289 | 3.828 | 5.350 |
C | 193 | 170190 | 1.134 | 1.172 | 1.006 | 1.339 | 1.012 | 1.351 |
A-B | 169 | 211657 | 0.798 | 0.782 | 0.663 | 0.900 | 0.668 | 0.909 |
By default, confidence intervals for recurrent (NB) and non-current (Gamma) outcomes are calculated.
Stukel, T. A., Glynn, R. J., Fisher, E. S., Sharp, S. M., Lu-Yao, G and Wennberg, J. E. (1994). Standardized rates of recurrent outcomes. Statistics in Medicine, 13, 1781-1791.
Fay, M.P., & Feuer, E.J. (1997). Confidence intervals for directly standardized rates: a method based on the gamma distribution. Statistics in Medicine, 16, 791-801.