The dsr()
function computes crude and directly standardized event rates with a variety of user defined options.
The dsrr()
function can be used to compare directly standardized event rates through rate differences or ratios.
Standard errors for directly standardized rates are calculated using Chiang’s (1961) method.
The dsr()
and dsrr()
functions expect aggregated event counts and unit-times by
If you are working with individual level data, you will need to aggregate it beforehand. The R package dplyr
can do this quite easily.
Note: For the standard or reference population, the data must be aggregated by the standardization variables and the unit-time variable name must labeled pop. See example below.
To illustrate the functions of the package, we will recreate the analysis of calculating and comparing directly standardized mortality rates for the states of Alaska and Florida.
Additional details of the original analysis and data can be found here.
#Alaska death counts and person-Years by Age and Sex
df_a <- data.frame(age=rep(c('00-14','15-34','35-54','55-74','75+'),2),
sex=rep(c('m','f'),c(5,5)),
death=c(37,68,206,369,556,78,181,395,555,479),
fu=c(81205,93662,108615,35139,5491,77203,85412,100386,32118,7701),
state='Alaska'
)
#Florida death counts and pearson-years by Age and Sex
df_f <- data.frame(age=rep(c('00-14','15-34','35-54','55-74','75+'),2),
sex=rep(c('m','f'),c(5,5)),
death=c(1189,2962,10279,26354,42443,906,1234,5630,18309,53489),
fu=c(1505889,1972157,2197912,1383533,554632,1445831,1870430,2246737,1612270,868838),
state='Florida'
)
#Merge state data together
df_all <- rbind(df_a, df_f)
knitr::kable(df_all, caption='State Specific Counts')
age | sex | death | fu | state |
---|---|---|---|---|
00-14 | m | 37 | 81205 | Alaska |
15-34 | m | 68 | 93662 | Alaska |
35-54 | m | 206 | 108615 | Alaska |
55-74 | m | 369 | 35139 | Alaska |
75+ | m | 556 | 5491 | Alaska |
00-14 | f | 78 | 77203 | Alaska |
15-34 | f | 181 | 85412 | Alaska |
35-54 | f | 395 | 100386 | Alaska |
55-74 | f | 555 | 32118 | Alaska |
75+ | f | 479 | 7701 | Alaska |
00-14 | m | 1189 | 1505889 | Florida |
15-34 | m | 2962 | 1972157 | Florida |
35-54 | m | 10279 | 2197912 | Florida |
55-74 | m | 26354 | 1383533 | Florida |
75+ | m | 42443 | 554632 | Florida |
00-14 | f | 906 | 1445831 | Florida |
15-34 | f | 1234 | 1870430 | Florida |
35-54 | f | 5630 | 2246737 | Florida |
55-74 | f | 18309 | 1612270 | Florida |
75+ | f | 53489 | 868838 | Florida |
#Standard population person-years by Age and Sex
df_pop <- data.frame(age=rep(c('00-14','15-34','35-54','55-74','75+'),2),
sex=rep(c('m','f'),c(5,5)),
pop=c(30854207,40199647,40945028,19948630,6106351,
29399168,38876268,41881451,22717040,10494416)
)
knitr::kable(df_pop, caption='US Person-Years')
age | sex | pop |
---|---|---|
00-14 | m | 30854207 |
15-34 | m | 40199647 |
35-54 | m | 40945028 |
55-74 | m | 19948630 |
75+ | m | 6106351 |
00-14 | f | 29399168 |
15-34 | f | 38876268 |
35-54 | f | 41881451 |
55-74 | f | 22717040 |
75+ | f | 10494416 |
Directly standardized Mortality Rates for Alaska and Flordia are presented here. A 95% confidence interval is requested using the gamma method. A rate multiplier of 1000 (i.e. per 1000) is also specified.
library(dsr)
my_results <- dsr(data=df_all,
event=death,
fu=fu,
subgroup=state,
age, sex,
refdata=df_pop,
method="gamma",
sig=0.95,
mp=1000,
decimals=4)
#> Joining, by = c("age", "sex")
knitr::kable(my_results)
Subgroup | Numerator | Denominator | Crude Rate (per 1000) | 95% LCL (Crude) | 95% UCL (Crude) | Std Rate (per 1000) | 95% LCL (Std) | 95% UCL (Std) |
---|---|---|---|---|---|---|---|---|
Alaska | 2924 | 626932 | 4.6640 | 4.4964 | 4.8362 | 8.0693 | 7.7504 | 8.3979 |
Florida | 162795 | 15658229 | 10.3968 | 10.3463 | 10.4474 | 7.7342 | 7.6959 | 7.7726 |
A rate ratio comparing the directly standardized mortality rate of Alaska to Florida is requested. A 95% log-normal confidence interval is computed.
my_results <- dsrr(data=df_all,
event=death,
fu=fu,
subgroup=state,
age, sex,
refdata=df_pop,
refgroup="Florida",
estimate="ratio",
sig=0.95,
mp=1000,
decimals=4)
#> Joining, by = c("age", "sex")
knitr::kable(my_results)
Comparator | Reference | Std Rate (per 1000) | Rate Ratio (RR) | 95% LCL (RR) | 95% UCL (RR) |
---|---|---|---|---|---|
Alaska | Florida | 8.0693 | 1.0433 | 1.0031 | 1.0835 |
Florida | Florida | 7.7342 | 1.0000 | 0.9930 | 1.0070 |
A rate difference comparing the directly standardized mortality rate of Alaska to Florida is requested. A 95% normal confidence interval is computed.
my_results2 <- dsrr(data=df_all,
event=death,
fu=fu,
subgroup=state,
age, sex,
refdata=df_pop,
refgroup="Florida",
estimate="difference",
sig=0.95,
mp=1000,
decimals=4)
#> Joining, by = c("age", "sex")
knitr::kable(my_results2)
Comparator | Reference | Std Rate (per 1000) | Rate Difference (RD) | 95% LCL (RD) | 95% UCL (RD) |
---|---|---|---|---|---|
Alaska | Florida | 8.0693 | 0.3351 | 0.0108 | 0.6594 |
Florida | Florida | 7.7342 | 0.0000 | -0.0541 | 0.0541 |
Chiang C. Standard error of the age-adjusted death rate. US Department of Health, Education and Welfare: Vital Statistics Special Reports 1961;47:271-285.
Elandt-Johnson, R. C., and Johnson, N. L. (1980). Survival Models and Data Analysis. New York: John Wiley & Sons.
Fay, M.P., & Feuer, E.J. (1997). Confidence intervals for directly standardized rates: a method based on the gamma distribution. Statistics in Medicine,16, 791-801.