The DisImpact
R package contains functions that help in determining disproportionate impact (DI) based on the following methodologies:
# From CRAN (Official)
## install.packages('DisImpact')
# From github (Development)
## devtools::install_github('vinhdizzo/DisImpact')
library(DisImpact)
library(dplyr) # Ease in manipulations with data frames
To illustrate the functionality of the package, let's load a toy data set:
# Load fake data set
data(student_equity)
The toy data set can be summarized as follows:
# Summarize toy data
dim(student_equity)
## [1] 20000 4
dSumm <- student_equity %>%
group_by(Cohort, Ethnicity) %>%
summarize(n=n(), Transfer_Rate=mean(Transfer))
dSumm
## # A tibble: 12 x 4
## # Groups: Cohort [2]
## Cohort Ethnicity n Transfer_Rate
## <int> <chr> <int> <dbl>
## 1 2017 Asian 3000 0.687
## 2 2017 Black 1000 0.31
## 3 2017 Hispanic 2000 0.205
## 4 2017 Multi-Ethnicity 500 0.524
## 5 2017 Native American 100 0.43
## 6 2017 White 3400 0.604
## 7 2018 Asian 3000 0.743
## 8 2018 Black 1000 0.297
## 9 2018 Hispanic 2000 0.218
## 10 2018 Multi-Ethnicity 500 0.484
## 11 2018 Native American 100 0.35
## 12 2018 White 3400 0.631
di_ppg
is the main work function, and it can take on vectors or column names the tidy way:
# Vector
di_ppg(success=student_equity$Transfer, group=student_equity$Ethnicity) %>% as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 4292 0.7153333 0.5264 0.03000000 0.6853333
## 2 Black 2000 607 0.3035000 0.5264 0.03000000 0.2735000
## 3 Hispanic 4000 847 0.2117500 0.5264 0.03000000 0.1817500
## 4 Multi-Ethnicity 1000 504 0.5040000 0.5264 0.03099032 0.4730097
## 5 Native American 200 78 0.3900000 0.5264 0.06929646 0.3207035
## 6 White 6800 4200 0.6176471 0.5264 0.03000000 0.5876471
## pct_hi di_indicator
## 1 0.7453333 0
## 2 0.3335000 1
## 3 0.2417500 1
## 4 0.5349903 0
## 5 0.4592965 1
## 6 0.6476471 0
# Tidy and column reference
di_ppg(success=Transfer, group=Ethnicity, data=student_equity) %>%
as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 4292 0.7153333 0.5264 0.03000000 0.6853333
## 2 Black 2000 607 0.3035000 0.5264 0.03000000 0.2735000
## 3 Hispanic 4000 847 0.2117500 0.5264 0.03000000 0.1817500
## 4 Multi-Ethnicity 1000 504 0.5040000 0.5264 0.03099032 0.4730097
## 5 Native American 200 78 0.3900000 0.5264 0.06929646 0.3207035
## 6 White 6800 4200 0.6176471 0.5264 0.03000000 0.5876471
## pct_hi di_indicator
## 1 0.7453333 0
## 2 0.3335000 1
## 3 0.2417500 1
## 4 0.5349903 0
## 5 0.4592965 1
## 6 0.6476471 0
Sometimes, one might want to break out the DI calculation by cohort:
# Cohort
di_ppg(success=Transfer, group=Ethnicity, cohort=Cohort, data=student_equity) %>%
as.data.frame
## cohort group n success pct reference moe pct_lo
## 1 2017 Asian 3000 2062 0.6873333 0.5140 0.03000000 0.6573333
## 2 2017 Black 1000 310 0.3100000 0.5140 0.03099032 0.2790097
## 3 2017 Hispanic 2000 410 0.2050000 0.5140 0.03000000 0.1750000
## 4 2017 Multi-Ethnicity 500 262 0.5240000 0.5140 0.04382693 0.4801731
## 5 2017 Native American 100 43 0.4300000 0.5140 0.09800000 0.3320000
## 6 2017 White 3400 2053 0.6038235 0.5140 0.03000000 0.5738235
## 7 2018 Asian 3000 2230 0.7433333 0.5388 0.03000000 0.7133333
## 8 2018 Black 1000 297 0.2970000 0.5388 0.03099032 0.2660097
## 9 2018 Hispanic 2000 437 0.2185000 0.5388 0.03000000 0.1885000
## 10 2018 Multi-Ethnicity 500 242 0.4840000 0.5388 0.04382693 0.4401731
## 11 2018 Native American 100 35 0.3500000 0.5388 0.09800000 0.2520000
## 12 2018 White 3400 2147 0.6314706 0.5388 0.03000000 0.6014706
## pct_hi di_indicator
## 1 0.7173333 0
## 2 0.3409903 1
## 3 0.2350000 1
## 4 0.5678269 0
## 5 0.5280000 0
## 6 0.6338235 0
## 7 0.7733333 0
## 8 0.3279903 1
## 9 0.2485000 1
## 10 0.5278269 1
## 11 0.4480000 1
## 12 0.6614706 0
di_ppg
is also applicable to summarized data; just pass the counts to success
and group size to weight
:
di_ppg(success=Transfer_Rate*n, group=Ethnicity, cohort=Cohort, weight=n, data=dSumm) %>%
as.data.frame
## cohort group n success pct reference moe pct_lo
## 1 2017 Asian 3000 2062 0.6873333 0.5140 0.03000000 0.6573333
## 2 2017 Black 1000 310 0.3100000 0.5140 0.03099032 0.2790097
## 3 2017 Hispanic 2000 410 0.2050000 0.5140 0.03000000 0.1750000
## 4 2017 Multi-Ethnicity 500 262 0.5240000 0.5140 0.04382693 0.4801731
## 5 2017 Native American 100 43 0.4300000 0.5140 0.09800000 0.3320000
## 6 2017 White 3400 2053 0.6038235 0.5140 0.03000000 0.5738235
## 7 2018 Asian 3000 2230 0.7433333 0.5388 0.03000000 0.7133333
## 8 2018 Black 1000 297 0.2970000 0.5388 0.03099032 0.2660097
## 9 2018 Hispanic 2000 437 0.2185000 0.5388 0.03000000 0.1885000
## 10 2018 Multi-Ethnicity 500 242 0.4840000 0.5388 0.04382693 0.4401731
## 11 2018 Native American 100 35 0.3500000 0.5388 0.09800000 0.2520000
## 12 2018 White 3400 2147 0.6314706 0.5388 0.03000000 0.6014706
## pct_hi di_indicator
## 1 0.7173333 0
## 2 0.3409903 1
## 3 0.2350000 1
## 4 0.5678269 0
## 5 0.5280000 0
## 6 0.6338235 0
## 7 0.7733333 0
## 8 0.3279903 1
## 9 0.2485000 1
## 10 0.5278269 1
## 11 0.4480000 1
## 12 0.6614706 0
The user could also pass in custom reference points for comparison (eg, a state-wide rate). di_ppg
accepts either a single reference point to be used or a vector of reference points, one for each cohort. For the latter, the vector of reference points will be taken to correspond to the cohort
variable, alphabetically ordered.
# With custom reference (single)
di_ppg(success=Transfer, group=Ethnicity, reference=0.54, data=student_equity) %>%
as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 4292 0.7153333 0.54 0.03000000 0.6853333
## 2 Black 2000 607 0.3035000 0.54 0.03000000 0.2735000
## 3 Hispanic 4000 847 0.2117500 0.54 0.03000000 0.1817500
## 4 Multi-Ethnicity 1000 504 0.5040000 0.54 0.03099032 0.4730097
## 5 Native American 200 78 0.3900000 0.54 0.06929646 0.3207035
## 6 White 6800 4200 0.6176471 0.54 0.03000000 0.5876471
## pct_hi di_indicator
## 1 0.7453333 0
## 2 0.3335000 1
## 3 0.2417500 1
## 4 0.5349903 1
## 5 0.4592965 1
## 6 0.6476471 0
# With custom reference (multiple)
di_ppg(success=Transfer, group=Ethnicity, cohort=Cohort, reference=c(0.5, 0.55), data=student_equity) %>%
as.data.frame
## Joining, by = "cohort"
## cohort group n success pct reference moe pct_lo
## 1 2017 Asian 3000 2062 0.6873333 0.50 0.03000000 0.6573333
## 2 2017 Black 1000 310 0.3100000 0.50 0.03099032 0.2790097
## 3 2017 Hispanic 2000 410 0.2050000 0.50 0.03000000 0.1750000
## 4 2017 Multi-Ethnicity 500 262 0.5240000 0.50 0.04382693 0.4801731
## 5 2017 Native American 100 43 0.4300000 0.50 0.09800000 0.3320000
## 6 2017 White 3400 2053 0.6038235 0.50 0.03000000 0.5738235
## 7 2018 Asian 3000 2230 0.7433333 0.55 0.03000000 0.7133333
## 8 2018 Black 1000 297 0.2970000 0.55 0.03099032 0.2660097
## 9 2018 Hispanic 2000 437 0.2185000 0.55 0.03000000 0.1885000
## 10 2018 Multi-Ethnicity 500 242 0.4840000 0.55 0.04382693 0.4401731
## 11 2018 Native American 100 35 0.3500000 0.55 0.09800000 0.2520000
## 12 2018 White 3400 2147 0.6314706 0.55 0.03000000 0.6014706
## pct_hi di_indicator
## 1 0.7173333 0
## 2 0.3409903 1
## 3 0.2350000 1
## 4 0.5678269 0
## 5 0.5280000 0
## 6 0.6338235 0
## 7 0.7733333 0
## 8 0.3279903 1
## 9 0.2485000 1
## 10 0.5278269 1
## 11 0.4480000 1
## 12 0.6614706 0
The margin of error (MOE) in di_ppg
has 2 underlying assumptions (defaults):
To override 1, the user could specify min_moe
in di_ppg
. To override 2, the user could specify use_prop_in_moe=TRUE
in di_ppg
.
# min_moe
di_ppg(success=Transfer, group=Ethnicity, data=student_equity, min_moe=0.02) %>%
as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 4292 0.7153333 0.5264 0.02000000 0.6953333
## 2 Black 2000 607 0.3035000 0.5264 0.02191347 0.2815865
## 3 Hispanic 4000 847 0.2117500 0.5264 0.02000000 0.1917500
## 4 Multi-Ethnicity 1000 504 0.5040000 0.5264 0.03099032 0.4730097
## 5 Native American 200 78 0.3900000 0.5264 0.06929646 0.3207035
## 6 White 6800 4200 0.6176471 0.5264 0.02000000 0.5976471
## pct_hi di_indicator
## 1 0.7353333 0
## 2 0.3254135 1
## 3 0.2317500 1
## 4 0.5349903 0
## 5 0.4592965 1
## 6 0.6376471 0
# use_prop_in_moe
di_ppg(success=Transfer, group=Ethnicity, data=student_equity, min_moe=0.02, use_prop_in_moe=TRUE) %>%
as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 4292 0.7153333 0.5264 0.02000000 0.6953333
## 2 Black 2000 607 0.3035000 0.5264 0.02015028 0.2833497
## 3 Hispanic 4000 847 0.2117500 0.5264 0.02000000 0.1917500
## 4 Multi-Ethnicity 1000 504 0.5040000 0.5264 0.03098933 0.4730107
## 5 Native American 200 78 0.3900000 0.5264 0.06759869 0.3224013
## 6 White 6800 4200 0.6176471 0.5264 0.02000000 0.5976471
## pct_hi di_indicator
## 1 0.7353333 0
## 2 0.3236503 1
## 3 0.2317500 1
## 4 0.5349893 0
## 5 0.4575987 1
## 6 0.6376471 0
In cases where the proportion is used in calculating MOE, a proportion of 0 or 1 would lead to a zero MOE. To account for these scenarios, the user could leverage the prop_sub_0
and prop_sub_1
parameters in di_ppg
and ppg_moe
as substitutes. These parameters default to 0.5
.
# Set Native American to have have zero transfers and see what the results
di_ppg(success=Transfer, group=Ethnicity, data=student_equity %>% mutate(Transfer=ifelse(Ethnicity=='Native American', 0, Transfer)), use_prop_in_moe=TRUE, prop_sub_0=0.1, prop_sub_1=0.9) %>%
as.data.frame
## Warning in ppg_moe(n = n, proportion = pct, min_moe = min_moe, prop_sub_0 =
## prop_sub_0, : The vector `proportion` contains 0. This will lead to a zero MOE.
## `prop_sub_0=0.1` will be used in calculating the MOE for these cases.
## group n success pct reference moe pct_lo
## 1 Asian 6000 4292 0.7153333 0.5225 0.03000000 0.68533333
## 2 Black 2000 607 0.3035000 0.5225 0.03000000 0.27350000
## 3 Hispanic 4000 847 0.2117500 0.5225 0.03000000 0.18175000
## 4 Multi-Ethnicity 1000 504 0.5040000 0.5225 0.03098933 0.47301067
## 5 Native American 200 0 0.0000000 0.5225 0.04157788 -0.04157788
## 6 White 6800 4200 0.6176471 0.5225 0.03000000 0.58764706
## pct_hi di_indicator
## 1 0.74533333 0
## 2 0.33350000 1
## 3 0.24175000 1
## 4 0.53498933 0
## 5 0.04157788 1
## 6 0.64764706 0
di_prop_index
is the main work function for this method, and it can take on vectors or column names the tidy way:
# Without cohort
## Vector
di_prop_index(success=student_equity$Transfer, group=student_equity$Ethnicity) %>% as.data.frame
## group n success pct_success pct_group di_prop_index
## 1 Asian 6000 4292 0.407674772 0.30 1.3589159
## 2 Black 2000 607 0.057655775 0.10 0.5765578
## 3 Hispanic 4000 847 0.080452128 0.20 0.4022606
## 4 Multi-Ethnicity 1000 504 0.047872340 0.05 0.9574468
## 5 Native American 200 78 0.007408815 0.01 0.7408815
## 6 White 6800 4200 0.398936170 0.34 1.1733417
## Tidy and column reference
di_prop_index(success=Transfer, group=Ethnicity, data=student_equity) %>%
as.data.frame
## group n success pct_success pct_group di_prop_index
## 1 Asian 6000 4292 0.407674772 0.30 1.3589159
## 2 Black 2000 607 0.057655775 0.10 0.5765578
## 3 Hispanic 4000 847 0.080452128 0.20 0.4022606
## 4 Multi-Ethnicity 1000 504 0.047872340 0.05 0.9574468
## 5 Native American 200 78 0.007408815 0.01 0.7408815
## 6 White 6800 4200 0.398936170 0.34 1.1733417
# With cohort
## Vector
di_prop_index(success=student_equity$Transfer, group=student_equity$Ethnicity, cohort=student_equity$Cohort) %>% as.data.frame
## cohort group n success pct_success pct_group di_prop_index
## 1 2017 Asian 3000 2062 0.401167315 0.30 1.3372244
## 2 2017 Black 1000 310 0.060311284 0.10 0.6031128
## 3 2017 Hispanic 2000 410 0.079766537 0.20 0.3988327
## 4 2017 Multi-Ethnicity 500 262 0.050972763 0.05 1.0194553
## 5 2017 Native American 100 43 0.008365759 0.01 0.8365759
## 6 2017 White 3400 2053 0.399416342 0.34 1.1747539
## 7 2018 Asian 3000 2230 0.413882702 0.30 1.3796090
## 8 2018 Black 1000 297 0.055122494 0.10 0.5512249
## 9 2018 Hispanic 2000 437 0.081106162 0.20 0.4055308
## 10 2018 Multi-Ethnicity 500 242 0.044914625 0.05 0.8982925
## 11 2018 Native American 100 35 0.006495917 0.01 0.6495917
## 12 2018 White 3400 2147 0.398478099 0.34 1.1719944
## Tidy and column reference
di_prop_index(success=Transfer, group=Ethnicity, cohort=Cohort, data=student_equity) %>%
as.data.frame
## cohort group n success pct_success pct_group di_prop_index
## 1 2017 Asian 3000 2062 0.401167315 0.30 1.3372244
## 2 2017 Black 1000 310 0.060311284 0.10 0.6031128
## 3 2017 Hispanic 2000 410 0.079766537 0.20 0.3988327
## 4 2017 Multi-Ethnicity 500 262 0.050972763 0.05 1.0194553
## 5 2017 Native American 100 43 0.008365759 0.01 0.8365759
## 6 2017 White 3400 2053 0.399416342 0.34 1.1747539
## 7 2018 Asian 3000 2230 0.413882702 0.30 1.3796090
## 8 2018 Black 1000 297 0.055122494 0.10 0.5512249
## 9 2018 Hispanic 2000 437 0.081106162 0.20 0.4055308
## 10 2018 Multi-Ethnicity 500 242 0.044914625 0.05 0.8982925
## 11 2018 Native American 100 35 0.006495917 0.01 0.6495917
## 12 2018 White 3400 2147 0.398478099 0.34 1.1719944
di_80_index
is the main work function for this method, and it can take on vectors or column names the tidy way:
# Without cohort
## Vector
di_80_index(success=student_equity$Transfer, group=student_equity$Ethnicity) %>% as.data.frame
## group n success pct reference di_80_index di_indicator
## 1 Asian 6000 4292 0.7153333 0.7153333 1.0000000 0
## 2 Black 2000 607 0.3035000 0.7153333 0.4242777 1
## 3 Hispanic 4000 847 0.2117500 0.7153333 0.2960158 1
## 4 Multi-Ethnicity 1000 504 0.5040000 0.7153333 0.7045666 1
## 5 Native American 200 78 0.3900000 0.7153333 0.5452004 1
## 6 White 6800 4200 0.6176471 0.7153333 0.8634395 0
## Tidy and column reference
di_80_index(success=Transfer, group=Ethnicity, data=student_equity) %>%
as.data.frame
## group n success pct reference di_80_index di_indicator
## 1 Asian 6000 4292 0.7153333 0.7153333 1.0000000 0
## 2 Black 2000 607 0.3035000 0.7153333 0.4242777 1
## 3 Hispanic 4000 847 0.2117500 0.7153333 0.2960158 1
## 4 Multi-Ethnicity 1000 504 0.5040000 0.7153333 0.7045666 1
## 5 Native American 200 78 0.3900000 0.7153333 0.5452004 1
## 6 White 6800 4200 0.6176471 0.7153333 0.8634395 0
# With cohort
## Vector
di_80_index(success=student_equity$Transfer, group=student_equity$Ethnicity, cohort=student_equity$Cohort) %>% as.data.frame
## cohort group n success pct reference di_80_index
## 1 2017 Asian 3000 2062 0.6873333 0.6873333 1.0000000
## 2 2017 Black 1000 310 0.3100000 0.6873333 0.4510184
## 3 2017 Hispanic 2000 410 0.2050000 0.6873333 0.2982541
## 4 2017 Multi-Ethnicity 500 262 0.5240000 0.6873333 0.7623666
## 5 2017 Native American 100 43 0.4300000 0.6873333 0.6256062
## 6 2017 White 3400 2053 0.6038235 0.6873333 0.8785017
## 7 2018 Asian 3000 2230 0.7433333 0.7433333 1.0000000
## 8 2018 Black 1000 297 0.2970000 0.7433333 0.3995516
## 9 2018 Hispanic 2000 437 0.2185000 0.7433333 0.2939462
## 10 2018 Multi-Ethnicity 500 242 0.4840000 0.7433333 0.6511211
## 11 2018 Native American 100 35 0.3500000 0.7433333 0.4708520
## 12 2018 White 3400 2147 0.6314706 0.7433333 0.8495120
## di_indicator
## 1 0
## 2 1
## 3 1
## 4 1
## 5 1
## 6 0
## 7 0
## 8 1
## 9 1
## 10 1
## 11 1
## 12 0
## Tidy and column reference
di_80_index(success=Transfer, group=Ethnicity, cohort=Cohort, data=student_equity) %>%
as.data.frame
## cohort group n success pct reference di_80_index
## 1 2017 Asian 3000 2062 0.6873333 0.6873333 1.0000000
## 2 2017 Black 1000 310 0.3100000 0.6873333 0.4510184
## 3 2017 Hispanic 2000 410 0.2050000 0.6873333 0.2982541
## 4 2017 Multi-Ethnicity 500 262 0.5240000 0.6873333 0.7623666
## 5 2017 Native American 100 43 0.4300000 0.6873333 0.6256062
## 6 2017 White 3400 2053 0.6038235 0.6873333 0.8785017
## 7 2018 Asian 3000 2230 0.7433333 0.7433333 1.0000000
## 8 2018 Black 1000 297 0.2970000 0.7433333 0.3995516
## 9 2018 Hispanic 2000 437 0.2185000 0.7433333 0.2939462
## 10 2018 Multi-Ethnicity 500 242 0.4840000 0.7433333 0.6511211
## 11 2018 Native American 100 35 0.3500000 0.7433333 0.4708520
## 12 2018 White 3400 2147 0.6314706 0.7433333 0.8495120
## di_indicator
## 1 0
## 2 1
## 3 1
## 4 1
## 5 1
## 6 0
## 7 0
## 8 1
## 9 1
## 10 1
## 11 1
## 12 0
Suppose we have a variable that indicates something negative. We could calculate DI on the converse of it:
di_ppg(success=!Transfer, group=Ethnicity, data=student_equity) %>%
as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 1708 0.2846667 0.4736 0.03000000 0.2546667
## 2 Black 2000 1393 0.6965000 0.4736 0.03000000 0.6665000
## 3 Hispanic 4000 3153 0.7882500 0.4736 0.03000000 0.7582500
## 4 Multi-Ethnicity 1000 496 0.4960000 0.4736 0.03099032 0.4650097
## 5 Native American 200 122 0.6100000 0.4736 0.06929646 0.5407035
## 6 White 6800 2600 0.3823529 0.4736 0.03000000 0.3523529
## pct_hi di_indicator
## 1 0.3146667 1
## 2 0.7265000 0
## 3 0.8182500 0
## 4 0.5269903 0
## 5 0.6792965 0
## 6 0.4123529 1
We can compute the success, group, and cohort variables on the fly:
# Transform success
a <- sample(0:1, size=nrow(student_equity), replace=TRUE, prob=c(0.95, 0.05))
mean(a)
## [1] 0.04915
di_ppg(success=pmax(Transfer, a), group=Ethnicity, data=student_equity) %>%
as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 4378 0.7296667 0.5496 0.03000000 0.6996667
## 2 Black 2000 677 0.3385000 0.5496 0.03000000 0.3085000
## 3 Hispanic 4000 1004 0.2510000 0.5496 0.03000000 0.2210000
## 4 Multi-Ethnicity 1000 536 0.5360000 0.5496 0.03099032 0.5050097
## 5 Native American 200 87 0.4350000 0.5496 0.06929646 0.3657035
## 6 White 6800 4310 0.6338235 0.5496 0.03000000 0.6038235
## pct_hi di_indicator
## 1 0.7596667 0
## 2 0.3685000 1
## 3 0.2810000 1
## 4 0.5669903 0
## 5 0.5042965 1
## 6 0.6638235 0
# Collapse Black and Hispanic
di_ppg(success=Transfer, group=ifelse(Ethnicity %in% c('Black', 'Hispanic'), 'Black/Hispanic', Ethnicity), data=student_equity) %>% as.data.frame
## group n success pct reference moe pct_lo
## 1 Asian 6000 4292 0.7153333 0.5264 0.03000000 0.6853333
## 2 Black/Hispanic 6000 1454 0.2423333 0.5264 0.03000000 0.2123333
## 3 Multi-Ethnicity 1000 504 0.5040000 0.5264 0.03099032 0.4730097
## 4 Native American 200 78 0.3900000 0.5264 0.06929646 0.3207035
## 5 White 6800 4200 0.6176471 0.5264 0.03000000 0.5876471
## pct_hi di_indicator
## 1 0.7453333 0
## 2 0.2723333 1
## 3 0.5349903 0
## 4 0.4592965 1
## 5 0.6476471 0
It is often the case that the user desires to calculate disproportionate impact across many outcome variables and many disaggregation/group variables. The function di_ppg_iterate
allows the user to specify a data set and the various variables to iterate across:
# Multiple group variables
di_ppg_iterate(data=student_equity, success_vars=c('Transfer'), group_vars=c('Ethnicity', 'Gender'), cohort_vars=c('Cohort'), reference_groups='overall')
## Joining, by = "group_var"
## # A tibble: 16 x 13
## success_variable disaggregation reference_group cohort group n success
## <chr> <chr> <chr> <int> <chr> <dbl> <int>
## 1 Transfer Ethnicity overall 2017 Asian 3000 2062
## 2 Transfer Ethnicity overall 2017 Black 1000 310
## 3 Transfer Ethnicity overall 2017 Hisp~ 2000 410
## 4 Transfer Ethnicity overall 2017 Mult~ 500 262
## 5 Transfer Ethnicity overall 2017 Nati~ 100 43
## 6 Transfer Ethnicity overall 2017 White 3400 2053
## 7 Transfer Ethnicity overall 2018 Asian 3000 2230
## 8 Transfer Ethnicity overall 2018 Black 1000 297
## 9 Transfer Ethnicity overall 2018 Hisp~ 2000 437
## 10 Transfer Ethnicity overall 2018 Mult~ 500 242
## 11 Transfer Ethnicity overall 2018 Nati~ 100 35
## 12 Transfer Ethnicity overall 2018 White 3400 2147
## 13 Transfer Gender overall 2017 Fema~ 5080 2605
## 14 Transfer Gender overall 2017 Male 4920 2535
## 15 Transfer Gender overall 2018 Fema~ 4965 2687
## 16 Transfer Gender overall 2018 Male 5035 2701
## # ... with 6 more variables: pct <dbl>, reference <dbl>, moe <dbl>,
## # pct_lo <dbl>, pct_hi <dbl>, di_indicator <dbl>
# Multiple group variables and different reference groups
bind_rows(
di_ppg_iterate(data=student_equity, success_vars=c('Transfer'), group_vars=c('Ethnicity', 'Gender'), cohort_vars=c('Cohort'), reference_groups=c('White', 'Male'))
, di_ppg_iterate(data=student_equity, success_vars=c('Transfer'), group_vars=c('Ethnicity', 'Gender'), cohort_vars=c('Cohort'), reference_groups='overall')
)
## Joining, by = "group_var"
## Joining, by = "cohort"
## Joining, by = "cohort"
## Joining, by = "group_var"
## # A tibble: 32 x 13
## success_variable disaggregation reference_group cohort group n success
## <chr> <chr> <chr> <int> <chr> <dbl> <int>
## 1 Transfer Ethnicity White 2017 Asian 3000 2062
## 2 Transfer Ethnicity White 2017 Black 1000 310
## 3 Transfer Ethnicity White 2017 Hisp~ 2000 410
## 4 Transfer Ethnicity White 2017 Mult~ 500 262
## 5 Transfer Ethnicity White 2017 Nati~ 100 43
## 6 Transfer Ethnicity White 2017 White 3400 2053
## 7 Transfer Ethnicity White 2018 Asian 3000 2230
## 8 Transfer Ethnicity White 2018 Black 1000 297
## 9 Transfer Ethnicity White 2018 Hisp~ 2000 437
## 10 Transfer Ethnicity White 2018 Mult~ 500 242
## # ... with 22 more rows, and 6 more variables: pct <dbl>, reference <dbl>,
## # moe <dbl>, pct_lo <dbl>, pct_hi <dbl>, di_indicator <dbl>