This vignette is written for baggr users who want to analyse binary data. For more general introduction to meta-analysis concepts and baggr workflow, please read vignette("baggr")
.
Here, we will show how to:
There are certain aspects of modelling binary data that are not yet covered by baggr, but may be important for your data:
Typically, a meta-analysis of binary data is done on summary statistics such as \(\log(OR)\) or \(\log(RR)\). The reason for this is two-fold: 1) they are the statistics most commonly reported by studies and 2) they are approximately normally distributed. The second assumption needs to be treated carefully, as we will show later.
For the first example we will use a simple summary data based on (part of) Table 6 in Yusuf et al. (1985), a famous study of impact of beta blockers on occurrence of strokes and mortality.
df_yusuf <- read.table(text="
trial a n1i c n2i
Balcon 14 56 15 58
Clausen 18 66 19 64
Multicentre 15 100 12 95
Barber 10 52 12 47
Norris 21 226 24 228
Kahler 3 38 6 31
Ledwich 2 20 3 20
", header=TRUE)
In a typical notation, a
(c
) are numbers of events in treatment (control) groups, while n1
(n2
) are total patients in treatment (control) group. We can also calculate \(b = n_1 - a\) and \(d = n_2 - c\), i.e. numbers of “non-events” in treatment and control groups respectively.
In our examples we will use OR metric; \(\log(OR)\) and its SE can easily be calculated from the values (if you’re not familiar with OR, see here):
df <- df_yusuf
df$b <- df$n1i-df$a
df$d <- df$n2i-df$c
df$tau <- log((df$a*df$d)/(df$b*df$c))
df$se <- sqrt(1/df$a + 1/df$b + 1/df$c + 1/df$d)
df
#> trial a n1i c n2i b d tau se
#> 1 Balcon 14 56 15 58 42 43 -0.04546237 0.4303029
#> 2 Clausen 18 66 19 64 48 45 -0.11860574 0.3888993
#> 3 Multicentre 15 100 12 95 85 83 0.19933290 0.4169087
#> 4 Barber 10 52 12 47 42 35 -0.36464311 0.4855042
#> 5 Norris 21 226 24 228 205 204 -0.13842138 0.3147471
#> 6 Kahler 3 38 6 31 35 25 -1.02961942 0.7540368
#> 7 Ledwich 2 20 3 20 18 17 -0.46262352 0.9735052
We use tau
and se
notation for our effect, same as we would for analysing continuous data with baggr()
. In fact, the model we use for logOR is the same default “Rubin” model (Rubin (1974)) with partial pooling. Once \(\log(OR)\) and is SE have been calculated, there are no differences between this process and analysis continuous quantities in baggr.
The two arguments we customised, effect
and group
do not impact results, but make outputs nicer, as we will see in the next section.
As with continuous quantities, the prior is chosen automatically. However, it is important to review the automatic prior choice when analysing logged quantities, because the priors may be needlessly diffuse (in other words, with the command above baggr does not “know” that data are logged).
We summarised our data as log(OR). Alternatively, we could work with log(RR), which is also normally distributed and can be easily calculated from contingency tables (i.e. a
, b
, c
, d
columns). All of the analysis would be analogous in that case – with exception of interpretation of treatment effect, which will then be on risk ratios.
Quick reminder on how risk and odd ratios compare. If an event is rare (rule of thumb: up to 10%), OR and RR will be similar. For really rare events there is no difference. The higher the event rate, the more discrepancy, e.g.
All of the outputs are explained in more detail in the “main” package vignette, vignette("baggr")
. Here, I simply illustrate their behaviour.
bg_model_agg
#> Model type: Rubin model with aggregate data
#> Pooling of effects: partial
#>
#> Aggregate treatment effect (on logarithm of odds ratio):
#> Hypermean (tau) = -0.16 with 95% interval -0.62 to 0.26
#> Hyper-SD (sigma_tau) = 0.248 with 95% interval 0.011 to 0.835
#> Total pooling (1 - I^2) = 0.83 with 95% interval 0.33 to 1.00
#>
#> Treatment effects on logarithm of odds ratio:
#> mean sd pooling
#> Balcon -0.131 0.26 0.76
#> Clausen -0.142 0.24 0.74
#> Multicentre -0.065 0.27 0.75
#> Barber -0.198 0.28 0.79
#> Norris -0.146 0.22 0.68
#> Kahler -0.263 0.35 0.88
#> Ledwich -0.177 0.35 0.92
Default plot (plot(bg_model_agg)
) will show pooled group results. If you prefer a typical forest plot style, you can use forest_plot
, which can also plot comparison with source data (via print
argument):
Hypothetical treatment effect in a new trial is obtained through effect_plot
.
We can transform printed and plotted outputs to show exponents (or other transforms); for example:
gridExtra::grid.arrange(
plot(bg_model_agg, transform = exp) + xlab("Effect on OR"),
effect_plot(bg_model_agg, transform = exp) + xlim(0, 3) + xlab("Effect on OR"),
ncol = 2)
We can compare with no pooling and full pooling models using baggr_compare
# Instead of writing...
# bg1 <- baggr(df, pooling = "none")
# bg2 <- baggr(df, pooling = "partial")
# bg3 <- baggr(df, pooling = "full")
# ...we use this one-liner
bg_c <- baggr_compare(df, effect = "logarithm of odds ratio")
#>
#> attr(,"class")
#> [1] "plot_list"
We can also examine the compared models directly, by accessing them via $models
. This is useful e.g. for effect_plot
:
effect_plot(
"Partial pooling, default prior" = bg_c$models$partial,
"Full pooling, default prior" = bg_c$models$full) +
theme(legend.position = "bottom")
You can use leave-one-out cross-validation to compare the models quantitatively. See documentation of ?loocv
for more details of calculation.
Let’s assume that you have access to underlying individual-level data. In our example above, we do not, but we can create a data.frame with individual level data as we know the contingency table:
df_ind <- data.frame()
for(i in 1:nrow(df_yusuf)) {
df_ind <- rbind(df_ind, data.frame(group = df_yusuf$trial[i],
treatment = c(rep(1, df_yusuf$n1i[i]), rep(0, df_yusuf$n2i[i])),
outcome = c(rep(1, df_yusuf$a[i]), rep(0, df_yusuf$n1i[i] - df_yusuf$a[i]),
rep(1, df_yusuf$c[i]), rep(0, df_yusuf$n2i[i] - df_yusuf$c[i]))))
}
head(df_ind)
#> group treatment outcome
#> 1 Balcon 1 1
#> 2 Balcon 1 1
#> 3 Balcon 1 1
#> 4 Balcon 1 1
#> 5 Balcon 1 1
#> 6 Balcon 1 1
We can now use a logistic regression model
Note: as in other cases,
baggr()
will detect model appropriately (in this case by noting thatoutcome
has only 0’s and 1’s) and notify the user, so in everyday use, it is not necessary to setmodel = "logit"
.
If we denote binary outcome as \(y\), treatment as \(z\) (both indexed over individual units by \(i\)), under partial pooling the logistic regression model assumes that
\[ y_i \sim \text{Bernoulli}(\text{logit}^{-1}[\mu_{\text{group}(i)} + \tau_{\text{group}(i)}z_i]) \]
where \(\mu_k\)’s and \(\tau_k\)’s are parameters to estimate. The former is a nuisance parameter, defining a group-specific mean probability of event, without treatment. The latter are group-specific effects.
Under this formulation, odds ratio of event between treatment and non-treatment are given by \(\tau_k\). This means, the modelling assumption is the same as when working with summarised data. Moreover, baggr’s default prior for \(\tau_k\) is set in the same way for summary and individual-level data (you can see it as bg_model_ind$formatted_prior
). One minor difference is that parameter \(\mu\) is estimated, but unless dealing with more extreme values of \(\mu\), i.e. with rare or common events, this should not impact the modelled result.
Therefore, the result we get from the two models will be very close (albeit with some numerical variation due to short run time):
baggr_compare(bg_model_agg, bg_model_ind)
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> Model 1 -0.620931 -0.163213 0.263314 -0.156234 0.217500
#> Model 2 -0.596394 -0.177299 0.221972 -0.169348 0.211027
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> Model 1 0.01122390 0.247723 0.834529 0.186882 0.225403
#> Model 2 0.00734586 0.246008 0.822508 0.186327 0.218467
#>
#> Transform: NULL
There will be difference in speed – in our example logistic model has to work with 1101 rows of data, while the summary level model only has 7 datapoints. Therefore you are better off summarising your data and using the default model, which I will show next.
To sum up:
Note: as shown above, you can create data for logistic model from
a
,b
/n1
,c
,d
/n2
columns. You cannot do it from OR only, as the denominator is missing.
In the previous example we analysed individual-level data. It is useful to summarise it, either to speed up modelling (see above) or to better understand the inputs. The generic function for doing this in baggr is prepare_ma
. It can be used with both continuous and binary data. In our case we can summarise df_ind
to obtain either odds ratios or risk ratios:
prepare_ma(df_ind, effect = "logOR")
#> group a n1 c n2 b d tau se
#> 1 Balcon 14 56 15 58 42 43 -0.04546237 0.4303029
#> 2 Barber 10 52 12 47 42 35 -0.36464311 0.4855042
#> 3 Clausen 18 66 19 64 48 45 -0.11860574 0.3888993
#> 4 Kahler 3 38 6 31 35 25 -1.02961942 0.7540368
#> 5 Ledwich 2 20 3 20 18 17 -0.46262352 0.9735052
#> 6 Multicentre 15 100 12 95 85 83 0.19933290 0.4169087
#> 7 Norris 21 226 24 228 205 204 -0.13842138 0.3147471
prepare_ma(df_ind, effect = "logRR")
#> group a n1 c n2 b d tau se
#> 1 Balcon 14 56 15 58 42 43 -0.03390155 0.3209310
#> 2 Barber 10 52 12 47 42 35 -0.28341767 0.3779232
#> 3 Clausen 18 66 19 64 48 45 -0.08483888 0.2782276
#> 4 Kahler 3 38 6 31 35 25 -0.89674614 0.6643991
#> 5 Ledwich 2 20 3 20 18 17 -0.40546511 0.8563488
#> 6 Multicentre 15 100 12 95 85 83 0.17185026 0.3598245
#> 7 Norris 21 226 24 228 205 204 -0.12472076 0.2836811
In both cases the effect (OR or RR) and its SE were renamed to tau
and se
, so that the resulting data frame can be used directly as input to baggr()
.
Consider data on four fictional studies:
df_rare <- data.frame(group = paste("Study", LETTERS[1:4]),
a = c(0, 2, 1, 3), c = c(2, 2, 3, 3),
n1i = c(120, 300, 110, 250),
n2i = c(120, 300, 110, 250))
In Study 1 you can see that no events occurred in a
column. Assume, that we have individual-level data for these studies (if not, we can generate it as we did in the example above):
df_rare_ind <- data.frame()
for(i in 1:nrow(df_rare)) {
df_rare_ind <- rbind(df_rare_ind, data.frame(group = df_rare$group[i],
treatment = c(rep(1, df_rare$n1i[i]), rep(0, df_rare$n2i[i])),
outcome = c(rep(1, df_rare$a[i]), rep(0, df_rare$n1i[i] - df_rare$a[i]),
rep(1, df_rare$c[i]), rep(0, df_rare$n2i[i] - df_rare$c[i]))))
}
Function prepare_ma()
can be used to summarise the rates and calculate log OR/RR. It can also be used to apply corrections to event rates:
df_rare_logor <- prepare_ma(df_rare_ind, effect = "logOR")
df_rare_logor
#> group a n1 c n2 b d tau se
#> 1 Study A 0.25 120.25 2.25 120.25 120.25 118.25 -2.213996 2.1121593
#> 2 Study B 2.00 300.00 2.00 300.00 298.00 298.00 0.000000 1.0033501
#> 3 Study C 1.00 110.00 3.00 110.00 109.00 107.00 -1.117131 1.1626923
#> 4 Study D 3.00 250.00 3.00 250.00 247.00 247.00 0.000000 0.8214401
Note that the output of prepare_ma
now differs from the original df_rare
for “Study A”: a (default) value of 0.25 was added, because there were no events in treatment arm. That means \(\log(OR)=\log(0)=-\infty\). It is typical to correct for rare events when analysing summary level data. A great overview of the subject and how different meta-analysis methods perform is provided by Bradburn et al. (2007). You can modify the amount of correction by setting rare_event_correction
argument.
pma01 <- prepare_ma(df_rare_ind, effect = "logOR",
rare_event_correction = 0.1)
pma1 <- prepare_ma(df_rare_ind, effect = "logOR",
rare_event_correction = 1)
When rare events occur, you can try to understand the impact of correction by comparing models with different rare_event_correction
values:
bg_correction01 <- baggr(pma01, effect = "logOR")
bg_correction025 <- baggr(df_rare_logor, effect = "logOR")
bg_correction1 <- baggr(pma1, effect = "logOR")
bg_rare_ind <- baggr(df_rare_ind, effect = "logOR")
bgc <- baggr_compare(
"Correct by .10" = bg_correction01,
"Correct by .25" = bg_correction025,
"Correct by 1.0" = bg_correction1,
"Individual data" = bg_rare_ind
)
bgc
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> Correct by .10 -2.80528 -0.460443 1.78505 -0.446145 1.092570
#> Correct by .25 -2.30077 -0.493740 1.29194 -0.457582 0.866460
#> Correct by 1.0 -2.29699 -0.534690 0.97479 -0.491637 0.818131
#> Individual data -4.37018 -0.984133 1.39341 -0.803722 1.346270
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> Correct by .10 0.0358399 1.448000 5.76854 0.930287 1.524060
#> Correct by .25 0.0428817 1.181230 4.49493 0.803533 1.270460
#> Correct by 1.0 0.0202896 0.971861 3.66782 0.693716 0.955043
#> Individual data 0.0650471 1.927440 7.44016 1.286640 1.960770
#>
#> Transform: NULL
plot(bgc)
#> [[1]]
#>
#> attr(,"class")
#> [1] "plot_list"
However, the issue with modelling rare events is not limited to zero-events. As mentioned, \(\log(OR)\) is approximately Gaussian. The quality of the approximation will depend on probabilities in all cells of the contingency table (which we estimate through a
, b
, c
, d
). Therefore, treating \(\log(OR)\) as Gaussian might lead to different results in the individual-level vs summary-level models if events are rare. With the low counts in our example it will definitely be the case.
Let us generate new data without 0’s, so that you can see how 0’s are not the issue:
df_rare <- data.frame(group = paste("Study", LETTERS[1:4]),
a = c(1, 2, 1, 3), c = c(2, 2, 3, 3),
n1i = c(120, 300, 110, 250),
n2i = c(120, 300, 110, 250))
df_rare_ind <- data.frame()
for(i in 1:nrow(df_rare)) {
df_rare_ind <- rbind(df_rare_ind, data.frame(group = df_rare$group[i],
treatment = c(rep(1, df_rare$n1i[i]), rep(0, df_rare$n2i[i])),
outcome = c(rep(1, df_rare$a[i]), rep(0, df_rare$n1i[i] - df_rare$a[i]),
rep(1, df_rare$c[i]), rep(0, df_rare$n2i[i] - df_rare$c[i]))))
}
df_rare_logor <- prepare_ma(df_rare_ind, effect = "logOR")
bg_rare_agg <- baggr(df_rare_logor, effect = "logOR")
bg_rare_ind <- baggr(df_rare_ind, effect = "logOR")
Let’s compare again, both on group level but also in terms of hypothetical treatment effect:
bgc <- baggr_compare(
"Summary-level (Rubin model on logOR)" = bg_rare_agg,
"Individual-level (logistic model)" = bg_rare_ind
)
bgc
#> Mean treatment effects:
#> 2.5% mean 97.5% median
#> Summary-level (Rubin model on logOR) -1.80662 -0.325010 1.38703 -0.339441
#> Individual-level (logistic model) -2.49763 -0.477669 1.39086 -0.444139
#> sd
#> Summary-level (Rubin model on logOR) 0.764156
#> Individual-level (logistic model) 0.907515
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median
#> Summary-level (Rubin model on logOR) 0.0272445 0.934829 3.57101 0.653308
#> Individual-level (logistic model) 0.0305191 1.099900 3.95634 0.777828
#> sd
#> Summary-level (Rubin model on logOR) 0.90796
#> Individual-level (logistic model) 1.02539
#>
#> Transform: NULL
plot(bgc)
#> [[1]]
#>
#> attr(,"class")
#> [1] "plot_list"
The results are still a bit different.
This section provides only a short example. To learn more about meta-regression see e.g. Baker et al. (2009)
Two types of covariates may be present in your data:
In both cases we only need to name one extra argument to baggr
: covariates=
, followed by a vector of column names in input data. You should remember that your treatment effect estimate will vary a lot not only with choice of covariates, but also the contrasts that you use. For example:
#let's use the data.frame we created from Yusuf et al earlier
df$study_grouping <- c(1,1,1,0,0,0,0)
df$different_contrasts <- c(1,1,1,0,0,0,0) - .5
bg_cov1 <- baggr(df, covariates = c("study_grouping"), effect = "logarithm of odds ratio")
bg_cov2 <- baggr(df, covariates = c("different_contrasts"), effect = "logarithm of odds ratio")
baggr_compare("No covariate" = bg_model_agg,
"With covariates, 0-1 coding" = bg_cov1,
"With covariates, +-1/2 coding" = bg_cov2)
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> No covariate -0.620931 -0.163213 0.263314 -0.156234 0.217500
#> With covariates, 0-1 coding -1.052960 -0.351897 0.310404 -0.346460 0.335913
#> With covariates, +-1/2 coding -0.653897 -0.176452 0.260016 -0.172070 0.227764
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> No covariate 0.0112239 0.247723 0.834529 0.186882 0.225403
#> With covariates, 0-1 coding 0.0125879 0.294858 1.028340 0.220150 0.273762
#> With covariates, +-1/2 coding 0.0115398 0.284586 0.965491 0.215459 0.257894
#>
#> Transform: NULL
To access the covariate estimates, see fixed_effects()
function. They are also printed out by default:
bg_cov1
#> Model type: Rubin model with aggregate data
#> Pooling of effects: partial
#>
#> Aggregate treatment effect (on logarithm of odds ratio):
#> Hypermean (tau) = -0.35 with 95% interval -1.05 to 0.31
#> Hyper-SD (sigma_tau) = 0.295 with 95% interval 0.013 to 1.028
#> Total pooling (1 - I^2) = 0.79 with 95% interval 0.24 to 1.00
#>
#> Treatment effects on logarithm of odds ratio:
#> mean sd pooling
#> Group 1 -0.36 0.43 0.72
#> Group 2 -0.39 0.42 0.69
#> Group 3 -0.29 0.42 0.71
#> Group 4 -0.34 0.32 0.75
#> Group 5 -0.26 0.27 0.63
#> Group 6 -0.44 0.41 0.85
#> Group 7 -0.35 0.43 0.89
#>
#> Covariate (fixed) effects on logarithm of odds ratio:
#> [,1]
#> mean 0.36
#> sd 0.45
Baker, W. L., C. Michael White, J. C. Cappelleri, J. Kluger, C. I. Coleman, and Health Outcomes, Policy, and Economics (HOPE) Collaborative Group. 2009. “Understanding Heterogeneity in Meta-Analysis: The Role of Meta-Regression.” International Journal of Clinical Practice 63 (10): 1426–34. https://doi.org/10.1111/j.1742-1241.2009.02168.x.
Bradburn, Michael J., Jonathan J. Deeks, Jesse A. Berlin, and A. Russell Localio. 2007. “Much Ado About Nothing: A Comparison of the Performance of Meta-Analytical Methods with Rare Events.” Statistics in Medicine 26 (1): 53–77. https://doi.org/10.1002/sim.2528.
Rubin, Donald B. 1974. “Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies.” Journal of Educational Psychology. https://doi.org/10.1037/h0037350.
Yusuf, S., R. Peto, J. Lewis, R. Collins, and P. Sleight. 1985. “Beta Blockade During and After Myocardial Infarction: An Overview of the Randomized Trials.” Progress in Cardiovascular Diseases 27 (5): 335–71. https://doi.org/10.1016/s0033-0620(85)80003-7.