A Bootstrap-Based Heterogeneity Test for Between-study Heterogeneity in Meta-Analysis

This R package boot.heterogeneity provides functions for testing between-study heterogeneity in meta-analysis of standardized mean differences (d), Fisher-transformed Pearson’s correlations (r), and natural-logarithm-transformed odds ratio (OR).

In the following examples, we describe how to use our package boot.heterogeneity to test the between-study heterogeneity for each of the three effect sizes (d, r, OR). Datasets, R codes, and Output are provided for each example so that applied researchers can easily replicate the examples and modify the codes for their own datasets.

In the main text of the article, an “Empirical Illustration” section is included to discuss the three examples in more detail.

Inclusion of moderators is an option for researchers who are interested in measuring the between-study heterogeneity per se and exploring factors that can explain the systematic between-study heterogeneity.

0. Installation of the package

Install the released version of boot.heterogeneity from CRAN with:

install.packages("boot.heterogeneity")

Or install the development version from GitHub with:

#install.packages("devtools")
library(devtools)
devtools::install_github("gabriellajg/boot.heterogeneity", force = TRUE, build_vignettes = TRUE)
library(boot.heterogeneity)

1. Standardized Mean Differences (d)

boot.d() is the function to test the between-study heterogeneity in meta-analysis of standardized mean differences (d).

1.1 Without moderators

Load the example dataset selfconcept first:

selfconcept consists of 18 studies in which the effect of open versus traditional education on students’ self-concept was studied (Hedges et al., 1981). The columns of selfconcept are: sample sizes of the two groups (n1 and n2), Hedges’s g, Cohen’s d, and a moderator X (X not used in the current example).

Extract the required arguments from selfconcept:

If g is a list of biased estimates of standardized mean differences in the meta-analytical study, a small-sample adjustment must be applied:

Run the heterogeneity test using function boot.d() and adjusted effect size d:

Alternatively, such an adjustment can be performed on unadjusted effect size g by specifying adjust = TRUE:

boot.run and boot.run2 will return the same results:

1.2 With moderators

Load an hypothetical dataset hypo_moder first:

Three moderators (cov.z1, cov.z2, cov.z3) are included:

Again, run the heterogeneity test using boot.d() with all moderators included in a matrix mods and model type specified as model = 'mixed':

The results in boot.run3 will in the same format as boot.run and boot.run2:

In the presence of moderators, the function above tests whether the variability in the true standardized mean differences after accounting for the moderators included in the model is larger than sampling variability alone (Viechtbauer, 2010).

For the following two examples (Fisher-transformed Pearson’s correlations r; Natural-logarithm-transformed odds ratio OR), no moderators are included, but one can simply include moderators as in section 1.2.

2. Fisher-transformed Pearson’s correlations (r)

boot.fcor() is the function to test the between-study heterogeneity in meta-analysis of Fisher-transformed Pearson’s correlations (r).

Load the example dataset sensation first:

Extract the required arguments from sensation:

Run the heterogeneity test using boot.fcor():

The test of between-study heterogeneity has the following results:

3. Natural-logarithm-transformed odds ratio (OR)

boot.lnOR() is the function to test the between-study heterogeneity in meta-analysis of Natural-logarithm-transformed odds ratio (OR).

Load the example dataset smoking from R package HSAUR2:

Extract the required arguments from smoking:

The log odds ratios can be computed, but they are not needed by boot.lnOR():

Run the heterogeneity test using boot.lnOR():

The test of between-study heterogeneity has the following results: