Bayesian Mediation Analysis in R

Alexander Rix alexrix@umich.edu

Perform mediation analysis in the presence of high-dimensional mediators based on the potential outcome framework. Bayesian Mediation Analysis (BAMA), developed by Song et al (2019), relies on two Bayesian sparse linear mixed models to simultaneously analyze a relatively large number of mediators for a continuous exposure and outcome assuming a small number of mediators are truly active. This sparsity assumption also allows the extension of univariate mediator analysis by casting the identification of active mediators as a variable selection problem and applying Bayesian methods with continuous shrinkage priors on the effects.

Installation

You can install bama from CRAN

install.packages("bama")

or from Github via devtools

# install.packages(devtools)
devtools::install_github("umich-cphds/bama", built_opts = c())

bama requires the R packages Rcpp and RcppArmadillo, so you may want to install / update them before downloading. If you decide to install bama from source (eg github), you will need a C++ compiler that supports C++11. On Windows this can accomplished by installing Rtools, and Xcode on MacOS.

The Github version may contain new features or bug fixes not yet present on CRAN, so if you are experiencing issues, you may want to try the Github version of the package. ## Example problem bama contains a semi-synthetic example data set, bama.data that is used in this example. bama.data contains a continuous response y and a continuous exposure a that is mediated by 100 mediators, m[1:100].

library(bama)
# print just the first 10 columns
head(bama.data[,1:10])
#>            y          a          m1         m2          m3         m4
#> 1 -0.5077701  1.3979467 -0.75395346 -0.2787043 -0.04471833  0.3422936
#> 2 -0.3239898 -0.2311032 -1.20208195  0.4210638  0.93175992 -0.3699733
#> 3 -1.8553536 -2.4647028 -0.65712133  0.3285993  0.59144748 -0.1307554
#> 4  0.1685455  0.1119932  0.04982723  0.6816996 -0.12956715 -0.8348541
#> 5  0.9070900  0.4994626 -0.99964057 -0.7660710  1.53962908 -0.9308951
#> 6 -1.0357105  0.6359685  1.06954128 -0.2441489 -1.52176072  0.4657214
#>            m5          m6          m7           m8
#> 1 -0.66227113 -0.30925865  1.58001664  0.008326522
#> 2  1.09811497 -0.09969085  1.02369272  0.045104531
#> 3 -0.30196963  0.38853526 -0.05841533 -0.436429826
#> 4  0.08936191 -0.69699157  0.41615473  0.973411472
#> 5  1.12107670  1.07603088  0.37449777 -0.289794580
#> 6 -1.55992443 -0.42705075 -0.98761802 -0.639473238

The mediators have an internal correlation structure that is based off the covariance matrix from the Multi-Ethnic Study of Atherosclerosis (MESA) data. However, bama does not model internal correlation between mediators. Instead, bama employs continuous Bayesian shrinkage priors to select mediators and assumes that all the potential mediators contribute small effects in mediating the exposure-outcome relationship, but only a small proportion of mediators exhibit large effects.

We use no adjustment covariates in this example, so we just include the intercept. Also, in a real world situation, it may be beneficial to normalize the input data.


Y <- bama.data$y
A <- bama.data$a

# grab the mediators from the example data.frame
M <- as.matrix(bama.data[, paste0("m", 1:100)])

# We just include the intercept term in this example.
C1 <- matrix(1, nrow(M), 1)
C2 <- matrix(1, nrow(M), 1)

# Initial guesses for coefficients
beta.m  <- rep(0, ncol(M))
alpha.a <- rep(0, ncol(M))

set.seed(12345)
# It is recommended to pick a larger number for burnin.
bama.out <- bama(Y, A, M, C1, C2, beta.m, alpha.a, burnin = 3000, ndraws = 100)

# rank the mediators by PIP and return the highest 10
summary(bama.out, rank = T)[1:10,]
#>         estimate     ci.lower    ci.upper  pip
#> m65 -0.264486934 -0.328705614 -0.19282056 1.00
#> m12  0.198286890  0.136895446  0.26869921 0.99
#> m89 -0.143646544 -0.224018959 -0.05699812 0.83
#> m93  0.033966711 -0.001209342  0.09014840 0.05
#> m22 -0.023336872 -0.069405421  0.02288249 0.03
#> m67  0.009776543 -0.032492573  0.04565784 0.03
#> m57  0.015029742 -0.022480618  0.05824432 0.02
#> m86  0.018896307 -0.019927875  0.06208343 0.02
#> m90 -0.020909128 -0.069062896  0.01643856 0.02
#> m97 -0.031851891 -0.074532133  0.01493506 0.02

Here, the summary function calculates the posterior inclusion probability (PIP) r1 = r3 = 1 | Data, and ranks each mediator by its PIP. Each mediator’s estimate and 95% credible interval is also calculated.

Reference

Song, Y, Zhou, X, Zhang, M, et al. Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies. Biometrics. 2019; 1-11. doi:10.1111/biom.13189

Contact

If you would like to report a bug, ask questions, or suggest something, please e-mail Alexander Rix at alexrix@umich.edu.