README

An R package for simulating correlation matrices possibly constrained by acyclic directed and undirected graphs.

Installation

install.packages("gmat")

Alternatively, using the R package devtools one may install the development version:

# install.packages("devtools")
devtools::install_github("irenecrsn/gmat")

The other R packages required for gmat are igraph and gRbase, which can also be installed from CRAN.

Overview

An example of use

First, we generate a random undirected graph with 3 nodes and density 0.5. Then we generate, using our port() function, 2 correlation matrices consistent with such random graphical structure.

ug <- gmat::rgraph(p = 3, d = 0.5)
igraph::print.igraph(ug)
#> IGRAPH c54f1ce U--- 3 2 -- Erdos renyi (gnp) graph
#> + attr: name (g/c), type (g/c), loops (g/l), p (g/n)
#> + edges from c54f1ce:
#> [1] 1--2 1--3
gmat::port(N = 2, ug = ug)
#> , , 1
#> 
#>            [,1]       [,2]       [,3]
#> [1,]  1.0000000 -0.3084469 -0.8615382
#> [2,] -0.3084469  1.0000000  0.0000000
#> [3,] -0.8615382  0.0000000  1.0000000
#> 
#> , , 2
#> 
#>            [,1]       [,2]      [,3]
#> [1,]  1.0000000 -0.7442018 0.4804328
#> [2,] -0.7442018  1.0000000 0.0000000
#> [3,]  0.4804328  0.0000000 1.0000000

We appreciate how the zero pattern is shared by all of the simulated matrices. The return value is an array, and so the individual matrices can be accessed as matrices[, , n], where n is the index of the matrix we want to retrieve from the sample, ranging from 1 to N.

We may also sample correlation matrices using i.i.d. coefficients in their upper Cholesky factor U.

gmat::chol_iid(N = 2)
#> , , 1
#> 
#>            [,1]       [,2]       [,3]
#> [1,]  1.0000000 -0.5736727 -0.3748520
#> [2,] -0.5736727  1.0000000 -0.1880879
#> [3,] -0.3748520 -0.1880879  1.0000000
#> 
#> , , 2
#> 
#>            [,1]       [,2]       [,3]
#> [1,]  1.0000000 -0.2140087 -0.1489077
#> [2,] -0.2140087  1.0000000 -0.1502781
#> [3,] -0.1489077 -0.1502781  1.0000000

dag <- gmat::rgraph(p = 3, d = 0.5, dag = TRUE)
igraph::print.igraph(dag)
#> IGRAPH 390731a D--- 3 1 -- 
#> + edge from 390731a:
#> [1] 1->2
top_sort <- igraph::topo_sort(dag)
peo_sort <- rev(top_sort)
inv_top <- order(top_sort)
inv_peo <- order(peo_sort)
m <- gmat::chol_iid(dag = dag)[, , 1]
# We sort the matrix according to both a topological and a perfect elimination
# ordering
m_top_sort <- m[top_sort, top_sort]
m_peo_sort <- m[peo_sort, peo_sort]
# Transpose of Cholesky factor (lower Cholesky decomposition)
U_chol <- chol(m_peo_sort) 
# Upper Cholesky factor (upper Cholesky decomposition)
U <- gmat::uchol(m_top_sort)
print(U_chol[inv_peo, inv_peo])
#>            [,1] [,2] [,3]
#> [1,]  0.8836329    0    0
#> [2,] -0.4681803    1    0
#> [3,]  0.0000000    0    1
print(U[inv_top, inv_top])
#>           [,1]       [,2] [,3]
#> [1,] 0.8836329 -0.4681803    0
#> [2,] 0.0000000  1.0000000    0
#> [3,] 0.0000000  0.0000000    1
print(m)
#>            [,1]       [,2] [,3]
#> [1,]  1.0000000 -0.4681803    0
#> [2,] -0.4681803  1.0000000    0
#> [3,]  0.0000000  0.0000000    1

The zeros are correctly reflected in the upper Cholesky factor when using a topological sorting, whereas they are reflected in the standard lower Cholesky decomposition factor if we sort the matrix with a perfect elimination ordering.