Introduction

This is a short introduction and test for graphical models with stability selection. We first load the relevant packages. We use package QUIC to fit graphical networks:

library(stabs)
library(huge) # Need package huge for generating test data
## Warning: package 'huge' was built under R version 3.4.1
library(QUIC)
library(igraph)

Generate some test data using huge

N <- 200
set.seed(10010)
dat.hubs <- huge.generator(n=N, d=40, graph="hub")
set.seed(10010)
dat.cluster <- huge.generator(n=N, d=40, graph="cluster")
set.seed(10010)
dat.rand <- huge.generator(n=N, d=40, graph="random")

Next, we can visualize the test data

plot(dat.hubs)

plot of chunk PlotHubs

plot(dat.cluster)

plot of chunk PlotClust

plot(dat.rand)

plot of chunk PlotRand

Stability selection

To run stability selection on a graphical model, we can simply use the dedicated fitfun called quic.graphical_model. Note that we don't supply a y argument to stabsel.

s.hubs <- stabsel(x = dat.hubs$data, fitfun = quic.graphical_model, 
                  cutoff = 0.75, PFER = 10)
s.cluster <- stabsel(x = dat.cluster$data, fitfun = quic.graphical_model, 
                     cutoff = 0.75, PFER = 10)
s.rand <- stabsel(x = dat.rand$data, fitfun = quic.graphical_model, 
                  cutoff = 0.75, PFER = 10)

Plot comparisons

In order to make comparisons possible, we use the following user defined plot function

plot_comparison <- function(stabsout, orig) {
    ## display comparison of original graph and stabs estimation
    j <- orig$omega * 0
    orig.graph <- graph.adjacency(orig$theta != 0, mode = "max", diag = FALSE)
    ut <- upper.tri(j)
    j[ut][stabsout$selected] <- 1
    stabs.graph <- graph.adjacency(j != 0, mode = "max", diag = FALSE)
    layout <- layout.fruchterman.reingold(orig.graph)

    par(mfrow = c(1,2))
    plot(orig.graph, layout = layout, edge.color = "gray50", vertex.color = 'red', 
         main = "Real graph",  vertex.size = 3, vertex.label = NA)
    plot(stabs.graph, layout = layout, edge.color = "gray50", vertex.color = 'red', 
         main = "Stabs estimated graph", vertex.size = 3, vertex.label = NA)
}

Now, we compare the three graphs:

plot_comparison(s.hubs, dat.hubs)

plot of chunk StabsPlotHubs

plot_comparison(s.cluster, dat.cluster)

plot of chunk StabsPlotCluster

plot_comparison(s.rand, dat.rand)

plot of chunk StabsPlotRand As one can see, the original graphs are resembled by the estimated graphs rather well in this example.

User specified fitfun for graphical models

For general instruction on how to create a fitfun please refer to

vignette("Using_stabs", package = "stabs")

For an example for grahical model see

quic.graphical_model
## function (x, y, q, ...) 
## {
##     if (!requireNamespace("QUIC")) 
##         stop("Package ", sQuote("QUIC"), " is required but not available")
##     extraargs <- list(...)
##     empirical.cov <- cov(x)
##     lams <- extraargs$lams
##     if (is.null(lams)) {
##         max.cov <- max(abs(empirical.cov[upper.tri(empirical.cov)]))
##         lams <- getLamPath(max.cov, max.cov * 0.05, len = 40)
##     }
##     est <- QUIC::QUIC(empirical.cov, rho = 1, path = lams, msg = 0)
##     ut <- upper.tri(empirical.cov)
##     qvals <- sapply(1:length(lams), function(idx) {
##         m <- est$X[, , idx]
##         sum(m[ut] != 0)
##     })
##     qq <- qvals >= q
##     if (!any(qq)) {
##         stop("Didn't reach the required number of variables. Try supplying lambda manually")
##     }
##     lamidx <- which.max(qvals >= q)
##     M <- est$X[, , lamidx][ut]
##     selected <- (M != 0)
##     s <- sapply(1:lamidx, function(idx) {
##         m <- est$X[, , idx][ut] != 0
##         return(m)
##     })
##     colnames(s) <- as.character(1:ncol(s))
##     return(list(selected = selected, path = s))
## }
## <bytecode: 0x00000000107a1970>
## <environment: namespace:stabs>
## attr(,"class")
## [1] "function"        "graphical_model"

A speciality of graphical models is that the function has an additional class "graphical_model". You can set this class immediately after having defined your fitfun, say, my.graphical_model as follows:

class(my.graphical_model) <- c(class(my.graphical_model), "graphical_model")

This will avoid that stabsel expects a y argument and will change naming conventions for the resulting selections.