Introduction

This is a short introduction and test for graphical models with stability selection. We first load the relevant packages. We use package QUIC to fit graphical networks:

library(stabs)
library(huge) # Need package huge for generating test data

## Warning: package 'huge' was built under R version 3.4.1

library(QUIC)
library(igraph)

Generate some test data using huge

N <- 200
set.seed(10010)
dat.hubs <- huge.generator(n=N, d=40, graph="hub")
set.seed(10010)
dat.cluster <- huge.generator(n=N, d=40, graph="cluster")
set.seed(10010)
dat.rand <- huge.generator(n=N, d=40, graph="random")

Next, we can visualize the test data

plot(dat.hubs)

plot of chunk PlotHubs

plot(dat.cluster)

plot of chunk PlotClust

plot(dat.rand)

plot of chunk PlotRand

Stability selection

To run stability selection on a graphical model, we can simply use the dedicated fitfun called quic.graphical_model. Note that we don't supply a y argument to stabsel.

s.hubs <- stabsel(x = dat.hubs$data, fitfun = quic.graphical_model, 
                  cutoff = 0.75, PFER = 10)
s.cluster <- stabsel(x = dat.cluster$data, fitfun = quic.graphical_model, 
                     cutoff = 0.75, PFER = 10)
s.rand <- stabsel(x = dat.rand$data, fitfun = quic.graphical_model, 
                  cutoff = 0.75, PFER = 10)

Plot comparisons

In order to make comparisons possible, we use the following user defined plot function

plot_comparison <- function(stabsout, orig) {
    ## display comparison of original graph and stabs estimation
    j <- orig$omega * 0
    orig.graph <- graph.adjacency(orig$theta != 0, mode = "max", diag = FALSE)
    ut <- upper.tri(j)
    j[ut][stabsout$selected] <- 1
    stabs.graph <- graph.adjacency(j != 0, mode = "max", diag = FALSE)
    layout <- layout.fruchterman.reingold(orig.graph)

    par(mfrow = c(1,2))
    plot(orig.graph, layout = layout, edge.color = "gray50", vertex.color = 'red', 
         main = "Real graph",  vertex.size = 3, vertex.label = NA)
    plot(stabs.graph, layout = layout, edge.color = "gray50", vertex.color = 'red', 
         main = "Stabs estimated graph", vertex.size = 3, vertex.label = NA)
}

Now, we compare the three graphs:

plot_comparison(s.hubs, dat.hubs)

plot of chunk StabsPlotHubs

plot_comparison(s.cluster, dat.cluster)

plot of chunk StabsPlotCluster

plot_comparison(s.rand, dat.rand)

plot of chunk StabsPlotRand As one can see, the original graphs are resembled by the estimated graphs rather well in this example.

User specified `fitfun` for graphical models

For general instruction on how to create a fitfun please refer to

vignette("Using_stabs", package = "stabs")

For an example for grahical model see

quic.graphical_model

## function (x, y, q, ...) 
## {
##     if (!requireNamespace("QUIC")) 
##         stop("Package ", sQuote("QUIC"), " is required but not available")
##     extraargs <- list(...)
##     empirical.cov <- cov(x)
##     lams <- extraargs$lams
##     if (is.null(lams)) {
##         max.cov <- max(abs(empirical.cov[upper.tri(empirical.cov)]))
##         lams <- getLamPath(max.cov, max.cov * 0.05, len = 40)
##     }
##     est <- QUIC::QUIC(empirical.cov, rho = 1, path = lams, msg = 0)
##     ut <- upper.tri(empirical.cov)
##     qvals <- sapply(1:length(lams), function(idx) {
##         m <- est$X[, , idx]
##         sum(m[ut] != 0)
##     })
##     qq <- qvals >= q
##     if (!any(qq)) {
##         stop("Didn't reach the required number of variables. Try supplying lambda manually")
##     }
##     lamidx <- which.max(qvals >= q)
##     M <- est$X[, , lamidx][ut]
##     selected <- (M != 0)
##     s <- sapply(1:lamidx, function(idx) {
##         m <- est$X[, , idx][ut] != 0
##         return(m)
##     })
##     colnames(s) <- as.character(1:ncol(s))
##     return(list(selected = selected, path = s))
## }
## <bytecode: 0x00000000107a1970>
## <environment: namespace:stabs>
## attr(,"class")
## [1] "function"        "graphical_model"

A speciality of graphical models is that the function has an additional class "graphical_model". You can set this class immediately after having defined your fitfun, say, my.graphical_model as follows:

class(my.graphical_model) <- c(class(my.graphical_model), "graphical_model")

This will avoid that stabsel expects a y argument and will change naming conventions for the resulting selections.

Introduction

Generate some test data using huge

Stability selection

Plot comparisons

User specified fitfun for graphical models

User specified `fitfun` for graphical models