This is a short introduction and test for graphical models with stability selection. We first load the relevant packages. We use package QUIC to fit graphical networks:
library(stabs)
library(huge) # Need package huge for generating test data
## Warning: package 'huge' was built under R version 3.4.1
library(QUIC)
library(igraph)
N <- 200
set.seed(10010)
dat.hubs <- huge.generator(n=N, d=40, graph="hub")
set.seed(10010)
dat.cluster <- huge.generator(n=N, d=40, graph="cluster")
set.seed(10010)
dat.rand <- huge.generator(n=N, d=40, graph="random")
Next, we can visualize the test data
plot(dat.hubs)
plot(dat.cluster)
plot(dat.rand)
To run stability selection on a graphical model, we can simply use the dedicated
fitfun
called quic.graphical_model
. Note that we don't supply a y
argument
to stabsel.
s.hubs <- stabsel(x = dat.hubs$data, fitfun = quic.graphical_model,
cutoff = 0.75, PFER = 10)
s.cluster <- stabsel(x = dat.cluster$data, fitfun = quic.graphical_model,
cutoff = 0.75, PFER = 10)
s.rand <- stabsel(x = dat.rand$data, fitfun = quic.graphical_model,
cutoff = 0.75, PFER = 10)
In order to make comparisons possible, we use the following user defined plot function
plot_comparison <- function(stabsout, orig) {
## display comparison of original graph and stabs estimation
j <- orig$omega * 0
orig.graph <- graph.adjacency(orig$theta != 0, mode = "max", diag = FALSE)
ut <- upper.tri(j)
j[ut][stabsout$selected] <- 1
stabs.graph <- graph.adjacency(j != 0, mode = "max", diag = FALSE)
layout <- layout.fruchterman.reingold(orig.graph)
par(mfrow = c(1,2))
plot(orig.graph, layout = layout, edge.color = "gray50", vertex.color = 'red',
main = "Real graph", vertex.size = 3, vertex.label = NA)
plot(stabs.graph, layout = layout, edge.color = "gray50", vertex.color = 'red',
main = "Stabs estimated graph", vertex.size = 3, vertex.label = NA)
}
Now, we compare the three graphs:
plot_comparison(s.hubs, dat.hubs)
plot_comparison(s.cluster, dat.cluster)
plot_comparison(s.rand, dat.rand)
As one can see, the original graphs are resembled by the estimated
graphs rather well in this example.
fitfun
for graphical modelsFor general instruction on how to create a fitfun
please refer to
vignette("Using_stabs", package = "stabs")
For an example for grahical model see
quic.graphical_model
## function (x, y, q, ...)
## {
## if (!requireNamespace("QUIC"))
## stop("Package ", sQuote("QUIC"), " is required but not available")
## extraargs <- list(...)
## empirical.cov <- cov(x)
## lams <- extraargs$lams
## if (is.null(lams)) {
## max.cov <- max(abs(empirical.cov[upper.tri(empirical.cov)]))
## lams <- getLamPath(max.cov, max.cov * 0.05, len = 40)
## }
## est <- QUIC::QUIC(empirical.cov, rho = 1, path = lams, msg = 0)
## ut <- upper.tri(empirical.cov)
## qvals <- sapply(1:length(lams), function(idx) {
## m <- est$X[, , idx]
## sum(m[ut] != 0)
## })
## qq <- qvals >= q
## if (!any(qq)) {
## stop("Didn't reach the required number of variables. Try supplying lambda manually")
## }
## lamidx <- which.max(qvals >= q)
## M <- est$X[, , lamidx][ut]
## selected <- (M != 0)
## s <- sapply(1:lamidx, function(idx) {
## m <- est$X[, , idx][ut] != 0
## return(m)
## })
## colnames(s) <- as.character(1:ncol(s))
## return(list(selected = selected, path = s))
## }
## <bytecode: 0x00000000107a1970>
## <environment: namespace:stabs>
## attr(,"class")
## [1] "function" "graphical_model"
A speciality of graphical models is that the function has an additional class
"graphical_model"
. You can set this class immediately after having defined your
fitfun
, say, my.graphical_model
as follows:
class(my.graphical_model) <- c(class(my.graphical_model), "graphical_model")
This will avoid that stabsel expects a y
argument and will change naming
conventions for the resulting selections.