The MXM R Package, short for the latin 'Mens ex Machina' ( Mind from the Machine ), is a collection of utility functions for feature selection, cross validation and Bayesian Networks. MXM offers many feature selection algorithms focused on providing one or more minimal feature subsets, refered also as variable signatures, that can be used to improve the performance of downstream analysis tasks such as regression and classification, by excluding irrelevant and redundant variables.
In this tutorial we will learn how to use the Forward Backward Early Dropping (FBED) algorithm. The algorithm is a variation of the usual forward selection. At every step, the most significant variable enters the selected variables set. In addition, only the significant variables stay and are further examined. The non significant ones are dropped. This goes until no variable can enter the set. The user has the option to redo this step 1 or more times (the argument K). In the end, a backward selection is performed to remove falsely selected variables.
For simplicity, in this tutorial, we will use a dataset referred as “The Wine Dataset”.
The Wine Dataset contains the results of a chemical analysis of wines grown in a specific area of Italy. Three types of wine are represented in the 178 samples, with the results of 13 chemical analyses recorded for each sample. Note that the “Type” variable was transformed into a categorical variable.
So, first of all, for this tutorial analysis, we are loading the 'MXM' library and 'dplyr' library for handling easier the dataset, but note that the second one is not necessary for the analysis.
### ~ ~ ~ Load Packages ~ ~ ~ ###
library(MXM)
library(dplyr)
On a next step we are downloading and opening the dataset, defining also the column names.
### ~ ~ ~ Load The Dataset ~ ~ ~ ###
wine.url <- "ftp://ftp.ics.uci.edu/pub/machine-learning-databases/wine/wine.data"
wine <- read.csv(wine.url,
check.names = FALSE,
header = FALSE)
head(wine)
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
## 1 1 14.23 1.71 2.43 15.6 127 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065
## 2 1 13.20 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050
## 3 1 13.16 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185
## 4 1 14.37 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480
## 5 1 13.24 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735
## 6 1 14.20 1.76 2.45 15.2 112 3.27 3.39 0.34 1.97 6.75 1.05 2.85 1450
str(wine)
## 'data.frame': 178 obs. of 14 variables:
## $ V1 : int 1 1 1 1 1 1 1 1 1 1 ...
## $ V2 : num 14.2 13.2 13.2 14.4 13.2 ...
## $ V3 : num 1.71 1.78 2.36 1.95 2.59 1.76 1.87 2.15 1.64 1.35 ...
## $ V4 : num 2.43 2.14 2.67 2.5 2.87 2.45 2.45 2.61 2.17 2.27 ...
## $ V5 : num 15.6 11.2 18.6 16.8 21 15.2 14.6 17.6 14 16 ...
## $ V6 : int 127 100 101 113 118 112 96 121 97 98 ...
## $ V7 : num 2.8 2.65 2.8 3.85 2.8 3.27 2.5 2.6 2.8 2.98 ...
## $ V8 : num 3.06 2.76 3.24 3.49 2.69 3.39 2.52 2.51 2.98 3.15 ...
## $ V9 : num 0.28 0.26 0.3 0.24 0.39 0.34 0.3 0.31 0.29 0.22 ...
## $ V10: num 2.29 1.28 2.81 2.18 1.82 1.97 1.98 1.25 1.98 1.85 ...
## $ V11: num 5.64 4.38 5.68 7.8 4.32 6.75 5.25 5.05 5.2 7.22 ...
## $ V12: num 1.04 1.05 1.03 0.86 1.04 1.05 1.02 1.06 1.08 1.01 ...
## $ V13: num 3.92 3.4 3.17 3.45 2.93 2.85 3.58 3.58 2.85 3.55 ...
## $ V14: int 1065 1050 1185 1480 735 1450 1290 1295 1045 1045 ...
colnames(wine) <- c('Type', 'Alcohol', 'Malic', 'Ash',
'Alcalinity', 'Magnesium', 'Phenols',
'Flavanoids', 'Nonflavanoids',
'Proanthocyanins', 'Color', 'Hue',
'Dilution', 'Proline')
For this tutorial example, we are going to apply the FBED algorithm on the above dataset, using as data and as target only continuous variables.
The selection of the appropriate conditional independence test is a crucial decision for the validity and success of downstream statistical analysis and machine learning tasks. Currently the __ MXM R package
__ supports numerous tests for different combinations of target ( dependent ) and predictor ( independent ) variables. A detailed summary table to guide you through the selection of the most suitable test can be found in MXM's reference manual (p.21 “CondInditional independence tests” ).
In our example we will use the MXMX::fbed.reg()
, which is the implementation of the FBED algorithm and since we are going to examine only continuous variables, we will use the Fisher's Independence Test.
dataset
- A numeric matrix (or a data.frame in case of categorical predictors), containing the variables for performing the test. The rows should refer to the different samples and columns to the features. For the purposes of this example analysis, we are going to use only the continuous variables, therefore we are removing the “Type” variable from the dataset. Furthermore, we are removing the “Nonflavanoids” variable, because we will use it as target.
### ~ ~ ~ Removing The Categorical ('Type') and The Target ('Nonflavanoids') Variables ~ ~ ~ ###
wine_dataset <- dplyr::select(wine,
-contains("Type"),
-contains("Nonflavanoids"))
head(wine_dataset)
## Alcohol Malic Ash Alcalinity Magnesium Phenols Flavanoids Proanthocyanins
## 1 14.23 1.71 2.43 15.6 127 2.80 3.06 2.29
## 2 13.20 1.78 2.14 11.2 100 2.65 2.76 1.28
## 3 13.16 2.36 2.67 18.6 101 2.80 3.24 2.81
## 4 14.37 1.95 2.50 16.8 113 3.85 3.49 2.18
## 5 13.24 2.59 2.87 21.0 118 2.80 2.69 1.82
## 6 14.20 1.76 2.45 15.2 112 3.27 3.39 1.97
## Color Hue Dilution Proline
## 1 5.64 1.04 3.92 1065
## 2 4.38 1.05 3.40 1050
## 3 5.68 1.03 3.17 1185
## 4 7.80 0.86 3.45 1480
## 5 4.32 1.04 2.93 735
## 6 6.75 1.05 2.85 1450
target
- The class variable including the values of the target variable. We should provide either a string, an integer, a numeric value, a vector, a factor, an ordered factor or a Surv object. For the purposes of this example analysis, we are going to use as the dependent variable “Nonflavanoids”.
wine_target <- wine$Nonflavanoids
head(wine_target)
## [1] 0.28 0.26 0.30 0.24 0.39 0.34
This is the first time that we are running the algorithm, so we are going to explain what each Argument refers to:
target
: The class variable. Provide either a string, an integer, a numeric value, a vector, a factor, an ordered factor or a Surv object. As explained above, this will be the dependent variable. If the target is a single integer value or a string, it has to corresponds to the column number or to the name of the target feature in the dataset. Here we choose “Nonflavanoids”.
dataset
: The dataset. Provide either a data frame or a matrix. If the dataset (predictor variables) contains missing (NA) values, they will automatically be replaced by the current variable (column) mean value with an appropriate warning to the user after the execution. Here we choose the whole wine dataset, except from the “Type” (categorical) and “Nonflavanoids” (target) variables.
test
: The conditional independence test to use. Default value is NULL. Here since our dataset includes only continuous features (remember: Categorical variable “Type” was removed) and our dependent variable is also continuous, we choose 'testIndFisher'.
For more information, about which test to use, please visit : https://www.rdocumentation.org/packages/MXM/versions/0.9.7/topics/CondInditional%20independence%20tests.
threshold
: Threshold (suitable values in [0,1]) for the significance of the p-values. The default value is 0.05. Here we choose the default value 0.05.
wei
: A vector of weights to be used for weighted regression. The default value is NULL. It is not suggested when “robust” is set to TRUE. If you want to use the “testIndBinom”, then supply the successes in the y and the trials here. Here we choose the default value NULL
K
: How many times should the process be repeated? The default value is 0. Here we choose 3.
method
: Do you want the likelihood ratio test to be performed (“LR” is the default value) or perform the selection using the “eBIC” criterion (BIC is a special case)? Here we choose BIC in the first example and LR for the second, in order to see the output differences.
gam
: In case the method chosen is “eBIC”, one can also specify the gamma parameter. The default value is “NULL”, so that the value is automatically calculated. Here, although we choose BIC as selection criterion, we do not choose any gamma parameter.
backward
: After the Forward Early Dropping phase, the algorithm proceeds with the usual Backward Selection phase. The default value is set to TRUE. It is advised to perform this step since some variables may be false positives and were wrongly selected. The backward phase using Likelihood Ratio test and eBIC are two different functions and can be called directly by the user. So, if you want for example to perform a backward regression with a different threshold value, just use these two functions separately. Here we set the backward argument as TRUE.
### ~ ~ ~ Running FBED with eBIC ~ ~ ~ ###
fbed_cont_eBIC <- MXM::fbed.reg(target = wine_target,
dataset = wine_dataset,
test = "testIndFisher",
threshold = 0.05,
wei = NULL,
K = 10,
method = "eBIC",
gam = NULL,
backward = TRUE)
### ~ ~ ~ Running FBED with LR ~ ~ ~ ###
fbed_cont_LR <- MXM::fbed.reg(target = wine_target,
datase = wine_dataset,
test = "testIndFisher",
threshold = 0.05,
wei = NULL,
K = 10,
method = "LR",
gam = NULL,
backward = TRUE)
So, the algorithm run twice… Let's see what information we can take out of it.
The main purpose of running FBED algorithm is to see which variables should be selected as important. The indices of those variables are stored in res
. Furthermore, in this matrix we see their test statistic and the associated p-value.
### ~ ~ ~ eBIC results ~ ~ ~ ###
fbed_cont_eBIC$res
## Vars eBIC difference
## [1,] 7 -240.5425
## [2,] 3 -280.7062
## [3,] 5 -291.0249
SelectedVars_names<-colnames(wine_dataset[fbed_cont_eBIC$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Ash" "Magnesium"
From eBIC, we get as significant the variables “Flavanoids”, “Ash” and “Magnesium”, while from LR …
### ~ ~ ~ LR results ~ ~ ~ ###
fbed_cont_LR$res
## sel stat pval
## 1 7 3.681241 -8.077634
## 2 3 4.847999 -12.795188
## 3 5 4.107912 -9.694011
## 4 11 2.087806 -3.262599
SelectedVars_names<-colnames(wine_dataset[fbed_cont_LR$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Ash" "Magnesium" "Dilution"
… we get the three previous variables, plus the variable “Dilution”. So, the two testing approaches do not differ so much. In this case, the eBIC criterion applied a more strict feature selection, by selecting only 3 variables, while LR returned one variable more.
Since the function returns the variables sorted by their significance, we can easily see that the three variables chosen by both approaches are the most important. So, it depends on the initial question and dataset used to say whether the fourth variable should be used in the downstream analysis.
And as you may imagine, you may also retrieve the information about the scores. They are all (sorted) in the second column.
fbed_cont_eBIC$res[,2]
## [1] -240.5425 -280.7062 -291.0249
fbed_cont_LR$res[,2]
## 1 2 3 4
## 3.681241 4.847999 4.107912 2.087806
Perfect! But we see that the function returned an object called info
. What is this?
fbed_cont_eBIC$info
## Number of vars Number of tests
## K=0 2 20
## K=1 3 10
## K=2 4 9
## K=3 4 8
fbed_cont_LR$info
## Number of vars Number of tests
## K=0 3 25
## K=1 4 9
## K=2 4 8
The info
matrix describes the number of variables and the number of tests performed (or models fitted) at each round (remember this value of K
that in this example we set equal to 10? Here it did not reach K=10 neither with eBIC nor with LR. This happened because there were no difference after the 4th (eBIC) or 3rd (LR) run, so the algorithm stopped running earlier). We see that that LR applied one iteration less. For each K
the number of selected variables is returned together with the number of tests performed. Therefore, we see that in the first step, 3 variables were already chosen by LR.
Well, all this refers to the forward phase only. So, if the information about the forward step is appended in the info
matrix, where can we find information about the backward phase?
fbed_cont_eBIC$back.rem
## Vars
## 4
fbed_cont_LR$back.rem
## numeric(0)
By calling the back.rem
, the variables that have been removed in the backward phase are returned. We see that both approaches did not remove any “false positive” variable.
In case we are interested in the number of models that were fitted in the backward phase, all we have to do is to look for the back.n.tests
variable.
fbed_cont_eBIC$back.n.tests
## [1] 7
fbed_cont_LR$back.n.tests
## [1] 4
We see that LR applied one test more during the backward phase. This is expected, since this method chose 4 variables (instead of 3 with eBIC) and all 4 have been checked.
But which of the both approaches was quicker applied?
fbed_cont_eBIC$runtime
## user system elapsed
## 0.11 0.02 0.13
fbed_cont_LR$runtime
## user system elapsed
## 0 0 0
Since the dataset is small, we do not see any special runtime difference.
On this step, will apply FBED for a Categorical variable, using only eBIC.
Since the variable is categorical - and more specific it is a factor with more than two levels (unordered) -and the features are continuous, according to MXM's reference manual (p.21 “CondInditional independence tests” ), we should use the Multinomial logistic regression ( 'testIndMultinom' ).
In this step, we keep the whole dataset, in order to show how to use the algorithm also without subtracting the initial matrix.
### ~ ~ ~ Taking The Whole Dataset ~ ~ ~ ###
wine_dataset <- dplyr::select(wine,
-contains("Type"))
head(wine_dataset)
## Alcohol Malic Ash Alcalinity Magnesium Phenols Flavanoids Nonflavanoids
## 1 14.23 1.71 2.43 15.6 127 2.80 3.06 0.28
## 2 13.20 1.78 2.14 11.2 100 2.65 2.76 0.26
## 3 13.16 2.36 2.67 18.6 101 2.80 3.24 0.30
## 4 14.37 1.95 2.50 16.8 113 3.85 3.49 0.24
## 5 13.24 2.59 2.87 21.0 118 2.80 2.69 0.39
## 6 14.20 1.76 2.45 15.2 112 3.27 3.39 0.34
## Proanthocyanins Color Hue Dilution Proline
## 1 2.29 5.64 1.04 3.92 1065
## 2 1.28 4.38 1.05 3.40 1050
## 3 2.81 5.68 1.03 3.17 1185
## 4 2.18 7.80 0.86 3.45 1480
## 5 1.82 4.32 1.04 2.93 735
## 6 1.97 6.75 1.05 2.85 1450
wine_target <- as.factor(wine$Type)
head(wine_target)
## [1] 1 1 1 1 1 1
## Levels: 1 2 3
### ~ ~ ~ Running FBED For Categorical Variable with eBIC~ ~ ~ ###
fbed_categorical_eBIC <- MXM::fbed.reg(target = wine_target,
dataset = wine_dataset,
test = "testIndMultinom",
threshold = 0.05,
wei = NULL,
K = 10,
method = "eBIC",
gam = NULL,
backward = TRUE)
### ~ ~ ~ Running FBED For Categorical Variable with LR~ ~ ~ ###
fbed_categorical_LR <- MXM::fbed.reg(target = wine_target,
dataset = wine_dataset,
test = "testIndMultinom",
threshold = 0.05,
wei = NULL,
K = 10,
method = "LR",
gam = NULL,
backward = TRUE)
## # weights: 18 (10 variable)
## initial value 195.552987
## iter 10 value 22.769414
## iter 20 value 6.859582
## iter 30 value 1.971162
## iter 40 value 1.829873
## iter 50 value 1.163934
## iter 60 value 1.129584
## iter 70 value 1.071912
## iter 80 value 1.064826
## iter 90 value 0.487277
## iter 100 value 0.477455
## final value 0.477455
## stopped after 100 iterations
## # weights: 15 (8 variable)
## initial value 195.552987
## iter 10 value 37.451958
## iter 20 value 27.927832
## iter 30 value 24.837868
## iter 40 value 23.766671
## iter 50 value 23.149697
## iter 60 value 22.993680
## iter 70 value 22.961828
## iter 80 value 22.953286
## iter 90 value 22.934465
## iter 100 value 22.929542
## final value 22.929542
## stopped after 100 iterations
## # weights: 15 (8 variable)
## initial value 195.552987
## iter 10 value 26.857767
## iter 20 value 20.694535
## iter 30 value 18.470545
## iter 40 value 18.092177
## iter 50 value 17.976798
## iter 60 value 17.841829
## iter 70 value 17.768169
## iter 80 value 17.729584
## iter 90 value 17.695714
## iter 100 value 17.660677
## final value 17.660677
## stopped after 100 iterations
## # weights: 15 (8 variable)
## initial value 195.552987
## iter 10 value 38.840089
## iter 20 value 25.181931
## iter 30 value 23.855234
## iter 40 value 23.809249
## iter 50 value 23.802273
## iter 60 value 23.800938
## iter 70 value 23.799450
## iter 80 value 23.798548
## final value 23.798315
## converged
## # weights: 15 (8 variable)
## initial value 195.552987
## iter 10 value 36.994356
## iter 20 value 24.354430
## iter 30 value 22.683756
## iter 40 value 22.163394
## iter 50 value 21.989378
## iter 60 value 21.943926
## iter 70 value 21.896838
## iter 80 value 21.886846
## iter 90 value 21.865875
## iter 100 value 21.849686
## final value 21.849686
## stopped after 100 iterations
So, the algorithm run once again… Let's see what information we can take out of it.
The main purpose of running FBED algorithm is to see which variables should be selected as important. The indices of those variables are stored in res
.
### ~ ~ ~ eBIC results ~ ~ ~ ###
fbed_categorical_eBIC$res
## Vars eBIC difference
## [1,] 7 89.71506
## [2,] 1 79.17733
## [3,] 13 91.45261
## [4,] 11 87.55535
SelectedVars_names<-colnames(wine_dataset[fbed_categorical_eBIC$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Alcohol" "Proline" "Hue"
From eBIC, we get as significant the variables “Flavanoids”, “Alcohol”, “Proline”, “Hue”, while from LR …
### ~ ~ ~ LR results ~ ~ ~ ###
fbed_categorical_LR$res
## sel stat pval
## 1 7 44.90417 -22.45209
## 2 1 34.36644 -17.18322
## 3 13 46.64172 -23.32086
## 4 11 42.74446 -21.37223
SelectedVars_names<-colnames(wine_dataset[fbed_categorical_LR$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Alcohol" "Proline" "Hue"
… exactly the same 4 variables were chosen.
What was stored this time in the info
matrix?
fbed_categorical_eBIC$info
## Number of vars Number of tests
## K=0 4 40
## K=1 4 9
fbed_categorical_LR$info
## Number of vars Number of tests
## K=0 4 44
## K=1 4 9
As we see, both approaches needed 2 iterations. The only difference is that LR applied one test more. Well, this refers to the forward phase only and for each K
the number of selected variables is returned together with the number of tests performed.
And now let us inspect the backward phase
fbed_categorical_eBIC$back.rem
## numeric(0)
fbed_categorical_LR$back.rem
## numeric(0)
No variable was removed during the backward steps…
fbed_categorical_eBIC$back.n.tests
## [1] 4
fbed_categorical_LR$back.n.tests
## [1] 4
… and both approaches fitted 4 models during the backward phase
And how quick has all this happened?
fbed_categorical_eBIC$runtime
## user system elapsed
## 0.63 0.03 0.69
fbed_categorical_LR$runtime
## user system elapsed
## 0.83 0.02 0.95
Really quick, since the dataset is small.
In case the user wants to run the FBED algorithm for more than one K and compare the differences after each iteration, instead of calling the function with K=0, K=1, K=2 and so on, there is the possibility of running fbed.reg with K=0:2. Then, the selected variables found at K=2, K=1 and K=0 are returned. In order to make this issue more clear, we are going to apply again the example given in ??3.4, but this time we will ask from the algorithm to check K = 0:5
### ~ ~ ~ Running FBED For Many K ~ ~ ~ ###
wine_dataset <- dplyr::select(wine,
-contains("Type"),
-contains("Nonflavanoids"))
wine_target <- wine$Nonflavanoids
fbed_cont_eBIC_manyK <- MXM::fbed.reg(target = wine_target,
datase = wine_dataset,
test = "testIndFisher",
threshold = 0.05,
wei = NULL,
K = 0:5,
method = "LR",
gam = NULL,
backward = TRUE)
Looking inside the new object fbed_cont_eBIC_manyK
, we can find all the information about each K separately. This information is stored in $mod
, for example:
### ~ ~ ~ statistics about K=1 ~ ~ ~ ###
fbed_cont_eBIC_manyK$mod$`K=1`
## Vars stat log p-value
## 1 7 3.681241 -8.077634
## 2 3 4.847999 -12.795188
## 3 5 4.107912 -9.694011
## 4 11 2.087806 -3.262599
Now you are ready to run your own analysis using MXM::FBED algorithm!
Thank you for your attention.
Hope that you found this tutorial helpful.
All analyses have been applied on:
sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=C LC_CTYPE=Greek_Greece.1253
## [3] LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C
## [5] LC_TIME=Greek_Greece.1253
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] dplyr_0.8.5 MXM_1.4.8
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-144 ordinal_2019.12-10 doParallel_1.0.15
## [4] RColorBrewer_1.1-2 R.cache_0.14.0 numDeriv_2016.8-1.1
## [7] tools_3.6.3 backports_1.1.6 R6_2.4.1
## [10] rpart_4.1-15 Hmisc_4.4-0 colorspace_1.4-1
## [13] nnet_7.3-12 tidyselect_1.0.0 gridExtra_2.3
## [16] compiler_3.6.3 coxme_2.2-16 cli_2.0.2
## [19] quantreg_5.55 htmlTable_1.13.3 SparseM_1.78
## [22] slam_0.1-47 scales_1.1.0 checkmate_2.0.0
## [25] relations_0.6-9 stringr_1.4.0 digest_0.6.25
## [28] foreign_0.8-75 minqa_1.2.4 R.utils_2.9.2
## [31] base64enc_0.1-3 jpeg_0.1-8.1 pkgconfig_2.0.3
## [34] htmltools_0.4.0 lme4_1.1-21 Rfast2_0.0.5
## [37] htmlwidgets_1.5.1 rlang_0.4.5 rstudioapi_0.11
## [40] visNetwork_2.0.9 generics_0.0.2 energy_1.7-7
## [43] jsonlite_1.6.1 acepack_1.4.1 R.oo_1.23.0
## [46] magrittr_1.5 Formula_1.2-3 Matrix_1.2-18
## [49] Rcpp_1.0.4.6 munsell_0.5.0 fansi_0.4.1
## [52] geepack_1.3-1 lifecycle_0.2.0 RcppZiggurat_0.1.5
## [55] R.methodsS3_1.8.0 ucminf_1.1-4 stringi_1.4.6
## [58] MASS_7.3-51.5 grid_3.6.3 parallel_3.6.3
## [61] bdsmatrix_1.3-4 bigmemory.sri_0.1.3 crayon_1.3.4
## [64] lattice_0.20-38 splines_3.6.3 knitr_1.28
## [67] pillar_1.4.3 boot_1.3-24 codetools_0.2-16
## [70] glue_1.4.0 evaluate_0.14 latticeExtra_0.6-29
## [73] data.table_1.12.8 png_0.1-7 vctrs_0.2.4
## [76] nloptr_1.2.2.1 foreach_1.5.0 MatrixModels_0.4-1
## [79] gtable_0.3.0 purrr_0.3.3 tidyr_1.0.2
## [82] assertthat_0.2.1 ggplot2_3.3.0 xfun_0.12
## [85] Rfast_1.9.9 broom_0.5.5 survival_3.1-12
## [88] tibble_3.0.0 iterators_1.0.12 sets_1.0-18
## [91] cluster_2.1.0 bigmemory_4.5.36 ellipsis_0.3.0
## [94] R.rsp_0.43.2
Borboudakis G. and Tsamardinos I. (2017). Forward-Backward Selection with Early Dropping. https://arxiv.org/pdf/1705.10770.pdf