Introduction

The MXM R Package, short for the latin 'Mens ex Machina' ( Mind from the Machine ), is a collection of utility functions for feature selection, cross validation and Bayesian Networks. MXM offers many feature selection algorithms focused on providing one or more minimal feature subsets, refered also as variable signatures, that can be used to improve the performance of downstream analysis tasks such as regression and classification, by excluding irrelevant and redundant variables.

In this tutorial we will learn how to use the Forward Backward Early Dropping (FBED) algorithm. The algorithm is a variation of the usual forward selection. At every step, the most significant variable enters the selected variables set. In addition, only the significant variables stay and are further examined. The non significant ones are dropped. This goes until no variable can enter the set. The user has the option to redo this step 1 or more times (the argument K). In the end, a backward selection is performed to remove falsely selected variables.

For simplicity, in this tutorial, we will use a dataset referred as “The Wine Dataset”.

Loading Data

The Wine Dataset contains the results of a chemical analysis of wines grown in a specific area of Italy. Three types of wine are represented in the 178 samples, with the results of 13 chemical analyses recorded for each sample. Note that the “Type” variable was transformed into a categorical variable.

So, first of all, for this tutorial analysis, we are loading the 'MXM' library and 'dplyr' library for handling easier the dataset, but note that the second one is not necessary for the analysis.

### ~ ~ ~ Load Packages ~ ~ ~ ###
library(MXM) 
library(dplyr)

On a next step we are downloading and opening the dataset, defining also the column names.

### ~ ~ ~ Load The Dataset ~ ~ ~ ###
wine.url <- "ftp://ftp.ics.uci.edu/pub/machine-learning-databases/wine/wine.data"
wine <- read.csv(wine.url,
                 check.names = FALSE,
                 header = FALSE) 
head(wine)
##   V1    V2   V3   V4   V5  V6   V7   V8   V9  V10  V11  V12  V13  V14
## 1  1 14.23 1.71 2.43 15.6 127 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065
## 2  1 13.20 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050
## 3  1 13.16 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185
## 4  1 14.37 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480
## 5  1 13.24 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.32 1.04 2.93  735
## 6  1 14.20 1.76 2.45 15.2 112 3.27 3.39 0.34 1.97 6.75 1.05 2.85 1450
str(wine)
## 'data.frame':    178 obs. of  14 variables:
##  $ V1 : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ V2 : num  14.2 13.2 13.2 14.4 13.2 ...
##  $ V3 : num  1.71 1.78 2.36 1.95 2.59 1.76 1.87 2.15 1.64 1.35 ...
##  $ V4 : num  2.43 2.14 2.67 2.5 2.87 2.45 2.45 2.61 2.17 2.27 ...
##  $ V5 : num  15.6 11.2 18.6 16.8 21 15.2 14.6 17.6 14 16 ...
##  $ V6 : int  127 100 101 113 118 112 96 121 97 98 ...
##  $ V7 : num  2.8 2.65 2.8 3.85 2.8 3.27 2.5 2.6 2.8 2.98 ...
##  $ V8 : num  3.06 2.76 3.24 3.49 2.69 3.39 2.52 2.51 2.98 3.15 ...
##  $ V9 : num  0.28 0.26 0.3 0.24 0.39 0.34 0.3 0.31 0.29 0.22 ...
##  $ V10: num  2.29 1.28 2.81 2.18 1.82 1.97 1.98 1.25 1.98 1.85 ...
##  $ V11: num  5.64 4.38 5.68 7.8 4.32 6.75 5.25 5.05 5.2 7.22 ...
##  $ V12: num  1.04 1.05 1.03 0.86 1.04 1.05 1.02 1.06 1.08 1.01 ...
##  $ V13: num  3.92 3.4 3.17 3.45 2.93 2.85 3.58 3.58 2.85 3.55 ...
##  $ V14: int  1065 1050 1185 1480 735 1450 1290 1295 1045 1045 ...
colnames(wine) <- c('Type', 'Alcohol', 'Malic', 'Ash', 
                    'Alcalinity', 'Magnesium', 'Phenols', 
                    'Flavanoids', 'Nonflavanoids',
                    'Proanthocyanins', 'Color', 'Hue', 
                    'Dilution', 'Proline')

FBED for Continuous

For this tutorial example, we are going to apply the FBED algorithm on the above dataset, using as data and as target only continuous variables.

Selecting Appropriate Conditional Independence Test

The selection of the appropriate conditional independence test is a crucial decision for the validity and success of downstream statistical analysis and machine learning tasks. Currently the __ MXM R package__ supports numerous tests for different combinations of target ( dependent ) and predictor ( independent ) variables. A detailed summary table to guide you through the selection of the most suitable test can be found in MXM's reference manual (p.21 “CondInditional independence tests” ). In our example we will use the MXMX::fbed.reg(), which is the implementation of the FBED algorithm and since we are going to examine only continuous variables, we will use the Fisher's Independence Test.

Creating Data & Target Matrices

dataset - A numeric matrix (or a data.frame in case of categorical predictors), containing the variables for performing the test. The rows should refer to the different samples and columns to the features. For the purposes of this example analysis, we are going to use only the continuous variables, therefore we are removing the “Type” variable from the dataset. Furthermore, we are removing the “Nonflavanoids” variable, because we will use it as target.

### ~ ~ ~ Removing The Categorical ('Type') and The Target ('Nonflavanoids') Variables ~ ~ ~ ###

wine_dataset <- dplyr::select(wine,
                              -contains("Type"),
                              -contains("Nonflavanoids")) 
head(wine_dataset)
##   Alcohol Malic  Ash Alcalinity Magnesium Phenols Flavanoids Proanthocyanins
## 1   14.23  1.71 2.43       15.6       127    2.80       3.06            2.29
## 2   13.20  1.78 2.14       11.2       100    2.65       2.76            1.28
## 3   13.16  2.36 2.67       18.6       101    2.80       3.24            2.81
## 4   14.37  1.95 2.50       16.8       113    3.85       3.49            2.18
## 5   13.24  2.59 2.87       21.0       118    2.80       2.69            1.82
## 6   14.20  1.76 2.45       15.2       112    3.27       3.39            1.97
##   Color  Hue Dilution Proline
## 1  5.64 1.04     3.92    1065
## 2  4.38 1.05     3.40    1050
## 3  5.68 1.03     3.17    1185
## 4  7.80 0.86     3.45    1480
## 5  4.32 1.04     2.93     735
## 6  6.75 1.05     2.85    1450

target - The class variable including the values of the target variable. We should provide either a string, an integer, a numeric value, a vector, a factor, an ordered factor or a Surv object. For the purposes of this example analysis, we are going to use as the dependent variable “Nonflavanoids”.

wine_target <- wine$Nonflavanoids
head(wine_target)
## [1] 0.28 0.26 0.30 0.24 0.39 0.34

Function's Arguments

This is the first time that we are running the algorithm, so we are going to explain what each Argument refers to:

target : The class variable. Provide either a string, an integer, a numeric value, a vector, a factor, an ordered factor or a Surv object. As explained above, this will be the dependent variable. If the target is a single integer value or a string, it has to corresponds to the column number or to the name of the target feature in the dataset. Here we choose “Nonflavanoids”.

dataset : The dataset. Provide either a data frame or a matrix. If the dataset (predictor variables) contains missing (NA) values, they will automatically be replaced by the current variable (column) mean value with an appropriate warning to the user after the execution. Here we choose the whole wine dataset, except from the “Type” (categorical) and “Nonflavanoids” (target) variables.

test : The conditional independence test to use. Default value is NULL. Here since our dataset includes only continuous features (remember: Categorical variable “Type” was removed) and our dependent variable is also continuous, we choose 'testIndFisher'. For more information, about which test to use, please visit : https://www.rdocumentation.org/packages/MXM/versions/0.9.7/topics/CondInditional%20independence%20tests.

threshold : Threshold (suitable values in [0,1]) for the significance of the p-values. The default value is 0.05. Here we choose the default value 0.05.

wei : A vector of weights to be used for weighted regression. The default value is NULL. It is not suggested when “robust” is set to TRUE. If you want to use the “testIndBinom”, then supply the successes in the y and the trials here. Here we choose the default value NULL

K : How many times should the process be repeated? The default value is 0. Here we choose 3.

method : Do you want the likelihood ratio test to be performed (“LR” is the default value) or perform the selection using the “eBIC” criterion (BIC is a special case)? Here we choose BIC in the first example and LR for the second, in order to see the output differences.

gam : In case the method chosen is “eBIC”, one can also specify the gamma parameter. The default value is “NULL”, so that the value is automatically calculated. Here, although we choose BIC as selection criterion, we do not choose any gamma parameter.

backward : After the Forward Early Dropping phase, the algorithm proceeds with the usual Backward Selection phase. The default value is set to TRUE. It is advised to perform this step since some variables may be false positives and were wrongly selected. The backward phase using Likelihood Ratio test and eBIC are two different functions and can be called directly by the user. So, if you want for example to perform a backward regression with a different threshold value, just use these two functions separately. Here we set the backward argument as TRUE.

Testing with eBIC

### ~ ~ ~ Running FBED with eBIC ~ ~ ~ ###
fbed_cont_eBIC <- MXM::fbed.reg(target     = wine_target,
                                 dataset   = wine_dataset, 
                                 test      = "testIndFisher", 
                                 threshold = 0.05,
                                 wei       = NULL,
                                 K         = 10,
                                 method    = "eBIC",
                                 gam       = NULL,
                                 backward  = TRUE)

Testing with LR

### ~ ~ ~ Running FBED with LR ~ ~ ~ ###
fbed_cont_LR <- MXM::fbed.reg(target       = wine_target,
                                 datase    = wine_dataset, 
                                 test      = "testIndFisher", 
                                 threshold = 0.05,
                                 wei       = NULL,
                                 K         = 10,
                                 method    = "LR",
                                 gam       = NULL,
                                 backward  = TRUE)

So, the algorithm run twice… Let's see what information we can take out of it.

Comparing Outputs

The main purpose of running FBED algorithm is to see which variables should be selected as important. The indices of those variables are stored in res. Furthermore, in this matrix we see their test statistic and the associated p-value.

### ~ ~ ~ eBIC results ~ ~ ~ ###
fbed_cont_eBIC$res
##      Vars eBIC difference
## [1,]    7       -240.5425
## [2,]    3       -280.7062
## [3,]    5       -291.0249
SelectedVars_names<-colnames(wine_dataset[fbed_cont_eBIC$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Ash"        "Magnesium"

From eBIC, we get as significant the variables “Flavanoids”, “Ash” and “Magnesium”, while from LR …

### ~ ~ ~ LR results ~ ~ ~ ###
fbed_cont_LR$res
##   sel     stat       pval
## 1   7 3.681241  -8.077634
## 2   3 4.847999 -12.795188
## 3   5 4.107912  -9.694011
## 4  11 2.087806  -3.262599
SelectedVars_names<-colnames(wine_dataset[fbed_cont_LR$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Ash"        "Magnesium"  "Dilution"

… we get the three previous variables, plus the variable “Dilution”. So, the two testing approaches do not differ so much. In this case, the eBIC criterion applied a more strict feature selection, by selecting only 3 variables, while LR returned one variable more.

Since the function returns the variables sorted by their significance, we can easily see that the three variables chosen by both approaches are the most important. So, it depends on the initial question and dataset used to say whether the fourth variable should be used in the downstream analysis.

And as you may imagine, you may also retrieve the information about the scores. They are all (sorted) in the second column.

fbed_cont_eBIC$res[,2]
## [1] -240.5425 -280.7062 -291.0249
fbed_cont_LR$res[,2]
##        1        2        3        4 
## 3.681241 4.847999 4.107912 2.087806

Perfect! But we see that the function returned an object called info. What is this?

fbed_cont_eBIC$info
##     Number of vars Number of tests
## K=0              2              20
## K=1              3              10
## K=2              4               9
## K=3              4               8
fbed_cont_LR$info
##     Number of vars Number of tests
## K=0              3              25
## K=1              4               9
## K=2              4               8

The info matrix describes the number of variables and the number of tests performed (or models fitted) at each round (remember this value of K that in this example we set equal to 10? Here it did not reach K=10 neither with eBIC nor with LR. This happened because there were no difference after the 4th (eBIC) or 3rd (LR) run, so the algorithm stopped running earlier). We see that that LR applied one iteration less. For each K the number of selected variables is returned together with the number of tests performed. Therefore, we see that in the first step, 3 variables were already chosen by LR.

Well, all this refers to the forward phase only. So, if the information about the forward step is appended in the info matrix, where can we find information about the backward phase?

fbed_cont_eBIC$back.rem 
## Vars 
##    4
fbed_cont_LR$back.rem 
## numeric(0)

By calling the back.rem, the variables that have been removed in the backward phase are returned. We see that both approaches did not remove any “false positive” variable.

In case we are interested in the number of models that were fitted in the backward phase, all we have to do is to look for the back.n.tests variable.

fbed_cont_eBIC$back.n.tests 
## [1] 7
fbed_cont_LR$back.n.tests 
## [1] 4

We see that LR applied one test more during the backward phase. This is expected, since this method chose 4 variables (instead of 3 with eBIC) and all 4 have been checked.

But which of the both approaches was quicker applied?

fbed_cont_eBIC$runtime 
##    user  system elapsed 
##    0.11    0.02    0.13
fbed_cont_LR$runtime 
##    user  system elapsed 
##       0       0       0

Since the dataset is small, we do not see any special runtime difference.

FBED for Categorical

On this step, will apply FBED for a Categorical variable, using only eBIC.

Selecting Appropriate Conditional Independence Test

Since the variable is categorical - and more specific it is a factor with more than two levels (unordered) -and the features are continuous, according to MXM's reference manual (p.21 “CondInditional independence tests” ), we should use the Multinomial logistic regression ( 'testIndMultinom' ).

Creating Data & Target Matrices

In this step, we keep the whole dataset, in order to show how to use the algorithm also without subtracting the initial matrix.

### ~ ~ ~ Taking The Whole Dataset ~ ~ ~ ###
wine_dataset <- dplyr::select(wine,
                              -contains("Type")) 
head(wine_dataset)
##   Alcohol Malic  Ash Alcalinity Magnesium Phenols Flavanoids Nonflavanoids
## 1   14.23  1.71 2.43       15.6       127    2.80       3.06          0.28
## 2   13.20  1.78 2.14       11.2       100    2.65       2.76          0.26
## 3   13.16  2.36 2.67       18.6       101    2.80       3.24          0.30
## 4   14.37  1.95 2.50       16.8       113    3.85       3.49          0.24
## 5   13.24  2.59 2.87       21.0       118    2.80       2.69          0.39
## 6   14.20  1.76 2.45       15.2       112    3.27       3.39          0.34
##   Proanthocyanins Color  Hue Dilution Proline
## 1            2.29  5.64 1.04     3.92    1065
## 2            1.28  4.38 1.05     3.40    1050
## 3            2.81  5.68 1.03     3.17    1185
## 4            2.18  7.80 0.86     3.45    1480
## 5            1.82  4.32 1.04     2.93     735
## 6            1.97  6.75 1.05     2.85    1450
wine_target <- as.factor(wine$Type)
head(wine_target)
## [1] 1 1 1 1 1 1
## Levels: 1 2 3

Setting the Arguments

### ~ ~ ~ Running FBED For Categorical Variable with eBIC~ ~ ~ ###
fbed_categorical_eBIC <- MXM::fbed.reg(target = wine_target,
                                 dataset      = wine_dataset, 
                                 test         = "testIndMultinom",
                                 threshold    = 0.05,
                                 wei          = NULL,
                                 K            = 10,
                                 method       = "eBIC",
                                 gam          = NULL,
                                 backward     = TRUE) 
### ~ ~ ~ Running FBED For Categorical Variable with LR~ ~ ~ ###
fbed_categorical_LR <- MXM::fbed.reg(target = wine_target,
                                 dataset    = wine_dataset, 
                                 test       = "testIndMultinom",
                                 threshold  = 0.05,
                                 wei        = NULL,
                                 K          = 10,
                                 method     = "LR",
                                 gam        = NULL,
                                 backward   = TRUE) 
## # weights:  18 (10 variable)
## initial  value 195.552987 
## iter  10 value 22.769414
## iter  20 value 6.859582
## iter  30 value 1.971162
## iter  40 value 1.829873
## iter  50 value 1.163934
## iter  60 value 1.129584
## iter  70 value 1.071912
## iter  80 value 1.064826
## iter  90 value 0.487277
## iter 100 value 0.477455
## final  value 0.477455 
## stopped after 100 iterations
## # weights:  15 (8 variable)
## initial  value 195.552987 
## iter  10 value 37.451958
## iter  20 value 27.927832
## iter  30 value 24.837868
## iter  40 value 23.766671
## iter  50 value 23.149697
## iter  60 value 22.993680
## iter  70 value 22.961828
## iter  80 value 22.953286
## iter  90 value 22.934465
## iter 100 value 22.929542
## final  value 22.929542 
## stopped after 100 iterations
## # weights:  15 (8 variable)
## initial  value 195.552987 
## iter  10 value 26.857767
## iter  20 value 20.694535
## iter  30 value 18.470545
## iter  40 value 18.092177
## iter  50 value 17.976798
## iter  60 value 17.841829
## iter  70 value 17.768169
## iter  80 value 17.729584
## iter  90 value 17.695714
## iter 100 value 17.660677
## final  value 17.660677 
## stopped after 100 iterations
## # weights:  15 (8 variable)
## initial  value 195.552987 
## iter  10 value 38.840089
## iter  20 value 25.181931
## iter  30 value 23.855234
## iter  40 value 23.809249
## iter  50 value 23.802273
## iter  60 value 23.800938
## iter  70 value 23.799450
## iter  80 value 23.798548
## final  value 23.798315 
## converged
## # weights:  15 (8 variable)
## initial  value 195.552987 
## iter  10 value 36.994356
## iter  20 value 24.354430
## iter  30 value 22.683756
## iter  40 value 22.163394
## iter  50 value 21.989378
## iter  60 value 21.943926
## iter  70 value 21.896838
## iter  80 value 21.886846
## iter  90 value 21.865875
## iter 100 value 21.849686
## final  value 21.849686 
## stopped after 100 iterations

So, the algorithm run once again… Let's see what information we can take out of it.

Comparing Outputs

The main purpose of running FBED algorithm is to see which variables should be selected as important. The indices of those variables are stored in res.

### ~ ~ ~ eBIC results ~ ~ ~ ###
fbed_categorical_eBIC$res
##      Vars eBIC difference
## [1,]    7        89.71506
## [2,]    1        79.17733
## [3,]   13        91.45261
## [4,]   11        87.55535
SelectedVars_names<-colnames(wine_dataset[fbed_categorical_eBIC$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Alcohol"    "Proline"    "Hue"

From eBIC, we get as significant the variables “Flavanoids”, “Alcohol”, “Proline”, “Hue”, while from LR …

### ~ ~ ~ LR results ~ ~ ~ ###
fbed_categorical_LR$res
##   sel     stat      pval
## 1   7 44.90417 -22.45209
## 2   1 34.36644 -17.18322
## 3  13 46.64172 -23.32086
## 4  11 42.74446 -21.37223
SelectedVars_names<-colnames(wine_dataset[fbed_categorical_LR$res[,1]])
SelectedVars_names
## [1] "Flavanoids" "Alcohol"    "Proline"    "Hue"

… exactly the same 4 variables were chosen.

What was stored this time in the info matrix?

fbed_categorical_eBIC$info
##     Number of vars Number of tests
## K=0              4              40
## K=1              4               9
fbed_categorical_LR$info
##     Number of vars Number of tests
## K=0              4              44
## K=1              4               9

As we see, both approaches needed 2 iterations. The only difference is that LR applied one test more. Well, this refers to the forward phase only and for each K the number of selected variables is returned together with the number of tests performed.

And now let us inspect the backward phase

fbed_categorical_eBIC$back.rem 
## numeric(0)
fbed_categorical_LR$back.rem 
## numeric(0)

No variable was removed during the backward steps…

fbed_categorical_eBIC$back.n.tests 
## [1] 4
fbed_categorical_LR$back.n.tests 
## [1] 4

… and both approaches fitted 4 models during the backward phase

And how quick has all this happened?

fbed_categorical_eBIC$runtime 
##    user  system elapsed 
##    0.63    0.03    0.69
fbed_categorical_LR$runtime 
##    user  system elapsed 
##    0.83    0.02    0.95

Really quick, since the dataset is small.

FBED for more than one K

In case the user wants to run the FBED algorithm for more than one K and compare the differences after each iteration, instead of calling the function with K=0, K=1, K=2 and so on, there is the possibility of running fbed.reg with K=0:2. Then, the selected variables found at K=2, K=1 and K=0 are returned. In order to make this issue more clear, we are going to apply again the example given in ??3.4, but this time we will ask from the algorithm to check K = 0:5

### ~ ~ ~ Running FBED For Many K ~ ~ ~ ###
wine_dataset <- dplyr::select(wine,
                              -contains("Type"),
                              -contains("Nonflavanoids")) 
wine_target <- wine$Nonflavanoids
fbed_cont_eBIC_manyK <- MXM::fbed.reg(target = wine_target,
                                 datase    = wine_dataset, 
                                 test      = "testIndFisher", 
                                 threshold = 0.05,
                                 wei       = NULL,
                                 K         = 0:5,
                                 method    = "LR",
                                 gam       = NULL,
                                 backward  = TRUE)

Looking inside the new object fbed_cont_eBIC_manyK, we can find all the information about each K separately. This information is stored in $mod, for example:

### ~ ~ ~ statistics about K=1 ~ ~ ~ ###
fbed_cont_eBIC_manyK$mod$`K=1`
##   Vars     stat log p-value
## 1    7 3.681241   -8.077634
## 2    3 4.847999  -12.795188
## 3    5 4.107912   -9.694011
## 4   11 2.087806   -3.262599

Conclusion

Now you are ready to run your own analysis using MXM::FBED algorithm!
Thank you for your attention.
Hope that you found this tutorial helpful.

Session Info {.unnumbered}

All analyses have been applied on:

sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=C                  LC_CTYPE=Greek_Greece.1253   
## [3] LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C                 
## [5] LC_TIME=Greek_Greece.1253    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] dplyr_0.8.5 MXM_1.4.8  
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-144        ordinal_2019.12-10  doParallel_1.0.15  
##  [4] RColorBrewer_1.1-2  R.cache_0.14.0      numDeriv_2016.8-1.1
##  [7] tools_3.6.3         backports_1.1.6     R6_2.4.1           
## [10] rpart_4.1-15        Hmisc_4.4-0         colorspace_1.4-1   
## [13] nnet_7.3-12         tidyselect_1.0.0    gridExtra_2.3      
## [16] compiler_3.6.3      coxme_2.2-16        cli_2.0.2          
## [19] quantreg_5.55       htmlTable_1.13.3    SparseM_1.78       
## [22] slam_0.1-47         scales_1.1.0        checkmate_2.0.0    
## [25] relations_0.6-9     stringr_1.4.0       digest_0.6.25      
## [28] foreign_0.8-75      minqa_1.2.4         R.utils_2.9.2      
## [31] base64enc_0.1-3     jpeg_0.1-8.1        pkgconfig_2.0.3    
## [34] htmltools_0.4.0     lme4_1.1-21         Rfast2_0.0.5       
## [37] htmlwidgets_1.5.1   rlang_0.4.5         rstudioapi_0.11    
## [40] visNetwork_2.0.9    generics_0.0.2      energy_1.7-7       
## [43] jsonlite_1.6.1      acepack_1.4.1       R.oo_1.23.0        
## [46] magrittr_1.5        Formula_1.2-3       Matrix_1.2-18      
## [49] Rcpp_1.0.4.6        munsell_0.5.0       fansi_0.4.1        
## [52] geepack_1.3-1       lifecycle_0.2.0     RcppZiggurat_0.1.5 
## [55] R.methodsS3_1.8.0   ucminf_1.1-4        stringi_1.4.6      
## [58] MASS_7.3-51.5       grid_3.6.3          parallel_3.6.3     
## [61] bdsmatrix_1.3-4     bigmemory.sri_0.1.3 crayon_1.3.4       
## [64] lattice_0.20-38     splines_3.6.3       knitr_1.28         
## [67] pillar_1.4.3        boot_1.3-24         codetools_0.2-16   
## [70] glue_1.4.0          evaluate_0.14       latticeExtra_0.6-29
## [73] data.table_1.12.8   png_0.1-7           vctrs_0.2.4        
## [76] nloptr_1.2.2.1      foreach_1.5.0       MatrixModels_0.4-1 
## [79] gtable_0.3.0        purrr_0.3.3         tidyr_1.0.2        
## [82] assertthat_0.2.1    ggplot2_3.3.0       xfun_0.12          
## [85] Rfast_1.9.9         broom_0.5.5         survival_3.1-12    
## [88] tibble_3.0.0        iterators_1.0.12    sets_1.0-18        
## [91] cluster_2.1.0       bigmemory_4.5.36    ellipsis_0.3.0     
## [94] R.rsp_0.43.2

References {.unnumbered}

Borboudakis G. and Tsamardinos I. (2017). Forward-Backward Selection with Early Dropping. https://arxiv.org/pdf/1705.10770.pdf