rfPermute estimates the significance of importance metrics for a Random Forest model by permuting the response variable. It will produce null distributions of importance metrics for each predictor variable and p-value of observed. The package also includes several summary and visualization functions for randomForest and rfPermute results.
To install the stable version from CRAN:
To install the latest version from GitHub:
# make sure you have Rtools installed
if (!require('devtools')) install.packages('devtools')
# install from GitHub
devtools::install_github('EricArcher/rfPermute')casePredictions Return predictions and votes for training cases classConfInt Classification Confidence Intervals
cleanRFdata Clean Random Forest Input Data
confusionMatrix Confusion Matrix
exptdErrRate Expected Error Rate
impHeatmap Importance Heatmap
pctCorrect Percent Correctly Classified
plotConfMat Heatmap representation of Confusion Matrix
plotImpVarDist Distribution of Important Variables
plotInbag Distribution of sample inbag rates
plotNull Plot Random Forest Importance Null Distributions
plotOOBtimes Distribution of sample OOB rates
plotPredictedProbs Distribution of prediction assignment probabilities
plotRFtrace Trace of cumulative error rates in forest
plotVotes Vote Distribution
plot.rp.importance Plot Random Forest Importance Distributions
proximityPlot Plot Random Forest Proximity Scores
rfPermute Estimate Permutation p-values for Random Forest Importance Metrics
rp.combine Combine rfPermute Objects
rp.importance Extract rfPermute Importance Scores and p-values
pctCorrectcasePredictionsplotConfMat, plotOOBtimes, plotRFtrace, and plotInbag, and plotImpVarDist visualizations.confusionMatrix so it will work when randomForest model doesn’t have a $confusion element, like when model is result of combine-ing multiple models.num.cores to NULL.type argument to plotVotes to choose between area and bar charts.plot.rfPermute to plotNull to avoid clashes and maintain functionality of randomForest::plot.randomForest.proximity.plot to proximityPlot, exptd.err.rate to exptdErrRate, and clean.rf.data to cleanRFdata to make camelCase naming scheme more consistent in package.plotNull from base graphics to ggplot2.symb.metab data set.n argument to impHeatmap.classConfInt, confusionMatrix, plotVotes, pctCorrect.plot.rfPermute that was reporting the p-value incorrectly at the top of the figure.rfPermute so it works on Windows too.impHeatmap function.proximity.plot to use ggplot2 graphics.rfPemute has separate $null.dist and $pval elements, each with results for unscaled and scaled importance mesures. See ?rfPermute for more information.rp.importance and plot.rfPermute now take a scale argument to specify whether or not importance values should be scaled by standard deviations.nrep = 0 for rfPermute, a randomForest object is returned.grid name clashes.clean.rf.data where fixed predictors were not removed.main argument in plot.rp.importance.num.cores argument to rfPermute to take advantage of multi-threadingcalc.imp.pval to keep it from indexing