rfPermute
estimates the significance of importance metrics for a Random Forest model by permuting the response variable. It will produce null distributions of importance metrics for each predictor variable and p-value of observed. The package also includes several summary and visualization functions for randomForest
and rfPermute
results.
To install the stable version from CRAN:
To install the latest version from GitHub:
# make sure you have Rtools installed
if (!require('devtools')) install.packages('devtools')
# install from GitHub
devtools::install_github('EricArcher/rfPermute')
casePredictions
Return predictions and votes for training cases classConfInt
Classification Confidence Intervals
cleanRFdata
Clean Random Forest Input Data
confusionMatrix
Confusion Matrix
exptdErrRate
Expected Error Rate
impHeatmap
Importance Heatmap
pctCorrect
Percent Correctly Classified
plotConfMat
Heatmap representation of Confusion Matrix
plotImpVarDist
Distribution of Important Variables
plotInbag
Distribution of sample inbag rates
plotNull
Plot Random Forest Importance Null Distributions
plotOOBtimes
Distribution of sample OOB rates
plotPredictedProbs
Distribution of prediction assignment probabilities
plotRFtrace
Trace of cumulative error rates in forest
plotVotes
Vote Distribution
plot.rp.importance
Plot Random Forest Importance Distributions
proximityPlot
Plot Random Forest Proximity Scores
rfPermute
Estimate Permutation p-values for Random Forest Importance Metrics
rp.combine
Combine rfPermute Objects
rp.importance
Extract rfPermute Importance Scores and p-values
pctCorrect
casePredictions
plotConfMat
, plotOOBtimes
, plotRFtrace
, and plotInbag
, and plotImpVarDist
visualizations.confusionMatrix
so it will work when randomForest
model doesn’t have a $confusion
element, like when model is result of combine
-ing multiple models.num.cores
to NULL
.type
argument to plotVotes
to choose between area and bar charts.plot.rfPermute
to plotNull
to avoid clashes and maintain functionality of randomForest::plot.randomForest
.proximity.plot
to proximityPlot
, exptd.err.rate
to exptdErrRate
, and clean.rf.data
to cleanRFdata
to make camelCase naming scheme more consistent in package.plotNull
from base graphics to ggplot2.symb.metab
data set.n
argument to impHeatmap
.classConfInt
, confusionMatrix
, plotVotes
, pctCorrect
.plot.rfPermute
that was reporting the p-value incorrectly at the top of the figure.rfPermute
so it works on Windows too.impHeatmap
function.proximity.plot
to use ggplot2
graphics.rfPemute
has separate $null.dist
and $pval
elements, each with results for unscaled and scaled importance mesures. See ?rfPermute
for more information.rp.importance
and plot.rfPermute
now take a scale
argument to specify whether or not importance values should be scaled by standard deviations.nrep = 0
for rfPermute
, a randomForest
object is returned.grid
name clashes.clean.rf.data
where fixed predictors were not removed.main
argument in plot.rp.importance
.num.cores
argument to rfPermute
to take advantage of multi-threadingcalc.imp.pval
to keep it from indexing