iBreakDown plots for regression models

Here we will use the dragons data from DALEX package to present the iBreakDown for regression models.

Linear regression

First, we fit a model.

m_lm <- lm(life_length ~ . , data = dragons)

To understand the factors that drive predictions for a single observation we use the iBreakDown package.

Now, we create an object of the break_down class. If we want to plot distributions of partial predictions, set keep_distributions = TRUE.

bd_lm <- local_attributions(m_lm,
                            data = dragons_test,
                            new_observation =  new_observation,
                            keep_distributions = TRUE)

We can simply print the result.

bd_lm

#>                               contribution
#> lm: intercept                     1356.562
#> lm: scars = 4                     -235.221
#> lm: number_of_lost_teeth = 30      205.037
#> lm: year_of_birth = -938            22.193
#> lm: height = 39.19                  11.296
#> lm: colour = black                  10.856
#> lm: weight = 10.02                  -9.217
#> lm: year_of_discovery = 1800         4.668
#> lm: life_length = 1375               0.000
#> lm: prediction                    1366.174

Or plot the result which is more clear.

plot(bd_lm)

Use the baseline parameter to set the origin of plots.

plot(bd_lm, baseline = 0)

Use the plot_distributions parameter to see distributions of partial predictions.

plot(bd_lm, plot_distributions = TRUE)

For another types of models we proceed analogously. However, sometimes we need to create custom predict function (see nnet example).

randomForest

library(randomForest)

m_rf <- randomForest(life_length ~ . , data = dragons)

bd_rf <- local_attributions(m_rf,
                            data = dragons_test,
                            new_observation =  new_observation)

head(bd_rf)

#>                                                 contribution
#> randomForest.formula: intercept                     1411.296
#> randomForest.formula: scars = 4                     -216.775
#> randomForest.formula: number_of_lost_teeth = 30      191.820
#> randomForest.formula: weight = 10.02                  -1.085
#> randomForest.formula: year_of_birth = -938            11.531
#> randomForest.formula: colour = black                   7.214

plot(bd_rf)

SVM

library(e1071)

m_svm <- svm(life_length ~ . , data = dragons)

bd_svm <- local_attributions(m_svm,
                            data = dragons_test,
                            new_observation =  new_observation)

plot(bd_svm)

nnet

When you use nnet package for regression, remember to normalize the resposne variable, in such a way that it is from interval \((0,1)\).

In this case, creating custom predict function is also needed.

library(nnet)

x <- max(abs(dragons$life_length))
digits <- floor(log10(x))
normalizing_factor <- round(x, -digits)
m_nnet <- nnet(life_length/normalizing_factor ~ . , data = dragons, size = 10, linout = TRUE)

#> # weights:  111
#> initial  value 349.630022 
#> iter  10 value 27.517170
#> iter  20 value 27.404945
#> iter  30 value 27.388586
#> iter  40 value 27.374960
#> iter  50 value 27.369169
#> iter  60 value 27.351230
#> iter  70 value 20.500091
#> iter  80 value 15.391575
#> iter  90 value 7.567563
#> iter 100 value 3.902800
#> final  value 3.902800 
#> stopped after 100 iterations

p_fun <- function(model, new_observation){
  predict(model, newdata = new_observation)*normalizing_factor
}

bd_nnet <- local_attributions(m_nnet,
                            data = dragons_test,
                            new_observation =  new_observation,
                            predict_function = p_fun)

plot(bd_nnet)

iBreakDown plots for regression models

Dariusz Komosinski

2020-07-28

Linear regression

randomForest

SVM

nnet