SDMtune - presence absence models

Sergio Vignali, Arnaud Barras & Veronika Braunisch

Train the model
Tune model hyperparameters
Evaluate the final model
References

The other vignettes are based on presence only methods. Here you will learn how to train a presence absence model. The following examples are based on the Artificial Neural Networks method (Venables and Ripley 2002), but you can adapt the code for any of the other supported methods. We use the first 8 environmental variables and the virtualSp dataset selecting the absence instead of the background locations.

library(SDMtune)
library(zeallot)

# Prepare data
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd", full.names = TRUE)
predictors <- raster::stack(files)
p_coords <- virtualSp$presence
a_coords <- virtualSp$absence
data <- prepareSWD(species = "Virtual species", p = p_coords, a = a_coords,
                   env = predictors[[1:8]])

# Split data in training and testing datasets
c(train, test) %<-% trainValTest(data, test = 0.2, seed = 25)
cat("# Training  : ", nrow(train@data))
cat("\n# Testing   : ", nrow(test@data))

# Create folds
folds <- randomFolds(train, k = 4, seed = 25)

Train the model

We first train the model with default settings and using 10 neurons:

set.seed(25)
model <- train("ANN", data = train, size = 10, folds = folds)
model

Let’s check the training and testing AUC:

auc(model)
auc(model, test = TRUE)

Tune model hyperparameters

To check which hyperparameters can be tuned we use the function getTunableArgs function:

getTunableArgs(model)

We use the function optimizeModel to tune the hyperparameters:

h <- list(size = 10:50, decay = c(0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5),
          maxit = c(50, 100, 300, 500))

om <- optimizeModel(model, hypers = h, metric = "auc", seed = 25)

The best model is:

best_model <- om@models[[1]]
om@results[1, ]

Evaluate the final model

We now train a model with the same configuration as found by the functionoptimizeModel, without cross validation, using all the train data, and we evaluate it using the held apart testing dataset:

set.seed(25)
final_model <- train("ANN", data = train, size = om@results[1, 1],
                     decay = om@results[1, 2], maxit = om@results[1, 4])
plotROC(final_model, test = test)

References

Venables, W N, and B. D. Ripley. 2002. Modern Applied Statistics with S. Fourth Edi. New York, NY: Springer.