DANN

Package Introduction

DANN is a variation of k nearest neighbors where the shape of the neighborhood takes into account training data’s class. The neighborhood is elongated along class boundaries and shrunk in the orthogonal direction to class boundaries. See Discriminate Adaptive Nearest Neighbor Classification by Hastie and Tibshirani. This package implements DANN and sub-DANN in section 4.1 of the publication and is based on Christopher Jenness’s python implementation.

Arguments

Example: Circle Data

In what follows a train and test set are created. Class 1 is inside a circle and class 2 surrounds class 1. dann is an accurate model for these data.

library(dann)
library(mlbench)
library(magrittr)
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)

######################
# Circle Data
######################
set.seed(1)
train <- mlbench.circle(500, 2) %>%
  tibble::as_tibble()
colnames(train) <- c("X1", "X2", "Y")

ggplot(train, aes(x = X1, y = X2, colour = Y)) +
  geom_point()


xTrain <- train %>%
  select(X1, X2) %>%
  as.matrix()

yTrain <- train %>%
  pull(Y) %>%
  as.numeric() %>%
  as.vector()

test <- mlbench.circle(500, 2) %>%
  tibble::as_tibble()
colnames(test) <- c("X1", "X2", "Y")

xTest <- test %>%
  select(X1, X2) %>%
  as.matrix()

yTest <- test %>%
  pull(Y) %>%
  as.numeric() %>%
  as.vector()

dannPreds <- dann(xTrain = xTrain, yTrain = yTrain, xTest = xTest, 
                  k = 3, neighborhood_size = 50, epsilon = 1, probability = FALSE)
mean(dannPreds == yTest) #An accurate model.
#> [1] 0.96

rm(train, test)
rm(xTrain, yTrain)
rm(xTest, yTest)
rm(dannPreds)