This package provides a goodness-of-fit test of whether a given i.i.d. sample {xi} is drawn from a given distribution. It works for any distribution once its score function (the derivative of log-density) ∇xlogp(x) can be provided. This method is based on ``A Kernelized Stein Discrepancy for Goodness-of-fit Tests and Model Evaluation'' by Liu, Lee, and Jordan, available at http://arxiv.org/abs/1602.03253.
The main function of this package is KSD, which estimates Kernelized Stein Discrepancy. Parameters include :
Other methods are also in this package, including various demos and examples.
KSD requires user to provide a score function to be used for computation. For example usage and exploration, a gmm class is provided in the pacakge, which allow test KSD using gaussian mixture model.
Consider the following examples :
# Pass in a dataset generated by Gaussian distribution,
# pass in computed score rather than score function
library(KSD)
library(pryr)
#> Warning: package 'pryr' was built under R version 3.2.5
model <- gmm()
X <- rgmm(model, n=100)
score_function = scorefunctiongmm(model=model, X=X)
result <- KSD(X,score_function=score_function)
result$p
#> [1] 0.899
# Pass in a dataset generated by Gaussian distribution,
# use pryr package to pass in score function
library(KSD)
library(pryr)
model <- gmm()
X <- rgmm(model, n=100)
score_function = pryr::partial(scorefunctiongmm, model=model)
result <- KSD(X,score_function=score_function)
result$p
#> [1] 0.899
Premade demos include the following (Note that these demos require additional libraries)
demo_iris()
demo_normal_performance()
demo_simple_gaussian()
demo_simple_gamma()
demo_gmm()
demo_gmm_multi()
A sample run of demo_iris :
library(KSD)
library(datasets)
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.2.3
library(gridExtra)
#> Warning: package 'gridExtra' was built under R version 3.2.3
library(mclust)
#> Warning: package 'mclust' was built under R version 3.2.3
#> Package 'mclust' version 5.1
#> Type 'citation("mclust")' for citing this R package in publications.
library(pryr)
demo_iris()
#> [1] "Fitting GMM with 3 clusters"
#> [1] "Average p value : 0.366"
Currently, the code is available at https://github.com/MinHyung-Kang/KSD/ More download options will be available after CRAN submission.
Minhyung(dot)Daniel(dot)Kang(at)gmail(dot)com