Superfast Likelihood Inference for Stationary Gaussian Time Series

This vignette illustrates the basic functionality of the SuperGauss package by simulating a few stochastic processes and estimating their parameters from regularly spaced data.

Simulation of fractional Brownian motion

A one-dimensional fractional Brownian motion (fBM) \(X_t = X(t)\) is a continuous Gaussian process with \(E[X_t] = 0\) and \(\mathrm{cov}(X_t, X_s) = \tfrac 1 2 (|t|^{2H} + |s|^{2H} - |t-s|^{2H})\), for \(0 < H < 1\). fBM is not stationary but has stationary increments, such that \((X_{t+\Delta t} - X_t) \stackrel{D}{=} (X_{s+\Delta t} - X_s)\) for any \(s,t\). As such, its covariance function is completely determined its mean squared displacement (MSD) \[ \mathrm{\scriptsize MSD}_X(t) = E[(X_t - X_0)^2] = |t|^{2H}. \] When the Hurst parameter \(H = \tfrac 1 2\), fBM reduces to ordinary Brownian motion.

The following R code generates 5 independent fBM realizations of length \(N = 3000\) with \(H = 0.3\). The timing of the “superfast” method (Wood and Chan 1994) provided in this package is compared to that of a “fast” method (e.g., Brockwell and Davis 1991) and to the usual method (Cholesky decomposition of an unstructured variance matrix).

require(SuperGauss)

N <- 3000 # number of observations
dT <- 1/60 # time between observations (seconds)
H <- .3 # Hurst parameter

tseq <- (0:N)*dT # times at which to sample fBM
npaths <- 5 # number of fBM paths to generate

# to generate fbm, generate its increments, which are stationary
msd <- fbm.msd(tseq = tseq[-1], H = H)
acf <- msd2acf(msd = msd) # convert msd to acf

# superfast method
system.time({
  dX <- rSnorm(n = npaths, acf = acf, fft = TRUE)
})

##    user  system elapsed 
##   0.006   0.000   0.007

# fast method (about 3x as slow)
system.time({
  rSnorm(n = npaths, acf = acf, fft = FALSE)
})

##    user  system elapsed 
##   0.019   0.000   0.019

# unstructured variance method (much slower)
system.time({
  matrix(rnorm(N*npaths), npaths, N) %*% chol(toeplitz(acf))
})

##    user  system elapsed 
##   3.869   0.024   3.897

# convert increments to position measurements
Xt <- apply(rbind(0, dX), 2, cumsum)

# plot
clrs <- c("black", "red", "blue", "orange", "green2")
par(mar = c(4.1,4.1,.5,.5))
plot(0, type = "n", xlim = range(tseq), ylim = range(Xt),
     xlab = "Time (s)", ylab = "Position (m)")
for(ii in 1:npaths) {
  lines(tseq, Xt[,ii], col = clrs[ii], lwd = 2)
}

Inference for the Hurst parameter

Suppose that \(\boldsymbol{X}= (N_{X},\ldots,N_{)}\) are equally spaced observations of an fBM process with \(X_i = X(i \Delta t)\), and let \(\Delta\boldsymbol{X}= (N-1_{\Delta X},\ldots,N-1_{)}\) denote the corresponding increments, \(\Delta X_i = X_{i+1} - X_i\). Then the loglikelihood function for \(H\) is \[ \ell(H \mid \Delta\boldsymbol{X}) = -\tfrac 1 2 \big(\Delta\boldsymbol{X}' \boldsymbol{V}_H^{-1} \Delta\boldsymbol{X}+ \log |\boldsymbol{V}_H|\big), \] where \(V_H\) is a Toeplitz matrix, \[ \boldsymbol{V}_H= [\mathrm{cov}(\Delta X_i, \Delta X_j)]_{0 \le i,j < N} = \begin{bmatrix} \gamma_0 & \gamma_1 & \cdots & \gamma_{N-1} \\ \gamma_1 & \gamma_0 & \cdots & \gamma_{N-2} \\ \vdots & \vdots & \ddots & \vdots \\ \gamma_{N-1} & \gamma_{N-2} & \cdots & \gamma_0 \end{bmatrix}. \] Thus, each evaluation of the loglikelihood requires the inverse and log-determinant of a Toeplitz matrix, which scales as \(\mathcal O(N^2)\) with the Durbin-Levinson algorithm. The SuperGauss package implements an extended version of the Generalized Schur algorithm of Ammar and Gragg (1988), which scales these computations as \(\mathcal O(N \log^2 N)\). With careful memory management and extensive use of the FFTW library (Frigo and Johnson 2005), the SuperGauss implementation crosses over Durbin-Levinson at around \(N = 300\).

The `Toeplitz` matrix class

The bulk of the likelihood calculations in SuperGauss are handled by the Toeplitz matrix class. A Toeplitz object is created as follows:

# allocate and assign in one step
Toep <- Toeplitz(acf = acf)
Toep

## Toeplitz matrix of size 3000 
##  acf:  0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...

# allocate memory only
Toep <- Toeplitz(n = N)
Toep

## Toeplitz matrix of size 3000 
##  acf:  NULL

Toep$setAcf(acf = acf) # assign later

Its primary methods are illustrated below:

all(acf == Toep$getAcf()) # extract acf

## [1] TRUE

# matrix multiplication
z <- rnorm(N)
x1 <- toeplitz(acf) %*% z # regular way
x2 <- Toep %*% z # with Toeplitz class
range(x1-x2)

## [1] -1.332268e-15  1.554312e-15

# system of equations
y1 <- solve(toeplitz(acf), z) # regular way
y2 <- solve(Toep, z) # with Toeplitz class
range(y1-y2)

## [1] -2.160050e-12  1.236344e-12

# log-determinant
ld1 <- determinant(toeplitz(acf))$mod
ld2 <- determinant(Toep) # note: no $mod
c(ld1, ld2)

## [1] -7642.578 -7642.578

Maximum likelihood calculation

The following code shows how to obtain the maximum likelihood of \(H\) and its standard error for a given fBM path. For speed comparisons, the optimization is done both using the superfast Generalized Schur algorithm and the fast Durbin-Levinson algorithm.

dX <- diff(Xt[,1]) # obtain the increments of a given path
N <- length(dX)

# autocorrelation of fBM increments
fbm.acf <- function(H) {
  msd <- fbm.msd(1:N*dT, H = H)
  msd2acf(msd)
}

# loglikelihood using generalized Schur algorithm
Toep <- Toeplitz(n = N) # pre-allocate memory
loglik.GS <- function(H) {
  Toep$setAcf(acf = fbm.acf(H))
  dSnorm(X = dX, acf = Toep, log = TRUE)
}

# loglikelihood using Durbin-Levinson algorithm
loglik.DL <- function(H) {
  dSnormDL(X = dX, acf = fbm.acf(H), log = TRUE)
}

# superfast method
system.time({
  GS.mle <- optimize(loglik.GS, interval = c(.01, .99), maximum = TRUE)
})

##    user  system elapsed 
##   0.024   0.001   0.026

# fast method (about 10x slower)
system.time({
  DL.mle <- optimize(loglik.DL, interval = c(.01, .99), maximum = TRUE)
})

##    user  system elapsed 
##   0.289   0.001   0.290

c(GS = GS.mle$max, DL = DL.mle$max)

##        GS        DL 
## 0.2986292 0.2986292

# standard error calculation
require(numDeriv)

## Loading required package: numDeriv

Hmle <- GS.mle$max
Hse <- -hessian(func = loglik.GS, x = Hmle) # observed Fisher Information
Hse <- sqrt(1/Hse[1])
c(mle = Hmle, se = Hse)

##         mle          se 
## 0.298629202 0.003332506

Caution with `ReferenceClasses`

In order to effectively manage memory in the underlying C++ code, the Toeplitz class is implemented using R’s “Reference Classes”. Among other things, this means that when a Toeplitz object is passed to a function, the function does not make a copy of it: all modifications to the object inside the object are reflected on the object outside the function as well, as in the following example:

T1 <- Toeplitz(n = N)
T2 <- T1 # shallow copy: both of these point to the same memory location

# affects both objects
T1$setAcf(fbm.acf(.5))
T1

## Toeplitz matrix of size 3000 
##  acf:  0.0167 0 1.73e-18 -3.47e-18 0 6.94e-18 ...

T2

## Toeplitz matrix of size 3000 
##  acf:  0.0167 0 1.73e-18 -3.47e-18 0 6.94e-18 ...

loglik <- function(H) {
  T1$setAcf(acf = fbm.acf(H))
  dSnorm(X = dX, acf = T1, log = TRUE)
}

# affects both objects
loglik(H = .3)

## [1] -467.3969

T1

## Toeplitz matrix of size 3000 
##  acf:  0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...

T2

## Toeplitz matrix of size 3000 
##  acf:  0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...

Superfast Newton-Raphson

In addition to the superfast algorithm for Gaussian likelihood evaluations , SuperGauss provides such algorithms for the loglikelihood gradient and Hessian functions, leading to superfast versions of many inference algorithms such as Newton-Raphson and Hamiltonian Monte Carlo. An example of the former is given below using the two-parameter exponential autocorrelation model \[ \mathrm{\scriptsize ACF}_X(t \mid \lambda, \sigma) = \sigma^2 \exp(- |t/\lambda|). \]

# autocorrelation function
exp.acf <- function(t, lambda, sigma) sigma^2 * exp(-abs(t/lambda))
# gradient, returned as a 2-column matrix
exp.acf.grad <- function(t, lambda, sigma) {
  ea <- exp.acf(t, lambda, 1)
  cbind(abs(t)*(sigma/lambda)^2 * ea, # d_acf/d_lambda
        2*sigma * ea) # d_acf/d_sigma
}
# Hessian, returned as an array of size length(t) x 2 x 2
exp.acf.hess <- function(t, lambda, sigma) {
  ea <- exp.acf(t, lambda, 1)
  sl2 <- sigma/lambda^2
  hess <- array(NA, dim = c(length(t), 2, 2))
  hess[,1,1] <- sl2^2*(t^2 - 2*abs(t)*lambda) * ea # d2_acf/d_lambda^2
  hess[,1,2] <- 2*sl2 * abs(t) * ea # d2_acf/(d_lambda d_sigma)
  hess[,2,1] <- hess[,1,2] # d2_acf/(d_sigma d_lambda)
  hess[,2,2] <- 2 * ea # d2_acf/d_sigma^2
  hess
}

# simulate data
lambda <- runif(1, .5, 2)
sigma <- runif(1, .5, 2)
tseq <- (1:N-1)*dT
acf <- exp.acf(t = tseq, lambda = lambda, sigma = sigma)
Xt <- rSnorm(acf = acf)

Toep <- Toeplitz(n = N) # storage space

# negative loglikelihood function of theta = (lambda, sigma)
# include attributes for gradient and Hessian
exp.negloglik <- function(theta) {
  lambda <- theta[1]
  sigma <- theta[2]
  # acf, its gradient, and Hessian
  Toep$setAcf(acf = exp.acf(tseq, lambda, sigma)) # use the Toeplitz class
  dacf <- exp.acf.grad(tseq, lambda, sigma)
  d2acf <- exp.acf.hess(tseq, lambda, sigma)
  nll <- -1 * dSnorm(X = Xt, acf = Toep, log = TRUE)
  attr(nll, "gradient") <- -1 * Snorm.grad(X = Xt, acf = Toep, dacf = dacf)
  attr(nll, "hessian") <- -1 * Snorm.hess(X = Xt, acf = Toep,
                                          dacf = dacf, d2acf = d2acf)
  nll
}

# optimization
system.time({
  mle.fit <- nlm(f = exp.negloglik, p = c(1,1), hessian = TRUE)
})

##    user  system elapsed 
##   0.433   0.047   0.482

# display estimates with standard errors
rbind(true = c(lambda = lambda, sigma = sigma),
      est = mle.fit$estimate,
      se = sqrt(diag(solve(mle.fit$hessian))))

##        lambda    sigma
## true 1.448131 1.676776
## est  1.472559 1.690344
## se   0.363845 0.207625

Superfast Likelihood Inference for Stationary Gaussian Time Series

Yun Ling, Martin Lysy

2020-02-27

Simulation of fractional Brownian motion

Inference for the Hurst parameter

The `Toeplitz` matrix class

Maximum likelihood calculation

Caution with `ReferenceClasses`

Superfast Newton-Raphson

References

Superfast Likelihood Inference for Stationary Gaussian Time Series

Yun Ling, Martin Lysy

2020-02-27

Simulation of fractional Brownian motion

Inference for the Hurst parameter

The Toeplitz matrix class

Maximum likelihood calculation

Caution with ReferenceClasses

Superfast Newton-Raphson

References

The `Toeplitz` matrix class

Caution with `ReferenceClasses`