fitsde()
functionThe Sim.DiffProc package implements pseudo-maximum likelihood via the fitsde()
function. The interface and the output of the fitsde()
function are made as similar as possible to those of the standard mle
function in the stats4
package of the basic R system. The main arguments to fitsde
consist:
data
a univariate time series (ts
object).drift
and diffusion
indicate drift and diffusion coefficient of the SDE, is an expression
of two variables t
, x
and theta
names of the parameters, and must be nominated by a vector of theta = (theta[1], theta[2],..., theta[p])
for reasons of symbolic derived in approximation methods.start
must be specified as a named list, where the names of the elements of the list correspond to the names of the parameters as they appear in the drift
and diffusion
coefficient.pmle
argument must be a character
string specifying the method to use, can be either: "euler"
Euler pseudo-likelihood, "ozaki"
Ozaki pseudo-likelihood, "shoji"
Shoji pseudo-likelihood and "kessler"
Kessler pseudo-likelihood.optim.method
select the optimization method ("L-BFGS-B"
is used by default), and further arguments to pass to optim
function.lower
and upper
bounds on the variables for the Brent
or L-BFGS-B
method.The functions of type S3 method
(as similar of the standard mle
function in the stats4
package of the basic R system for the class fitsde
are the following:
coef
: which extracts model coefficients from objects returned by fitsde
.vcov
: returns the variance-covariance matrix of the parameters of a fitted model objects.logLik
: extract log-likelihood.AIC
: calculating Akaike’s Information Criterion for fitted model objects.BIC
: calculating Schwarz’s Bayesian Criterion for fitted model objects.confint
: computes confidence intervals for one or more parameters in a fitted model objects.The following we explain how to use this function to estimate a SDE with different approximation methods.
with \(\theta_{1}=1\), \(\theta_{2}=2\), \(\theta_{3}=0.5\) and \(\theta_{4}=0.3\). Before calling fitsde
, we generate sampled data \(X_{t_{i}}\), with \(\Delta t =10^{-4}\), as following:
R> f <- expression( (1+2*x) ) ; g <- expression( 0.5*x^0.3 )
R> sim <- snssde1d(drift=f,diffusion=g,x0=2,N=10^4,Dt=10^-4)
R> mydata <- sim$X
we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=\theta_{3}=\theta_{4}=1\), and we specify the coefficients drift and diffusion as expressions. you can use the upper
and lower
limits of the search region used by the optimizer (here using the default method "L-BFGS-B"
), and specifying the method to use with pmle="euler"
.
R> fx <- expression( theta[1]+theta[2]*x ) ## drift coefficient of model
R> gx <- expression( theta[3]*x^theta[4] ) ## diffusion coefficient of model
R> fitmod <- fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1=1, theta2=1,
+ theta3=1,theta4=1),pmle="euler")
R> fitmod
Call:
fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1,
theta2 = 1, theta3 = 1, theta4 = 1), pmle = "euler")
Coefficients:
theta1 theta2 theta3 theta4
1.11201 1.95717 0.50243 0.30419
The estimated coefficients are extracted from the output object fitmod
as follows:
R> coef(fitmod)
theta1 theta2 theta3 theta4
1.11201 1.95717 0.50243 0.30419
We can use the summary
function to produce result summaries of output object:
R> summary(fitmod)
Pseudo maximum likelihood estimation
Method: Euler
Call:
fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1,
theta2 = 1, theta3 = 1, theta4 = 1), pmle = "euler")
Coefficients:
Estimate Std. Error
theta1 1.11201 1.495082
theta2 1.95717 0.196385
theta3 0.50243 0.010715
theta4 0.30419 0.010697
-2 log L: -66048
vcov
for variance-covariance matrice, and extract log-likelihood by logLik
:
R> vcov(fitmod)
theta1 theta2 theta3 theta4
theta1 2.235269019 -0.2409196808 -0.0000809562 0.000010246
theta2 -0.240919681 0.0385669931 -0.0000048572 0.000013319
theta3 -0.000080956 -0.0000048572 0.0001148041 -0.000108128
theta4 0.000010246 0.0000133186 -0.0001081279 0.000114421
R> logLik(fitmod)
[1] 33024
R> AIC(fitmod)
[1] -66040
R> BIC(fitmod)
[1] -66030
Computes confidence intervals for one or more parameters in a fitted SDE:
R> confint(fitmod, level=0.95)
2.5 % 97.5 %
theta1 -1.81829 4.04232
theta2 1.57226 2.34208
theta3 0.48142 0.52343
theta4 0.28322 0.32515
It is always possible to transform process \(X_t\) with a constant diffusion coefficient using the Lamperti transform.
Now we consider the Vasicek model, with \(f(x,\theta) = \theta_{1} (\theta_{2}- x)\) and constant volatility \(g(x,\theta) = \theta_{3}\), \[\begin{equation}\label{eq12} dX_{t} = \theta_{1} (\theta_{2}- X_{t}) dt + \theta_{3} dW_{t},\qquad X_{0}=5 \end{equation}\]with \(\theta_{1}=3\), \(\theta_{2}=2\) and \(\theta_{3}=0.5\), we generate sampled data \(X_{t_{i}}\), with \(\Delta t =10^{-2}\), as following:
R> f <- expression( 3*(2-x) ) ; g <- expression( 0.5 )
R> sim <- snssde1d(drift=f,diffusion=g,x0=5,Dt=0.01)
R> HWV <- sim$X
we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=\theta_{3}=1\), and we specify the coefficients drift and diffusion as expressions. Specifying the method to use with pmle="ozaki"
, which can easily be implemented in R as follows:
R> fx <- expression( theta[1]*(theta[2]- x) ) ## drift coefficient of model
R> gx <- expression( theta[3] ) ## diffusion coefficient of model
R> fitmod <- fitsde(data=HWV,drift=fx,diffusion=gx,start = list(theta1=1,theta2=1,
+ theta3=1),pmle="ozaki")
R> summary(fitmod)
Pseudo maximum likelihood estimation
Method: Ozaki
Call:
fitsde(data = HWV, drift = fx, diffusion = gx, start = list(theta1 = 1,
theta2 = 1, theta3 = 1), pmle = "ozaki")
Coefficients:
Estimate Std. Error
theta1 2.79627 0.366488
theta2 1.97629 0.060255
theta3 0.51036 0.011414
-2 log L: -3113.1
If you want to have confidence intervals \(\theta_{1}\) and \(\theta_{2}\) parameters only, using the command confint
as following:
R> confint(fitmod,parm=c("theta1","theta2"),level=0.95)
2.5 % 97.5 %
theta1 2.0780 3.5146
theta2 1.8582 2.0944
with: \(a(t) = \theta_{1}t\), and we generate sampled data \(X_{t_{i}}\), with \(\theta_{1}=-2\), \(\theta_{2}=0.2\) and time step \(\Delta t =10^{-3}\), as following:
R> f <- expression(-2*x*t) ; g <- expression(0.2*x)
R> sim <- snssde1d(drift=f,diffusion=g,N=1000,Dt=0.001,x0=10)
R> mydata <- sim$X
we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=1\), and we specify the method to use with pmle="shoji"
:
R> fx <- expression( theta[1]*x*t ) ## drift coefficient of model
R> gx <- expression( theta[2]*x ) ## diffusion coefficient of model
R> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1),pmle="shoji",lower=c(-3,0),upper=c(-1,1))
R> summary(fitmod)
Pseudo maximum likelihood estimation
Method: Shoji
Call:
fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1,
theta2 = 1), pmle = "shoji", lower = c(-3, 0), upper = c(-1,
1))
Coefficients:
Estimate Std. Error
theta1 -1.58027 0.3436216
theta2 0.19826 0.0044351
-2 log L: -3313.2
with: \(a(t) = \theta_{1}t\) and \(b(t)=\theta_{2}\sqrt{t}\), the volatility depends on time: \(\sigma(t)=\theta_{3}t\). We generate sampled data of \(X_t\), with \(\theta_{1}=3\), \(\theta_{2}=1\) and \(\theta_{3}=0.3\), time step \(\Delta t =10^{-3}\), as following:
R> f <- expression(3*t*(sqrt(t)-x)) ; g <- expression(0.3*t)
R> sim <- snssde1d(drift=f,diffusion=g,M=1,N=1000,x0=2,Dt=0.001)
R> mydata <- sim$X
we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=\theta_{3}=1\), and we specify the method to use with pmle="kessler"
:
R> fx <- expression( theta[1]*t* ( theta[2]*sqrt(t) - x ) ) ## drift coefficient of model
R> gx <- expression( theta[3]*t ) ## diffusion coefficient of model
R> fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="kessler")
R> summary(fitmod)
Pseudo maximum likelihood estimation
Method: Kessler
Call:
fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1,
theta2 = 1, theta3 = 1), pmle = "kessler")
Coefficients:
Estimate Std. Error
theta1 1.70762 0.3469224
theta2 0.51601 0.3917016
theta3 0.30409 0.0068078
-2 log L: -8437.3
fitsde()
in practiceWe generate data from true model with parameters \(\underline{\theta}=(2,0.3,0.5)\), initial value \(X_{0}=2\) and \(\Delta t =10^{-4}\), as following:
R> f <- expression( 2*x )
R> g <- expression( 0.3*x^0.5 )
R> sim <- snssde1d(drift=f,diffusion=g,M=1,N=10000,x0=2,Dt=0.0001)
R> mydata <- sim$X
We test the performance of the AIC statistics for the four competing models, we can proceed as follows:
R> ## True model
R> fx <- expression( theta[1]*x )
R> gx <- expression( theta[2]*x^theta[3] )
R> truemod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="euler")
R> ## competing model 1
R> fx1 <- expression( theta[1]+theta[2]*x )
R> gx1 <- expression( theta[3]*x^theta[4] )
R> mod1 <- fitsde(data=mydata,drift=fx1,diffusion=gx1,start = list(theta1=1,
+ theta2=1,theta3=1,theta4=1),pmle="euler")
R> ## competing model 2
R> fx2 <- expression( theta[1]+theta[2]*x )
R> gx2 <- expression( theta[3]*sqrt(x) )
R> mod2 <- fitsde(data=mydata,drift=fx2,diffusion=gx2,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="euler")
R> ## competing model 3
R> fx3 <- expression( theta[1] )
R> gx3 <- expression( theta[2]*x^theta[3] )
R> mod3 <- fitsde(data=mydata,drift=fx3,diffusion=gx3,start = list(theta1=1,
+ theta2=1,theta3=1),pmle="euler")
R> ## Computes AIC
R> AIC <- c(AIC(truemod),AIC(mod1),AIC(mod2),AIC(mod3))
R> Test <- data.frame(AIC,row.names = c("True mod","Comp mod1","Comp mod2","Comp mod3"))
R> Bestmod <- rownames(Test)[which.min(Test[,1])]
R> Bestmod
[1] "True mod"
the estimates under the different models:
R> Theta1 <- c(coef(truemod)[[1]],coef(mod1)[[1]],coef(mod2)[[1]],coef(mod3)[[1]])
R> Theta2 <- c(coef(truemod)[[2]],coef(mod1)[[2]],coef(mod2)[[2]],coef(mod3)[[2]])
R> Theta3 <- c(coef(truemod)[[3]],coef(mod1)[[3]],coef(mod2)[[3]],coef(mod3)[[3]])
R> Theta4 <- c("",round(coef(mod1)[[4]],5),"","")
R> Parms <- data.frame(Theta1,Theta2,Theta3,Theta4,row.names = c("True mod",
+ "Comp mod1","Comp mod2","Comp mod3"))
R> Parms
Theta1 Theta2 Theta3 Theta4
True mod 1.988747 0.30575 0.49435
Comp mod1 1.118869 1.80983 0.30573 0.4944
Comp mod2 0.026459 1.98457 0.30285
Comp mod3 9.113331 0.30498 0.49801
We make use of real data of the U.S. Interest Rates monthly form \(06/1964\) to \(12/1989\) (see Figure 1) available in package Ecdat, and we estimate the parameters \(\underline{\theta}=(\theta_{1},\theta_{2},\theta_{3},\theta_{4})\) of CKLS model. These data have been analyzed by Brouste et all (2014) with yuima package, here we confirm the results of the estimates by several approximation methods.
R> data(Irates)
R> rates <- Irates[, "r1"]
R> X <- window(rates, start = 1964.471, end = 1989.333)
R> plot(X)
The U.S. Interest Rates monthly form \(06/1964\) to \(12/1989\)
we can now use all previous methods by fitsde
function to estimate the parameters of CKLS model as follows:
R> fx <- expression( theta[1]+theta[2]*x ) ## drift coefficient of CKLS model
R> gx <- expression( theta[3]*x^theta[4] ) ## diffusion coefficient of CKLS model
R> pmle <- eval(formals(fitsde.default)$pmle)
R> fitres <- lapply(1:4, function(i) fitsde(X,drift=fx,diffusion=gx,pmle=pmle[i],
+ start = list(theta1=1,theta2=1,theta3=1,theta4=1)))
R> Coef <- data.frame(do.call("cbind",lapply(1:4,function(i) coef(fitres[[i]]))))
R> Info <- data.frame(do.call("rbind",lapply(1:4,function(i) logLik(fitres[[i]]))),
+ do.call("rbind",lapply(1:4,function(i) AIC(fitres[[i]]))),
+ do.call("rbind",lapply(1:4,function(i) BIC(fitres[[i]]))),
+ row.names=pmle)
R> names(Coef) <- c(pmle)
R> names(Info) <- c("logLik","AIC","BIC")
R> Coef
euler kessler ozaki shoji
theta1 2.07695 2.14335 2.11532 2.10150
theta2 -0.26319 -0.27434 -0.26905 -0.26647
theta3 0.13022 0.12598 0.12652 0.13167
theta4 1.45132 1.46917 1.46491 1.45131
R> Info
logLik AIC BIC
euler -237.88 483.76 487.15
kessler -237.78 483.57 486.96
ozaki -237.84 483.67 487.07
shoji -237.88 483.76 487.15
In Figure 2 we reports the sample mean of the solution of the CKLS model with the estimated parameters and real data, their empirical \(95\%\) confidence bands (from the \(2.5th\) to the \(97.5th\) percentile), we can proceed as follows:
R> f <- expression( (2.076-0.263*x) )
R> g <- expression( 0.130*x^1.451 )
R> mod <- snssde1d(drift=f,diffusion=g,x0=X[1],M=500, N=length(X),t0=1964.471, T=1989.333)
R> mod
Itô Sde 1D:
| dX(t) = (2.076 - 0.263 * X(t)) * dt + 0.13 * X(t)^1.451 * dW(t)
Method:
| Euler scheme with order 0.5
Summary:
| Size of process | N = 299.
| Number of simulation | M = 500.
| Initial value | x0 = 3.317.
| Time of process | t in [1964.5,1989.3].
| Discretization | Dt = 0.08343.
R> plot(mod,type="n",ylim=c(0,30))
R> lines(X,col=4,lwd=2)
R> lines(time(mod),apply(mod$X,1,mean),col=2,lwd=2)
R> lines(time(mod),apply(mod$X,1,bconfint,level=0.95)[1,],col=5,lwd=2)
R> lines(time(mod),apply(mod$X,1,bconfint,level=0.95)[2,],col=5,lwd=2)
R> legend("topleft",c("real data","mean path",paste("bound of", 95,"% confidence")),inset = .01,col=c(4,2,5),lwd=2,cex=0.8)
The path mean of the solution of the CKLS model with the estimated parameters and real data
snssdekd()
& dsdekd()
& rsdekd()
- Monte-Carlo Simulation and Analysis of Stochastic Differential Equations.bridgesdekd()
& dsdekd()
& rsdekd()
- Constructs and Analysis of Bridges Stochastic Differential Equations.fptsdekd()
& dfptsdekd()
- Monte-Carlo Simulation and Kernel Density Estimation of First passage time.MCM.sde()
& MEM.sde()
- Parallel Monte-Carlo and Moment Equations for SDEs.TEX.sde()
- Converting Sim.DiffProc Objects to LaTeX.fitsde()
- Parametric Estimation of 1-D Stochastic Differential Equation.Brouste A, Fukasawa M, Hino H, Iacus SM, Kamatani K, Koike Y, Masuda H, Nomura R,Ogihara T, Shimuzu Y, Uchida M, Yoshida N (2014). The YUIMA Project: A ComputationalFramework for Simulation and Inference of Stochastic Differential Equations." Journal of Statistical Software, 57(4), 1-51. URL http://www.jstatsoft.org/v57/i04.
Guidoum AC, Boukhetala K (2020). Sim.DiffProc: Simulation of Diffusion Processes. R package version 4.6, URL https://cran.r-project.org/package=Sim.DiffProc.
Iacus SM (2008). Simulation and Inference for Stochastic Differential Equations: With R Examples. Springer-Verlag, New York.
Department of Probabilities & Statistics, Faculty of Mathematics, University of Science and Technology Houari Boumediene, BP 32 El-Alia, U.S.T.H.B, Algeria, E-mail (acguidoum@usthb.dz)↩
Faculty of Mathematics, University of Science and Technology Houari Boumediene, BP 32 El-Alia, U.S.T.H.B, Algeria, E-mail (kboukhetala@usthb.dz)↩