Function dfddm
evaluates the density function (or probability density function, PDF) for the Ratcliff diffusion decision model (DDM) using different methods for approximating the full PDF, which contains an infinite sum. An overview of the mathematical details of the different approximations is provided in the Math Vignette. Timing benchmarks for the present methods and comparison with existing methods are provided in the Benchmark Vignette. Two examples of using dfddm
for parameter estimation are provided in the Example Vignette.
Our implementation of the DDM has the following parameters: \(a \in (0, \infty)\) (threshold separation), \(v \in (-\infty, \infty)\) (drift rate), \(t_0 \in [0, \infty)\) (non-decision time/response time constant), \(w \in (0, 1)\) (relative starting point), and \(sv \in (0, \infty)\) (inter-trial-variability of drift).
This vignette is more technical in nature and will demonstrate not only the consistency across the different implemented methods but also show that dfddm
is in accordance with other implementations in the literature. In addition to testing all of the methods available in dfddm
, we also include three functions that are considered to be current and widely used. First, we include the function ddiffusion
from the package rtdists
as it is well known for being not only user friendly and feature-rich but also designed specifically for handling data with regard to distributions for response time models. Second, we also test the dwiener
function from the RWiener
package, which is mainly aimed at providing an R
language interface to calculate various functions from the Wiener process. Third, we include some raw R
code that is bundled with the paper written by Gondan, Blurton, and Kesselmeier (2014) about improving the approximation to the density function. As this code was not yet available in an R
package, it is part of fddm
and can be accessed after running source
on the corresponding file as shown below.
First, we perform a simple test in Validating the Density Functions, evaluating each algorithm for the density function of the DDM and comparing their consistency. To ensure rigorous concordance across all algorithms, we test each algorithm throughout a sufficiently large and granular parameter space that can be found fully defined in the subsection Generating Data. Of course these algorithms are approximations, and as such they will provide slightly different results given the same parameter inputs. To ensure that these small differences in density output do not affect the results of fitting parameters to real-world data, we also include testing in an optimization setting in Validating Fitting (Optimization) Using the Density Functions. We use the various algorithms as the bases for log-likelihood functions for the optimization process and include a variety of starting points in the parameter space to ensure rigorous consistency testing.
It is imperative to show that our density functions produce the same results as the current standards. To demonstrate this, we calculate the lower probability density for a granular parameter space and show that all of the results are within an acceptable error tolerance. We only calculate the lower probability density because the upper probability density negates \(v\) (\(v' = -v\)) and takes the complement of \(w\) (\(w' = 1-w\)); our parameter space already includes all of these values so calculating the upper probability density would be redundant. The fully defined parameter space can be found below, and it includes both realistically feasible values and some extreme values for each parameter. Since each of these functions approximates an infinite summation to a desired precision of \(\epsilon\), we allow for a difference of \(2 \cdot \epsilon\) between any pair of calculated densities to account for convergence from above and below the limit of the summation.
Note that we include \(sv\) in all the functions tested below, even those that do not contain \(sv\) themselves (i.e., RWiener
and Gondan, Blurton, and Kesselmeier (2014)). This is possible because variability in drift rate can be added to those functions via the constant \(M\) as described in the Math Vignette.
First we load the necessary packages and code available from the current literature.
library("fddm")
library("rtdists")
library("RWiener")
source(system.file("extdata", "Gondan_et_al_density.R", package = "fddm", mustWork = TRUE))
The following code chunk stores the lower probability densities calculated across the parameter space using the different methods; we also calculate and store the log-probability for consistency testing.
# Define parameter space
RT <- c(0.001, 0.1, 1, 10)
A <- c(0.5, 1, 5)
V <- c(-5, 0, 5)
t0 <- 1e-4 # must be nonzero for RWiener
W <- c(0.2, 0.5, 0.8)
SV <- c(0, 0.5, 1.5)
SV_THRESH <- 1e-6
eps <- 1e-6 # this is the setting from rtdists
nRT <- length(RT)
nA <- length(A)
nV <- length(V)
nW <- length(W)
nSV <- length(SV)
resp <- rep("lower", nRT) # for RWiener
fnames <- c("fs_SWSE_17", "fs_SWSE_14",
"fs_Gon_17", "fs_Gon_14", "fs_Nav_17", "fs_Nav_14",
"fb_Gon_17", "fb_Gon_14", "fb_Nav_17", "fb_Nav_14",
"fl_Nav_09", "RWiener", "Gondan", "rtdists")
nf <- length(fnames)
res <- data.frame(matrix(ncol = 9, nrow = nf*nRT*nA*nV*nW*nSV))
colnames(res) <- c('rt', 'a', 'v', 'w', 'sv', 'FuncName', 'res', 'dif',
'log_res')
start <- 1
stop <- nf
# Loop through each combination of parameters and record results
for (rt in 1:nRT) {
for (a in 1:nA) {
for (v in 1:nV) {
for (w in 1:nW) {
for (sv in 1:nSV) {
# add the rt, v, a, w, and function names to the dataframe
res[start:stop, 1] <- rep(RT[rt], nf)
res[start:stop, 2] <- rep(A[a] , nf)
res[start:stop, 3] <- rep(V[v] , nf)
res[start:stop, 4] <- rep(W[w] , nf)
res[start:stop, 5] <- rep(SV[sv], nf)
res[start:stop, 6] <- fnames
# calculate "lower" density
res[start, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "SWSE",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+1, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "SWSE",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+2, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+3, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+4, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+5, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+6, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+7, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+8, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+9, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+10, 7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "",
summation_small = "", scale = "large",
err_tol = eps)
res[start+11, 7] <- dwiener(RT[rt], resp = resp[rt], alpha = A[a],
delta = V[v], tau = t0, beta = W[w],
give_log = FALSE)
res[start+12, 7] <- fs(t = RT[rt]-t0, a = A[a], v = V[v],
w = W[w], eps = eps)
res[start+13, 7] <- ddiffusion(RT[rt], resp[rt], a = A[a], v = V[v],
t0 = t0, z = W[w]*A[a], sv = SV[sv])
if (sv > SV_THRESH) { # multiply to get density with sv
t <- RT[rt] - t0
M <- exp(V[v] * A[a] * W[w] + V[v]*V[v] * t / 2 +
(SV[sv]*SV[sv] * A[a]*A[a] * W[w]*W[w] -
2 * V[v] * A[a] * W[w] - V[v]*V[v] * t) /
(2 + 2 * SV[sv]*SV[sv] * t)) / sqrt(1 + SV[sv]*SV[sv] * t)
res[start+11, 7] <- M * res[start+11, 7] # RWiener
res[start+12, 7] <- M * res[start+12, 7] # Gondan_R
}
# calculate differences
ans <- res[start, 7] # use Foster_2017_small as truth
res[start, 8] <- abs(res[start, 7] - ans)
res[start+1, 8] <- abs(res[start+1, 7] - ans)
res[start+2, 8] <- abs(res[start+2, 7] - ans)
res[start+3, 8] <- abs(res[start+3, 7] - ans)
res[start+4, 8] <- abs(res[start+4, 7] - ans)
res[start+5, 8] <- abs(res[start+1, 7] - ans)
res[start+6, 8] <- abs(res[start+6, 7] - ans)
res[start+7, 8] <- abs(res[start+7, 7] - ans)
res[start+8, 8] <- abs(res[start+8, 7] - ans)
res[start+9, 8] <- abs(res[start+9, 7] - ans)
res[start+10, 8] <- abs(res[start+10, 7] - ans)
res[start+11, 8] <- abs(res[start+11, 7] - ans)
res[start+12, 8] <- abs(res[start+12, 7] - ans)
res[start+13, 8] <- abs(res[start+13, 7] - ans)
# calculate log of "lower" density
res[start, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "SWSE",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+1, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "SWSE",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+2, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+3, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+4, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+5, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+6, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+7, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+8, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+9, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+10, 9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "",
summation_small = "", scale = "large",
err_tol = eps)
res[start+11, 9] <- dwiener(RT[rt], resp = resp[rt], alpha = A[a],
delta = V[v], tau = t0, beta = W[w],
give_log = TRUE)
res[start+12, 9] <- log(fs(t = RT[rt]-t0, a = A[a], v = V[v],
w = W[w], eps = eps))
res[start+13, 9] <- log(ddiffusion(RT[rt], resp[rt], a = A[a],
v = V[v], t0 = t0, z = W[w]*A[a],
sv = SV[sv]))
if (sv > SV_THRESH) { # add to get log of density with sv
t <- RT[rt] - t0
M <- V[v] * A[a] * W[w] + V[v]*V[v] * t / 2 +
(SV[sv]*SV[sv] * A[a]*A[a] * W[w]*W[w] -
2 * V[v] * A[a] * W[w] - V[v]*V[v] * t) /
(2 + 2 * SV[sv]*SV[sv] * t) - 0.5 * log(1 + SV[sv]*SV[sv] * t)
res[start+11, 9] <- M + res[start+11, 9] # RWiener
res[start+12, 9] <- M + res[start+12, 9] # Gondan_R
}
# iterate start and stop values
start = start + nf
stop = stop + nf
}
}
}
}
}
Now we test the consistency of the various density function approximations by using the testthat
package. As discussed above, we allow the difference between any two outputs to be twice the original desired accuracy to allow for convergence from either above or below. We have to amend some of the checks because of the instability of some of the methods, and these are documented in the Known Errors section below. First we check that all of the approximations produce densities that are non-negative (we allow a density equal to zero). Next, we check the consistency of the internal dfddm
methods before confirming their accuracy with external code from established packages. Lastly, we run similar tests to check the consistency of the logged results; however, we do not test the logs of densities that are very close to zero (i.e., less than \(\epsilon^2\)) because even a small difference in approximated density can lead to a very large difference in logged density. If all tests pass correctly, there should be no output.
library("testthat")
# Subset results
SWSE <- subset(res, FuncName %in% fnames[c(1,2)])
Gondan_s <- subset(res, FuncName %in% fnames[c(3,4)])
Gondan_b <- subset(res, FuncName %in% fnames[c(7,8)])
Navarro_s <- subset(res, FuncName %in% fnames[c(5,6)])
Navarro_l <- subset(res, FuncName %in% fnames[11])
Navarro_b <- subset(res, FuncName %in% fnames[c(9,10)])
rtdists <- subset(res, FuncName == "rtdists")
RWiener <- subset(res, FuncName == "RWiener")
Gondan_R <- subset(res, FuncName == "Gondan")
# Compensate for KE 1, 2
Nav_s_res_0 <- subset(Navarro_s, res < 0) # KE 1
Nav_s_0 <- min(Nav_s_res_0$rt / Nav_s_res_0$a / Nav_s_res_0$a) # = 12
Nav_l_res_0 <- subset(Navarro_l, res < 0) # KE 2
Nav_l_0 <- max(Nav_l_res_0$rt / Nav_l_res_0$a / Nav_l_res_0$a) # = 0.04
Nav_l_dif_2eps <- subset(Navarro_l, dif > 2*eps) # KE 2
Nav_l_2 <- max(Nav_l_dif_2eps$rt / Nav_l_dif_2eps$a / Nav_l_dif_2eps$a) # = 0.004
# Ensure all densities are non-negative
test_that("Non-negativity of densities", {
expect_true(all(SWSE$res >= 0))
expect_true(all(Gondan_s$res >= 0))
expect_true(all(Gondan_b$res >= 0))
expect_true(all(subset(Navarro_s, rt/a/a < Nav_s_0)$res >= 0)) # see KE 1
expect_true(all(subset(Navarro_l, rt/a/a > Nav_l_0)$res >= 0)) # see KE 2
expect_true(all(subset(Navarro_b, rt/a/a > Nav_l_0)$res >= 0)) # see KE 3
expect_true(all(rtdists$res >= 0))
expect_true(all(RWiener$res >= 0))
expect_true(all(Gondan_R$res >= 0))
})
# Test accuracy within 2*eps (allows for convergence from above and below)
test_that("Consistency among internal methods", {
expect_true(all(SWSE$dif < 2*eps))
expect_true(all(Gondan_s$dif < 2*eps))
expect_true(all(Gondan_b$dif < 2*eps))
expect_true(all(Navarro_s$dif < 2*eps)) # see KE 1
expect_true(all(subset(Navarro_l, rt/a/a > Nav_l_2)$dif < 2*eps)) # see KE 2
expect_true(all(Navarro_b$dif < 2*eps)) # see KE 1,2
})
test_that("Accuracy relative to established packages", {
expect_true(all(rtdists$dif < 2*eps))
expect_true(all(RWiener$dif < 2*eps))
expect_true(all(subset(Gondan_R, sv < SV_THRESH)$dif < 2*eps)) # see KE 4
})
# Test consistency in log vs non-log
test_that("Log-Consistency among internal methods", {
expect_equal(subset(SWSE, res > eps*eps)$log_res,
log(subset(SWSE, res > eps*eps)$res))
expect_equal(subset(Gondan_s, res > eps*eps)$log_res,
log(subset(Gondan_s, res > eps*eps)$res))
expect_equal(subset(Gondan_b, res > eps*eps)$log_res,
log(subset(Gondan_b, res > eps*eps)$res))
expect_equal(subset(Navarro_s, res > eps*eps)$log_res,
log(subset(Navarro_s, res > eps*eps)$res))
expect_equal(subset(Navarro_l, res > eps*eps)$log_res,
log(subset(Navarro_l, res > eps*eps)$res))
expect_equal(subset(Navarro_b, res > eps*eps)$log_res,
log(subset(Navarro_b, res > eps*eps)$res))
})
test_that("Log-Consistency of established packages", {
expect_equal(subset(rtdists, res > eps*eps)$log_res,
log(subset(rtdists, res > eps*eps)$res))
expect_equal(subset(RWiener, res > eps*eps)$log_res,
log(subset(RWiener, res > eps*eps)$res))
expect_equal(subset(Gondan_R, res > eps*eps)$log_res,
log(subset(Gondan_R, res > eps*eps)$res))
})
Navarro small-time approximation is unstable for “large” effective response times, \(\frac{rt}{a^2} \geq 12\). It gives slightly negative densities for effective response times above this threshold. These parameter values should not have an effect on the “both” time scale because the large-time approximation should handle such locations in the parameter space. The negative results, however, are still within \(2 \cdot \epsilon\) of the accepted results (basically zero).
Navarro large-time approximation is unstable for “small” effective response times, \(\frac{rt}{a^2} \leq 0.04\). Similarly to the Navarro small-time approximation above, the large-time approximation gives slightly negative densities for effective response times in this range. Furthermore, it yields inaccurate densities for effective response times of \(\frac{rt}{a^2} \leq 0.004\). These parameter values should not have an effect on the “both” time scale because the small-time approximation should handle such locations in the parameter space.
The Navarro “both” time scale switches between the small-time and large-time approximations. This method relies too much on the unstable large time approximation, leading to its instability; thus we only need to subset to correct for the large time.
4) The Gondan_R approximation divides the error tolerance by the multiplicative term outside of the summation. Since the outside term is different when \(sv > 0\), the approximation uses the incorrect error tolerance for \(sv > 0\). This affects the number of terms required in the summation to achieve the desired precision, thus not actually achieving that desired precision. This issue is fixed in our implementation of the Gondan method (n_terms_small = "Gondan"
, scale = "small"
). For an example of this discrepancy, see the following code:
rt <- 1.5
t <- rt - 1e-4
a <- 0.5
v <- 4.5
w <- 0.5
eps <- 1e-6
sv <- 0.9
sv0 <- exp(-v*a*w - v*v*t/2) / (a*a) # for constant drift rate
sv0_9 <- exp((-2*v*a*w - v*v*t + sv*sv*a*a*w*w)/(2 + 2*sv*sv*t)) /
(a*a*sqrt(1+sv*sv*t)) # for variable drift rate
ks_0 <- ks(t/(a*a), w, eps/sv0) # = 2; the summation will only calculate 2 terms
ks_9 <- ks(t/(a*a), w, eps/sv0_9) # = 5; but the summation needs 5 terms
cat("the summation will only calculate", ks_0, "terms, but it needs", ks_9, "terms.")
#> the summation will only calculate 2 terms, but it needs 5 terms.
Perhaps the most practical use of dfddm
is to use it in an optimization setting, such as fitting DDM parameters to real-world data, and this section will show that all of the methods implemented in dfddm
yield the same results in this setting. The parameters that we will fit are: \(a\) (threshold separation), \(v\) (drift rate), \(t_0\) (non-decision time/response time constant), \(w\) (relative starting point), and \(sv\) (inter-trial-variability of drift). Since the example data we are using consists of two different item types that each require a different correct response (i.e., for one item type the correct response is mapped to the lower boundary and for the other item type the correct response is mapped to the upper boundary), the model includes two different versions of \(v\) (drift rate): \(v_\ell\) for fitting to the truthful lower boundary, and \(v_u\) for fitting to the truthful upper boundary. As many of the approximations in dfddm
have different styles, we only test a subset of all the available methods in dfddm
to avoid unnecessary testing. After using the various density function approximations in fitting the DDM to real-world data, we then validate that the produced parameter estimates are consistent across the various methods within a given error tolerance.
We use a range of different initial parameter values for the optimization function to ensure that none of the methods encounters a problem in the parameter space. However, we need to slightly restrict the starting values to prevent fitting issues with some of the small-time methods; these restrictions are discussed in a later subsection and in Known Error 5. We allow a difference of \(0.0001\) across the methods for each combination of parameters since the density functions can return slightly different results (within \(2 \cdot \epsilon\)), as discussed earlier in this vignette. The following subsections will define all of the functions used to generate the fittings and provide the code to run the full fitting. Since running the full fitting for all of the individuals in the data takes a long time, we will forgo running the fitting in this vignette and instead read pre-fit parameter estimates that used the provided code.
To avoid too many repetitious plots, we will only include six of the methods from the previous section that illustrate the different variations and implementations of the density functions. We include: all three small-time implementations in dfddm
, the two implementations in dfddm
that combine the small-time and large-time, and the one method available in the rtdists
package. We only use a selection of the methods available in dfddm
to avoid redundancy in our testing since most of the methods have multiple sibling approximations that are extremely similar. Moreover, we do not require the RWiener
package nor the Gondan raw R
code because the associated density function approximations do not include an option for variability in drift rate. While we can convert the constant drift rate density to the variable drift rate density using a multiplicative factor, the densities are still potentially calculated incorrectly. For more details on the differences between the constant drift rate density functions and their variable drift rate counterparts, see the Math Vignette.
First, we load the necessary packages.
This code chunk defines the log-likelihood functions used in the optimization algorithm. The log-likelihood functions are fairly straightforward and split the responses and associated response times by the true item status (i.e., the correct response) to enable fitting distinct drift rates (\(v_\ell\) for the items in which the correct response is the lower boundary and \(v_u\) for the items for which the correct response is the upper boundary). In addition, the log-likelihood functions heavily penalize any combination of parameters that returns a log-density of \(-\infty\) (equivalent to a regular density of \(0\)).
ll_fs_SWSE_17 <- function(pars, rt, resp, truth, err_tol) {
rtu <- rt[truth == "upper"]
rtl <- rt[truth == "lower"]
respu <- resp[truth == "upper"]
respl <- resp[truth == "lower"]
# the truth is "upper" so use vu
densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "SWSE", summation_small = "2017",
scale = "small", err_tol = err_tol)
# the truth is "lower" so use vl
densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "SWSE", summation_small = "2017",
scale = "small", err_tol = err_tol)
densities <- c(densu, densl)
if (any(!is.finite(densities))) return(1e6)
return(-sum(densities))
}
ll_fs_Gon_17 <- function(pars, rt, resp, truth, err_tol) {
rtu <- rt[truth == "upper"]
rtl <- rt[truth == "lower"]
respu <- resp[truth == "upper"]
respl <- resp[truth == "lower"]
# the truth is "upper" so use vu
densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Gondan", summation_small = "2017",
scale = "small", err_tol = err_tol)
# the truth is "lower" so use vl
densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Gondan", summation_small = "2017",
scale = "small", err_tol = err_tol)
densities <- c(densu, densl)
if (any(!is.finite(densities))) return(1e6)
return(-sum(densities))
}
ll_fs_Nav_17 <- function(pars, rt, resp, truth, err_tol) {
rtu <- rt[truth == "upper"]
rtl <- rt[truth == "lower"]
respu <- resp[truth == "upper"]
respl <- resp[truth == "lower"]
# the truth is "upper" so use vu
densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Navarro", summation_small = "2017",
scale = "small", err_tol = err_tol)
# the truth is "lower" so use vl
densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Navarro", summation_small = "2017",
scale = "small", err_tol = err_tol)
densities <- c(densu, densl)
if (any(!is.finite(densities))) return(1e6)
return(-sum(densities))
}
ll_fb_Gon_17 <- function(pars, rt, resp, truth, err_tol) {
rtu <- rt[truth == "upper"]
rtl <- rt[truth == "lower"]
respu <- resp[truth == "upper"]
respl <- resp[truth == "lower"]
# the truth is "upper" so use vu
densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Gondan", summation_small = "2017",
scale = "both", err_tol = err_tol)
# the truth is "lower" so use vl
densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Gondan", summation_small = "2017",
scale = "both", err_tol = err_tol)
densities <- c(densu, densl)
if (any(!is.finite(densities))) return(1e6)
return(-sum(densities))
}
ll_fb_Nav_17 <- function(pars, rt, resp, truth, err_tol) {
rtu <- rt[truth == "upper"]
rtl <- rt[truth == "lower"]
respu <- resp[truth == "upper"]
respl <- resp[truth == "lower"]
# the truth is "upper" so use vu
densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Navarro", summation_small = "2017",
scale = "both", err_tol = err_tol)
# the truth is "lower" so use vl
densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]],
t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE,
n_terms_small = "Navarro", summation_small = "2017",
scale = "both", err_tol = err_tol)
densities <- c(densu, densl)
if (any(!is.finite(densities))) return(1e6)
return(-sum(densities))
}
ll_RTDists <- function(pars, rt, resp, truth) {
rtu <- rt[truth == "upper"]
rtl <- rt[truth == "lower"]
respu <- resp[truth == "upper"]
respl <- resp[truth == "lower"]
# the truth is "upper" so use vu
densu <- ddiffusion(rtu, respu, a = pars[[3]], v = pars[[1]],
z = pars[[5]]*pars[[3]], t0 = pars[[4]], sv = pars[[6]])
# the truth is "lower" so use vl
densl <- ddiffusion(rtl, respl, a = pars[[3]], v = pars[[2]],
z = pars[[5]]*pars[[3]], t0 = pars[[4]], sv = pars[[6]])
densities <- c(densu, densl)
if (any(densities <= 0)) return(1e6)
return(-sum(log(densities)))
}
We use a different set of initial parameter values for each combination of data set and method. In a real-life situation we would probably do so using random initial parameter values to avoid local minima. Here we do this with a fixed set of initial parameter values to ensure that the different method produce the same results for different starting points. However, there are a couple of restrictions to these initial values that we must address (we would also have to do so if we picked the initial values randomly). The parameter \(t_0\) must be less than the response time, so we set the initial values for \(t_0\) to be strictly less than the minimum response time according to each individual in the input data.frame. In addition, the optimization algorithm encounters problems for specific values of \(a\) when using the log-likelihood function that is based on the Navarro small-time density function approximation, and this is explained in Known Errors.
Furthermore, we must place a lower bound on \(a\) because the optimization algorithm occasionally evaluates the log-likelihood functions (and thus the underlying density function approximations) using values of \(a\) that are very close to zero in an effort to fully explore the parameter space. In these cases, the small-time approximations struggle because \(a\) inversely scales the input response time and effectively changes the input response time into a very large one. In particular, an extremely small value of \(a\) can cause the “SWSE” small-time density approximation to run for a very long time; to prevent this, we bound \(a\) below by \(0.05\) as it is a very small value such that anything less would be of no psychological significance to the model. In common use this is not an issue because very small values of \(a\) do not make any sense with regard to the psychological interpretation of the parameter, but this issue can arise in an exploratory optimization environment.
The following code chunk defines the function that will run the optimization and produce the fitted parameter estimates for \(v_u\), \(v_\ell\), \(a\), \(t_0\), \(w\), and \(sv\). As discussed above, we only use a selection of methods from those available in dfddm
. This fitting function will run the optimization for each set of initial parameter values and store the resulting convergence code (either \(0\) indicating successful convergence or \(1\) indicating unsuccessful convergence), minimized value of the objective log-likelihood function, and parameter estimates.
rt_fit <- function(data, id_idx = NULL, rt_idx = NULL, response_idx = NULL,
truth_idx = NULL, response_upper = NULL, err_tol = 1e-6) {
# Format data for fitting
if (all(is.null(id_idx), is.null(rt_idx), is.null(response_idx),
is.null(truth_idx), is.null(response_upper))) {
df <- data # assume input data is already formatted
} else {
if(any(data[,rt_idx] < 0)) {
stop("Input data contains negative response times; fit will not be run.")
}
if(any(is.na(data[,response_idx]))) {
stop("Input data contains invalid responses (NA); fit will not be run.")
}
nr <- nrow(data)
df <- data.frame(id = character(nr),
rt = double(nr),
response = character(nr),
truth = character(nr),
stringsAsFactors = FALSE)
if (!is.null(id_idx)) { # relabel identification tags
for (i in 1:length(id_idx)) {
idi <- unique(data[,id_idx[i]])
for (j in 1:length(idi)) {
df$id[data[,id_idx[i]] == idi[j]] <- paste(df$id[data[,id_idx[i]] == idi[j]], idi[j], sep = " ")
}
}
df$id <- trimws(df$id, which = "left")
}
df$rt <- as.double(data[,rt_idx])
df$response <- "lower"
df$response[data[,response_idx] == response_upper] <- "upper"
df$truth <- "lower"
df$truth[data[,truth_idx] == response_upper] <- "upper"
}
# Preliminaries
ids <- unique(df$id)
nids <- max(length(ids), 1) # if inds is null, there is only one individual
init_vals <- data.frame(vu = c( 0, 10, -.5, 0, 0, 0, 0, 0, 0, 0, 0),
vl = c( 0, -10, .5, 0, 0, 0, 0, 0, 0, 0, 0),
a = c( 1, 1, 1, .5, 5, 1, 1, 1, 1, 1, 1),
t0 = c( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
w = c(.5, .5, .5, .5, .5, .5, .5, .2, .8, .5, .5),
sv = c( 1, 1, 1, 1, 1, 1, 1, 1, 1, .05, 5))
ninit_vals <- nrow(init_vals)
algo_names <- c(rep("fs_SWSE_17", ninit_vals), rep("fs_Gon_17", ninit_vals),
rep("fs_Nav_17", ninit_vals), rep("fb_Gon_17", ninit_vals),
rep("fb_Nav_17", ninit_vals), rep("rtdists", ninit_vals))
nalgos <- length(unique(algo_names))
ni <- nalgos*ninit_vals
# Initilize the result dataframe
cnames <- c("ID", "Algorithm", "Convergence", "Objective",
"vu_init", "vl_init", "a_init", "t0_init", "w_init", "sv_init",
"vu_fit", "vl_fit", "a_fit", "t0_fit", "w_fit", "sv_fit")
res <- data.frame(matrix(ncol = length(cnames), nrow = nids*ninit_vals*nalgos))
colnames(res) <- cnames
# label the result dataframe
res$Algorithm <- algo_names # label algorithms
res$vu_init <- init_vals$vu # label initial vu
res$vl_init <- init_vals$vl # label initial vl
res$a_init <- init_vals$a # label initial a
res$w_init <- init_vals$w # label initial w
res$sv_init <- init_vals$sv # label initial sv
# Loop through each individual and starting values
for (i in 1:nids) {
# extract data for id i
dfi <- df[df$id == ids[i],]
rti <- dfi$rt
respi <- dfi$response
truthi <- dfi$truth
# starting value for t0 must be smaller than the smallest rt
min_rti <- min(rti)
t0_lo <- 0.01*min_rti
t0_me <- 0.50*min_rti
t0_hi <- 0.99*min_rti
init_vals$t0 <- c(rep(t0_me, 5), t0_lo, t0_hi, rep(t0_me, 4))
# label the result dataframe
res$ID[((i-1)*ni+1):(i*ni)] <- ids[i] # label individuals
res$t0_init[((i-1)*ni+1):(i*ni)] <- init_vals$t0 # label initial t0
# loop through all of the starting values
for (j in 1:ninit_vals) {
temp <- nlminb(init_vals[j,], ll_fs_SWSE_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits: vu, vl, a, t0, w, sv
lower = c(-Inf, -Inf, .05, 0, 0, 0),
upper = c( Inf, Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+0*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+0*ninit_vals+j] <- temp$objective
res[(i-1)*ni+0*ninit_vals+j, 11:16] <- temp$par
temp <- nlminb(init_vals[j,], ll_fs_Gon_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits: vu, vl, a, t0, w, sv
lower = c(-Inf, -Inf, .05, 0, 0, 0),
upper = c( Inf, Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+1*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+1*ninit_vals+j] <- temp$objective
res[(i-1)*ni+1*ninit_vals+j, 11:16] <- temp$par
temp <- nlminb(init_vals[j,], ll_fs_Nav_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits: vu, vl, a, t0, w, sv
lower = c(-Inf, -Inf, .05, 0, 0, 0),
upper = c( Inf, Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+2*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+2*ninit_vals+j] <- temp$objective
res[(i-1)*ni+2*ninit_vals+j, 11:16] <- temp$par
temp <- nlminb(init_vals[j,], ll_fb_Gon_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits: vu, vl, a, t0, w, sv
lower = c(-Inf, -Inf, .05, 0, 0, 0),
upper = c( Inf, Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+3*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+3*ninit_vals+j] <- temp$objective
res[(i-1)*ni+3*ninit_vals+j, 11:16] <- temp$par
temp <- nlminb(init_vals[j,], ll_fb_Nav_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits: vu, vl, a, t0, w, sv
lower = c(-Inf, -Inf, .05, 0, 0, 0),
upper = c( Inf, Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+4*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+4*ninit_vals+j] <- temp$objective
res[(i-1)*ni+4*ninit_vals+j, 11:16] <- temp$par
temp <- nlminb(init_vals[j,], ll_RTDists,
rt = rti, resp = respi, truth = truthi,
# limits: vu, vl, a, t0, w, sv
lower = c(-Inf, -Inf, .05, 0, 0, 0),
upper = c( Inf, Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+5*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+5*ninit_vals+j] <- temp$objective
res[(i-1)*ni+5*ninit_vals+j, 11:16] <- temp$par
}
}
return(res)
}
As an example data set, we use the med_dec
data that comes with fddm
. This data contains the accuracy condition reported in Trueblood et al. (2018) investigating medical decision making among medical professionals (pathologists) and novices (i.e., undergraduate students). The task of participants was to judge whether pictures of blood cells show cancerous cells (i.e., blast cells) or non-cancerous cells (i.e., non-blast cells). The data set contains 200 decision per participant, based on pictures of 100 true cancerous cells and pictures of 100 true non-cancerous cells. We remove the trials without response (indicated by rt < 0 in the data) before fitting.
Having set up the fitting functions in the above chunks of code, we could pass the med_dec
data to this function to get the parameter estimates. However, as this takes a long time we skip the fitting in this vignette and instead read the pre-fit parameter estimates in the next section.
Now we test the consistency of the fitted parameters using the various density function approximations. As discussed above, we use an allowable error tolerance of \(0.0001\) to account for the slight differences in the output of the different density approximations. First, we define a function to determine the differences in the objective value of the log-likelihood function and the parameter estimates. For each set of initial parameter values, there is one objective value corresponding to each algorithm; we take the minimum of these objective values and use the associated parameter estimates as the best estimates. Then we use this set of best estimates to compare against each set parameter estimates that use the same initial parameter values. At the end of this prep function, we subset the fitted data to remove those fits that used the Navarro small-time approximation with an initial value of \(a \leq .75\) (see Known Error 5 below for more details). The following code chunk defines this function, which will be used next.
fit_prep <- function(fit) {
nr <- nrow(fit)
fit$Obj_diff <- rep(0, nr)
fit$vu_diff <- rep(0, nr)
fit$vl_diff <- rep(0, nr)
fit$a_diff <- rep(0, nr)
fit$t0_diff <- rep(0, nr)
fit$w_diff <- rep(0, nr)
fit$sv_diff <- rep(0, nr)
ids <- unique(fit$ID)
nids <- length(ids)
algos <- unique(fit$Algorithm)
nalgos <- length(algos)
fit_idx <- c(4, 11:16)
dif_idx <- 17:23
ninit <- nrow(subset(fit, ID == ids[1] & Algorithm == algos[1]))
for (i in 1:nids) {
for (j in 1:ninit) {
actual_idx <- seq((i-1)*ninit*nalgos+j, i*ninit*nalgos, by = ninit)
min_obj_idx <- actual_idx[which.min(fit[actual_idx, 4])]
best_fit <- fit[min_obj_idx, fit_idx]
for (k in 0:(nalgos-1)) {
fit[(i-1)*(ninit*nalgos) + k*ninit + j, dif_idx] <-
fit[(i-1)*(ninit*nalgos) + k*ninit + j, fit_idx] - best_fit
}
}
}
fit <- fit[fit$Algorithm != "fs_Nav_17" | fit$a_init > .75,] # KE 5
return(fit)
}
As previously mentioned, we will read the pre-fit data from file as the fits can take a long time to run. Then we run our prep function to expose any significant differences in the fits across the algorithms. As an example of the fitting comparison, we print the results for the first ID in the data set (ID = "experienced 2"
). This sample shows that the agreement across methods for this set of initial parameter values is rather good for the first participant in the study.
# load data, will be in the variable 'fit'
load(system.file("extdata", "valid_fit.Rds", package = "fddm", mustWork = TRUE))
fit <- fit_prep(fit)
print("Results for ID = experienced 2")
#> [1] "Results for ID = experienced 2"
fit[(0:5)*11+1,]
#> ID Algorithm Convergence Objective vu_init vl_init a_init t0_init w_init sv_init
#> 1 experienced 2 fs_SWSE_17 0 42.47181 0 0 1 0.2305 0.5 1
#> 12 experienced 2 fs_Gon_17 0 42.47181 0 0 1 0.2305 0.5 1
#> 23 experienced 2 fs_Nav_17 0 42.47181 0 0 1 0.2305 0.5 1
#> 35 experienced 2 fb_Gon_17 0 42.47181 10 -10 1 0.2305 0.5 1
#> 46 experienced 2 fb_Nav_17 0 42.47181 10 -10 1 0.2305 0.5 1
#> 57 experienced 2 rtdists 0 42.47181 10 -10 1 0.2305 0.5 1
#> vu_fit vl_fit a_fit t0_fit w_fit sv_fit Obj_diff vu_diff
#> 1 5.681305 -2.188657 2.790911 0.3764464 0.4010114 2.281295 1.660993e-09 4.213544e-06
#> 12 5.681307 -2.188655 2.790915 0.3764461 0.4010111 2.281291 1.562547e-09 6.397442e-06
#> 23 5.681299 -2.188660 2.790908 0.3764466 0.4010116 2.281296 1.728765e-09 -1.982117e-06
#> 35 5.681304 -2.188660 2.790912 0.3764464 0.4010115 2.281298 0.000000e+00 0.000000e+00
#> 46 5.681305 -2.188660 2.790913 0.3764464 0.4010115 2.281299 1.469883e-08 1.119686e-06
#> 57 5.681305 -2.188660 2.790912 0.3764464 0.4010115 2.281299 1.469822e-08 1.029976e-06
#> vl_diff a_diff t0_diff w_diff sv_diff
#> 1 8.397604e-07 1.408102e-06 -6.937031e-08 -1.609263e-07 -9.734412e-07
#> 12 3.374579e-06 5.186847e-06 -3.937648e-07 -3.794396e-07 -5.048216e-06
#> 23 -2.418119e-06 -1.014296e-06 7.146087e-08 6.097234e-08 2.327210e-07
#> 35 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
#> 46 -3.610748e-07 3.172648e-07 4.550910e-09 1.692955e-09 8.149610e-07
#> 57 -3.626724e-07 2.295751e-07 3.599953e-09 -7.131966e-09 6.125464e-07
We can see that the fits for this individual participant and this set of initial parameter values are very similar across the different algorithms. As there are many individuals in this dataset, we will not print this style of fit comparison for each combination of participant and initial parameter values; instead we will print only those fits that are worse than the best fit. Output below are all of the runs where either:
the optimization run did not converge (i.e. Convergence = 0); or
the log-likelihood objective value differs from the smallest produced of any method for that specific set of initial parameter values; or
or the parameter estimates differ from the smallest produced of any method for that specific set of initial parameter values.
# Define error tolerance
eps <- 1e-4
out <- fit[unique(which(abs(fit[, c(3, 17:23)]) > eps, arr.ind = TRUE)[, 1]),]
out[, -c(1:2)] <- zapsmall(out[, -c(1:2)])
out
#> ID Algorithm Convergence Objective vu_init vl_init a_init t0_init w_init sv_init
#> 411 experienced 16 fs_Gon_17 1 2966.5346 0.0 0.0 0.5 0.2895 0.5 1
#> 1588 novice 7 fs_SWSE_17 1 2961.5563 0.0 0.0 0.5 0.2690 0.5 1
#> 1597 novice 7 fs_Gon_17 1 414.7291 10.0 -10.0 1.0 0.2690 0.5 1
#> 1599 novice 7 fs_Gon_17 1 2954.4444 0.0 0.0 0.5 0.2690 0.5 1
#> 2400 novice 19 fs_Nav_17 1 819.4197 10.0 -10.0 1.0 0.2520 0.5 1
#> 771 inexperienced 8 fb_Nav_17 0 149.9361 0.0 0.0 1.0 0.0885 0.5 1
#> 1465 novice 5 fs_Gon_17 0 133.0414 10.0 -10.0 1.0 0.2815 0.5 1
#> 1697 novice 8 fb_Nav_17 0 28.7188 -0.5 0.5 1.0 0.0620 0.5 1
#> 1943 novice 12 fs_Nav_17 0 21.6547 0.0 0.0 1.0 0.3623 0.5 1
#> 3513 novice 36 fs_Gon_17 0 7.2392 0.0 0.0 0.5 0.2730 0.5 1
#> 3335 novice 33 fb_Gon_17 0 86.0960 10.0 -10.0 1.0 0.1875 0.5 1
#> vu_fit vl_fit a_fit t0_fit w_fit sv_fit Obj_diff vu_diff vl_diff a_diff t0_diff w_diff
#> 411 0.0001 -0.0001 0.4921 0.3045 0.5000 0.9990 2712.8383 -1.2601 1.3891 -2.2641 -0.1647 0.0599
#> 1588 0.0000 0.0000 0.4915 0.2721 0.5001 0.9997 2617.7618 -0.4114 0.9253 -2.4014 -0.1172 0.0864
#> 1597 -0.0703 -0.8049 2.2278 0.2269 0.6624 0.0064 70.9346 -0.4817 0.1205 -0.6652 -0.1625 0.2487
#> 1599 0.0000 0.0000 0.4999 0.2690 0.5000 1.0000 2610.6499 -0.4114 0.9253 -2.3931 -0.1203 0.0863
#> 2400 1.4632 -1.5287 0.7102 0.4944 0.6476 0.6965 664.3276 -0.3301 0.5084 -1.5907 0.0666 0.1377
#> 771 1.8936 -0.0809 2.1561 0.1313 0.4815 0.0000 1.1808 -0.2726 -0.0229 -0.1671 0.0040 0.0081
#> 1465 0.4940 -1.3330 1.5444 0.5027 0.5181 0.0000 3.2519 -0.2709 0.5393 -0.1972 0.0071 -0.0105
#> 1697 1.9776 1.2560 1.4352 0.0995 0.4137 0.0000 5.0284 -0.7610 -0.6282 -0.2765 0.0065 0.0414
#> 1943 -0.7418 -2.6739 1.5598 0.3376 0.5256 0.0000 1.8148 0.3297 0.5947 -0.1746 0.0053 -0.0254
#> 3513 1.5424 -1.9930 1.5396 0.5175 0.4319 0.0000 1.5007 -0.2841 0.3311 -0.1393 0.0026 0.0039
#> 3335 1.7151 -0.1983 1.3949 0.3317 0.4688 1.1782 0.0000 0.0001 0.0000 0.0000 0.0000 0.0000
#> sv_diff
#> 411 -0.0445
#> 1588 0.3384
#> 1597 -0.6549
#> 1599 0.3387
#> 2400 -1.1265
#> 771 -0.5912
#> 1465 -1.1449
#> 1697 -1.3573
#> 1943 -1.0363
#> 3513 -0.8562
#> 3335 0.0001
We can see that it is mostly the small time methods that have occasional issues with data fitting. Although there are two methods that appear in the above output that use both the small-time and large-time approximations, the method that combines the large-time approximation of Navarro with the small-time approximation of Gondan is actually quite stable. Using this method, there are two offending fitted parameters whose difference from the fits given using rtdists
are not smaller than our given error tolerance; in this case, the differences are only marginally greater than our given error tolerance. These results suggest that the most stable methods are those that use both the small-time and large-time approximations to the density, and in particular the default option for dfddm
(scale = "both", n_terms_small = "Gondan"
) is very stable .
nlminb
claims to have converged. This is likely due to the small value of \(a\) creating a larger “effective response time” \(\frac{rt}{a^2}\), which the small-time approximation may not handle well and thus cause the optimizer to get stuck. This issue does not appear for the density function that uses both the Navarro small-time and large-time approximations (n_terms_small = "Navarro", scale = "both"
) because the large-time approximation will handle this region of the parameter space and produce more accurate results.sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=C LC_CTYPE=English_United Kingdom.1252
#> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
#> [5] LC_TIME=English_United Kingdom.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] testthat_2.3.2 ggplot2_3.3.2.9000 reshape2_1.4.4 microbenchmark_1.4-7
#> [5] RWiener_1.3-3 rtdists_0.11-2 fddm_0.1-1
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.4.6 pillar_1.4.4 compiler_4.0.2 plyr_1.8.6 tools_4.0.2
#> [6] digest_0.6.25 evd_2.3-3 evaluate_0.14 lifecycle_0.2.0 tibble_3.0.1
#> [11] gtable_0.3.0 lattice_0.20-41 pkgconfig_2.0.3 rlang_0.4.6 Matrix_1.2-18
#> [16] yaml_2.2.1 mvtnorm_1.1-1 expm_0.999-4 xfun_0.15 withr_2.2.0
#> [21] dplyr_1.0.0 stringr_1.4.0 knitr_1.29 generics_0.0.2 vctrs_0.3.1
#> [26] tidyselect_1.1.0 grid_4.0.2 ggnewscale_0.4.1 glue_1.4.1 R6_2.4.1
#> [31] survival_3.2-3 rmarkdown_2.3 farver_2.0.3 purrr_0.3.4 magrittr_1.5
#> [36] scales_1.1.1 htmltools_0.5.0 ellipsis_0.3.1 splines_4.0.2 colorspace_1.4-1
#> [41] labeling_0.3 stringi_1.4.6 gsl_2.1-6 munsell_0.5.0 msm_1.6.8
#> [46] crayon_1.3.4
Gondan, Matthias, Steven P Blurton, and Miriam Kesselmeier. 2014. “Even Faster and Even More Accurate First-Passage Time Densities and Distributions for the Wiener Diffusion Model.” Journal of Mathematical Psychology 60: 20–22.
Trueblood, Jennifer S., William R. Holmes, Adam C. Seegmiller, Jonathan Douds, Margaret Compton, Eszter Szentirmai, Megan Woodruff, Wenrui Huang, Charles Stratton, and Quentin Eichbaum. 2018. “The Impact of Speed and Bias on the Cognitive Processes of Experts and Novices in Medical Image Decision-Making.” Cognitive Research: Principles and Implications 3 (1): 28. https://doi.org/10.1186/s41235-018-0119-2.