The package morse
is devoted to the analysis of data from standard toxicity tests. It provides a simple workflow to explore/visualize a data set, and compute estimations of risk assessment indicators. This document illustrates a typical use of morse
on survival and reproduction data, which can be followed step-by-step to analyze new data sets.
The following example shows all the steps to perform survival analysis on standard toxicity test data and to produce estimated values of the \(LC_x\). We will use a data set of the library named cadmium2
, which contains both survival and reproduction data from a chronic laboratory toxicity test. In this experiment, snails were exposed to six concentrations of a metal contaminant (cadmium) during 56 days.
The data from a survival toxicity test should be gathered in a data.frame
with a specific layout. This is documented in the paragraph on survData
in the reference manual, and you can also inspect one of the data sets provided in the package (e.g., cadmium2
). First, we load the data set and use the function survDataCheck()
to check that it has the expected layout:
data(cadmium2)
survDataCheck(cadmium2)
## No message
The output ## No message
just informs that the data set is well-formed.
survData
objectThe class survData
corresponds to survival data and is the basic layout used for the subsequent operations. Note that if the call to survDataCheck()
reports no error (i.e., ## No message
), it is guaranteed that survData
will not fail.
dat <- survData(cadmium2)
head(dat)
## # A tibble: 6 x 6
## conc time Nsurv Nrepro replicate Ninit
## <dbl> <int> <int> <int> <dbl> <int>
## 1 0 0 5 0 1 5
## 2 0 3 5 262 1 5
## 3 0 7 5 343 1 5
## 4 0 10 5 459 1 5
## 5 0 14 5 328 1 5
## 6 0 17 5 742 1 5
The function plot()
can be used to plot the number of surviving individuals as a function of time for all concentrations and replicates.
plot(dat, pool.replicate = FALSE)
Two graphical styles are available, "generic"
for standard R
plots or "ggplot"
to call package ggplot2
(default). If argument pool.replicate
is TRUE
, datapoints at a given time-point and a given concentration are pooled and only the mean number of survivors is plotted. To observe the full data set, we set this option to FALSE
.
By fixing the concentration at a (tested) value, we can visualize one subplot in particular:
plot(dat, concentration = 124, addlegend = TRUE,
pool.replicate = FALSE, style ="generic")
We can also plot the survival rate, at a given time-point, as a function of concentration, with binomial confidence intervals around the data. This is achieved by using function plotDoseResponse()
and by fixing the option target.time
(default is the end of the experiment).
plotDoseResponse(dat, target.time = 21, addlegend = TRUE)
Function summary()
provides some descriptive statistics on the experimental design.
summary(dat)
##
## Number of replicates per time and concentration:
## time
## conc 0 3 7 10 14 17 21 24 28 31 35 38 42 45 49 52 56
## 0 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 53 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 78 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 124 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 232 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 284 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
##
## Number of survivors (sum of replicates) per time and concentration:
## 0 3 7 10 14 17 21 24 28 31 35 38 42 45 49 52 56
## 0 30 30 30 30 29 29 29 29 29 28 28 28 28 28 28 28 28
## 53 30 30 29 29 29 29 29 29 29 29 28 28 28 28 28 28 28
## 78 30 30 30 30 30 30 29 29 29 29 29 29 29 29 29 27 27
## 124 30 30 30 30 30 29 28 28 27 26 25 23 21 18 11 11 9
## 232 30 30 30 22 18 18 17 14 13 12 8 4 3 1 0 0 0
## 284 30 30 15 7 4 4 4 2 2 1 1 1 1 1 1 0 0
Now we are ready to fit a probabilistic model to the survival data, in order to describe the relationship between the concentration in chemical compound and survival rate at the target time. Our model assumes this latter is a log-logistic function of the former, from which the package delivers estimates of the parameters. Once we have estimated the parameters, we can then calculate the \(LC_x\) values for any \(x\). All this work is performed by the survFitTT()
function, which requires a survData
object as input and the levels of \(LC_x\) we want:
fit <- survFitTT(dat,
target.time = 21,
lcx = c(10, 20, 30, 40, 50))
The returned value is an object of class survFitTT
providing the estimated parameters as a posterior1 distribution, which quantifies the uncertainty on their true value. For the parameters of the models, as well as for the \(LC_x\) values, we report the median (as the point estimated value) and the 2.5 % and 97.5 % quantiles of the posterior (as a measure of uncertainty, a.k.a. credible intervals). They can be obtained by using the summary()
method:
summary(fit)
## Summary:
##
## The loglogisticbinom_3 model with a binomial stochastic part was used !
##
## Priors on parameters (quantiles):
##
## 50% 2.5% 97.5%
## b 1.000e+00 1.259e-02 7.943e+01
## d 5.000e-01 2.500e-02 9.750e-01
## e 1.227e+02 5.390e+01 2.793e+02
##
## Posteriors of the parameters (quantiles):
##
## 50% 2.5% 97.5%
## b 8.595e+00 4.044e+00 1.636e+01
## d 9.569e-01 9.097e-01 9.858e-01
## e 2.366e+02 2.109e+02 2.536e+02
##
## Posteriors of the LCx (quantiles):
##
## 50% 2.5% 97.5%
## LC10 1.833e+02 1.278e+02 2.143e+02
## LC20 2.014e+02 1.552e+02 2.262e+02
## LC30 2.144e+02 1.759e+02 2.350e+02
## LC40 2.257e+02 1.936e+02 2.437e+02
## LC50 2.366e+02 2.109e+02 2.536e+02
If the inference went well, it is expected that the difference between quantiles in the posterior will be reduced compared to the prior, meaning that the data were helpful to reduce the uncertainty on the true value of the parameters. This simple check can be performed using the summary function.
The fit can also be plotted:
plot(fit, log.scale = TRUE, adddata = TRUE, addlegend = TRUE)
This representation shows the estimated relationship between concentration of chemical compound and survival rate (orange curve). It is computed by choosing for each parameter the median value of its posterior. To assess the uncertainty on this estimation, we compute many such curves by sampling the parameters in the posterior distribution. This gives rise to the grey band, showing for any given concentration an interval (called credible interval) containing the survival rate 95% of the time in the posterior distribution. The experimental data points are represented in black and correspond to the observed survival rate when pooling all replicates. The black error bars correspond to a 95% confidence interval, which is another, more straightforward way to bound the most probable value of the survival rate for a tested concentration. In favorable situations, we expect that the credible interval around the estimated curve and the confidence interval around the experimental data largely overlap.
A similar plot is obtained with the style "generic"
:
plot(fit, log.scale = TRUE, style = "generic", adddata = TRUE, addlegend = TRUE)
Note that survFitTT()
will warn you if the estimated \(LC_{x}\) lie outside the range of tested concentrations, as in the following example:
data("cadmium1")
doubtful_fit <- survFitTT(survData(cadmium1),
target.time = 21,
lcx = c(10, 20, 30, 40, 50))
## Warning: The LC50 estimation (model parameter e) lies outside the range of
## tested concentrations and may be unreliable as the prior distribution on
## this parameter is defined from this range !
plot(doubtful_fit, log.scale = TRUE, style = "ggplot", adddata = TRUE,
addlegend = TRUE)
In this example, the experimental design does not include sufficiently high concentrations, and we are missing measurements that would have a major influence on the final estimation. For this reason this result should be considered unreliable.
The fit can be further validated using so-called posterior predictive checks: the idea is to plot the observed values against the corresponding estimated predictions, along with their 95% credible interval. If the fit is correct, we expect to see 95% of the data inside the intervals.
ppc(fit)
In this plot, each black dot represents an observation made at a given concentration, and the corresponding number of survivors at target time is given by the value on the x-axis. Using the concentration and the fitted model, we can produce the corresponding prediction of the expected number of survivors at that concentration. This prediction is given by the y-axis. Ideally observations and predictions should coincide, so we’d expect to see the black dots on the points of coordinate \(Y = X\). Our model provides a tolerable variation around the predited mean value as an interval where we expect 95% of the dots to be in average. The intervals are represented in green if they overlap with the line \(Y=X\), and in red otherwise.
The steps for a TKTD data analysis are absolutely analogous to what we described for the analysis at target time. Here the goal is to estimate the relationship between chemical compound concentration, time and survival rate using the GUTS models. GUTS, for General Unified Threshold models of Survival, is a TKTD models generalising most of existing mechanistic models for survival description. For details about GUTS models, see the vignette theory under modelling approaches, and the included references.
Here is a typical session to analyse concentration-dependent time-course data using the so-called “Stochastic Death” (SD) model:
# (1) load data set
data(propiconazole)
# (2) check structure and integrity of the data set
survDataCheck(propiconazole)
## No message
# (3) create a `survData` object
dat <- survData(propiconazole)
# (4) represent the number of survivors as a function of time
plot(dat, pool.replicate = FALSE)
# (5) check information on the experimental design
summary(dat)
##
## Number of replicates per time and concentration:
## time
## conc 0 1 2 3 4
## 0 1 1 1 1 1
## 8.05 1 1 1 1 1
## 11.91 1 1 1 1 1
## 13.8 1 1 1 1 1
## 17.87 1 1 1 1 1
## 24.19 1 1 1 1 1
## 28.93 1 1 1 1 1
## 35.92 1 1 1 1 1
##
## Number of survivors (sum of replicates) per time and concentration:
## 0 1 2 3 4
## 0 20 19 19 19 19
## 8.05 20 20 20 20 19
## 11.91 20 19 19 19 17
## 13.8 20 19 19 18 16
## 17.87 21 21 20 16 16
## 24.19 20 17 6 2 1
## 28.93 20 11 4 0 0
## 35.92 20 11 1 0 0
To fit the Stochastic Death model, we have to specify the model_type
as "SD"
:
# (6) fit the TKTD model SD
fit_cstSD <- survFit(dat, quiet = TRUE, model_type = "SD")
Then, the summary()
function provides parameters estimates as medians and 95% credible intervals.
# (7) summary of parameters estimates
summary(fit_cstSD)
## Summary:
##
## Priors of the parameters (quantiles) (select with '$Qpriors'):
##
## parameters median Q2.5 Q97.5
## kd 4.157e-02 2.771e-04 6.236e+00
## hb 1.317e-02 2.708e-04 6.403e-01
## z 1.700e+01 8.171e+00 3.539e+01
## kk 5.727e-03 1.021e-05 3.212e+00
##
## Posteriors of the parameters (quantiles) (select with '$Qposteriors'):
##
## parameters median Q2.5 Q97.5
## kd 2.196e+00 1.589e+00 3.476e+00
## hb 2.676e-02 1.282e-02 4.926e-02
## z 1.691e+01 1.547e+01 1.877e+01
## kk 1.229e-01 7.894e-02 1.902e-01
# OR
fit_cstSD$estim.par
## parameters median Q2.5 Q97.5
## 1 kd 2.19632182 1.58891845 3.47596576
## 2 hb 0.02675996 0.01282189 0.04925564
## 3 z 16.91436325 15.47321989 18.76652335
## 4 kk 0.12289037 0.07894404 0.19017026
Once fitting is done, we can compute posteriors vs. priors distribution with the function plot_prior_post()
as follow:
plot_prior_post(fit_cstSD)
The plot()
function provides a representation of the fitting for each replicates
plot(fit_cstSD)
Original data can be removed by using the option adddata = FALSE
plot(fit_cstSD, adddata = FALSE)
A posterior predictive check is also possible using function ppc()
:
ppc(fit_cstSD)
The Individual Tolerance (IT) model is a variant of TKTD survival analysis. It can also be used with morse
as demonstrated hereafter. For the IT model, we have to specify the model_type
as "IT"
:
fit_cstIT <- survFit(dat, quiet = TRUE, model_type = "IT")
We can first get a summary of the estimated parameters:
summary(fit_cstIT)
## Summary:
##
## Priors of the parameters (quantiles) (select with '$Qpriors'):
##
## parameters median Q2.5 Q97.5
## kd 4.157e-02 2.771e-04 6.236e+00
## hb 1.317e-02 2.708e-04 6.403e-01
## alpha 1.700e+01 8.171e+00 3.539e+01
## beta 1.000e+00 1.259e-02 7.943e+01
##
## Posteriors of the parameters (quantiles) (select with '$Qposteriors'):
##
## parameters median Q2.5 Q97.5
## kd 7.198e-01 5.299e-01 9.399e-01
## hb 1.588e-02 3.643e-03 3.757e-02
## alpha 1.771e+01 1.505e+01 2.024e+01
## beta 6.693e+00 4.907e+00 9.007e+00
# OR
fit_cstIT$estim.par
## parameters median Q2.5 Q97.5
## 1 kd 0.71975488 0.529853426 0.93986992
## 2 hb 0.01588179 0.003642627 0.03757068
## 3 alpha 17.71322253 15.054232417 20.23599168
## 4 beta 6.69268600 4.906969190 9.00708161
And the plot of posteriors vs. priors distributions:
plot_prior_post(fit_cstIT)
plot(fit_cstIT)
ppc(fit_cstIT)
Here is a typical session fitting an SD or an IT model for a data set under time-variable exposure scenario.
# (1) load data set
data("propiconazole_pulse_exposure")
# (2) check structure and integrity of the data set
survDataCheck(propiconazole_pulse_exposure)
## No message
# (3) create a `survData` object
dat <- survData(propiconazole_pulse_exposure)
# (4) represent the number of survivor as a function of time
plot(dat)
# (5) check information on the experimental design
summary(dat)
##
## Occurence of 'replicate' for each 'time':
## time
## replicate 0 0.96 1 1.96 2 2.96 3 3.96 4 4.96 4.97 5 5.96 6 6.96 7 7.96
## varA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## varB 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## varC 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## varControl 1 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0
## time
## replicate 8 9 9.96 10
## varA 1 1 1 1
## varB 1 1 1 1
## varC 1 1 1 1
## varControl 1 1 0 1
##
## Number of survivors per 'time' and 'replicate':
## 0 0.96 1 1.96 2 2.96 3 3.96 4 4.96 4.97 5 5.96 6 6.96 7
## varA 70 NA 50 NA 49 NA 49 NA 45 NA NA 45 NA 45 NA 42
## varB 70 NA 57 NA 53 NA 52 NA 50 NA NA 46 NA 45 NA 44
## varC 70 NA 70 NA 69 NA 69 NA 68 NA NA 66 NA 65 NA 64
## varControl 60 NA 59 NA 58 NA 58 NA 57 NA NA 57 NA 56 NA 56
## 7.96 8 9 9.96 10
## varA NA 38 37 NA 36
## varB NA 40 38 NA 37
## varC NA 60 55 NA 54
## varControl NA 56 55 NA 54
##
## Concentrations per 'time' and 'replicate':
## 0 0.96 1 1.96 2 2.96 3 3.96 4 4.96 4.97 5
## varA 30.56 27.93 0.00 0.26 NA 0.21 27.69 26.49 0.00 0.18 0.18 NA
## varB 28.98 27.66 0.00 0.27 NA 0.26 0.26 0.26 0.26 0.25 0.25 NA
## varC 4.93 4.69 4.69 4.58 NA 4.58 4.58 4.54 4.54 4.58 4.71 NA
## varControl 0.00 NA 0.00 NA 0 NA 0.00 NA 0.00 NA NA 0
## 5.96 6 6.96 7 7.96 8 9 9.96 10
## varA 0.18 NA 0.14 0.14 0.18 0.18 0.00 0.00 0.00
## varB 0.03 NA 0.00 26.98 26.28 0.00 0.12 0.12 0.12
## varC 4.71 NA 4.60 4.60 4.59 4.59 4.46 4.51 4.51
## varControl NA 0 NA 0.00 NA 0.00 0.00 NA 0.00
# (6) fit the TKTD model SD
fit_varSD <- survFit(dat, quiet = TRUE, model_type = "SD")
## Warning: The estimation of the dominant rate constant (model parameter kd) lies
## outside the range used to define its prior distribution which indicates that this
## rate is very high and difficult to estimate from this experiment !
## Warning: The estimation of Non Effect Concentration threshold (NEC)
## (model parameter z) lies outside the range of tested concentration
## and may be unreliable as the prior distribution on this parameter is
## defined from this range !
# (7) summary of the fit object
summary(fit_varSD)
## Summary:
##
## Priors of the parameters (quantiles) (select with '$Qpriors'):
##
## parameters median Q2.5 Q97.5
## kd 2.683e-02 1.119e-04 6.434e+00
## hb 8.499e-03 1.094e-04 6.606e-01
## z 9.154e-01 3.212e-02 2.608e+01
## kk 5.080e-02 4.342e-06 5.942e+02
##
## Posteriors of the parameters (quantiles) (select with '$Qposteriors'):
##
## parameters median Q2.5 Q97.5
## kd 3.854e+00 1.612e+00 2.879e+01
## hb 2.316e-02 1.625e-02 3.102e-02
## z 2.075e+01 2.524e+00 3.198e+01
## kk 7.726e-02 5.400e-03 6.597e-01
plot_prior_post(fit_varSD)
plot(fit_varSD)
ppc(fit_varSD)
# fit a TKTD model IT
fit_varIT <- survFit(dat, quiet = TRUE, model_type = "IT")
## Warning: The estimation of the dominant rate constant (model parameter kd) lies
## outside the range used to define its prior distribution which indicates that this
## rate is very high and difficult to estimate from this experiment !
## Warning: The estimation of log-logistic median (model parameter alpha)
## lies outside the range of tested concentration and may be unreliable as
## the prior distribution on this parameter is defined from this range !
# (7) summary of the fit object
summary(fit_varIT)
## Summary:
##
## Priors of the parameters (quantiles) (select with '$Qpriors'):
##
## parameters median Q2.5 Q97.5
## kd 2.683e-02 1.119e-04 6.434e+00
## hb 8.499e-03 1.094e-04 6.606e-01
## alpha 9.154e-01 3.212e-02 2.608e+01
## beta 1.000e+00 1.259e-02 7.943e+01
##
## Posteriors of the parameters (quantiles) (select with '$Qposteriors'):
##
## parameters median Q2.5 Q97.5
## kd 5.148e-01 7.799e-04 1.920e+01
## hb 2.446e-02 1.653e-02 3.344e-02
## alpha 1.794e+01 2.724e-01 3.957e+01
## beta 2.440e+00 5.098e-01 2.369e+01
plot_prior_post(fit_varIT)
plot(fit_varIT)
ppc(fit_varIT)
GUTS models can be used to simulate the survival of the organisms under any exposure pattern, using the calibration done with function survFit()
from observed data. The function for prediction is called predict()
and returns an object of class survFitPredict
.
# (1) upload or build a data frame with the exposure profile
# argument `replicate` is used to provide several profiles of exposure
data_4prediction <- data.frame(time = c(1:10, 1:10),
conc = c(c(0,0,40,0,0,0,40,0,0,0),
c(21,19,18,23,20,14,25,8,13,5)),
replicate = c(rep("pulse", 10), rep("random", 10)))
# (2) Use the fit on constant exposure propiconazole with model SD (see previously)
predict_PRZ_cstSD_4pred <- predict(object = fit_cstSD, data_predict = data_4prediction)
From an object survFitPredict
, results can ben plotted with function plot()
:
# (3) Plot the predicted survival rate under the new exposure profiles.
plot(predict_PRZ_cstSD_4pred)
deSolve
It appears that with some extreme data set, the fast way used to compute predictions return NA
data, due to numerical error (e.g. number greater or lower than \(10^{300}\) or \(10^{-300}\)).
When this issue happens, the function predict()
returns an error, with the message providing the way to use the robust implementation with ODE solver provided by deSolve
.
This way is implemented through the use of the function predict_ode()
. Robustness goes often with longer time to compute. Time to compute can be long, so we use by default MCMC chain size of 1000 independent iterations.
predict_PRZ_cstSD_4pred_ode <- predict_ode(object = fit_cstSD, data_predict = data_4prediction)
This new object predict_PRZ_cstSD_4pred_ode
is a survFitPredict
object and so it has exactly the same properties as an object returned by a predict()
function.
Note that since predict_ode()
can be very long to compute, the mcmc_size
is reduced to 1000 MCMC chains by default.
See for instance, with the plot:
plot(predict_PRZ_cstSD_4pred_ode)
While the model has been estimated using the background mortality parameter hb
, it can be interesting to see the prediction without it. This is possible with the argument hb_value
. If TRUE
, the background mortality is taken into account, and if FALSE
, the background mortality is set to \(0\) in the prediction.
# Use the same data set profile to predict without 'hb'
predict_PRZ_cstSD_4pred_hbOUT <- predict(object = fit_cstSD, data_predict = data_4prediction, hb_value = FALSE)
# Plot the prediction:
plot(predict_PRZ_cstSD_4pred_hbOUT)
Following EFSA recommendations, the next functions compute qualitative and quantitative model performance criteria suitable for GUTS, and TKTD modelling in general: the percentage of observations within the 95% credible interval of the Posterior Prediction Check (PPC), the Normalised Root Mean Square Error (NRMSE) and the Survival-Probability Prediction Error (SPPE).
PPC
The PPC compares the predicted median numbers of survivors associated to their uncertainty limits with the observed numbers of survivors. This can be visualised by plotting the predicted versus the observed values and counting how frequently the confidence/credible limits intersect with the 1:1 prediction line [see previous plot]. Based on experience, PPC resulting in less than 50% of the observations within the uncertainty limits indicate poor model performance.
Normalised Root Mean Square Error NRMSE
NRMSE criterion is also based on the expectation that predicted and observed survival numbers matches the 1:1 line in a scatter plot. The criterion is based on the classical root-mean-square error (RMSE), used to aggregate the magnitudes of the errors in predictions for various time-points into a single measure of predictive power. In order to provide a criterion expressed as a percentage, it is suggested using a normalised RMSE by the mean of the observations.
\[ NRMSE = \frac{RMSE}{\overline{Y}} = \frac{1}{\overline{Y}} \sqrt{\frac{1}{n} \sum_{i=1}^{n} (Y_{obs,i} - Y_{pred,i})^2} \times 100 \]
Survival Probability Prediction Error (SPPE)
The SPPE indicator is negative (between 0 and -100%) for an underestimation of effects, and positive (between 0 and 100%) for an overestimation of effects. An SPPE value of 0% means an exact prediction of the observed survival probability at the end of the experiment.
\[ SPPE = \left( \frac{Y_{obs, t_{end}}}{Y_{init}} - \frac{Y_{pred, t_{end}}}{Y_{init}} \right) \times 100 = \frac{Y_{obs, t_{end}} - Y_{pred, t_{end}}}{Y_{init}} \times 100 \]
For NRMSE and SPPE, we need to compute the number of survivors. To do so, we use the function predict_Nsurv()
where two arguments are required: the first argument is a survFit
object, and the other is a data set with four columns (time
, conc
, replicate
and Nsurv
). Contrary to the function predict()
, here the column Nsurv
is necessary.
predict_Nsurv_PRZ_SD_cstTOcst <- predict_Nsurv(fit_cstSD, propiconazole)
## Note that computing can be quite long (several minutes).
predict_Nsurv_PRZ_SD_varTOcst <- predict_Nsurv(fit_varSD, propiconazole)
## Note that computing can be quite long (several minutes).
predict_Nsurv_PRZ_SD_cstTOvar <- predict_Nsurv(fit_cstSD, propiconazole_pulse_exposure)
## Note that computing can be quite long (several minutes).
predict_Nsurv_PRZ_SD_varTOvar <- predict_Nsurv(fit_varSD, propiconazole_pulse_exposure)
## Note that computing can be quite long (several minutes).
predict_Nsurv_ode
For the same reason that a predict_ode
function as been implemented to compute predict
function using the ODE solver of deSolve, a predict_Nsurv_ode
function as been implemented as equivalent to predict_Nsurv
. The time to compute is subtentially longer than the original function.
predict_Nsurv_PRZ_SD_cstTOcst_ode <- predict_Nsurv_ode(fit_cstSD, propiconazole)
## Note that computing can be quite long (several minutes).
When both function work well, their results are identical (or highly similar):
plot(predict_Nsurv_PRZ_SD_cstTOcst)
plot(predict_Nsurv_PRZ_SD_cstTOcst_ode)
plot(predict_Nsurv_PRZ_SD_cstTOcst_ode)
Then, using object produce with the function predict_Nsurv()
we can compute PPC, NRMSE and SPPE for all models.
predict_Nsurv_check(predict_Nsurv_PRZ_SD_cstTOvar)
## $Percent_PPC
## replicate PPC
## 1 varA 36.36364
## 2 varB 63.63636
## 3 varC 100.00000
## 4 varControl 81.81818
##
## $Percent_PPC_global
## [1] 70.45455
##
## $Percent_NRMSE
## replicate NRMSE
## 1 varA 31.153292
## 2 varB 20.182053
## 3 varC 5.853125
## 4 varControl 9.130619
##
## $Percent_NRMSE_global
## [1] 17.13552
##
## $Percent_SPPE
## replicate SPPE
## 1 varA 20.000000
## 2 varB 18.571429
## 3 varC 1.428571
## 4 varControl 13.333333
plot(predict_Nsurv_PRZ_SD_cstTOvar)
When ploting a PPC for a survFitPredict_Nsurv
object, 3 types of lines are represented (following EFSA recommendations). - A plain line corresponding to the 1:1 line (\(y=x\)): prediction match perfectly with observation when dots are on this line. - A band of dashed lines corresponding to the range of 25% deviation. - A band of dotted lines corresponding to the range of 50% deviation.
ppc(predict_Nsurv_PRZ_SD_cstTOvar)
Following the naming of parameters in the EFSA Scientific Opinion (2018), which differs from our naming of parameters, we add an option to be in agreement with EFSA.
Several names of parameters are used in the TKTD GUTS models. The ‘R-package’ morse
, and more specifically since the GUTS implementation, several name of parameters have been used.
For stability reason of algorithms and package, we do not change parameters name in implemented algorithms. However, we added argument EFSA_name
to use EFSA naming in the summary()
functions, and in the functions priors_distribution()
providing the distributions of priors (note: distributions of posteriors are obtained with $mcmc
element of a survFit
object) and plot_prior_post()
plotting priors distributions versus posteriors distributions.
For instance:
summary(fit_cstSD, EFSA_name = TRUE)
## Summary:
##
## Priors of the parameters (quantiles) (select with '$Qpriors'):
##
## parameters median Q2.5 Q97.5
## kD 4.157e-02 2.771e-04 6.236e+00
## hb 1.317e-02 2.708e-04 6.403e-01
## zw 1.700e+01 8.171e+00 3.539e+01
## bw 5.727e-03 1.021e-05 3.212e+00
##
## Posteriors of the parameters (quantiles) (select with '$Qposteriors'):
##
## parameters median Q2.5 Q97.5
## kD 2.196e+00 1.589e+00 3.476e+00
## hb 2.676e-02 1.282e-02 4.926e-02
## zw 1.691e+01 1.547e+01 1.877e+01
## bw 1.229e-01 7.894e-02 1.902e-01
head(priors_distribution(fit_cstSD, EFSA_name = TRUE))
## hb hb_log10 kD kD_log10 bw bw_log10
## 1 0.0057804767 -2.238036 0.107450477 -0.9687917 2.175829e+01 1.337625
## 2 0.0131357823 -1.881544 0.027261260 -1.5644541 3.901033e-04 -3.408820
## 3 0.0079299129 -2.100732 0.007909079 -2.1018741 7.158086e-04 -3.145203
## 4 0.0003656537 -3.436930 0.001778440 -2.7499607 2.116498e-03 -2.674382
## 5 0.0569237046 -1.244707 0.491991436 -0.3080425 6.753988e-03 -2.170440
## 6 0.0137904714 -1.860421 0.395345819 -0.4030228 1.454708e-03 -2.837224
## zw zw_log10
## 1 13.795565 1.1397395
## 2 10.721956 1.0302740
## 3 16.662281 1.2217345
## 4 8.589012 0.9339432
## 5 6.896206 0.8386102
## 6 18.957316 1.2777768
plot_prior_post(fit_cstSD, EFSA_name = TRUE)
summary(fit_cstIT, EFSA_name = TRUE)
## Summary:
##
## Priors of the parameters (quantiles) (select with '$Qpriors'):
##
## parameters median Q2.5 Q97.5
## kD 4.157e-02 2.771e-04 6.236e+00
## hb 1.317e-02 2.708e-04 6.403e-01
## mw 1.700e+01 8.171e+00 3.539e+01
## beta 1.000e+00 1.259e-02 7.943e+01
##
## Posteriors of the parameters (quantiles) (select with '$Qposteriors'):
##
## parameters median Q2.5 Q97.5
## kD 7.198e-01 5.299e-01 9.399e-01
## hb 1.588e-02 3.643e-03 3.757e-02
## bw 1.771e+01 1.505e+01 2.024e+01
## beta 6.693e+00 4.907e+00 9.007e+00
head(priors_distribution(fit_cstIT, EFSA_name = TRUE))
## hb hb_log10 kd kd_log10 alpha alpha_log10
## 1 0.0301194740 -1.5211526 0.028031750 -1.552349784 15.97910 1.2035523
## 2 0.0182128643 -1.7396217 0.036335835 -1.439664856 17.07419 1.2323400
## 3 0.0008113008 -3.0908181 0.021289131 -1.671842068 9.05069 0.9566817
## 4 0.0001082782 -3.9654588 0.002255436 -2.646769522 21.42763 1.3309742
## 5 0.0020927763 -2.6792772 0.987937713 -0.005270436 40.50333 1.6074907
## 6 0.3388918504 -0.4699389 0.010662900 -1.972124659 20.83557 1.3188055
## beta beta_log10
## 1 1.6781119 0.2248209
## 2 0.2450178 -0.6108023
## 3 41.7127566 1.6202689
## 4 66.9987443 1.8260667
## 5 3.7622789 0.5754510
## 6 33.1338346 1.5202717
plot_prior_post(fit_cstIT, EFSA_name = TRUE)
Compared to the target time analysis, TKTD modelling allows to compute and plot the lethal concentration for any x percentage and at any time-point. The chosen time-point can be specified with time_LCx
, by default the maximal time-point in the data set is used.
# LC50 at the maximum time-point:
LCx_cstSD <- LCx(fit_cstSD, X = 50)
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
plot(LCx_cstSD)
# LC50 at time = 2
LCx(fit_cstSD, X = 50, time_LCx = 2) %>% plot()
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Note the use of the pipe operator, `%>%`, which is a powerful tool for clearly expressing a sequence of multiple operations.
## For more information on pipes, see: http://r4ds.had.co.nz/pipes.html
Warning messages are returned when the range of concentrations is not appropriated for one or more LCx calculation(s).
# LC50 at time = 15
LCx(fit_cstSD, X = 50, time_LCx = 15) %>% plot()
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
# LC50 at the maximum time-point:
LCx_cstIT <- LCx(fit_cstIT, X = 50)
plot(LCx_cstIT)
# LC50 at time = 2
LCx(fit_cstIT, X = 50, time_LCx = 2) %>% plot()
# LC30 at time = 15
LCx(fit_cstIT, X = 30, time_LCx = 15) %>% plot()
# LC50 at time = 4
LCx_varSD <- LCx(fit_varSD, X = 50, time_LCx = 4, conc_range = c(0,100))
plot(LCx_varSD)
# LC50 at time = 30
LCx(fit_varSD, X = 50, time_LCx = 30, conc_range = c(0,100)) %>% plot()
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
# LC50 at time = 4
LCx(fit_varIT, X = 50, time_LCx = 4, conc_range = c(0,200)) %>% plot()
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in pointsLCx(df_dose, X_prop): No 95%sup for survival probability
## of 0.45339146370806 in the range of concentrations under consideration:
## [ 0 ; 200 ]
## Warning: Removed 1 rows containing missing values (geom_point).
# LC50 at time = 30
LCx(fit_varIT, X = 50, time_LCx = 30, conc_range = c(0,100)) %>% plot()
Using prediction functions, GUTS models can be used to simulate the survival rate of organisms exposed to a given exposure pattern. In general, this realistic exposure profile does not result in any related mortality, but a critical question is to know how far the exposure profile is from adverse effect, that is a “margin of safety”.
This idea is then to multiply the concentration in the realistic exposure profile by a “multiplication factor”, denoted \(MF_x\), resulting in \(x\%\) (classically \(10\%\) or \(50\%\)) of additional death at a specified time (by default, at the end of the exposure period).
The multiplication factor \(MF_x\) then informs the “margin of safety” that could be used to assess if the risk should be considered as acceptable or not.
Computing an \(MF_x\) is easy with function MFx()
. It only requires object survFit
and the exposure profile, argument data_predict
in the function. The chosen percentage of survival reduction is specified with argument X
, the default is \(50\), and the chosen time-point can be specified with time_MFx
, by default the maximal time-point in the data set is used.
There is no explicit formulation of \(MF_x\) (at least for the GUTS-SD model), so the accuracy
argument can be used to change the accuracy of the convergence level.
# (1) upload or build a data frame with the exposure profile
data_4MFx <- data.frame(time = 1:10,
conc = c(0,0.5,8,3,0,0,0.5,8,3.5,0))
# (2) Use the fit on constant exposure propiconazole with model SD (see previously)
MFx_PRZ_cstSD_4MFx <- MFx(object = fit_cstSD, data_predict = data_4MFx)
## q50 1 accuracy: 0.391371 with multiplication factor: 1000
## q50 2 accuracy: 0.391371 with multiplication factor: 500
## q50 3 accuracy: 0.391371 with multiplication factor: 250
## q50 4 accuracy: 0.391371 with multiplication factor: 125
## q50 5 accuracy: 0.391371 with multiplication factor: 62.5
## q50 6 accuracy: 0.391371 with multiplication factor: 31.25
## q50 7 accuracy: 0.391371 with multiplication factor: 15.625
## q50 8 accuracy: 0.3913396 with multiplication factor: 7.8125
## q50 9 accuracy: 0.1967993 with multiplication factor: 3.90625
## q50 10 accuracy: 0.391371 with multiplication factor: 1.953125
## q50 11 accuracy: 0.3131043 with multiplication factor: 2.929688
## q50 12 accuracy: 0.02144811 with multiplication factor: 3.417969
## q50 13 accuracy: 0.1023751 with multiplication factor: 3.662109
## q50 14 accuracy: 0.04447074 with multiplication factor: 3.540039
## q50 15 accuracy: 0.01270335 with multiplication factor: 3.479004
## q50 16 accuracy: 0.004486061 with multiplication factor: 3.448486
## qinf95 1 accuracy: 0.391371 with multiplication factor: 1000
## qinf95 2 accuracy: 0.391371 with multiplication factor: 500
## qinf95 3 accuracy: 0.391371 with multiplication factor: 250
## qinf95 4 accuracy: 0.391371 with multiplication factor: 125
## qinf95 5 accuracy: 0.391371 with multiplication factor: 62.5
## qinf95 6 accuracy: 0.391371 with multiplication factor: 31.25
## qinf95 7 accuracy: 0.391371 with multiplication factor: 15.625
## qinf95 8 accuracy: 0.3913694 with multiplication factor: 7.8125
## qinf95 9 accuracy: 0.2498114 with multiplication factor: 3.90625
## qinf95 10 accuracy: 0.2933304 with multiplication factor: 1.953125
## qinf95 11 accuracy: 0.1996141 with multiplication factor: 2.929688
## qinf95 12 accuracy: 0.05849731 with multiplication factor: 3.417969
## qinf95 13 accuracy: 0.06799025 with multiplication factor: 3.173828
## qinf95 14 accuracy: 0.002383348 with multiplication factor: 3.295898
## qsup95 1 accuracy: 0.391371 with multiplication factor: 1000
## qsup95 2 accuracy: 0.391371 with multiplication factor: 500
## qsup95 3 accuracy: 0.391371 with multiplication factor: 250
## qsup95 4 accuracy: 0.391371 with multiplication factor: 125
## qsup95 5 accuracy: 0.391371 with multiplication factor: 62.5
## qsup95 6 accuracy: 0.391371 with multiplication factor: 31.25
## qsup95 7 accuracy: 0.391371 with multiplication factor: 15.625
## qsup95 8 accuracy: 0.3910594 with multiplication factor: 7.8125
## qsup95 9 accuracy: 0.1378857 with multiplication factor: 3.90625
## qsup95 10 accuracy: 0.4688228 with multiplication factor: 1.953125
## qsup95 11 accuracy: 0.4132493 with multiplication factor: 2.929688
## qsup95 12 accuracy: 0.1166627 with multiplication factor: 3.417969
## qsup95 13 accuracy: 0.03058822 with multiplication factor: 3.662109
## qsup95 14 accuracy: 0.03883372 with multiplication factor: 3.540039
## qsup95 15 accuracy: 0.002390517 with multiplication factor: 3.601074
As the computing time can be long, the function prints the accuracy
for each step of the tested multiplication factor, for the median and the 95% credible interval.
Then, we can plot the survival rate as a function of the tested multiplication factors. Note that it is a linear interpolation between tested multiplication factor (cross dots on the graph).
# (3) Plot the survival rate as function of the multiplication factors.
plot(MFx_PRZ_cstSD_4MFx)
## Warning in plot.MFx(MFx_PRZ_cstSD_4MFx): This is not an error message:
## Just take into account that MFx as been estimated with a binary
## search using the 'accuracy' argument. Cross point indicate the
## position of evaluated time series. To improve the shape of the curve, you
## can use X = NULL, and compute time series around the median MFx, with the
## vector `MFx_range`.
In this specific case, the x-axis needs to be log-scaled, what is possible by setting option log_scale = TRUE
:
# (3 bis) Plot the survival rate as function of the multiplication factors in log-scale.
plot(MFx_PRZ_cstSD_4MFx, log_scale = TRUE)
## Warning in plot.MFx(MFx_PRZ_cstSD_4MFx, log_scale = TRUE): This is not an error message:
## Just take into account that MFx as been estimated with a binary
## search using the 'accuracy' argument. Cross point indicate the
## position of evaluated time series. To improve the shape of the curve, you
## can use X = NULL, and compute time series around the median MFx, with the
## vector `MFx_range`.
As indicated, the warning message just remind you how multiplication factors and linear interpolations between them have been computed to obtain the graph.
To compare the initial survival rate (corresponding to a multiplication factor set to 1) with the survival rate at the asked multiplication factor leading to a reduction of \(x\%\) of survival (provided with argument X
), we can use option x_variable = "Time"
. The option x_variable = "Time"
allows to vizualize differences in survival rate with and without the multiplication factor.
# (4) Plot the survival rate versus time. Control (MFx = 1) and estimated MFx.
plot(MFx_PRZ_cstSD_4MFx, x_variable = "Time")
What is provided with the function plot()
is direclty accessible within the object of class MFx
. For instance, to have access to the median and \(95\%\) of returned MFx
, we simply extract the element df_MFx
which is the following data.frame
:
MFx_PRZ_cstSD_4MFx$df_MFx
## quantile MFx
## 1 median 3.448486
## 2 quantile 2.5% 3.295898
## 3 quantile 97.5% 3.601074
Here is an other example with a 10 percent Multiplication Factor:
# (2 bis) fit on constant exposure propiconazole with model SD (see previously)
MFx_PRZ_cstSD_4MFx_x10 <- MFx(object = fit_cstSD, data_predict = data_4MFx, X = 10)
## q50 1 accuracy: 0.7044677 with multiplication factor: 1000
## q50 2 accuracy: 0.7044677 with multiplication factor: 500
## q50 3 accuracy: 0.7044677 with multiplication factor: 250
## q50 4 accuracy: 0.7044677 with multiplication factor: 125
## q50 5 accuracy: 0.7044677 with multiplication factor: 62.5
## q50 6 accuracy: 0.7044677 with multiplication factor: 31.25
## q50 7 accuracy: 0.7044677 with multiplication factor: 15.625
## q50 8 accuracy: 0.7044364 with multiplication factor: 7.8125
## q50 9 accuracy: 0.5098961 with multiplication factor: 3.90625
## q50 10 accuracy: 0.07827419 with multiplication factor: 1.953125
## q50 11 accuracy: 7.518292e-06 with multiplication factor: 2.929688
## qinf95 1 accuracy: 0.7044677 with multiplication factor: 1000
## qinf95 2 accuracy: 0.7044677 with multiplication factor: 500
## qinf95 3 accuracy: 0.7044677 with multiplication factor: 250
## qinf95 4 accuracy: 0.7044677 with multiplication factor: 125
## qinf95 5 accuracy: 0.7044677 with multiplication factor: 62.5
## qinf95 6 accuracy: 0.7044677 with multiplication factor: 31.25
## qinf95 7 accuracy: 0.7044677 with multiplication factor: 15.625
## qinf95 8 accuracy: 0.7044661 with multiplication factor: 7.8125
## qinf95 9 accuracy: 0.5629082 with multiplication factor: 3.90625
## qinf95 10 accuracy: 0.01976633 with multiplication factor: 1.953125
## qinf95 11 accuracy: 0.01976633 with multiplication factor: 0.9765625
## qinf95 12 accuracy: 0.01976633 with multiplication factor: 0.4882812
## qinf95 13 accuracy: 0.01976633 with multiplication factor: 0.2441406
## qinf95 14 accuracy: 0.01976633 with multiplication factor: 0.1220703
## qinf95 15 accuracy: 0.01976633 with multiplication factor: 0.06103516
## qinf95 16 accuracy: 0.01976633 with multiplication factor: 0.03051758
## qinf95 17 accuracy: 0.01976633 with multiplication factor: 0.01525879
## qinf95 18 accuracy: 0.01976633 with multiplication factor: 0.007629395
## qinf95 19 accuracy: 0.01976633 with multiplication factor: 0.003814697
## qinf95 20 accuracy: 0.01976633 with multiplication factor: 0.001907349
## qinf95 21 accuracy: 0.01976633 with multiplication factor: 0.0009536743
## qinf95 22 accuracy: 0.01976633 with multiplication factor: 0.0004768372
## qinf95 23 accuracy: 0.01976633 with multiplication factor: 0.0002384186
## qinf95 24 accuracy: 0.01976633 with multiplication factor: 0.0001192093
## qinf95 25 accuracy: 0.01976633 with multiplication factor: 5.960464e-05
## qinf95 26 accuracy: 0.01976633 with multiplication factor: 2.980232e-05
## qinf95 27 accuracy: 0.01976633 with multiplication factor: 1.490116e-05
## qinf95 28 accuracy: 0.01976633 with multiplication factor: 7.450581e-06
## qinf95 29 accuracy: 0.01976633 with multiplication factor: 3.72529e-06
## qinf95 30 accuracy: 0.01976633 with multiplication factor: 1.862645e-06
## qinf95 31 accuracy: 0.01976633 with multiplication factor: 9.313226e-07
## qinf95 32 accuracy: 0.01976633 with multiplication factor: 4.656613e-07
## qinf95 33 accuracy: 0.01976633 with multiplication factor: 2.328306e-07
## qinf95 34 accuracy: 0.01976633 with multiplication factor: 1.164153e-07
## qinf95 35 accuracy: 0.01976633 with multiplication factor: 5.820766e-08
## qinf95 36 accuracy: 0.01976633 with multiplication factor: 2.910383e-08
## qinf95 37 accuracy: 0.01976633 with multiplication factor: 1.455192e-08
## qinf95 38 accuracy: 0.01976633 with multiplication factor: 7.275958e-09
## qinf95 39 accuracy: 0.01976633 with multiplication factor: 3.637979e-09
## qinf95 40 accuracy: 0.01976633 with multiplication factor: 1.818989e-09
## qinf95 41 accuracy: 0.01976633 with multiplication factor: 9.094947e-10
## qinf95 42 accuracy: 0.01976633 with multiplication factor: 4.547474e-10
## qinf95 43 accuracy: 0.01976633 with multiplication factor: 2.273737e-10
## qinf95 44 accuracy: 0.01976633 with multiplication factor: 1.136868e-10
## qinf95 45 accuracy: 0.01976633 with multiplication factor: 5.684342e-11
## qinf95 46 accuracy: 0.01976633 with multiplication factor: 2.842171e-11
## qinf95 47 accuracy: 0.01976633 with multiplication factor: 1.421085e-11
## qinf95 48 accuracy: 0.01976633 with multiplication factor: 7.105427e-12
## qinf95 49 accuracy: 0.01976633 with multiplication factor: 3.552714e-12
## qinf95 50 accuracy: 0.01976633 with multiplication factor: 1.776357e-12
## qinf95 51 accuracy: 0.01976633 with multiplication factor: 8.881784e-13
## qinf95 52 accuracy: 0.01976633 with multiplication factor: 4.440892e-13
## qinf95 53 accuracy: 0.01976633 with multiplication factor: 2.220446e-13
## qinf95 54 accuracy: 0.01976633 with multiplication factor: 1.110223e-13
## qinf95 55 accuracy: 0.01976633 with multiplication factor: 5.551115e-14
## qinf95 56 accuracy: 0.01976633 with multiplication factor: 2.775558e-14
## qinf95 57 accuracy: 0.01976633 with multiplication factor: 1.387779e-14
## qinf95 58 accuracy: 0.01976633 with multiplication factor: 6.938894e-15
## qinf95 59 accuracy: 0.01976633 with multiplication factor: 3.469447e-15
## qinf95 60 accuracy: 0.01976633 with multiplication factor: 1.734723e-15
## qinf95 61 accuracy: 0.01976633 with multiplication factor: 8.673617e-16
## qinf95 62 accuracy: 0.01976633 with multiplication factor: 4.336809e-16
## qinf95 63 accuracy: 0.01976633 with multiplication factor: 2.168404e-16
## qinf95 64 accuracy: 0.01976633 with multiplication factor: 1.084202e-16
## qinf95 65 accuracy: 0.01976633 with multiplication factor: 5.421011e-17
## qinf95 66 accuracy: 0.01976633 with multiplication factor: 2.710505e-17
## qinf95 67 accuracy: 0.01976633 with multiplication factor: 1.355253e-17
## qinf95 68 accuracy: 0.01976633 with multiplication factor: 6.776264e-18
## qinf95 69 accuracy: 0.01976633 with multiplication factor: 3.388132e-18
## qinf95 70 accuracy: 0.01976633 with multiplication factor: 1.694066e-18
## qinf95 71 accuracy: 0.01976633 with multiplication factor: 8.470329e-19
## qinf95 72 accuracy: 0.01976633 with multiplication factor: 4.235165e-19
## qinf95 73 accuracy: 0.01976633 with multiplication factor: 2.117582e-19
## qinf95 74 accuracy: 0.01976633 with multiplication factor: 1.058791e-19
## qinf95 75 accuracy: 0.01976633 with multiplication factor: 5.293956e-20
## qinf95 76 accuracy: 0.01976633 with multiplication factor: 2.646978e-20
## qinf95 77 accuracy: 0.01976633 with multiplication factor: 1.323489e-20
## qinf95 78 accuracy: 0.01976633 with multiplication factor: 6.617445e-21
## qinf95 79 accuracy: 0.01976633 with multiplication factor: 3.308722e-21
## qinf95 80 accuracy: 0.01976633 with multiplication factor: 1.654361e-21
## qinf95 81 accuracy: 0.01976633 with multiplication factor: 8.271806e-22
## qinf95 82 accuracy: 0.01976633 with multiplication factor: 4.135903e-22
## qinf95 83 accuracy: 0.01976633 with multiplication factor: 2.067952e-22
## qinf95 84 accuracy: 0.01976633 with multiplication factor: 1.033976e-22
## qinf95 85 accuracy: 0.01976633 with multiplication factor: 5.169879e-23
## qinf95 86 accuracy: 0.01976633 with multiplication factor: 2.584939e-23
## qinf95 87 accuracy: 0.01976633 with multiplication factor: 1.29247e-23
## qinf95 88 accuracy: 0.01976633 with multiplication factor: 6.462349e-24
## qinf95 89 accuracy: 0.01976633 with multiplication factor: 3.231174e-24
## qinf95 90 accuracy: 0.01976633 with multiplication factor: 1.615587e-24
## qinf95 91 accuracy: 0.01976633 with multiplication factor: 8.077936e-25
## qinf95 92 accuracy: 0.01976633 with multiplication factor: 4.038968e-25
## qinf95 93 accuracy: 0.01976633 with multiplication factor: 2.019484e-25
## qinf95 94 accuracy: 0.01976633 with multiplication factor: 1.009742e-25
## qinf95 95 accuracy: 0.01976633 with multiplication factor: 5.04871e-26
## qinf95 96 accuracy: 0.01976633 with multiplication factor: 2.524355e-26
## qinf95 97 accuracy: 0.01976633 with multiplication factor: 1.262177e-26
## qinf95 98 accuracy: 0.01976633 with multiplication factor: 6.310887e-27
## qinf95 99 accuracy: 0.01976633 with multiplication factor: 3.155444e-27
## qinf95 100 accuracy: 0.01976633 with multiplication factor: 1.577722e-27
## Warning in binarySearch_MFx(object = object, spaghetti = spaghetti,
## mcmc_size = mcmc_size, : For qinf95 , the number of iterations reached the
## threshold number of iterations of 100
## qsup95 1 accuracy: 0.7044677 with multiplication factor: 1000
## qsup95 2 accuracy: 0.7044677 with multiplication factor: 500
## qsup95 3 accuracy: 0.7044677 with multiplication factor: 250
## qsup95 4 accuracy: 0.7044677 with multiplication factor: 125
## qsup95 5 accuracy: 0.7044677 with multiplication factor: 62.5
## qsup95 6 accuracy: 0.7044677 with multiplication factor: 31.25
## qsup95 7 accuracy: 0.7044677 with multiplication factor: 15.625
## qsup95 8 accuracy: 0.7041561 with multiplication factor: 7.8125
## qsup95 9 accuracy: 0.4509825 with multiplication factor: 3.90625
## qsup95 10 accuracy: 0.155726 with multiplication factor: 1.953125
## qsup95 11 accuracy: 0.1001525 with multiplication factor: 2.929688
## qsup95 12 accuracy: 0.196434 with multiplication factor: 3.417969
## qsup95 13 accuracy: 0.02867901 with multiplication factor: 3.173828
## qsup95 14 accuracy: 0.04403394 with multiplication factor: 3.051758
## qsup95 15 accuracy: 0.01080081 with multiplication factor: 3.112793
## qsup95 16 accuracy: 0.00815689 with multiplication factor: 3.143311
The warning messages
are just saying that the quantile at \(2.5\%\) was not possible to compute. You can see this in the object df_MFx
included in MFx_PRZ_cstSD_4MFx_x10
. The reason of this impossibility is obvious when you plot the multiplication factor-response curve:
# Plot with log scale
plot(MFx_PRZ_cstSD_4MFx_x10, log_scale = TRUE)
## Warning in plot.MFx(MFx_PRZ_cstSD_4MFx_x10, log_scale = TRUE): This is not an error message:
## Just take into account that MFx as been estimated with a binary
## search using the 'accuracy' argument. Cross point indicate the
## position of evaluated time series. To improve the shape of the curve, you
## can use X = NULL, and compute time series around the median MFx, with the
## vector `MFx_range`.
Then, you can reduce the threshold of iterations as:
# (2 ter) fit on constant exposure propiconazole with model SD (see previously)
MFx_PRZ_cstSD_4MFx_x10_thresh20 <- MFx(object = fit_cstSD, data_predict = data_4MFx, X = 10, threshold_iter = 20)
## q50 1 accuracy: 0.7044677 with multiplication factor: 1000
## q50 2 accuracy: 0.7044677 with multiplication factor: 500
## q50 3 accuracy: 0.7044677 with multiplication factor: 250
## q50 4 accuracy: 0.7044677 with multiplication factor: 125
## q50 5 accuracy: 0.7044677 with multiplication factor: 62.5
## q50 6 accuracy: 0.7044677 with multiplication factor: 31.25
## q50 7 accuracy: 0.7044677 with multiplication factor: 15.625
## q50 8 accuracy: 0.7044364 with multiplication factor: 7.8125
## q50 9 accuracy: 0.5098961 with multiplication factor: 3.90625
## q50 10 accuracy: 0.07827419 with multiplication factor: 1.953125
## q50 11 accuracy: 7.518292e-06 with multiplication factor: 2.929688
## qinf95 1 accuracy: 0.7044677 with multiplication factor: 1000
## qinf95 2 accuracy: 0.7044677 with multiplication factor: 500
## qinf95 3 accuracy: 0.7044677 with multiplication factor: 250
## qinf95 4 accuracy: 0.7044677 with multiplication factor: 125
## qinf95 5 accuracy: 0.7044677 with multiplication factor: 62.5
## qinf95 6 accuracy: 0.7044677 with multiplication factor: 31.25
## qinf95 7 accuracy: 0.7044677 with multiplication factor: 15.625
## qinf95 8 accuracy: 0.7044661 with multiplication factor: 7.8125
## qinf95 9 accuracy: 0.5629082 with multiplication factor: 3.90625
## qinf95 10 accuracy: 0.01976633 with multiplication factor: 1.953125
## qinf95 11 accuracy: 0.01976633 with multiplication factor: 0.9765625
## qinf95 12 accuracy: 0.01976633 with multiplication factor: 0.4882812
## qinf95 13 accuracy: 0.01976633 with multiplication factor: 0.2441406
## qinf95 14 accuracy: 0.01976633 with multiplication factor: 0.1220703
## qinf95 15 accuracy: 0.01976633 with multiplication factor: 0.06103516
## qinf95 16 accuracy: 0.01976633 with multiplication factor: 0.03051758
## qinf95 17 accuracy: 0.01976633 with multiplication factor: 0.01525879
## qinf95 18 accuracy: 0.01976633 with multiplication factor: 0.007629395
## qinf95 19 accuracy: 0.01976633 with multiplication factor: 0.003814697
## qinf95 20 accuracy: 0.01976633 with multiplication factor: 0.001907349
## Warning in binarySearch_MFx(object = object, spaghetti = spaghetti,
## mcmc_size = mcmc_size, : For qinf95 , the number of iterations reached the
## threshold number of iterations of 20
## qsup95 1 accuracy: 0.7044677 with multiplication factor: 1000
## qsup95 2 accuracy: 0.7044677 with multiplication factor: 500
## qsup95 3 accuracy: 0.7044677 with multiplication factor: 250
## qsup95 4 accuracy: 0.7044677 with multiplication factor: 125
## qsup95 5 accuracy: 0.7044677 with multiplication factor: 62.5
## qsup95 6 accuracy: 0.7044677 with multiplication factor: 31.25
## qsup95 7 accuracy: 0.7044677 with multiplication factor: 15.625
## qsup95 8 accuracy: 0.7041561 with multiplication factor: 7.8125
## qsup95 9 accuracy: 0.4509825 with multiplication factor: 3.90625
## qsup95 10 accuracy: 0.155726 with multiplication factor: 1.953125
## qsup95 11 accuracy: 0.1001525 with multiplication factor: 2.929688
## qsup95 12 accuracy: 0.196434 with multiplication factor: 3.417969
## qsup95 13 accuracy: 0.02867901 with multiplication factor: 3.173828
## qsup95 14 accuracy: 0.04403394 with multiplication factor: 3.051758
## qsup95 15 accuracy: 0.01080081 with multiplication factor: 3.112793
## qsup95 16 accuracy: 0.00815689 with multiplication factor: 3.143311
plot(MFx_PRZ_cstSD_4MFx_x10_thresh20, log_scale = TRUE)
## Warning in plot.MFx(MFx_PRZ_cstSD_4MFx_x10_thresh20, log_scale = TRUE): This is not an error message:
## Just take into account that MFx as been estimated with a binary
## search using the 'accuracy' argument. Cross point indicate the
## position of evaluated time series. To improve the shape of the curve, you
## can use X = NULL, and compute time series around the median MFx, with the
## vector `MFx_range`.
After the plot()
function, you have the following message: Warning message: Removed 1 rows containing missing values (geom_point).
This message comes from the use of ggplot()
function (see the ggplot2
package) as an echo of the warning message about the missing point at \(2.5\%\) that has not been computed.
Multiplication factor is also available for the GUTS IT model (option quiet = TRUE
remove the output):
# (2) Use the fit on constant exposure propiconazole with model IT. No print of run messages.
MFx_PRZ_cstIT_4pred <- MFx(object = fit_cstIT, data_predict = data_4MFx, time_MFx = 4, quiet = TRUE)
# (3) Plot the survival rate versus multiplication factors.
plot(MFx_PRZ_cstIT_4pred, log_scale = TRUE)
## Warning in plot.MFx(MFx_PRZ_cstIT_4pred, log_scale = TRUE): This is not an error message:
## Just take into account that MFx as been estimated with a binary
## search using the 'accuracy' argument. Cross point indicate the
## position of evaluated time series. To improve the shape of the curve, you
## can use X = NULL, and compute time series around the median MFx, with the
## vector `MFx_range`.
This last example set a reduction of \(10\%\) of the survival rate, remove the background mortality by setting hb_value = FALSE
and is computed at time time_MFx = 4
.
# (2) Use the fit on constant exposure propiconazole with model IT. No print of run messages.
MFx_PRZ_cstIT_4pred <- MFx(object = fit_cstIT, X=10, hb_value = FALSE, data_predict = data_4MFx, time_MFx = 4, quiet = TRUE)
plot(MFx_PRZ_cstIT_4pred, log_scale = TRUE)
## Warning in plot.MFx(MFx_PRZ_cstIT_4pred, log_scale = TRUE): This is not an error message:
## Just take into account that MFx as been estimated with a binary
## search using the 'accuracy' argument. Cross point indicate the
## position of evaluated time series. To improve the shape of the curve, you
## can use X = NULL, and compute time series around the median MFx, with the
## vector `MFx_range`.
plot(MFx_PRZ_cstIT_4pred, x_variable = "Time")
Once we have obtained the desired multiplication factor inducing the \(x\%\) reduction of the survival rate, it can be relevant to explore the sentivity of this parameter by exploring survival rate over a range of multiplication factors. This is possible by setting argument X = NULL
and providing a range of wanted multiplication factors, for instance MFx_range = c()
in our first example.
# Use the fit on constant exposure propiconazole with model SD.
MFx_PRZ_cstSD_4pred_range <- MFx(object = fit_cstSD, data_predict = data_4MFx, X = NULL, MFx_range = 1:6)
The associated plot if given by:
# Plot survival rate versus the range of multiplication factor.
plot(MFx_PRZ_cstSD_4pred_range)
And the argument x_variable = "Time"
returns all computed time series:
# Plot Survival rate as function of time.
plot(MFx_PRZ_cstSD_4pred_range, x_variable = "Time")
To select a specific time series, we can use the element ls_predict
wich is a list of object of class survFitPredict
to wich a plot is defined.
# Plot a specific time series.
plot(MFx_PRZ_cstSD_4pred_range$ls_predict[[4]])
The steps for reproduction data analysis are absolutely analogous to what we described for survival data. Here, the aim is to estimate the relationship between the chemical compound concentration and the reproduction rate per individual-day.
Here is a typical session:
# (1) load data set
data(cadmium2)
# (2) check structure and integrity of the data set
reproDataCheck(cadmium2)
## No message
# (3) create a `reproData` object
dat <- reproData(cadmium2)
# (4) represent the cumulated number of offspring as a function of time
plot(dat, concentration = 124, addlegend = TRUE, pool.replicate = FALSE)
# (5) represent the reproduction rate as a function of concentration
plotDoseResponse(dat, target.time = 28)
# (6) check information on the experimental design
summary(dat)
##
## Number of replicates per time and concentration:
## time
## conc 0 3 7 10 14 17 21 24 28 31 35 38 42 45 49 52 56
## 0 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 53 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 78 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 124 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 232 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
## 284 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
##
## Number of survivors (sum of replicates) per time and concentration:
## 0 3 7 10 14 17 21 24 28 31 35 38 42 45 49 52 56
## 0 30 30 30 30 29 29 29 29 29 28 28 28 28 28 28 28 28
## 53 30 30 29 29 29 29 29 29 29 29 28 28 28 28 28 28 28
## 78 30 30 30 30 30 30 29 29 29 29 29 29 29 29 29 27 27
## 124 30 30 30 30 30 29 28 28 27 26 25 23 21 18 11 11 9
## 232 30 30 30 22 18 18 17 14 13 12 8 4 3 1 0 0 0
## 284 30 30 15 7 4 4 4 2 2 1 1 1 1 1 1 0 0
##
## Number of offspring (sum of replicates) per time and concentration:
## 0 3 7 10 14 17 21 24 28 31 35 38 42 45
## 0 0 1659 1587 2082 1580 2400 2069 2316 1822 2860 2154 3200 1603 2490
## 53 0 1221 1567 1710 1773 1859 1602 1995 1800 2101 1494 2126 935 1629
## 78 0 1066 2023 1752 1629 1715 1719 1278 1717 1451 1826 1610 1097 1727
## 124 0 807 1917 1423 567 383 568 493 605 631 573 585 546 280
## 232 0 270 1153 252 30 0 37 28 46 119 19 9 0 0
## 284 0 146 275 18 1 0 0 0 0 0 0 0 0 0
## 49 52 56
## 0 1609 2149 2881
## 53 2108 1686 1628
## 78 2309 1954 1760
## 124 594 328 380
## 232 0 0 0
## 284 0 0 0
# (7) fit a concentration-effect model at target-time
fit <- reproFitTT(dat, stoc.part = "bestfit",
target.time = 21,
ecx = c(10, 20, 30, 40, 50),
quiet = TRUE)
summary(fit)
## Summary:
##
## The loglogistic model with a Gamma Poisson stochastic part was used !
##
## Priors on parameters (quantiles):
##
## 50% 2.5% 97.5%
## b 1.000e+00 1.259e-02 7.943e+01
## d 1.830e+01 1.554e+01 2.107e+01
## e 1.488e+02 7.902e+01 2.804e+02
## omega 1.000e+00 1.585e-04 6.310e+03
##
## Posteriors of the parameters (quantiles):
##
## 50% 2.5% 97.5%
## b 3.837e+00 2.824e+00 5.780e+00
## d 1.778e+01 1.544e+01 2.015e+01
## e 1.368e+02 1.134e+02 1.730e+02
## omega 1.440e+00 8.236e-01 2.785e+00
##
## Posteriors of the ECx (quantiles):
##
## 50% 2.5% 97.5%
## EC10 7.706e+01 5.386e+01 1.166e+02
## EC20 9.524e+01 7.111e+01 1.349e+02
## EC30 1.097e+02 8.562e+01 1.484e+02
## EC40 1.231e+02 9.943e+01 1.607e+02
## EC50 1.368e+02 1.134e+02 1.730e+02
plot(fit, log.scale = TRUE, adddata = TRUE,
cicol = "orange",
addlegend = TRUE)
ppc(fit)
As in the survival analysis, we assume that the reproduction rate per individual-day is a log-logistic function of the concentration. More details and parameter signification can be found in the modelling vignette.
For reproduction analyses, we compare one model which neglects the inter-individual variability (named “Poisson”) and another one which takes it into account (named “gamma Poisson”). You can choose either one or the other with the option stoc.part
. Setting this option to "bestfit"
, you let reproFitTT()
decides which models fits the data best. The corresponding choice can be seen by calling the summary
function:
summary(fit)
## Summary:
##
## The loglogistic model with a Gamma Poisson stochastic part was used !
##
## Priors on parameters (quantiles):
##
## 50% 2.5% 97.5%
## b 1.000e+00 1.259e-02 7.943e+01
## d 1.830e+01 1.554e+01 2.107e+01
## e 1.488e+02 7.902e+01 2.804e+02
## omega 1.000e+00 1.585e-04 6.310e+03
##
## Posteriors of the parameters (quantiles):
##
## 50% 2.5% 97.5%
## b 3.837e+00 2.824e+00 5.780e+00
## d 1.778e+01 1.544e+01 2.015e+01
## e 1.368e+02 1.134e+02 1.730e+02
## omega 1.440e+00 8.236e-01 2.785e+00
##
## Posteriors of the ECx (quantiles):
##
## 50% 2.5% 97.5%
## EC10 7.706e+01 5.386e+01 1.166e+02
## EC20 9.524e+01 7.111e+01 1.349e+02
## EC30 1.097e+02 8.562e+01 1.484e+02
## EC40 1.231e+02 9.943e+01 1.607e+02
## EC50 1.368e+02 1.134e+02 1.730e+02
When the gamma Poisson model is selected, the summary shows an additional parameter called omega
, which quantifies the inter-individual variability (the higher omega
the higher the variability).
In morse
, reproduction data sets are a special case of survival data sets: a reproduction data set includes the same information as in a survival data set plus the information on reproduction outputs. For that reason, the S3 class reproData
inherits from the class survData
, which means that any operation on a survData
object is legal on a reproData
object. In particular, in order to use the plot function related to the survival analysis on a reproData
object, we can use survData
as a conversion function first:
dat <- reproData(cadmium2)
plot(survData(dat))
In Bayesian inference, the parameters of a model are estimated from the data starting from a so-called prior, which is a probability distribution representing an initial guess on the true parameters, before seing the data. The posterior distribution represents the uncertainty on the parameters after seeing the data and combining them with the prior. To obtain a point estimate of the parameters, it is typical to compute the mean or median of the posterior. We can quantify the uncertainty by reporting the standard deviation or an inter-quantile distance from this posterior distribution.↩