Introduction to hesim

2020-05-31

1 Overview

hesim supports three types of health-economic models: (i) cohort discrete time state transition models (cDTSTMs), (ii) N-state partitioned survival models (PSMs), and (iii) individual-level continuous time state transition models (iCTSTMs). cDTSTMs are Markov cohort models and can be time-homogeneous or time-inhomogeneous. iCTSTMs are individual-level simulations that can encompass both Markov and semi-Markov processes. All models are implemented as R6 classes and have methods for simulating disease progression, costs, and QALYs.

Economic model R6 class
N-state partitioned survival model (PSM) hesim::Psm
Cohort discrete time state transition model (cDTSTM) hesim::CohortDtstm
Individual-level continuous time state transition model (iCTSTM) hesim::IndivCtstm

Economic models can, in general, be created in two ways: first, with mathematical expressions using non-standard evaluation (note that this is currently only available for cDTSTMs), and, second, from specific statistical models. Each economic model consists of statistical models of disease progression, costs, and utilities. As shown in the figure, a typical analysis proceeds in a 3-step process:




  1. Parameterization: An economic model is parameterized by estimating statistical models for disease progression, costs, and utilities using “estimation” datasets, such as extracted data from published studies for a meta-analysis or multi-state survival data from a clinical trial.
  2. Simulation: The statistical models estimated in Step 1 are combined to construct an economic model. For a given model structure, disease progression, QALYs, and costs are simulated from “input data”, based on the target population and treatment strategies of interest.
  3. Decision analysis: Simulated outcomes from Step 2 are used to perform decision analysis using approaches such as cost-effectiveness analysis (CEA) and multi-criteria decision analysis (MCDA).

The entire analysis is inherently Bayesian, as uncertainty in the parameters from the statistical models is propagated throughout the economic model and decision analysis with probabilistic sensitivity analysis (PSA). Furthermore, since the statistical and economic models are integrated, patient heterogeneity can be easily introduced with patient level covariates.

2 Treatment strategies, target population, and model structure

Before beginning an analysis, it is necessary to define the treatment strategies of interest, the target population, and the model structure. This can be done in hesim by creating a hesim_data object with the function hesim_data(). Let’s consider an example where we use an iCTSTM to evaluate two competing treatment strategies. We will consider a model with three health states (healthy, sick, and dead) with four transitions (healthy -> sick, sick -> healthy, healthy -> dead, and sick -> dead). Since we are using an individual-level model, we must simulate a target population that is sufficiently large so that uncertainty reflects uncertainty in the model parameters, rather than variability across simulated individuals.

library("hesim")
library("data.table")
strategies <- data.table(strategy_id = c(1, 2))
n_patients <- 1000
patients <- data.table(patient_id = 1:n_patients,
                          age = rnorm(n_patients, mean = 45, sd = 7),
                          female = rbinom(n_patients, size = 1, prob = .51))
states <- data.table(state_id = c(1, 2),
                     state_name = c("Healthy", "Sick")) # Non-death health states
tmat <- rbind(c(NA, 1, 2),
              c(3, NA, 4),
              c(NA, NA, NA))
colnames(tmat) <- rownames(tmat) <- c("Healthy", "Sick", "Dead")
transitions <- create_trans_dt(tmat)
transitions[, trans := factor(transition_id)]
hesim_dat <- hesim_data(strategies = strategies,
                        patients = patients, 
                        states = states,
                        transitions = transitions)
print(hesim_dat)
## $strategies
##    strategy_id
## 1:           1
## 2:           2
## 
## $patients
##       patient_id      age female
##    1:          1 39.24173      0
##    2:          2 41.72205      1
##    3:          3 46.75134      0
##    4:          4 38.41198      1
##    5:          5 44.70204      0
##   ---                           
##  996:        996 52.78174      0
##  997:        997 52.45926      0
##  998:        998 44.04488      1
##  999:        999 46.42657      0
## 1000:       1000 39.03964      1
## 
## $states
##    state_id state_name
## 1:        1    Healthy
## 2:        2       Sick
## 
## $transitions
##    transition_id from to from_name to_name trans
## 1:             1    1  2   Healthy    Sick     1
## 2:             2    1  3   Healthy    Dead     2
## 3:             3    2  1      Sick Healthy     3
## 4:             4    2  3      Sick    Dead     4
## 
## attr(,"class")
## [1] "hesim_data"

3 Parameterization

3.1 Disease progression

As shown in the table below, the statistical model used to parameterize the disease model component of an economic model varies by the type of economic model. For example, multinomial logistic regressions can be used to parameterize a cDTSTM, a set of N-1 independent survival models are used to parameterize an N-state partitioned survival model, and multi-state models can be used to parameterize an iCTSTM.

Economic model (R6 class) Statistical model Parameter object Model fit object
hesim::CohortDtstm Custom hesim::tparams_transprobs hesim::define_model()
Multinomial logistic regressions hesim::params_mlogit_list hesim::multinom_list
hesim::Psm Independent survival models hesim::params_surv_list hesim::flexsurvreg_list
hesim::IndivCtstm Multi-state model (joint likelihood) hesim::params_surv flexsurv::flexsurvreg
Multi-state model (transition-specific) hesim::params_surv_list hesim::flexsurvreg_list

Disease models can either be fit from an explicit statistical model or through mathematical expressions. In the first case, the easiest way to parameterize a disease model is by fitting a statistical model using R. For example, survival models and multi-state models can be fit using flexsurv::flexsurvreg() while multinomial logistic regressions can be fit with nnet::multinom(). In other cases, the disease models will not be fit directly with R, but the estimates of a disease model can be directly stored in parameter (hesim::params) or transformed parameter (hesim::tparams) objects. In the second case, an entire model (encompassing disease progression, costs, and utility) can be defined in terms of mathematical expressions with hesim::define_model().

We will illustrate an example of a statistical model of disease progression fit with R by estimating a multi-state model with a joint likelihood using flexsurv::flexsurvreg().

library("flexsurv")
mstate_data <- data.table(mstate3_exdata$transitions)
mstate_data[, trans := factor(trans)]
fit_wei <- flexsurv::flexsurvreg(Surv(years, status) ~ trans + 
                                                       factor(strategy_id):trans +
                                                       age:trans + 
                                                       female: trans +
                                                       shape(trans), 
                                 data = mstate_data, 
                                 dist = "weibull")

3.2 Costs and utility

Costs and utilities can currently either be modeled using a linear model or using predicted means. The latter is an example of a transformed parameter object since the predicted means are parameters that are presumably a function of the underlying parameters of a statistical model and possibly input data.

Statistical model Parameter object Model fit object
Predicted means hesim::tparams_mean hesim::stateval_tbl
hesim::tparams_mean hesim::define_model()
Linear model hesim::params_lm stats::lm

Linear models are fit using stats::lm(). Predicted means can be constructed from a hesim::stateval_tbl object or as a part of a model defined in term of mathematical expressions with define_model(). The former is a special object used to assign values to health states that can vary across treatment strategies, patients, and/or time intervals. State values can be specified either as moments (i.e., mean and standard error) or parameters (e.g., shape and scale of gamma distribution) of a probability distribution, or by pre-simulating values from a suitable probability distribution. Here we will use hesim::stateval_tbl objects for utility and two cost categories (drug and medical).

# Utility
utility_tbl <- stateval_tbl(data.table(state_id = states$state_id,
                                       mean = mstate3_exdata$utility$mean,
                                       se = mstate3_exdata$utility$se),
                            dist = "beta",
                            hesim_data = hesim_dat)

# Costs
drugcost_tbl <- stateval_tbl(data.table(strategy_id = strategies$strategy_id,
                                       est = mstate3_exdata$costs$drugs$costs),
                            dist = "fixed",
                            hesim_data = hesim_dat) 
medcost_tbl <- stateval_tbl(data.table(state_id = states$state_id,
                                       mean = mstate3_exdata$costs$medical$mean,
                                       se = mstate3_exdata$costs$medical$se),
                            dist = "gamma",
                            hesim_data = hesim_dat)  

4 Simulation

4.1 Constructing an economic model

An economic model consists of a disease model, a utility model, and a set of cost models for each cost category. The utility and cost models are always hesim::StateVals objects, whereas the disease models vary by economic model. The disease model is used to simulate survival curves in a PSM and health state transitions in a cDTSTM and iCTSTM.

Economic model Disease model Utility model Cost model(s)
hesim::CohortDtstm hesim::CohortDtstmTrans hesim::StateVals hesim::StateVals
hesim::Psm hesim::PsmCurves hesim::StateVals hesim::StateVals
hesim::IndivCtstm hesim::IndivCtstmTrans hesim::StateVals hesim::StateVals

Since economic models in hesim are inherently Bayesian, we must specify the number of parameter samples we will use for the PSA before constructing the model.

n_samples <- 1000

4.1.1 Disease model

Models are constructed as a function of parameters (or model fits) and in regression models, input data. The input data must be objects of class expanded_hesim_data, which are data tables containing the covariates for the statistical model. In our multi-state model, each row is a unique treatment strategy, patient, and health-state transition. The ID variables (strategy_id, patient_id, and transition_id) are stored as attributes of the dataset.

An expanded_hesim_data object can be created directly or by expanding an object of class hesim_data using expand.hesim_data(). Here, we will use the latter approach,

transmod_data <- expand(hesim_dat, 
                        by = c("strategies", "patients", "transitions"))
head(transmod_data)
##    strategy_id patient_id transition_id      age female from to from_name
## 1:           1          1             1 39.24173      0    1  2   Healthy
## 2:           1          1             2 39.24173      0    1  3   Healthy
## 3:           1          1             3 39.24173      0    2  1      Sick
## 4:           1          1             4 39.24173      0    2  3      Sick
## 5:           1          2             1 41.72205      1    1  2   Healthy
## 6:           1          2             2 41.72205      1    1  3   Healthy
##    to_name trans
## 1:    Sick     1
## 2:    Dead     2
## 3: Healthy     3
## 4:    Dead     4
## 5:    Sick     1
## 6:    Dead     2
attr(transmod_data, "id_vars")
## [1] "strategy_id"   "patient_id"    "transition_id"

We can now construct the health state transition model, which creates an IndivCtstmTrans object that can be used to simulate health state transitions.

transmod <- create_IndivCtstmTrans(fit_wei, transmod_data,
                                   trans_mat = tmat, n = n_samples)
class(transmod)
## [1] "IndivCtstmTrans" "CtstmTrans"      "R6"

4.1.2 Cost and utility models

Since we are using predicted means for costs and utilities, we do not need to specify input data. Instead, we can construct the cost and utility models directly from the stateval_tbl objects.

# Utility
utilitymod <- create_StateVals(utility_tbl, n = n_samples)

# Costs
drugcostmod <- create_StateVals(drugcost_tbl, n = n_samples)
medcostmod <- create_StateVals(medcost_tbl, n = n_samples)
costmods <- list(drugs = drugcostmod,
                 medical = medcostmod)

4.1.3 Combining the disease progression, cost, and utility models

Once the disease, utility, and cost model have been constructed, we combine them to create the full economic model.

ictstm <- IndivCtstm$new(trans_model = transmod,
                         utility_model = utilitymod,
                         cost_models = costmods)

4.2 Simulating outcomes

Each economic model contains methods (i.e., functions) for simulating disease progression, costs, and QALYs. These methods are listed in the table below.

Economic model (R6 class) Disease progression QALYs Costs
hesim::CohortDtstm $sim_stateprobs() $sim_qalys() $sim_costs()
hesim::Psm $sim_survival() and $sim_stateprobs() $sim_qalys() $sim_costs()
hesim::IndivCtstm $sim_disease() and $sim_stateprobs() $sim_qalys() $sim_costs()

Although all models simulate state probabilities, they do so in different ways. The cDTSTM uses discrete time Markov chains, the PSM calculates differences in probabilities from simulated survival curves, and the iCTSTM aggregates individual trajectories simulated using random number generation. The individual-level simulation is advantageous because it can be used for semi-Markov processes where transition rates depend on time since entering a health state (rather than time since the start of the model).

In the cohort models, costs and QALYs are computed as a function of the state probabilities whereas in individual-level models they are based on the simulated individual trajectories. Like the disease model, the individual-level simulation is more flexible because costs and QALYs can depend on time since entering the health state.

We illustrate with the iCTSTM. First we simulate disease progression for each patient.

ictstm$sim_disease()
head(ictstm$disprog_)
##    sample strategy_id patient_id grp_id from to final time_start time_stop
## 1:      1           1          1      1    1  2     0   0.000000  3.086192
## 2:      1           1          1      1    2  1     0   3.086192  5.108223
## 3:      1           1          1      1    1  2     0   5.108223 16.289004
## 4:      1           1          1      1    2  1     0  16.289004 16.339959
## 5:      1           1          1      1    1  2     0  16.339959 19.802941
## 6:      1           1          1      1    2  1     0  19.802941 20.945194

The disease trajectory can be summarized with $sim_stateprobs().

ictstm$sim_stateprobs(t = c(0:10))
head(ictstm$stateprobs_)
##    sample strategy_id grp_id state_id t  prob
## 1:      1           1      1        1 0 1.000
## 2:      1           1      1        1 1 0.849
## 3:      1           1      1        1 2 0.772
## 4:      1           1      1        1 3 0.690
## 5:      1           1      1        1 4 0.644
## 6:      1           1      1        1 5 0.616

We can then compute costs and QALYs (using a discount rate of 3 percent).

# QALYs
ictstm$sim_qalys(dr = .03)
head(ictstm$qalys_)
##    sample strategy_id grp_id state_id   dr     qalys      lys
## 1:      1           1      1        1 0.03 8.1605070 9.761295
## 2:      1           1      1        2 0.03 1.8982012 2.123352
## 3:      1           2      1        1 0.03 8.2143181 9.825662
## 4:      1           2      1        2 0.03 0.9982393 1.116643
## 5:      2           1      1        1 0.03 3.4363922 9.710236
## 6:      2           1      1        2 0.03 2.2454015 2.245998
# Costs
ictstm$sim_costs(dr = .03)
head(ictstm$costs_)
##    sample strategy_id grp_id state_id   dr category    costs
## 1:      1           1      1        1 0.03    drugs 48806.48
## 2:      1           1      1        2 0.03    drugs 10616.76
## 3:      1           2      1        1 0.03    drugs 98256.62
## 4:      1           2      1        2 0.03    drugs 11166.43
## 5:      2           1      1        1 0.03    drugs 48551.18
## 6:      2           1      1        2 0.03    drugs 11229.99

5 Decision analysis

Once output has been simulated with an economic model, a decision analysis can be performed. Cost-effectiveness analyses can be performed using other R packages such as BCEA or directly with hesim as described in more detail here. hesim does not currently provide support for MCDA.

To perform a CEA, simulated costs and QALYs can be summarized to create a ce object, which contains mean costs and QALYs for each sample from the PSA by treatment strategy.

ce <- ictstm$summarize()
print(ce)
## $costs
##       category   dr sample strategy_id     costs grp_id
##    1:    drugs 0.03      1           1  59423.24      1
##    2:    drugs 0.03      1           2 109423.06      1
##    3:    drugs 0.03      2           1  59781.17      1
##    4:    drugs 0.03      2           2 101015.24      1
##    5:    drugs 0.03      3           1  40831.21      1
##   ---                                                  
## 5996:    total 0.03    998           2 119969.83      1
## 5997:    total 0.03    999           1  55168.78      1
## 5998:    total 0.03    999           2 110556.40      1
## 5999:    total 0.03   1000           1  61038.94      1
## 6000:    total 0.03   1000           2 112301.41      1
## 
## $qalys
##         dr sample strategy_id     qalys grp_id
##    1: 0.03      1           1 10.058708      1
##    2: 0.03      1           2  9.212557      1
##    3: 0.03      2           1  5.681794      1
##    4: 0.03      2           2  4.392129      1
##    5: 0.03      3           1  6.268148      1
##   ---                                         
## 1996: 0.03    998           2  7.713205      1
## 1997: 0.03    999           1  5.457914      1
## 1998: 0.03    999           2  6.019683      1
## 1999: 0.03   1000           1  5.701389      1
## 2000: 0.03   1000           2  5.833973      1
## 
## attr(,"class")
## [1] "ce"

The functions icea() and icea_pw, which perform individualized cost-effectiveness analysis and incremental individualized cost-effectiveness analysis, respectively, can be used.

icea <- icea(ce, dr_qalys = .03, dr_costs = .03)
icea_pw <- icea_pw(ce, dr_qalys = .03, dr_costs = .03, comparator = 1)

For instance, we might want to plot a cost-effectiveness acceptability curve (CEAC) displaying the probability that treatment strategy 2 is more cost-effective than treatment strategy 1 at a given willingness to pay for a QALY.

library("ggplot2")
ggplot2::ggplot(icea_pw$ceac, aes(x = k, y = prob, col = factor(strategy_id))) +
  geom_line() + xlab("Willingness to pay") +
  ylab("Probability most cost-effective") +
  scale_x_continuous(breaks = seq(0, 200000, 100000), label = scales::dollar) +
  theme(legend.position = "bottom") + scale_colour_discrete(name = "Strategy") + 
  theme_minimal()