SEMinR brings a friendly syntax to creating and estimating structural equation models (SEM). The syntax allows applied practitioners of SEM to use terminology that is very close to their familiar modeling terms (e.g., reflective, composite, interactions) instead of specifying underlying matrices and covariances. SEM models can be estimated either using Partial Least Squares Path Modeling (PLS-PM) as popularized by SmartPLS, or using Covariance Based Structural Equation Modeling (CBSEM) as popularized by LISREL and AMOS. Confirmatory Factor Analysis (CFA) of reflective measurements models is also supported. Both CBSEM and CFA estimation use the Lavaan package.
SEMinR uses its own PLS-PM estimation engine and integrates with the Lavaan package for CBSEM/CFA estimation. It also brings a few methodological advancements not found in other packages or software, and encourages best practices wherever possible.
PLS-PM advances and best-practices in SEMinR:
CBSEM/CFA advances and best-practices in SEMinR:
Briefly, there are three steps to specifying and estimating a structural equation model using SEMinR. The following example is generic to either PLS-PM or CBSEM/CFA.
# Distinguish and mix composite measurement (used in PLS-PM)
# or reflective (common-factor) measurement (used in CBSEM, CFA, and PLSc)
# - We will first use composites in PLS-PM analysis
# - Later we will convert the omposites into reflectives for CFA/CBSEM (step 3)
measurements <- constructs(
composite("Image", multi_items("IMAG", 1:5)),
composite("Expectation", multi_items("CUEX", 1:3)),
composite("Value", multi_items("PERV", 1:2)),
composite("Satisfaction", multi_items("CUSA", 1:3)),
interaction_term(iv = "Image", moderator = "Expectation")
)
# Quickly create multiple paths "from" and "to" sets of constructs
structure <- relationships(
paths(from = c("Image", "Expectation", "Image*Expectation"), to = "Value"),
paths(from = "Value", to = "Satisfaction")
)
# Estimate using PLS-PM from model parts defined earlier
pls_model <- estimate_pls(data = mobi,
measurement_model = measurements,
structural_model = structure)
summary(pls_model)
# note: PLS requires seperate bootstrapping for PLS path estimates
# SEMinR uses multi-core parallel processing to speed up bootstrapping
boot_estimates <- bootstrap_model(pls_model, nboot = 1000, cores = 2)
summary(boot_estimates)
# Alternatively, we could estimate our model using CBSEM, which uses the Lavaan package
# We often wish to conduct a CFA of our measurement model prior to CBSEM
# note: we must convert composites in our measurement model into reflective constructs for CFA/CBSEM
cfa_model <- estimate_cfa(data = mobi, as.reflective(measurements))
summary(cfa_model)
cbsem_model <- estimate_cbsem(data = mobi, as.reflective(measurements), structure)
summary(cbsem_model)
# note: the Lavaan syntax and Lavaan fitted model can be extracted for your own specific needs
cbsem_model$lavaan_syntax
cbsem_model$lavaan_model
SEMinR seeks to combine ease-of-use, flexible model construction, and high-performance. Below, we will cover the details and options of each of the three parts of model construction and estimation demonstrated above.
You must install the SEMinR library once on your local machine:
install.packages("seminr")
And then load it in every session you want to use it:
library(seminr)
You must load your data into a dataframe from any source you wish (CSV, etc.). Column names must be names of your measurement items.
Important: Avoid using asterixes ’*’ in your column names (these are reserved for interaction terms).
survey_data <- read.csv("mobi_survey_data.csv")
For demonstration purposes, we will start with a dataset bundled with the seminr package - the mobi
data frame (also found in the semPLS
R package). This dataset comes from a measurement instrument for the European Customer Satisfaction Index (ECSI) adapted to the mobile phone market (Tenenhaus et al. 2005).
You can see a description and sample of what is in mobi
:
dim(mobi)
#> [1] 250 24
head(mobi)
#> CUEX1 CUEX2 CUEX3 CUSA1 CUSA2 CUSA3 CUSCO CUSL1 CUSL2 CUSL3 IMAG1 IMAG2 IMAG3
#> 1 7 7 6 6 4 7 7 6 5 6 7 5 5
#> 2 10 10 9 10 10 8 10 10 2 10 10 9 10
#> 3 7 7 7 8 7 7 6 6 2 7 8 7 6
#> 4 7 10 5 10 10 10 5 10 4 10 10 10 5
#> 5 8 7 10 10 8 8 5 10 3 8 10 10 5
#> 6 10 9 7 8 7 7 8 10 3 10 8 9 10
#> IMAG4 IMAG5 PERQ1 PERQ2 PERQ3 PERQ4 PERQ5 PERQ6 PERQ7 PERV1 PERV2
#> 1 5 4 7 6 4 7 6 5 5 2 3
#> 2 10 9 10 9 10 10 9 10 10 10 10
#> 3 4 7 7 8 5 7 8 7 7 7 7
#> 4 5 10 8 10 10 8 4 5 8 5 5
#> 5 8 9 10 9 8 10 9 9 8 6 6
#> 6 8 9 9 10 9 10 8 9 9 10 10
SEMinR uses the following functions to describe measurement models:
constructs()
gathers all the construct measurement modelscomposite()
or reflective()
define the measurement mode of individual constructsinteraction_term()
specifies interactions and higher_composite()
specifies higher order constructsmulti_items()
or single_item()
define the items of a constructThese functions should be natural to SEM practitioners and encourages them to explicitly specify their core nature of their measurement models: composite or common-factor (See Sarstedt et al., 2016, and Henseler et al., 2013, for clear definitions).
Let’s take a closer look at the individual functions.
constructs()
compiles the measurement model specification list from the user specified construct descriptions described in the parameters. You must supply it with any number of individual composite, reflective, interaction_term, or higher_composite constructs. Note that we currenly only support higher-order constructs for PLS-PM estimation (i.e., composites).
measurements <- constructs(
composite("Image", multi_items("IMAG", 1:5), weights = mode_B),
composite("Expectation", multi_items("CUEX", 1:3), weights = regression_weights),
composite("Quality", multi_items("PERQ", 1:7), weights = mode_A),
composite("Value", multi_items("PERV", 1:2), weights = correlation_weights),
reflective("Satisfaction", multi_items("CUSA", 1:3)),
reflective("Complaints", single_item("CUSCO")),
higher_composite("HOC", c("Value", "Satisfaction"), orthogonal, mode_A),
interaction_term(iv = "Image", moderator = "Expectation", method = orthogonal, weights = mode_A),
reflective("Loyalty", multi_items("CUSL", 1:3))
)
We are storing the measurement model in the measurements
object for later use.
Note that neither a dataset nor a structural model is specified in the measurement model stage, so we can reuse the measurement model object measurements
across different datasets and structural models.
composite()
or reflective()
describe the measurement of a construct by its items.
For example, we can use composite()
for PLS models to describe mode A (correlation weights) for the “Expectation” construct with manifest variables CUEX1, CUEX2, and CUEX3:
composite("Expectation", multi_items("CUEX", 1:3), weights = mode_A)
# is equivalent to:
composite("Expectation", multi_items("CUEX", 1:3), weights = correlation_weights)
We can describe composite “Image” using mode B (regression weights) with manifest variables IMAG1, IMAG2, IMAG3, IMAG4 and IMAG5:
composite("Image", multi_items("IMAG", 1:5), weights = mode_B)
# is equivalent to:
composite("Image", multi_items("IMAG", 1:5), weights = regression_weights)
Alternatively, we can use reflective()
for CBSEM/CFA/PLSc to describe the reflective, common-factor measurement of the “Satisfaction” construct with manifest variables CUSA1, CUSA2, and CUSA3:
reflective("Satisfaction", multi_items("CUSA", 1:3))
For covariance-based SEM and CFA, you will want constructs to be reflective common factors. If you already have composite constructs or measurement models, you may use them for CBSEM/CFA after converting them to reflective versions. The as.reflective()
function can convert either a single construct or an entire measurement model into reflective forms.
# Coerce a composite into reflective form
img_composite <- composite("Image", multi_items("IMAG", 1:5))
img_reflective <- as.reflective(img_composite)
# Coerce all constructs of a measurement model into composite form
mobi_composites <- constructs(
composite("Image", multi_items("IMAG", 1:5)),
composite("Expectation", multi_items("CUEX", 1:3)),
reflective("Complaints", single_item("CUSCO"))
)
mobi_reflective <- as.reflective(mobi_composites)
SEMinR strives to make specification of measurement items shorter and cleaner using multi_items()
or single_item()
multi_items()
creates a vector of multiple measurement items with similar namessingle_item()
describe a single measurement itemWe can describe the manifest variables: IMAG1, IMAG2, IMAG3, IMAG4 and IMAG5:
multi_items("IMAG", 1:5)
# which is equivalent to the R vector:
c("IMAG1", "IMAG2", "IMAG3", "IMAG4", "IMAG5")
If your constructs are not numbered perfectly sequentially, then you will combine your items using the c()
function:
multi_items("IMAG", c(1, 3:5))
# which is equivalent to the R vector:
c("IMAG1", "IMAG3", "IMAG4", "IMAG5")
multi_items()
is used in conjunction with composite()
or reflective()
to describe a composite and common-factor construct respectively.
We can describe a single manifest variable CUSCO:
single_item("CUSCO")
# which is equivalent to the R character string:
"CUSCO"
Note that single-item constructs can be defined as either composite mode A or reflective common-factor, but single-item constructs are essentially composites whose construct scores are determined.
Covariance-based SEM models generally constrain all item errors to be unrelated. However, researchers might sometimes wish to free up covariances between item errors for estimation.
# The following specifies that items PERQ1 and PERQ2 covary with each other, both covary with IMAG1
mobi_am <- associations(
item_errors("PERQ1", "PERQ2"),
item_errors(c("PERQ1", "PERQ2"), "IMAG1")
)
Creating interaction terms by hand can be a time-consuming and error-prone. SEMinR provides high-level functions for simply creating interactions between constructs.
Interaction terms are described in the measurement model function constructs() using the following methods:
product_indicator
describes a single interaction composite as generated by the scaled product-indicator method as described by Henseler and Chin (2010).two_stage
describes a single-item interaction composite that uses a product of the IV and moderator construct scores. For PLS-PM, the first stage uses PLS-PM described by Henseler and Chin (2010) whereas for CBSEM, the first stage uses a CFA and extracts ten Berge factor scores.orthogonal
describes a single interaction composite generated by the orthogonalization method of Henseler and Chin (2010). It is more typical to use for composites, to help interpret multicollinearity between product termsFor these methods the standard deviation of the interaction term is adjusted as noted below.
For example, we can describe the following interactions between Image and Expectation constructs:
# By default, interaction terms are computed using two stage procedures
interaction_term(iv = "Image", moderator = "Expectation")
# You can also explicitly specify how to create the interaction term
interaction_term(iv = "Image", moderator = "Expectation", method = two_stage)
interaction_term(iv = "Image", moderator = "Expectation", method = product_indicator)
interaction_term(iv = "Image", moderator = "Expectation", method = orthogonal)
Note that these functions themselves return functions (closures) that are not resolved until processed in the estimate_pls()
or estimate_cbsem()
functions for SEM estimation. Note that recent studies show PLS models must adjust the standard deviation of the interaction term because: “In general, the product of two standardized variables does not equal the standardized product of these variables” (Henseler and Chin 2010). SEMinR automatically adjusts for this providing highly accurate model estimations.
Important Note: SEMinR syntax uses an asterisk “*” as a naming convention for the interaction construct. Thus, the “Image” + “Expectation” interaction is called “Image*Expectation” in the structural model below. Please refrain from using an asterisk "*" in the naming of non-interaction constructs.
SEMinR makes for human-readable and explicit structural model specification using these functions:
relationships()
gather all the structural relationships between all constructspaths()
specifies relationships between sets of antecedents and outcomesrelationships()
compiles the structural model source-target list from the user specified structural path descriptions described in the parameters.
For example, we can describe a structural model for the mobi
data:
mobi_sm <- relationships(
paths(from = "Image", to = c("Expectation", "Satisfaction", "Loyalty")),
paths(from = "Expectation", to = c("Quality", "Value", "Satisfaction")),
paths(from = "Quality", to = c("Value", "Satisfaction")),
paths(from = "Value", to = c("Satisfaction")),
paths(from = "Satisfaction", to = c("Complaints", "Loyalty")),
paths(from = "Complaints", to = "Loyalty")
)
Note that neither a dataset nor a measurement model is specified in the structural model stage, so we can reuse the structural model object mobi_sm
across different datasets and measurement models.
paths()
describe single or multiple structural paths between sets of constructs.
For example, we can define paths from a single antecedent construct to a single outcome construct:
# "Image" -> "Expectation"
paths(from = "Image", to = "Expectation")
Or paths from a single antecedent to multiple outcomes:
# "Image" -> "Expectation"
# "Image" -> "Satisfaction"
paths(from = "Image", to = c("Expectation", "Satisfaction"))
Or paths from multiple antecedents to a single outcome:
# "Image" -> "Satisfaction"
# "Expectation" -> "Satisfaction"
paths(from = c("Image", "Expectation"), to = "Satisfaction")
Or paths from multiple antecedents to a common set of outcomes:
# "Expectation" -> "Value"
# "Expectation" -> "Satisfaction"
# "Quality" -> "Value"
# "Quality" -> "Satisfaction"
paths(from = c("Expectation", "Quality"), to = c("Value", "Satisfaction"))
Even the most complicated structural models become quick and easy to specify and modify.
SEMinR can estimate a CFA or a full SEM model described by the measurement and structural models above:
estimate_pls()
estimates the parameters of a PLS-SEM modelestimate_cfa()
estimates the parameters of a CFA model using the Lavaan packageestimate_cbsem()
estimates the parameters of a CBSEM model using the Lavaan packageThe above functions take some combination of the following parameters:
data
: the dataset containing the measurement model items specified in constructs()
measurement_model
: the measurement model described by the constructs()
functionstructural_model
(PLS-PM and CBSEM only): the structural model described by the paths()
functioninner_weights
(PLS-PM only): the weighting scheme for path estimation - either path_weighting
for path weighting (default) or path_factorial
for factor weighting (Lohmöller 1989).For example, we can estimate a simple SEM model adapted from the structural and measurement model with interactions described thus far:
# define measurement model
mobi_mm <- constructs(
composite("Image", multi_items("IMAG", 1:5)),
composite("Expectation", multi_items("CUEX", 1:3)),
composite("Value", multi_items("PERV", 1:2)),
composite("Satisfaction", multi_items("CUSA", 1:3)),
interaction_term(iv = "Image", moderator = "Expectation"),
interaction_term(iv = "Image", moderator = "Value")
)
# define structural model
# note: interactions cobnstruct should be named by its main constructs joined by a '*'
mobi_sm <- relationships(
paths(to = "Satisfaction",
from = c("Image", "Expectation", "Value",
"Image*Expectation", "Image*Value"))
)
mobi_pls <- estimate_pls(
data = mobi,
measurement_model = mobi_mm,
structural_model = mobi_sm,
inner_weights = path_weighting
)
#> Generating the seminr model
#> All 250 observations are valid.
mobi_cfa <- estimate_cfa(
data = mobi,
measurement_model = as.reflective(mobi_mm)
)
#> Generating the seminr model for CFA
mobi_cbsem <- estimate_cbsem(
data = mobi,
measurement_model = as.reflective(mobi_mm),
structural_model = mobi_sm
)
#> Generating the seminr model for CBSEM
Dijkstra and Henseler (2015) offer an adjustment to generate consistent weight and path estimates of common factors estimated using PLS-PM. When estimating PLS-PM models using estimate_pls()
, SEMinR automatically adjusts to produce consistent estimates of coefficients for common-factors defined using reflective()
.
Note: SEMinR also uses PLSc on PLS models with interactions involving reflective constructs. PLS models with interactions can be estimated as PLS consistent, but are subject to some bias as per Becker et al. (2018). It is not uncommon for bootstrapping PLSc models to result in errors due the calculation of the adjustment.
SEMinR can conduct high performance bootstrapping.
bootstrap_model()
bootstraps a SEMinR model previously estimated using estimate_pls()
This function takes the following parameters:
seminr_model
: a SEM model provided by estimate_pls()
nboot
: the number of bootstrap subsamples to generatecores
: If your pc supports multi-core processing, the number of cores to utilize for parallel processing (default is NULL, wherein SEMinR will automatically detect and utilize all available cores)For example, we can bootstrap the model described above:
# use 1000 bootstraps and utilize 2 parallel cores
boot_mobi_pls <- bootstrap_model(seminr_model = mobi_pls,
nboot = 1000,
cores = 2)
#> Bootstrapping model using seminr...
#> SEMinR Model successfully bootstrapped
bootstrap_model()
returns an object of class boot_seminr_model
which contains the following accessible objects:
boot_seminr_model$boot_paths
an array of the nboot
estimated bootstrap sample path coefficient matricesboot_seminr_model$boot_loadings
an array of the nboot
estimated bootstrap sample item loadings matricesboot_seminr_model$boot_weights
an array of the nboot
estimated bootstrap sample item weights matricesboot_seminr_model$boot_HTMT
an array of the nboot
estimated bootstrap sample model HTMT matricesboot_seminr_model$paths_descriptives
a matrix of the bootstrap path coefficients and standard deviationsboot_seminr_model$loadings_descriptives
a matrix of the bootstrap item loadings and standard deviationsboot_seminr_model$weights_descriptives
a matrix of the bootstrap item weights and standard deviationsboot_seminr_model$HTMT_descriptives
a matrix of the bootstrap model HTMT and standard deviationsNotably, bootstrapping can also be meaningfully applied to models containing interaction terms and readjusts the interaction term (Henseler and Chin 2010) for every sub-sample. This leads to slightly increased processing times, but provides accurate estimations.
There are multiple ways of reporting the estimated model. The estimate_pls()
function returns an object of class seminr_model
. This can be passed directly to the base R function summary()
. This can be used in two primary ways:
summary(seminr_model)
to report \(R^{2}\), adjusted \(R^{2}\), path coefficients for the structural model, and the construct reliability metrics \(rho_{C}\), also known as composite reliability (Dillon and Goldstein 1987), AVE (Fornell and Larcker 1981), and \(rho_{A}\) (Dijkstra and Henseler 2015).summary(mobi_pls)
#>
#> Results from package seminr (1.1.0)
#>
#> Path Coefficients:
#> Satisfaction
#> R^2 0.614
#> AdjR^2 0.606
#> Image 0.470
#> Expectation 0.132
#> Value 0.320
#> Image*Expectation -0.140
#> Image*Value 0.023
#>
#> Reliability:
#> rhoC AVE rhoA
#> Image 0.818 0.478 1
#> Expectation 0.733 0.481 1
#> Value 0.918 0.848 1
#> Image*Expectation 0.833 0.291 1
#> Image*Value 0.918 0.574 1
#> Satisfaction 0.871 0.693 1
model_summary <- summary(seminr_model)
returns an object of class summary.seminr_model
which contains the following accessible objects (might vary depending on CBSEM or PLS model):
model_summary$descriptives
reports the descriptive statistics and correlations for both items and constructsmodel_summary$paths
reports the matrix of path coefficients, \(R^{2}\), and adjusted \(R^{2}\)model_summary$reliability
reports composite reliability (\(rho_{C}\)), average variance extracted (AVE), and \(rho_{A}\)model_summary$loadings
reports the estimated loadings of the measurement modelmodel_summary$weights
reports the estimated weights of the measurement modelmodel_summary$construct_scores
reports the construct scores of compositesmodel_summary$vif_items
reports the Variance Inflation Factor (VIF) for the measurement modelmodel_summary$vif_antecedents
report the Variance Inflation Factor (VIF) for the structural modelmodel_summary$fSquare
reports the effect sizes (\(f^{2}\)) for the structural modelmodel_summary$htmt
reports the HTMT for the structural modelmodel_summary$iterations
(PLS only) reports the number of iterations to converge on a stable modelmodel_summary$cross_loadings
(PLS only) reports all possible loadings between contructs and itemsPlease note that common-factor scores are indeterminable and therefore construct scores for reflecive common factors are extracted using a ten Berge procedure.
As with the estimated model, there are multiple ways of reporting the bootstrapping of a PLS model. The bootstrap_model()
function returns an object of class boot_seminr_model
. This can be passed directly to the base R function summary()
. This can be used in two primary ways:
summary(boot_seminr_model)
to report t-values and p-values for the structural pathsGet information about bootstrapped PLS models using the summary()
function on the bootstrapped model object.
summary(boot_mobi_pls)
boot_model_summary <- summary(boot_seminr_model)
returns an object of class summary.boot_seminr_model
which contains the following accessible objects:
boot_model_summary$nboot
reports the number of bootstraps performedmodel_summary$bootstrapped_paths
reports a matrix of direct paths and their standard deviation, t_values, and confidence intervals.model_summary$bootstrapped_weights
reports a matrix of measurement model weights and their standard deviation, t_values, and confidence intervals.model_summary$bootstrapped_loadings
reports a matrix of measurement model loadings and their standard deviation, t_values, and confidence intervals.model_summary$bootstrapped_HTMT
reports a matrix of HTMT values and their standard deviation, t_values, and confidence intervals.The summary(boot_seminr_model)
function will return t_values and confidence intervals for direct structural paths in PLS models. However, the confidence_interval()
function can be used to evaluate the confidence intervals for specific paths - direct and mediated (Zhao et al., 2010) - in a boot_seminr_model
object returned by the bootstrap_model()
function.
This function takes the following parameters:
boot_seminr_model
: a bootstrapped SEMinR model returned by bootstrap_model()
from
: the antecedent construct for the structural pathto
: the outcome construct for the structural paththrough
: the mediator construct, if the path is mediated (default is NULL)alpha
the required level of alpha (default is 0.05)and returns a specific confidence interval using the percentile method as per Henseler et al. (2014).
mobi_mm <- constructs(
composite("Image", multi_items("IMAG", 1:5)),
composite("Expectation", multi_items("CUEX", 1:3)),
composite("Quality", multi_items("PERQ", 1:7)),
composite("Value", multi_items("PERV", 1:2)),
composite("Satisfaction", multi_items("CUSA", 1:3)),
composite("Complaints", single_item("CUSCO")),
composite("Loyalty", multi_items("CUSL", 1:3))
)
# Creating structural model
mobi_sm <- relationships(
paths(from = "Image", to = c("Expectation", "Satisfaction", "Loyalty")),
paths(from = "Expectation", to = c("Quality", "Value", "Satisfaction")),
paths(from = "Quality", to = c("Value", "Satisfaction")),
paths(from = "Value", to = c("Satisfaction")),
paths(from = "Satisfaction", to = c("Complaints", "Loyalty")),
paths(from = "Complaints", to = "Loyalty")
)
# Estimating the model
mobi_pls <- estimate_pls(data = mobi,
measurement_model = mobi_mm,
structural_model = mobi_sm)
#> Generating the seminr model
#> All 250 observations are valid.
# Load data, assemble model, and bootstrap
boot_seminr_model <- bootstrap_model(seminr_model = mobi_pls,
nboot = 50, cores = 2, seed = NULL)
#> Bootstrapping model using seminr...
#> SEMinR Model successfully bootstrapped
# Calculate the 5% confidence interval for mediated path Image -> Expectation -> Satisfaction
confidence_interval(boot_seminr_model = boot_seminr_model,
from = "Image",
through = "Expectation",
to = "Satisfaction",
alpha = 0.05)
#> 2.5% 97.5%
#> -0.004778419 0.078552765
# Calculate the 10% confidence interval for direct path Image -> Satisfaction
confidence_interval(boot_seminr_model = boot_seminr_model,
from = "Image",
to = "Satisfaction",
alpha = 0.10)
#> 5% 95%
#> 0.08670856 0.26515338
The summary(seminr_model)
function will return four matrices: model_summary <- summary(seminr_model)
returns an object of class summary.seminr_model
which contains the following four descriptive statistics matrices:
+ `model_summary$descriptives$statistics$items` reports the descriptive statistics for items
+ `model_summary$descriptives$correlations$items` reports the correlation matrix for items
+ `model_summary$descriptives$statistics$constructs` reports the descriptive statistics for constructs
+ `model_summary$descriptives$correlations$constructs` reports the correlation matrix for constructs
model_summary <- summary(mobi_pls)
model_summary$descriptives$statistics$items
model_summary$descriptives$correlations$items
model_summary$descriptives$statistics$constructs
model_summary$descriptives$correlations$constructs