This package allows users to derive flexible cutoffs for the evaluation of absolute model fit in Covariance-Based Structural Equation Modeling (CBSEM).
For CBSEM, a multitude of fit indices have been proposed, roughly classified as Goodness-of-Fit (GoF), such as CFI (Comparative Fit Index), or Badness-of-Fit (BoF), including SRMR (Standardized Root Mean Square Residual). Despite its initial appeal, all fit indices are distorted by characteristics of the model (e.g., size of the model) and sample characteristics (e.g., sample size) to some degree. Hence, their ability to determine whether a model has a truly “good” fit or not is often masked by an unwanted sensitivity to model and sample characteristics.
Consequentially, widely acknowledged recommendations and guidelines (e.g., CFI >= .95 indicates “good” fit) struggle with these distortions because their “fixed” cutoffs remain constant, irrespective of the model and sample characteristics investigated. For example, CFI tends to be sensitive to sample size. A rather small sample (e.g., N = 200) will almost always produce smaller CFIs than a rather large sample (e.g., N = 1,000). Hence, fixed cutoffs for CFI, such as .95 will likely reject more correct models for small sample sizes and likely accept more incorrect models for larger sample sizes.
Flexible cutoffs aim to overcome this deficit by providing customized cutoff values for a given model with specific model and sample characteristics. For the above example, a flexible cutoff will decrease (to e.g., .94 for N = 200) or increase (to e.g., .98 for N = 1,000) depending on the model and its characteristics.
Flexible cutoffs are derived from simulated distributions of correctly specified Confirmatory Factor Analysis (CFA) models for a wide range of latent variables (or factors), indicators per latent variable, sample sizes, factor loadings, and normal as well as non-normal data. Flexible cutoffs can be understood as the empirical quantile of a given index for a predefined uncertainty. If an uncertainty of 5 percent (or .05) is accepted a-priori, the 5 percent quantile of the simulated distribution for correctly specified CFA models, with the given model and sample characteristics, will determine the flexible cutoff. Depending on the nature of the underlying fit index, the appropriate lower (GoF) or upper (BoF) width of the respective confidence interval, as defined by the quantile, is used to derive the flexible cutoff.
Thereby, flexible cutoffs are also flexible in the recommendations they provide. If users are more uncertain about their model, they can adjust the uncertainty to 10 percent (or .10). When being more certain about the underlying model, the uncertainty can be adjusted to .1, or even .001. Note that this uncertainty is inverse to the understanding of a p-value. A researcher admits how uncertain s/he is about a given model and thus .10 indicates very conservative cutoffs, while .001 determines very lenient cutoffs.
Details on the procedure and methods can be found in the paper “Flexible Cutoff Values for Fit Indices in the Evaluation of Structural Equation Models” (Niemand and Mai, 2018).
From 2018 to 2021, we provided a website-based tool on flexiblecutoffs.org that allowed users to input certain parameters and obtain the respective flexible cutoffs from pre-run simulations. Given the limitations of this approach (parameters to be simulated are finite) and user feedback, we decided to deprecate the tool on the website and instead offer this package FCO. We believe that this provides multiple advantages for users, particularly:
Empirical: Using lavaan to define and estimate the users’ own model. Hence, parameters such as the number of latent variables and sample size are not required to be specified, but are taken from the lavaan model.
More flexibility: A myriad of new functions and arguments are available to assist the user and provide more options to individualize the flexible cutoffs to a given model. For example, relative cutoffs can be derived for two models to be compared (e.g., nested models). Users can choose from different types of settings for the underlying population model (including empirical relationships). Finally, discriminant validity can be tested by comparing fit indices for merged and constrained models (Rönkkö & Cho, 2020). We also include helper functions to obtain population models, constrain models and guessing the type of index (GoF, BoF).
Speed: Given the two previous advantages, deriving flexible cutoffs still requires essential simulations. In order to overcome this, the simulation functions are written using parallel processing (via mclapply on Linux or Mac or parLapply on Windows) to speed up the code. With a decent number of replications (default: 500) and a modern CPU, flexible cutoffs can be obtained in a few minutes or even seconds.
Transparency: With switching to an R package, the code of all functions becomes transparent, allowing users to better understand the derivations as well as provide feedback and troubleshooting regarding issues (and there will be many, certainly).
As a side note of transforming the previous tool into a package, flexible cutoffs obtained from tool and package are not identical, but certainly very close to each other. This disadvantage stems from the differential use of seeds in the pre-run simulations and the present package-based simulations as well as the implementation of the new features. If you detect substantial differences between flexible cutoffs, please contact us.
FCO is available on CRAN and can be installed as usual via
install.packages
. Then, the package can be loaded, for
example with (required packages, if installed, will be loaded
automatically):
library(FCO)
In an initial example, a normal user has two objects, a dataset
x and a (theoretical) model mod to be
estimated. For illustrative purposes, we use the data from Babakus and
Boller (1992), which is available as a correlation matrix of 22 items
(Q1-Q22) for the SERVQUAL scale consisting of five factors (here F1 to
F5). We simulated 502 observations from the original correlation matrix
via MASS::mvrnorm
and provide it as a data.frame. This
dataset is included in the package:
data(bb1992)
head(bb1992, 3)
#> Q1 Q2 Q3 Q4 Q5 Q6 Q7
#> 1 -1.0206565 0.2276041 -1.2568654 0.8136901 -1.6019692 0.7552052 -0.4227379
#> 2 1.5938691 -0.0276403 -1.2359269 0.4527496 0.4198624 1.2915004 0.3614339
#> 3 0.6432085 0.5146161 0.9700383 0.5768854 0.1788848 -0.2423614 0.7111541
#> Q8 Q9 Q10 Q11 Q12 Q13 Q14
#> 1 -1.2791570 0.13263442 -0.4502009 -0.3691879 -0.5386581 -1.238408 0.02816952
#> 2 0.4577506 -0.05877489 0.6064040 0.9672691 1.1028063 1.784519 -0.47735634
#> 3 -0.5452735 -0.54137900 -0.8755678 0.3516510 -1.9334731 -0.498252 -0.33502753
#> Q15 Q16 Q17 Q18 Q19 Q20
#> 1 -0.36047328 0.3736429 -0.3124173 0.3765030 -0.956071795 1.50488335
#> 2 0.17534870 0.2421734 1.7328168 0.3081753 0.004130691 0.62142902
#> 3 -0.07986953 -1.4054425 -0.7272241 -0.7679232 0.009754038 0.07108398
#> Q21 Q22
#> 1 -0.2591684 -0.4093789
#> 2 0.9135299 1.1154500
#> 3 -0.6949648 1.7242517
The second object required is a model to be fitted to the data. A simple lavaan syntax for an empirical five-factor solution looks like this:
<- "
mod F1 =~ Q5 + Q7 + Q8
F2 =~ Q2 + Q4
F3 =~ Q10 + Q11 + Q12 + Q13 + Q18 + Q19 + Q20 + Q21 + Q22
F4 =~ Q1 + Q17
F5 =~ Q6 + Q14 + Q15 + Q16
"
Both objects, the dataset bb1992
, and the model
mod
can then be used to estimate the CFA via
lavaan. For simplicity reasons, we focus on two fit
indices, CFI and SRMR in this example
(Niemand & Mai, 2018, p. 1170).
<- cfa(mod, data = bb1992)
res fitmeasures(res, fit.measures = c("CFI", "SRMR"))
#> cfi srmr
#> 0.960 0.038
Note that both values would suffice fixed cutoffs such as CFI ≥ .95 and SRMR ≤ .09, hence concluding that the model fits the data well. However, given the empirical setting of the example study, it would be beneficial to investigate the uncertainty of this finding. This is where flexible cutoffs come into play.
To derive the flexible cutoffs, one only needs the two objects
discussed before. All other arguments are set internally via standard
options (see below). In this case, we however only use 10 replications,
to speed up the derivation. The function gen_fit
generates
a list fits.single
of fit indicators for the underlying
data and model, as well as lists for the other arguments. The function
flex_co
then takes this list and estimates the intended
flexible cutoffs for the selected fit indices. Since the default setting
for the preset uncertainty is .05, we can also omit this argument. For
now, this code is sufficient:
<- gen_fit(mod1 = mod, x = bb1992, rep = 10)
fits.single #Please use for flexible cutoffs as desribed below:
#fits.single <- gen_fit(mod1 = mod, x = bb1992, rep = 100)
flex_co(fits = fits.single, index = c("CFI", "SRMR"))
#> Warning in flex_co(fits = fits.single, index = c("CFI", "SRMR")): The number of
#> replications is lower than the recommended minimum of 500. Consider with care.
#> $cutoff
#> CFI SRMR
#> 0.97826871 0.03659316
#>
#> $index
#> [1] "CFI" "SRMR"
#>
#> $alpha
#> [1] 0.05
#>
#> $gof
#> CFI SRMR
#> TRUE FALSE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
In this case, the second function returns a warning that the number
of replications is below the recommended minimum. For non-illustrative
purposes, we recommend higher values (e.g., 500 or 1,000). The function
flex_co
also returns objects cutoff
,
index
, alpha
, gof
,
replications
, number of non-converging models
and share of non-converging models
. Cutoff
provides the flexible cutoff values for the selected indices.
Index
returns the name of the indices selected and alpha
returns the preset uncertainty. Gof
returns the type of fit
index (GoF or BoF, logically, GoF = FALSE indicates BoF). This can be
either stated manually in the function with the argument
gof
or is guessed by the helper function
index_guess
, which works for all established fit
indices.
Note that the cutoff values for CFI and SRMR are higher (CFI = .971 for 100 replications) and lower (SRMR = .036 for 100 replications) than the respective empirical values (CFI = .960, SRMR = .038). That is, given the uncertainty of .05, as well as the present model and data, the cutoffs are close to the empirical values, but the model is marginally rejected. We can however simulate what happens if we assume less (.001) or more (.10) uncertainty.
flex_co(fits = fits.single,
index = c("CFI", "SRMR"),
alpha.lev = .001)
#> Warning in flex_co(fits = fits.single, index = c("CFI", "SRMR"), alpha.lev =
#> 0.001): The number of replications is lower than the recommended minimum of
#> 500. Consider with care.
#> $cutoff
#> CFI SRMR
#> 0.97826871 0.03659316
#>
#> $index
#> [1] "CFI" "SRMR"
#>
#> $alpha
#> [1] 0.001
#>
#> $gof
#> CFI SRMR
#> TRUE FALSE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
flex_co(fits = fits.single,
index = c("CFI", "SRMR"),
alpha.lev = .10)
#> Warning in flex_co(fits = fits.single, index = c("CFI", "SRMR"), alpha.lev =
#> 0.1): The number of replications is lower than the recommended minimum of 500.
#> Consider with care.
#> $cutoff
#> CFI SRMR
#> 0.97980796 0.03616053
#>
#> $index
#> [1] "CFI" "SRMR"
#>
#> $alpha
#> [1] 0.1
#>
#> $gof
#> CFI SRMR
#> TRUE FALSE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
We do not need to re-simulate the flexible cutoffs, as the function
flex_co
takes the simulated values from the generated fit
indices and we only need to change the uncertainty via the
alpha.lev
argument. As expected, if one assumes less
uncertainty—or more certainty, for example because the model has been
tested before—the flexible cutoffs tend to be closer to the empirical
values for more replications. Likewise, when assuming more uncertainty
(e.g., a novel model), the cutoffs more clearly reject the model.
In principle, this is the core usage originally intended for flexible cutoffs. We however included a substantial number of other features that may be interesting for users. The following chapter gives an overview of the possible features.
The basic idea of flexible cutoffs is that these cutoffs come from a sui-generis (having it’s own shape) distribution of fit indices for a correct, unbiased model. Essentially, this means that the population model is correctly specified with no errors. For example, all observed variables load on the latent variables they are supposed to load on, all correlations are estimated freely, and the data is not excessively skewed. This assumption makes flexible cutoffs potentially more objective since no subjective modification to the model (misspecification) or data (skewness, kurtosis) is needed. However, this also implies that the population model is specified not simply on the basis of the own empirical model. In an anecdotal manner, one can compare this to the introduction of the meter. Assuming that one’s own measure is exactly 1 meter is likely error prone. One needs a validated meter bar. The meter bar in this case is the population model.
The function pop_mod
specifies this population model. We
offer three types of population models, the NM (Niemand & Mai, 2018)
option, the HB (Hu & Bentler, 1999) option, and the EM (empirical)
option. Function gen_fit
internally calls function
pop_mod
, but the argument type
is also
available in the latter function for flexibility. We can compare these
models by:
pop_mod(mod, x = bb1992, type = "NM")$pop.mod
#> [1] "F1=~0.7*Q5+0.7*Q7+0.7*Q8 \nF2=~0.7*Q2+0.7*Q4 \nF3=~0.7*Q10+0.7*Q11+0.7*Q12+0.7*Q13+0.7*Q18+0.7*Q19+0.7*Q20+0.7*Q21+0.7*Q22 \nF4=~0.7*Q1+0.7*Q17 \nF5=~0.7*Q6+0.7*Q14+0.7*Q15+0.7*Q16 \nF1~~1*F1 \nF2~~0.3*F1 \nF3~~0.3*F1 \nF4~~0.3*F1 \nF5~~0.3*F1 \nF2~~1*F2 \nF3~~0.3*F2 \nF4~~0.3*F2 \nF5~~0.3*F2 \nF3~~1*F3 \nF4~~0.3*F3 \nF5~~0.3*F3 \nF4~~1*F4 \nF5~~0.3*F4 \nF5~~1*F5 \n \n"
pop_mod(mod, x = bb1992, type = "HB")$pop.mod
#> [1] "F1=~0.7*Q5+0.75*Q7+0.8*Q8 \nF2=~0.75*Q2+0.75*Q4 \nF3=~0.7*Q10+0.7*Q11+0.7*Q12+0.7*Q13+0.75*Q18+0.8*Q19+0.8*Q20+0.8*Q21+0.8*Q22 \nF4=~0.75*Q1+0.75*Q17 \nF5=~0.7*Q6+0.75*Q14+0.75*Q15+0.8*Q16 \nF1~~1*F1 \nF2~~0.5*F1 \nF3~~0.4*F1 \nF4~~0.3*F1 \nF5~~0.5*F1 \nF2~~1*F2 \nF3~~0.4*F2 \nF4~~0.5*F2 \nF5~~0.4*F2 \nF3~~1*F3 \nF4~~0.3*F3 \nF5~~0.5*F3 \nF4~~1*F4 \nF5~~0.4*F4 \nF5~~1*F5 \n \n"
pop_mod(mod, x = bb1992, type = "EM")$pop.mod
#> [1] "F1=~0.807*Q5+0.637*Q7+0.876*Q8 \nF2=~0.747*Q2+0.842*Q4 \nF3=~0.519*Q10+0.568*Q11+0.597*Q12+0.697*Q13+0.603*Q18+0.557*Q19+0.534*Q20+0.578*Q21+0.552*Q22 \nF4=~0.659*Q1+0.772*Q17 \nF5=~0.727*Q6+0.719*Q14+0.826*Q15+0.752*Q16 \nF1~~1*F1 \nF2~~0.36*F1 \nF3~~0.71*F1 \nF4~~0.641*F1 \nF5~~0.807*F1 \nF2~~1*F2 \nF3~~0.349*F2 \nF4~~0.486*F2 \nF5~~0.492*F2 \nF3~~1*F3 \nF4~~0.625*F3 \nF5~~0.8*F3 \nF4~~1*F4 \nF5~~0.831*F4 \nF5~~1*F5 \n \n"
When the type is “NM”, all loadings (default: .7) and correlations
(default: .3) are assumed to be equal (\n denotes a line break for the
lavaan syntax). When the type is “HB”, the loadings
vary by .05 around .75, depending on the number of indicators and the
correlations are either .5, .4, or .3, also depending on the number of
latent variables. Finally, when the type is “EM”, the function runs a
CFA and determines the empirical loadings and correlations. Since one
cannot assume the own empirical model to be correct, we advise users to
not use the “EM” type for model validation. This type might be useful
for other features (see further applications). Hence, the
default is set to “NM”. Since the by far most selected
standardized factor loading (afl
) in our tool was .7, we
set the default value to .7. Other options between 0 and 1 are possible.
The average correlation between latent variables (aco
) is
set to a default of .3, but this can be changed likewise. To increase
flexibility, the argument standardized
(default: TRUE) can
also be called, allowing for the specification of standardized (all
loadings < 1, all covariances are correlations) and unstandardized
(loadings > 1, covariances, not correlations) parameters. The
function returns a warning when the empirical model suspects
standardized or unstandardized loadings and this is in conflict with the
standardized
argument. See below:
pop_mod(mod, x = bb1992, type = "NM", afl = .9)$pop.mod
#> [1] "F1=~0.9*Q5+0.9*Q7+0.9*Q8 \nF2=~0.9*Q2+0.9*Q4 \nF3=~0.9*Q10+0.9*Q11+0.9*Q12+0.9*Q13+0.9*Q18+0.9*Q19+0.9*Q20+0.9*Q21+0.9*Q22 \nF4=~0.9*Q1+0.9*Q17 \nF5=~0.9*Q6+0.9*Q14+0.9*Q15+0.9*Q16 \nF1~~1*F1 \nF2~~0.3*F1 \nF3~~0.3*F1 \nF4~~0.3*F1 \nF5~~0.3*F1 \nF2~~1*F2 \nF3~~0.3*F2 \nF4~~0.3*F2 \nF5~~0.3*F2 \nF3~~1*F3 \nF4~~0.3*F3 \nF5~~0.3*F3 \nF4~~1*F4 \nF5~~0.3*F4 \nF5~~1*F5 \n \n"
pop_mod(mod, x = bb1992, type = "NM", aco = .5)$pop.mod
#> [1] "F1=~0.7*Q5+0.7*Q7+0.7*Q8 \nF2=~0.7*Q2+0.7*Q4 \nF3=~0.7*Q10+0.7*Q11+0.7*Q12+0.7*Q13+0.7*Q18+0.7*Q19+0.7*Q20+0.7*Q21+0.7*Q22 \nF4=~0.7*Q1+0.7*Q17 \nF5=~0.7*Q6+0.7*Q14+0.7*Q15+0.7*Q16 \nF1~~1*F1 \nF2~~0.5*F1 \nF3~~0.5*F1 \nF4~~0.5*F1 \nF5~~0.5*F1 \nF2~~1*F2 \nF3~~0.5*F2 \nF4~~0.5*F2 \nF5~~0.5*F2 \nF3~~1*F3 \nF4~~0.5*F3 \nF5~~0.5*F3 \nF4~~1*F4 \nF5~~0.5*F4 \nF5~~1*F5 \n \n"
pop_mod(mod, x = bb1992, type = "EM", standardized = FALSE)$pop.mod
#> Warning in pop_mod(mod, x = bb1992, type = "EM", standardized = FALSE): All
#> loadings are < 1. Consider revision of standardized.
#> [1] "F1=~0.807*Q5+0.637*Q7+0.876*Q8 \nF2=~0.747*Q2+0.842*Q4 \nF3=~0.519*Q10+0.568*Q11+0.597*Q12+0.697*Q13+0.603*Q18+0.557*Q19+0.534*Q20+0.578*Q21+0.552*Q22 \nF4=~0.659*Q1+0.772*Q17 \nF5=~0.727*Q6+0.719*Q14+0.826*Q15+0.752*Q16 \nF1~~1*F1 \nF2~~0.36*F1 \nF3~~0.71*F1 \nF4~~0.641*F1 \nF5~~0.807*F1 \nF2~~1*F2 \nF3~~0.349*F2 \nF4~~0.486*F2 \nF5~~0.492*F2 \nF3~~1*F3 \nF4~~0.625*F3 \nF5~~0.8*F3 \nF4~~1*F4 \nF5~~0.831*F4 \nF5~~1*F5 \n"
As an alternative to provide a model and data, it is possible to specify a population model and the sample size of your data instead. This might be useful for simulation approaches where the population model is provided. Hence, for example, the generated population model can be used:
<- pop_mod(mod, x = bb1992)$pop.mod
pop.mod <- gen_fit(pop.mod1 = pop.mod, n = 502, rep = 10) fits.altern
In order to follow this principle of objectivity, the data
x
is essentially not that important for deriving flexible
cutoffs, unless specified differently. That is, x
determines the sample size (N) for the simulations and the multivariate
non-normality of the data, if assume.mvn = FALSE
(default:
TRUE). Both are only relevant for the gen_fit
function.
Internally, the function gen_fit
calls the
simulateData
function from lavaan, which
itself does not require data, but takes a population model from function
pop_mod
. Sample size is specified via
sample.nobs
by nrow(x)
. Arguments skewness and
kurtosis are derived from semTools::mardiaKurtosis(x)
and
semTools::mardiaSkew(x)
in package
semTools, if assume.mvn = FALSE
.
Since there seems to be no unified agreement on what “no”, “low”,
“moderate”, or “high” kurtosis / skewness constitutes (Niemand &
Mai, 2018), we omitted the once verbally differentiated levels available
in the tool. As mentioned before, x
is also needed for the
empirical population model type “EM” in pop_mod
.
For flexible cutoffs, the type of a fit index (GoF or BoF) plays an
essential role, as the lower (GoF) or upper (BoF) width of a confidence
interval is required. Since empirically guessing the type may be
misleading (e.g., a very bad model may produce a distribution of SRMR
that is not different from a distribution for a CFI in a better fitting
model), we implemented a helper function index_guess
that
simply guesses the fit index by name (upper or lowercase
considered):
index_guess("cfi")
#> [1] "GoF"
index_guess("CFI")
#> [1] "GoF"
index_guess("srmr")
#> [1] "BoF"
index_guess("SRMR")
#> [1] "BoF"
index_guess("mickey_mouse")
#> [1] "not a fit index"
This function is applied in the flex_co
function. If one
specifies established fit indices, such as CFI or SRMR, the
gof
argument is not required. However, this feature does
not override the gof
argument. For example:
flex_co(
fits = fits.single,
index = c("CFI", "SRMR"),
gof = c(TRUE, FALSE)
)#> Warning in flex_co(fits = fits.single, index = c("CFI", "SRMR"), gof = c(TRUE,
#> : The number of replications is lower than the recommended minimum of 500.
#> Consider with care.
#> $cutoff
#> CFI SRMR
#> 0.97826871 0.03659316
#>
#> $index
#> [1] "CFI" "SRMR"
#>
#> $alpha
#> [1] 0.05
#>
#> $gof
#> CFI SRMR
#> TRUE FALSE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
flex_co(
fits = fits.single,
index = c("CFI", "SRMR"),
gof = c(FALSE, TRUE)
)#> Warning in flex_co(fits = fits.single, index = c("CFI", "SRMR"), gof = c(FALSE,
#> : The number of replications is lower than the recommended minimum of 500.
#> Consider with care.
#> $cutoff
#> CFI SRMR
#> 1.00000000 0.03096027
#>
#> $index
#> [1] "CFI" "SRMR"
#>
#> $alpha
#> [1] 0.05
#>
#> $gof
#> CFI SRMR
#> FALSE TRUE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
The first version (the gof
argument can also be omitted
as the guess is correct) gives the correct flexible cutoffs, the second
version does not, as CFI is not a BoF and SRMR is not a GoF. Hence,
users should be careful with specifying this argument. For novel fit
indices or alternative uses, it may however be beneficial to maintain
this argument.
Since simulations may need some time, we implemented multi-core
support. Depending on the system the package runs on, function
gen_fit
uses mclapply
(on Linux or Mac) and
parLapply
(on Windows) from package
parallel. The default is set to TRUE and two cores.
system.time(gen_fit(mod1 = mod, x = bb1992, rep = 10))
#> user system elapsed
#> 0.028 0.007 0.515
In the fortunate situation that a user has more cores, more cores can
be set by the argument cores. However, note that the function returns an
error if the number of available cores of the system is lower than the
number of cores provided by the cores
argument (e.g.,
cores = 4
). Of course, multi-core support can be switched
off by setting the argument multi.core = FALSE
.
system.time(gen_fit(
mod1 = mod,
x = bb1992,
rep = 10,
multi.core = FALSE
))#> user system elapsed
#> 0.810 0.072 0.894
One quintessential issue we notice in many papers, reviews, and PhD courses is that non-experts do not know which fit index or fit indicator to choose. To summarize, this is the major question one of our papers investigates (Mai et al., 2021) and the main message is that one should follow a tailor-fit approach. Depending on three settings, a) sample size, b) research purpose (novel or established model), and c) focus (confirming a factorial structure, i.e., CFA or investigating a theoretical, structural model), different fit indicators are recommended.
The differentiation between novel or established model might need some explanation. Fit indicators often yield different Type I and II errors. When an established model that has been empirically investigated many times before (e.g., Theory of Planned Behavior-models) is tested, it makes sense to put more weight on Type I error. However, when the model has never been tested before, it makes sense to put more weight on the Type II error. Since fixed cutoffs show worse hit rates for higher Type II error weights (1:3, 1:4), they may be poorly performing for novel models.
We built the function recommend
to incorporate this
tailor-fit approach in a user-friendly way. It requires the simulated
fit indices and the arguments for purpose
and
focus
. Sample size is determined automatically. Results are
rounded to 3 digits, but can be changed to 1 to 5 digits if needed, for
example by digits = 5.
Since the most obvious application is to conduct a CFA on a novel model, the standard arguments of purpose and focus are set to this application. Hence, when we use no further arguments, we get the following recommendation:
recommend(fits.single)
#> Warning in recommend(fits.single): The number of replications is lower than the
#> recommended minimum of 500. Consider with care.
#> $recommended
#> type fit.values
#> SRMR BoF 0.038
#>
#> $cutoffs
#> SRMR
#> cutoff 0.001 0.037
#> cutoff 0.01 0.037
#> cutoff 0.05 0.037
#> cutoff 0.1 0.036
#>
#> $decisions
#> SRMR
#> cutoff 0.001 rejected
#> cutoff 0.01 rejected
#> cutoff 0.05 rejected
#> cutoff 0.1 rejected
#>
#> $replications
#> [1] 10
#>
#> $comment
#> [1] "Recommendations based on flexible cutoffs and Mai et al. (2021)"
SRMR is recommended due to the purpose, focus, and sample size (n =
502) in line with the recommendations by Mai et al. (2021). Hence, the
lone fit indicator we need to report is SRMR. The function also returns
the type of the fit indicator, which is guessed from
index_guess
and the actual value of the SRMR in the
model.
Additionally, the function provides a sensitivity analysis for different values of uncertainty, .001 (.1 percent), .01 (1 percent), .05 (5 percent) and .10 (10 percent) and makes a decision given the cutoff. For completeness, replications, the number of non-converging models, and their share are also provided.
The result found here demonstrates the consequences of uncertainty for the decision. When one is very conservative and assumes a high Type I error (.10), the cutoff (.036 for 100 replications) is lower than the actual value (.038) and hence the present model should be rejected. When one is very lenient and feels safe with the model (.001), the cutoff (.040 for 100 replications) is higher than the actual value and hence the model can be confirmed.
Let us assume for a moment that Babakus and Boller had not been as exploratory as they were and simply looked for confirmation of an established measurement model. So, we change the purpose argument to established.
recommend(fits.single, purpose = "established")
#> $recommended
#> type fit.values
#> CFI GoF 0.96
#>
#> $decisions
#> [1] "confirmed"
#>
#> $comment
#> [1] "Recommendations based on fixed cutoffs and Mai et al. (2021)"
Now the recommendation changes to CFI with a fixed cutoff. Consequently, no uncertainty is provided (as they are fixed) and the recommendation is to confirm the model because the actual value (.960) is above the fixed cutoff of .95. This demonstrates two things. First, the recommend function also recommends fixed cutoffs when it is acceptable to do so (see Mai et al. 2021). Second, it shows what happens when assuming established models (and hence a low importance of Type II error): Type I errors become more important. Compare this with the lenient uncertainty before (.001, i.e., .040 for 100 replications), from the first recommendation, where SRMR also confirms the model. In other words, we feigned to be very certain by assuming the model to be an established model (ignoring the doubts) and hence got a very determined answer.
Finally, for exploratory investigations, one can also override the
recommendations programmed into the function by using the
override
argument. This however requires users to provide
one or more indices with the argument index
(otherwise, an
error is returned).
recommend(fits.single,
override = TRUE,
index = c("CFI", "SRMR"))
#> Warning in recommend(fits.single, override = TRUE, index = c("CFI", "SRMR")):
#> The number of replications is lower than the recommended minimum of 500.
#> Consider with care.
#> $recommended
#> type fit.values
#> CFI GoF 0.960
#> SRMR BoF 0.038
#>
#> $cutoffs
#> CFI SRMR
#> cutoff 0.001 0.978 0.037
#> cutoff 0.01 0.978 0.037
#> cutoff 0.05 0.978 0.037
#> cutoff 0.1 0.980 0.036
#>
#> $decisions
#> CFI SRMR
#> cutoff 0.001 rejected rejected
#> cutoff 0.01 rejected rejected
#> cutoff 0.05 rejected rejected
#> cutoff 0.1 rejected rejected
#>
#> $replications
#> [1] 10
#>
#> $comment
#> [1] "Override mode"
In the example, we provide CFI and SRMR and get the appropriate recommendations for them. Unsurprisingly, the recommendations do not change much (SRMR is confirmed for levels of .001 and .01 uncertainty for 100 replications, otherwise the model is rejected). Please note that purpose and focus are without function in this case, as the recommendations by Mai et al. (2021) are overridden.
A popular feature in CBSEM is to compare nested
models, for example testing some type of invariance between
groups, comparing alternative theoretical models or discriminant
validity testing (see next chapter for the latter). So far, we allow
users to incorporate a second model by specifying the mod2
argument in function gen_fit
. In this case, the resulting
fits from function are not vectors in a list of the length of
replications, but a matrix with two rows. Function flex_co
then produces a slightly different output displaying the flexible
cutoffs for both models as well as the difference between the two
models.
Let us assume that the first two factors F1 and F2 might be
orthogonal (independent). Hence, we constrain the factor correlation
between both to zero (F1 ~~ 0 * F2
).
subset(parameterestimates(res, standardized = T), lhs == "F1" &
== "F2")
rhs #> lhs op rhs est se z pvalue ci.lower ci.upper std.lv std.all std.nox
#> 46 F1 ~~ F2 0.217 0.038 5.736 0 0.143 0.291 0.36 0.36 0.36
<- "
mod.con F1 =~ Q5 + Q7 + Q8
F2 =~ Q2 + Q4
F3 =~ Q10 + Q11 + Q12 + Q13 + Q18 + Q19 + Q20 + Q21 + Q22
F4 =~ Q1 + Q17
F5 =~ Q6 + Q14 + Q15 + Q16
F1 ~~ 0 * F2
"
<- gen_fit(
fits.con mod1 = mod,
mod2 = mod.con,
x = bb1992,
rep = 10
)flex_co(fits = fits.con,
index = c("CFI", "SRMR"),
alpha.lev = .05)
#> Warning in flex_co(fits = fits.con, index = c("CFI", "SRMR"), alpha.lev =
#> 0.05): The number of replications is lower than the recommended minimum of 500.
#> Consider with care.
#> $cutoff
#> CFI SRMR
#> Model 1 0.9808747 0.03280321
#> Model 2 0.9640482 0.05264764
#>
#> $difference
#> CFI SRMR
#> 0.01682647 -0.01984443
#>
#> $index
#> [1] "CFI" "SRMR"
#>
#> $alpha
#> [1] 0.05
#>
#> $gof
#> CFI SRMR
#> TRUE FALSE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
fitmeasures(res, fit.measures = c("cfi", "srmr")) - fitmeasures(cfa(model = mod.con, data = bb1992), fit.measures = c("cfi", "srmr"))
#> cfi srmr
#> 0.011 -0.037
Since the correlation between both factors was only .36, it is likely that the models show a comparable fit. Indeed, the flexible cutoffs are slightly worsened in the constrained model 2 (CFI = .956, SRMR = .056) compared to model 1 (CFI = .969, SRMR = .035), yielding a small difference (delta CFI = .012, delta SRMR = -.022). Consider that this difference is larger than the present difference of CFI between the models (.011) and smaller than present difference of SRMR between the models (-.037). That is, CFI was - as expected - less sensitive to the constraint than SRMR. Put simply, flexible cutoffs suggest that the change in fit by CFI is acceptable (flexible cutoff: .012 > present difference: .011), but the change in fit by SRMR is not (flexible cutoff: -.020 > present difference: -.037). We should reject the alternative model of orthogonal factors F1 and F2. Fixed cutoffs (CFI ≥ .95, SRMR ≤ .09) would still have accepted the alternative model. Please note that guidelines for relative fit comparisons in a contingent way - accounting for model size and sample size - are rare to find (e.g., Meade et al. 2008 for measurement invariance). Future investigations of the performance of flexible cutoffs for relative fit comparisons might be interesting.
As a technical side note, the flexible cutoffs here are slightly
different from the ones described in chapter “Basic usage” since the
type
argument is automatically switched to “EM”
(type = "EM"
) when two models are provided. When single
cutoffs would be generated with this setting, the same flexible cutoffs
would be found.
<- gen_fit(
fits.proof mod1 = mod,
x = bb1992,
rep = 10,
type = "EM"
)flex_co(fits = fits.proof,
index = c("CFI", "SRMR"),
alpha.lev = .05)
#> Warning in flex_co(fits = fits.proof, index = c("CFI", "SRMR"), alpha.lev =
#> 0.05): The number of replications is lower than the recommended minimum of 500.
#> Consider with care.
#> $cutoff
#> CFI SRMR
#> 0.98087471 0.03280321
#>
#> $index
#> [1] "CFI" "SRMR"
#>
#> $alpha
#> [1] 0.05
#>
#> $gof
#> CFI SRMR
#> TRUE FALSE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
A special application of flexible cutoffs could be its ability to test discriminant validity. As Rönkkö & Cho (2020) highlighted, CBSEM could be useful to compare CFAs where a second alternative model merges, hereafter “merging”, two factors subject to an issue in discriminant validity (given a substantially high correlation between the factors). Alternatively, it is possible to compare an unconstrained model with a constrained model, where the correlation between the factors is set to a cutoff value (e.g., .9). We refer to this second principle as “constraining”. Instead of testing the chi-square difference between the original and the alternative model, flexible cutoffs for fit indicators might work equally well or even better, given the issues with the chi-square statistic (e.g., Niemand & Mai 2018).
Consider that in the example data, factors F4 and F5 are highly
correlated (.831), possibly indicating an issue in discriminant
validity. To investigate this, we can use the discriminant
validity-related arguments of the gen_fit
function, termed
dv
, dv.factors
, merge.mod
, and
dv.cutoff
. Argument dv
simply tells the
function to expect the application of discriminant validity testing.
Argument dv.factors
provides the factors that should be
investigated, in this case F4 and F5. If this argument is missing, it is
assumed that the first and second factor of the model should be
investigated (F1 & F2). Hence, one should provide the names of the
factors if that is not the case. Argument merge.mod
is
needed if merging should be applied. This argument
calls an internal function that takes the original model, changes all
indicators of the second factor (F5) specified under
dv.factors
to be indicators of the first factor (F4), and
omits the second factor from estimation. Please note that it does not
matter which factor is first or second, as both are merged anyway.
Finally, dv.cutoff
is required when one wants to apply
constraining. The value for this cutoff can be between
0 and 1, but a warning is given if it is lower than .8, as this is
generally accepted as the lower border of a discriminant validity issue
(Rönkkö & Cho 2020, p. 30). Some guessing work is implemented here
to make the arguments more convenient. For example, if one forgot the
argument dv
but sets merge.mod = TRUE
, merging
is still assumed. To make sure the user knows which approach is used, a
message indicating the discriminant validity testing approach is
returned.
subset(parameterestimates(res, standardized = TRUE), lhs == "F4" &
== "F5")
rhs #> lhs op rhs est se z pvalue ci.lower ci.upper std.lv std.all std.nox
#> 55 F4 ~~ F5 0.398 0.042 9.417 0 0.316 0.481 0.831 0.831 0.831
<- gen_fit(
fits.dv.con mod1 = mod,
x = bb1992,
rep = 10,
dv = TRUE,
dv.factors = c("F4", "F5"),
dv.cutoff = .9
)#> Constraining is selected as the discriminant validity testing option given the provided arguments.
<- gen_fit(
fits.dv.merge mod1 = mod,
x = bb1992,
rep = 10,
dv = TRUE,
dv.factors = c("F4", "F5"),
merge.mod = TRUE
)#> Merging is selected as the discriminant validity testing option given the provided arguments.
flex_co(fits = fits.dv.con,
index = "CFI",
alpha.lev = .05)
#> Warning in flex_co(fits = fits.dv.con, index = "CFI", alpha.lev = 0.05): The
#> number of replications is lower than the recommended minimum of 500. Consider
#> with care.
#> $cutoff
#> CFI
#> Model 1 0.9808747
#> Model 2 0.9805963
#>
#> $difference
#> [1] 0.0002783622
#>
#> $index
#> [1] "CFI"
#>
#> $alpha
#> [1] 0.05
#>
#> $gof
#> CFI
#> TRUE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
flex_co(fits = fits.dv.merge,
index = "CFI",
alpha.lev = .05)
#> Warning in flex_co(fits = fits.dv.merge, index = "CFI", alpha.lev = 0.05): The
#> number of replications is lower than the recommended minimum of 500. Consider
#> with care.
#> $cutoff
#> CFI
#> Model 1 0.9808747
#> Model 2 0.9428451
#>
#> $difference
#> [1] 0.03802959
#>
#> $index
#> [1] "CFI"
#>
#> $alpha
#> [1] 0.05
#>
#> $gof
#> CFI
#> TRUE
#>
#> $replications
#> [1] 10
#>
#> $`number of non-converging models`
#> [1] 0
#>
#> $`share of non-converging models`
#> [1] 0
<- "
mod.dv.con F1 =~ Q5 + Q7 + Q8
F2 =~ Q2 + Q4
F3 =~ Q10 + Q11 + Q12 + Q13 + Q18 + Q19 + Q20 + Q21 + Q22
F4 =~ Q1 + Q17
F5 =~ Q6 + Q14 + Q15 + Q16
F4 ~~ .9 * F5
"
fitmeasures(
cfa(
model = mod.dv.con,
data = bb1992,
auto.fix.first = FALSE,
std.lv = TRUE
),fit.measures = "cfi"
)#> cfi
#> 0.959
<- "
mod.dv.merge F1 =~ Q5 + Q7 + Q8
F2 =~ Q2 + Q4
F3 =~ Q10 + Q11 + Q12 + Q13 + Q18 + Q19 + Q20 + Q21 + Q22
F4 =~ Q1 + Q17 + Q6 + Q14 + Q15 + Q16
"
fitmeasures(
cfa(
model = mod.dv.merge,
data = bb1992
),fit.measures = "cfi"
)#> cfi
#> 0.952
In this example, we asked for discriminant validity testing
(dv = TRUE
) and ran constraining first, constraining the
correlation between the provided factors
(dv.factors = c("F4", "F5")
) to the cutoff of .9
(dv.cutoff = .9
). Second, we ran merging, this time telling
the function to merge both selected factors
(merge.mod = TRUE
).
Similar to the relative fit comparison case, the function returns
tables for each replication, which then can be handled by the
flex_co
function. There are multiple options for how to
assess discriminant validity subject to further investigations. As a
simple test, we apply the following logic here
(essentially a variant of a chi-squared difference test only with CFI):
If two factors are clearly discriminant, their correlation is low,
yielding a worse fit for the constrained or
merged model compared to the original
(unconstrained, not merged) model. This implies that the a GoF index,
such as CFI, is smaller than the cutoff of the correct model (e.g., .75
< .95). In contrast, a BoF index, such as SRMR, would indicate
discriminant validity if it is larger than the respective cutoff of the
correct model (e.g., .15 > .05). In this example, the CFI for
constraining is .959, while cutoffs for the correct model (model 1) are
.969. In a nutshell, the CFI of the constrained model is out of the
range of simulated CFIs for correct models (in this case, outside of 95%
of all CFIs simulated). For merging, the same picture is found. CFI is
.952, which is well below the cutoff of .969. Multiple other options are
plausible, but we leave the point here.
Since the code to obtain these values is rather extensive and hence
to make user’s work easier, we provide a recommendation function
recommend_dv
that incorporates the previously described
considerations. As with the recommend
function for single
fit indicators, it requires a result from gen_fit
with
appropriate arguments and, optionally, names of the fit indices (CFI is
used if not provided). Results are rounded to 3 digits, but can be
changed to 1 to 5 digits if needed, for example by
digits = 5.
recommend_dv(fits.dv.con)
#> Warning in recommend_dv(fits.dv.con): The number of replications is lower than
#> the recommended minimum of 500. Consider with care.
#> $cutoffs
#> CFI model alpha
#> 1 0.981 original 0.001
#> 2 0.981 constrained 0.001
#> 3 0.981 original 0.010
#> 4 0.981 constrained 0.010
#> 5 0.981 original 0.050
#> 6 0.981 constrained 0.050
#> 7 0.982 original 0.100
#> 8 0.982 constrained 0.100
#>
#> $fit.values
#> CFI model
#> 1 0.960 original
#> 2 0.959 constrained
#>
#> $differences
#> CFI
#> cutoff 0.001 0.000
#> cutoff 0.01 0.000
#> cutoff 0.05 0.000
#> cutoff 0.1 0.000
#> fit 0.001
#>
#> $decisions
#> CFI
#> cutoff 0.001 confirmed
#> cutoff 0.01 confirmed
#> cutoff 0.05 confirmed
#> cutoff 0.1 confirmed
#>
#> $replications
#> [1] 10
#>
#> $comment
#> [1] "Approach for discriminant validity testing: constraining. Discriminant validity is confirmed if the fit value from the constrained/merged model is smaller (GoF) / larger (BoF) than the respective cutoff of original model."
recommend_dv(fits.dv.merge)
#> Warning in recommend_dv(fits.dv.merge): The number of replications is lower
#> than the recommended minimum of 500. Consider with care.
#> $cutoffs
#> CFI model alpha
#> 1 0.981 original 0.001
#> 2 0.943 merged 0.001
#> 3 0.981 original 0.010
#> 4 0.943 merged 0.010
#> 5 0.981 original 0.050
#> 6 0.943 merged 0.050
#> 7 0.982 original 0.100
#> 8 0.944 merged 0.100
#>
#> $fit.values
#> CFI model
#> 1 0.960 original
#> 2 0.952 merged
#>
#> $differences
#> CFI
#> cutoff 0.001 0.038
#> cutoff 0.01 0.038
#> cutoff 0.05 0.038
#> cutoff 0.1 0.038
#> fit 0.008
#>
#> $decisions
#> CFI
#> cutoff 0.001 confirmed
#> cutoff 0.01 confirmed
#> cutoff 0.05 confirmed
#> cutoff 0.1 confirmed
#>
#> $replications
#> [1] 10
#>
#> $comment
#> [1] "Approach for discriminant validity testing: merging. Discriminant validity is confirmed if the fit value from the constrained/merged model is smaller (GoF) / larger (BoF) than the respective cutoff of original model."
The function returns the cutoffs for different levels of alpha and
the two models, where the first model is the original model and the
second model is termed constrained or merged based on the approach
selected in the simulation. Further, the actual fit values are
automatically provided. Hence, the constrained or merged model does not
needed to be defined explicitly. Differences and decisions based on the
different alpha values are provided as well. Finally, the number of
replications as well as a comment highlighting the approach and an
interpretation of the decision basis (the simple test) are given. Please
note that the decisions are identical to the ones made by Rönkkö &
Cho’s tool in semTools::discriminantValidity
.
Babakus, E., & Boller, G. W. (1992). An empirical assessment of the SERVQUAL scale. Journal of Business Research, 24(3), 253–268. https://doi.org/10.1016/0148-2963(92)90022-4
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. https://doi.org/10.1080/10705519909540118
Mai, R., Niemand, T., & Kraus, S. (2021). A Tailor-Fit Model Evaluation Strategy for Better Decisions about Structural Equation Models. Technological Forecasting & Social Change, 173(December) 121142. https://doi.org/10.1016/j.techfore.2021.121142
Meade, A. W., Johnson, E. C., & Braddy, P. W. (2008). Power and sensitivity of alternative fit indices in tests of measurement invariance. Journal of Applied Psychology, 93(3), 568-592. https://doi.org/10.1037/0021-9010.93.3.568
Niemand, T., & Mai, R. (2018). Flexible cutoff values for fit indices in the evaluation of structural equation models. Journal of the Academy of Marketing Science, 46(6), 1148—1172. https://doi.org/10.1007/s11747-018-0602-9
Rönkkö, M., & Cho, E. (2020). An updated guideline for assessing
discriminant validity. Organizational Research Methods. https://doi.org/10.1177/1094428120968614
Comment: V12042022