STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL

Does Cluster-Robust Estimation Provide Within-Study Effects? A Comparison of Individual Participant Data Methods in MASEM
Groot LJ, Kan KJ and Jak S
Researchers conducting meta-analytical structural equation modeling (MASEM) with individual participant data can choose from several methods, including cluster-robust estimation, two-level SEM, multivariate meta-analysis of path coefficients, and One-Stage MASEM (OSMASEM). While two-level SEM and OSMASEM model within- and between-study effects separately, cluster-robust estimation combines them, estimating an overall path coefficient. Despite its popularity, cluster-robust estimation often yields results that differ from other methods. Simulations using factor models and real-world comparisons using path models show that it may not accurately reflect within-study estimates and can produce biased standard errors. This study compares IPD MASEM methods using simulated data, varying intraclass correlations, parameter equality across levels, number of studies, and missing data. Results reveal that cluster-robust estimation frequently misrepresents within-study estimates, produces biased standard errors, and tends to incorrectly reject model fit, highlighting the need for careful method selection in IPD MASEM applications.
Testing for and Moderation using Random Intercept Cross-Lagged Panel Models
Speyer LG, Ushakova A, Blakemore SJ, Murray AL and Kievit R
Random-Intercept Cross-Lagged Panel Models allow for the decomposition of measurements into between- and within-person components and have hence become popular for testing developmental hypotheses. Here, we describe how developmental researchers can implement, test and interpret interaction effects in such models using an empirical example from developmental psychopathology research. We illustrate the analysis of and interactions utilising data from the United Kingdom-based Millennium Cohort Study within a Bayesian Structural Equation Modelling framework. We provide annotated Mplus code, allowing users to isolate, estimate and interpret the complexities of within-person and between person dynamics as they unfold over time.
Products of Variables in Structural Equation Models
Boker S, von Oertzen T, Pritikin JN, Hunter MD, Brick T, Brandmaier A and Neale M
A general method is introduced in which variables that are products of other variables in the context of a structural equation model (SEM) can be decomposed into the sources of variance due to the multiplicands. The result is a new category of SEM which we call a Products of Variables Model (PoV). Some useful and practical features of PoV models include estimation of interactions between latent variables, latent variable moderators, manifest moderators with missing values, and manifest or latent squared terms. Expected means and covariances are analytically derived for a simple product of two variables and it is shown that the method reproduces previously published results for this special case. It is shown algebraically that using centered multiplicands results in an unidentified model, but if the multiplicands have non-zero means, the result is identified. The method has been implemented in OpenMx and Ωnyx and is applied in five extensive simulations.
The impact of omitting confounders in parallel process latent growth curve mediation models: Three sensitivity analysis approaches
Liu X, Zhang Z, Valentino K and Wang L
Parallel process latent growth curve mediation models (PP-LGCMMs) are frequently used to longitudinally investigate the mediation effects of treatment on the level and change of outcome through the level and change of mediator. An important but often violated assumption in empirical PP-LGCMM analysis is the absence of omitted confounders of the relationships among treatment, mediator, and outcome. In this study, we analytically examined how omitting pretreatment confounders impacts the inference of mediation from the PP-LGCMM. Using the analytical results, we developed three sensitivity analysis approaches for the PP-LGCMM, including the frequentist, Bayesian, and Monte Carlo approaches. The three approaches help investigate different questions regarding the robustness of mediation results from the PP-LGCMM, and handle the uncertainty in the sensitivity parameters differently. Applications of the three sensitivity analyses are illustrated using a real-data example. A user-friendly Shiny web application is developed to conduct the sensitivity analyses.
Dynamic Structural Equation Models with Missing Data: Data Requirements on and
Fang Y and Wang L
Dynamic structural equation modeling (DSEM) is a useful technique for analyzing intensive longitudinal data. A challenge of applying DSEM is the missing data problem. The impact of missing data on DSEM, especially on widely applied DSEM such as the two-level vector autoregressive (VAR) cross-lagged models, however, is understudied. To fill the research gap, we evaluated how well the fixed effects and variance parameters in two-level bivariate VAR models are recovered under different missingness percentages, sample sizes, the number of time points, and heterogeneity in missingness distributions through two simulation studies. To facilitate the use of DSEM under customized data and model scenarios (different from those in our simulations), we provided illustrative examples of how to conduct Monte Carlo simulations in M to determine whether a data configuration is sufficient to obtain accurate and precise results from a specific DSEM.
Fitting Multilevel Vector Autoregressive Models in Stan, JAGS, and Mplus
Li Y, Wood J, Ji L, Chow SM and Oravecz Z
The influx of intensive longitudinal data creates a pressing need for complex modeling tools that help enrich our understanding of how individuals change over time. Multilevel vector autoregressive (mlVAR) models allow for simultaneous evaluations of reciprocal linkages between dynamic processes and individual differences, and have gained increased recognition in recent years. High-dimensional and other complex variations of mlVAR models, though often computationally intractable in the frequentist framework, can be readily handled using Markov chain Monte Carlo techniques in a Bayesian framework. However, researchers in social science fields may be unfamiliar with ways to capitalize on recent developments in Bayesian software programs. In this paper, we provide step-by-step illustrations and comparisons of options to fit Bayesian mlVAR models using Stan, JAGS and Mplus, supplemented with a Monte Carlo simulation study. An empirical example is used to demonstrate the utility of mlVAR models in studying intra- and inter-individual variations in affective dynamics.
Are the Signs of Factor Loadings Arbitrary in Confirmatory Factor Analysis? Problems and Solutions
Tang D, Boker SM and Tong X
The replication crisis in social and behavioral sciences has raised concerns about the reliability and validity of empirical studies. While research in the literature has explored contributing factors to this crisis, the issues related to analytical tools have received less attention. This study focuses on a widely used analytical tool - confirmatory factor analysis (CFA) - and investigates one issue that is typically overlooked in practice: accurately estimating factor-loading signs. Incorrect loading signs can distort the relationship between observed variables and latent factors, leading to unreliable or invalid results in subsequent analyses. Our study aims to investigate and address the estimation problem of factor-loading signs in CFA models. Based on an empirical demonstration and Monte Carlo simulation studies, we found current methods have drawbacks in estimating loading signs. To address this problem, three solutions are proposed and proven to work effectively. The applications of these solutions are discussed and elaborated.
Estimating latent baseline-by-treatment interactions in statistical mediation analysis
Gonzalez O, Millechek JR and Georgeson AR
Statistical mediation analysis is used to uncover intermediate variables, known as mediators [ ], that explain how a treatment [ ] changes an outcome [ ]. Often, researchers examine whether baseline levels of and moderate the effect of on posttest or . However, there is limited guidance on how to estimate baseline-by-treatment interaction (BTI) effects when and are latent variables, which entails the estimation of latent interaction effects. In this paper, we discuss two general approaches for estimating latent BTI effects in mediation analysis: using structural models or scoring latent variables prior to estimating observed BTIs and correcting for unreliability. We present simulation results describing bias, power, type 1 error rates, and interval coverage of the latent BTIs and mediated effects estimated using these approaches. These methods are also illustrated with an applied example. R and M syntax are provided to facilitate the implementation of these approaches.
A Growth of Hierarchical Autoregression Model for Capturing Individual Differences in Changes of Dynamic Characteristics of Psychological Processes
Li Y, Williams L, Muth C, Heshmati S, Chow SM and Oravecz Z
Several methodological innovations have been advanced in the past decades that combine growth curve models (GCMs) with models of autoregressive (AR) processes. However, most of these approaches do not effectively capitalize on known (e.g., study design-related) information to structure the growth curves into meaningful between- and within-phase changes, while simultaneously accommodating interindividual differences in these intraindividual changes. We propose a Bayesian growth of hierarchical autoregression (GoHiAR) model, which combines AR and GCM to evaluate phase-to-phase changes in multifaceted dynamic characteristics (e.g., baseline, variability, and inertia) as well as individual differences in these changes. This approach allows for drawing conclusions in a way that the proposed data generating mechanisms are in line with the theoretical insights about psychological change and dynamics. Our Bayesian implementation of the GoHiAR model allows for all parameters to be estimated simultaneously. First, we evaluated GoHiAR's overall estimation accuracy and sampling efficiency, effects of model misspecifications, and sensitivity to effect sizes via a simulation study. Results showed reasonable performance. Then, we applied GoHiAR to an ecological momentary assessment (EMA) study that comprised data from pre-, during, and following an intervention, and investigated changes in the dynamic characteristics of individuals' psychological well-being (specifically in meaning of life) within and across phases.
Deriving Expected Values of Model Parameters when Using Sum Scores in Simulation Research
Georgeson AR
There is increasing interest in using factor scores in structural equation models and there have been numerous methodological papers on the topic. Nevertheless, sum scores, which are computed from adding up item responses, continue to be ubiquitous in practice. It is therefore important to compare simulation results involving factor scores to those of sum scores so that applied researchers can understand the advantages. Yet, researchers do not often compare sum scores and factor scores in terms of bias, a common simulation outcome. A reason for this is that sum scores are on a different scale and it is unclear how to compare sum scores to other types of scores. The purpose of this paper is to provide guidance for methodological researchers who wish to conduct research on scoring how to compute bias for sum scores by obtaining the expected values of their model parameters under a sum score model.
Unsupervised Model Construction in Continuous-Time
Park JJ, Fisher ZF, Hunter MD, Shenk C, Russell M, Molenaar PCM and Chow SM
Many of the advancements reconciling individual- and group-level results have occurred in the context of a discrete-time modeling framework. Discrete-time models are intuitive and offer relatively simple interpretations for the resulting dynamic structures; however, they do not possess the flexibility of models fitted in the continuous-time framework. We introduce ct-gimme, a continuous-time extension of the group iterative multiple model estimation (GIMME; Gates & Molenaar, 2012) procedure which enables researchers to fit complex, high dimensional dynamic networks in continuous-time. Our results indicate that ct-gimme outperforms model fitting in continuous-time by pooling information across multiple subjects. Likewise, ct-gimme outperforms group-level model fitting in the presence of within-sample heterogeneity. We conclude with an empirical illustration and highlight limitations of the approach relating to identification of meaningful starting values.
Measurement Model Misspecification in Dynamic Structural Equation Models: Power, Reliability, and Other Considerations
Oh H, Hunter MD and Chow SM
Dynamic Structural Equation Models (DSEMs) integrate multilevel modeling, time series analysis, and structural equation modeling within a Bayesian estimation framework, offering a versatile tool for analyzing intensive longitudinal data (ILD). However, the impact of measurement structure misspecification in DSEMs, especially under varying reliability conditions and model complexities, remains underexplored. Our Monte Carlo simulation revealed that omitting measurement errors when present led to severe biases in dynamic parameters regardless of reliability conditions, though power remained high. Increasing the number of participants and time points ameliorated but did not eliminate all biases. A single-indicator DSEMs with a measurement structure using composite scores showed similar performance to multiple indicators DSEMs. Empirical applications showed discrepancies in dynamic parameters based on the number of indicators and measurement structures used. Leveraging these findings, we provide design recommendations, functions for extending reliability indices from single-indicator to multiple-indicator models, and guidelines for power evaluations under different reliability conditions.
Two-Step Multilevel Latent Class Analysis in the Presence of Measurement Non-Equivalence
Lyrvall J, Kuha J and Oser J
We consider estimation of two-level latent class models for clustered data, when the measurement model for the observed measurement items includes non-equivalence of measurement with respect to some observed covariates. The parameters of interest are coefficients in structural models for the latent classes given covariates. We propose a two-step method of estimation. This extends previously proposed methods of two-step estimation for models without non-equivalence of measurement by specifying the model used in the first step in such a way that it correctly accounts for non-equivalence. The properties of these two-step estimators are examined using simulation studies and an applied example.
Regression-Equivalent Effect Sizes for Latent Growth Modeling and Associated Null Hypothesis Significance Tests
Feingold A
The effect of an independent variable on random slopes in growth modeling with latent variables is conventionally used to examine predictors of change over the course of a study. This tutorial demonstrates that the same effect of a covariate on growth can be obtained by using final status centering for parameterization and regressing the random intercepts (or the intercept factor scores) on both the independent variable and a baseline covariate--the framework used to study change with classical regression analysis. Examples are provided that illustrate the application of an intercept-focused approach to obtain effect sizes--the unstandardized regression coefficient, the standardized regression coefficient, squared semi-partial correlation, and Cohen's --that estimate the same parameters as respective effect sizes from a classical regression analysis. Moreover, statistical power to detect the effect of the predictor on growth was greater when using random intercepts than the conventionally used random slopes.
Optimal Instrument Selection using Bayesian Model Averaging for Model Implied Instrumental Variable Two Stage Least Squares Estimators
Henry T, Fisher Z and Bollen K
Model-Implied Instrumental Variable Two-Stage Least Squares (MIIV-2SLS) is a limited information, equation-by-equation, non-iterative estimator for latent variable models. Associated with this estimator are equation specific tests of model misspecification. One issue with equation specific tests is that they lack specificity, in that they indicate that some instruments are problematic without revealing which specific ones. Instruments that are poor predictors of their target variables ("weak instruments") is a second potential problem. We propose a novel extension to detect instrument specific tests of misspecification and weak instruments. We term this the Model-Implied Instrumental Variable Two-Stage Bayesian Model Averaging (MIIV-2SBMA) estimator. We evaluate the performance of MIIV-2SBMA against MIIV-2SLS in a simulation study and show that it has comparable performance in terms of parameter estimation. Additionally, our instrument specific overidentification tests developed within the MIIV-2SBMA framework show increased power to detect specific problematic and weak instruments. Finally, we demonstrate MIIV-2SBMA using an empirical example.
Utilizing Moderated Non-linear Factor Analysis Models for Integrative Data Analysis: A Tutorial
Kush JM, Masyn KE, Amin-Esmaeili M, Susukida R, Wilcox HC and Musci RJ
Integrative data analysis (IDA) is an analytic tool that allows researchers to combine raw data across multiple, independent studies, providing improved measurement of latent constructs as compared to single study analysis or meta-analyses. This is often achieved through implementation of moderated nonlinear factor analysis (MNLFA), an advanced modeling approach that allows for covariate moderation of item and factor parameters. The current paper provides an overview of this modeling technique, highlighting distinct advantages most apt for IDA. We further illustrate the complex modeling building process involved in MNLFA by providing a tutorial using empirical data from five separate prevention trials. The code and data used for analyses are also provided.
Bias-Adjusted Three-Step Multilevel Latent Class Modeling with Covariates
Lyrvall J, Bakk Z, Oser J and Di Mari R
We present a bias-adjusted three-step estimation approach for multilevel latent class models (LC) with covariates. The proposed approach involves (1) fitting a single-level measurement model while ignoring the multilevel structure, (2) assigning units to latent classes, and (3) fitting the multilevel model with the covariates while controlling for measurement error introduced in the second step. Simulation studies and an empirical example show that the three-step method is a legitimate modeling option compared to the existing one-step and two-step methods.
Teacher's Corner: An R Shiny App for Sensitivity Analysis for Latent Growth Curve Mediation
Kruger ES, Tofighi D, Hsiao YY, MacKinnon DP, Lee Van Horn M and Witkiewitz K
Mechanisms of behavior change are the processes through which interventions are hypothesized to cause changes in outcomes. Latent growth curve mediation models (LGCMM) are recommended for investigating the mechanisms of behavior change because LGCMM models establish temporal precedence of change from the mediator to the outcome variable. The Correlated Augmented Mediation Sensitivity Analyses (CAMSA) App implements sensitivity analysis for LGCMM models to evaluate if a mediating path (mechanism) is robust to potential confounding variables. The CAMSA approach is described and applied to simulated data, and data from a research study exploring a mechanism of change in the treatment of substance use disorder.
Using SymPy (Symbolic Python) for understanding Structural Equation Modeling
Steele JS and Grimm KJ
Structural Equation Modeling (SEM) continues to grow in popularity with numerous articles, books, courses, and workshops available to help researchers become proficient with SEM quickly. However, few resources are available to help users gain a deep understanding of the analytic steps involved in SEM, with even fewer providing reproducible syntax for those learning the technique. This work builds off of the original work by Ferron and Hess (2007) to provide computer syntax, written in python, for the specification, estimation, and numerical optimization steps necessary for SEM. The goal is to provide readers with many of the numerical and analytic details of SEM that may not be regularly taught in workshops and courses. This work extends the original demonstration by Ferron and Hess to incorporate the reticular action model notation for specification as well as the estimation of variable means. All of the code listed is provided in the appendix.
Effects of Mixing Weights and Predictor Distributions on Regression Mixture Models
Sherlock P, DiStefano C and Habing B
Accommodating Continuous Time Metrics within the Discrete-time Latent Change Score Model Using Definition Variables
Serang S, Whiteman SD and Reese AH
Longitudinal models typically represent change as a function of a single time metric. However, the COVID-19 pandemic prompted researchers to consider whether changes are a function of phases of the pandemic while simultaneously accommodating age. This paper proposes an extension of the discrete-time latent change score modeling framework to model wave-to-wave changes while accounting for time more precisely by including continuous time metrics via regressing out initial age and using definition variables instead of bins. The approach is motivated by and applied to data involving adolescent sibling influence in expectations about marijuana. A simulation study shows how our approach compares to models that use wave without regressing out initial age or using definition variables.