MULTIVARIATE BEHAVIORAL RESEARCH

Neural Network Analysis of Psychological Data: A Step-by-Step Guide
Tong L and Zhang Z
Artificial neural networks (ANN) have attracted increasing attention in the field of psychology. With the availability of software programs, the wide application of ANN becomes possible. However, without a firm understanding of the basics of the ANN, issues can easily arise. This article presents a step-by-step guide for implementing a feed-forward neural network (FNN) on a psychological data set to illustrate the critical steps in building, estimating, and interpreting a neural network model. We start with a concrete example of a basic 3-layer FNN, illustrating the core concepts, the matrix representation, and the whole optimization process. By adjusting parameters and changing the model structure, we examine their effects on model performance. Then, we introduce accessible methods for interpreting model results and making inferences. Through the guide, we hope to help researchers avoid common problems in applying neural network models and machine learning methods in general.
Novel Full-Bayesian and Hybrid-Bayesian Approaches for Modeling Intraindividual Variability
Fang Y and Wang L
Intraindividual variability (IIV) characterizes the amplitude and temporal dependency of short-term fluctuations of a variable and is often used to predict outcomes in psychological studies. However, how to properly model IIV is understudied. In particular, intraindividual standard deviation (or variance), which quantifies the amplitude of fluctuation of a variable around its mean level, can be challenging to model directly in popular latent variable frameworks, such as dynamic structural equation modeling (DSEM). In this study, we introduced three novel modeling methods, including two two-step hybrid-Bayesian methods using DSEM and a one-step full Bayesian method, to model IIV as predictors. We conducted a simulation study to evaluate the performance of the three methods and compared their performance to that of the conventional regression approach under various data conditions. Simulation results showed that the hybrid-Bayesian approach with multiple draws (HBM) and the one-step full Bayesian (FB) approach performed well in recovering the parameters when sufficient sample size and time points were available. The data requirement of using FB was lower than HBM. However, the conventional approach and hybrid-Bayesian approach with a single draw failed to recover parameters, even with large samples. We provided a simulated data example with code online to illustrate the use of the methods.
The Effects of Data Preprocessing Choices on Behavioral RCT Outcomes: A Multiverse Analysis
Veltri GA
Seemingly routine data-preprocessing choices can exert outsized influence on the conclusions drawn from randomized controlled trials (RCTs), particularly in behavioral science where data are noisy, skewed and replete with outliers. We demonstrate this influence with two fully specified multiverse analyses on simulated RCT data. Each analysis spans 180 analytical pathways, produced by crossing 36 preprocessing pipelines that vary outlier handling, missing-data imputation and scale transformation, with five common model specifications. In Simulation A, which uses linear regression families, preprocessing decisions explain 76.9% of the total variance in estimated treatment effects, whereas model choice explains only 7.5%. In Simulation B, which replaces the linear models with advanced algorithms (generalized additive models, random forests, gradient boosting), the dominance of preprocessing is even clearer: 99.8% of the variance is attributable to data handling and just 0.1% to model specification. The ranges of mean effects show the same pattern (4.34 vs. 1.43 in Simulation A; 15.30 vs. 0.56 in Simulation B). Particular pipelines-most notably those that standardize or log-transform variables-shrink effect estimates by more than 90% relative to the raw-data baseline, while pipelines that leave the original scale intact can inflate effects by an order of magnitude. Because preprocessing choices can overshadow even large shifts in statistical methodology, we call for meticulous reporting of these steps and for routine sensitivity or multiverse analyses that make their impact transparent. Such practices are essential for improving the robustness and replicability of behavioral-science RCTs.
Bayesian Multilevel Compositional Data Analysis with the R Package
Le F, Dumuid D, Stanford TE and Wiley JF
Multilevel compositional data, such as data sampled over time that are non-negative and sum to a constant value, are common in various fields. However, there is currently no software specifically built to model compositional data in a multilevel framework. The package implements a collection of tools for modeling compositional data in a Bayesian multivariate, multilevel pipeline. The user-friendly setup only requires the data, model formula, and minimal specification of the analysis. This article outlines the statistical theory underlying the Bayesian compositional multilevel modeling approach and details the implementation of the functions available in , using an example dataset of compositional daily sleep-wake behaviors. This innovative method can be used to robustly answer scientific questions from the increasingly available multilevel compositional data from intensive, longitudinal studies.
Sample Size Determination for Optimal and Sub-Optimal Designs in Simplified Parametric Test Norming
Innocenti F and Cassese A
Norms play a critical role in high-stakes individual assessments (e.g., diagnosing intellectual disabilities), where precision and stability are essential. To reduce fluctuations in norms due to sampling, normative studies must be based on sufficiently large and well-designed samples. This paper provides formulas, applicable to any sample composition, for determining the required sample size for normative studies under the simplified parametric norming framework. In addition to a sufficiently large sample size, precision can be further improved by sampling according to an optimal design, that is, a sample composition that minimizes sampling error in the norms. Optimal designs are, here, derived for 45 (multivariate) multiple linear regression models, assuming normality and homoscedasticity. These models vary in the degree of interaction among three norm-predictors: a continuous variable (e.g., age), a categorical variable (e.g., sex), and a variable (e.g., education) that may be treated as either continuous or categorical. To support practical implementation, three interactive Shiny apps are introduced, enabling users to determine the sample size for their normative studies. Their use is demonstrated through the hypothetical planning of a normative study for the Trail Making Test, accompanied by a review of the most common models for this neuropsychological test in current practice.
Multilevel Metamodels: Enhancing Inference, Interpretability, and Generalizability in Monte Carlo Simulation Studies
Gilbert JB and Miratrix LW
Metamodels, or the regression analysis of Monte Carlo simulation results, provide a powerful tool to summarize simulation findings. However, an underutilized approach is the multilevel metamodel (MLMM) that accounts for the dependent data structure that arises from fitting multiple models to the same simulated data set. In this study, we articulate the theoretical rationale for the MLMM and illustrate how it can improve the interpretability of simulation results, better account for complex simulation designs, and provide new insights into the generalizability of simulation findings.
Detecting Transition Points in the Slope-Intercept Relation in Linear Latent Growth Models
Lee D and Hancock GR
In a linear latent growth model parameterized by intercept (α) and slope (β) factors, those factors' relation is often of interest. The model typically captures this through their covariance parameter, which inherently assumes linearity in their relation. However, this assumption may not always hold. For instance, α and β might be unrelated below a certain threshold along the α-axis but show a meaningful relation above it. That is, even though individual growth trajectories may follow a linear pattern over time, the relation between α and β can be nonlinear, potentially featuring distinct segments separated by a transition point. To address such relations, we propose a semiparametric approach that combines Bayesian P-splines for flexible nonlinear modeling of the α-β relation along with a segmented regression-based transition point detection method. This two-stage analytic approach provides for a more nuanced understanding of the α-β relation, including estimation of a potential transition point where the α-β relation structure fundamentally changes. Simulation results and an empirical data illustration support this approach's effectiveness with single transition point scenarios, offering deeper insights into aspects of the growth process.
Targeted Maximum Likelihood Estimation for Causal Inference With Observational Data-The Example of Private Tutoring
Jindra C and Sachse KA
State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing doubly robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We briefly introduce TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study ( 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.
Regression Discontinuity Analysis with Latent Variables
Morell M, Kwon M, Han Y, Sung Y, Liu Y and Yang JS
A regression discontinuity (RD) design is often employed to provide causal evidence when the randomization of the treatment assignment is infeasible. When variables of interest are latent constructs measured by observed indicators, the conventional RD analysis using observed variable scores does not allow researchers to examine heterogeneity in the estimated local average treatment effect (ATE) and to generalize the ATE to participants away from the cutoff. We propose a novel methodological augmentation to the conventional RD analysis, which assumes the availability of multiple indicator variables (i.e., raw item responses) that measure the latent construct underlying the running variable. By specifying an explicit measurement model based on those indicator variables, our latent RD framework allows 1) defining the local ATE conditional on the latent construct, 2) disentangling the heterogeneity of the local ATE, and 3) generalizing the local ATE to running variable scores away from the cutoff. In a proof-of-concept simulation we illustrate the proposed augmentation recovers parameters of interest well under practical test length and sample size conditions.
Demystifying Posterior Distributions: A Tutorial on Their Derivation
Du H, Liu F, Zhang Z and Enders C
Bayesian statistics have gained significant traction across various fields over the past few decades. Bayesian statistics textbooks often provide both code and the analytical forms of parameters for simple models. However, they often omit the process of deriving posterior distributions or limit it to basic univariate examples focused on the mean and variance. Additionally, these resources frequently assume a strong background in linear algebra and probability theory, which can present barriers for researchers without extensive mathematical training. This tutorial aims to fill that gap by offering a step-by-step guide to deriving posterior distributions. We aim to make concepts typically reserved for advanced statistics courses more accessible and practical. This tutorial will cover two models: the univariate normal model and the multilevel model. The concepts and properties demonstrated in the two examples can be generalized to other models and distributions.
Integrated Trend and Lagged Modeling of Multi-Subject, Multilevel, and Short Time Series
Xiong X, Li Y, Hunter MD and Chow SM
Trends represent systematic intra-individual variations that occur over slower time scales that, if unaccounted, are known to yield biases in estimation of momentary change patterns captured by time series models. The applicability of detrending methods has rarely been assessed in the context of multi-level longitudinal panel data, namely, nested data structures with relatively few measurements. This paper evaluated the efficacy of a series of two-stage detrending methods against a single-stage Bayesian approach in fitting ulti-evel nonlinear rowth curve models with utoegressive residuals (ml-GAR) with random effects in both the growth and autoregressive processes. Monte Carlo simulation studies revealed that the single-stage Bayesian approach, in contrast to two-stage approaches, exhibited satisfactory properties with as few as five time points when the number of individuals was large (e.g., 500 individuals). It still outperformed alternative two-stage approaches when correlated random effects between the trend and autoregressive processes were misspecified as a diagonal random effect structure. Empirical results from the Early Childhood Longitudinal Study-Kindergarten Class (ECLS-K) data suggested substantial deviations in conclusions regarding children's reading ability using two-stage in comparison to single-stage approaches, thus highlighting the importance of simultaneous modeling of trends and intraindividual variability whenever feasible.
Detecting Model Misfit in Structural Equation Modeling with Machine Learning-A Proof of Concept
Partsch MV and Goretzko D
Despite the popularity of structural equation modeling in psychological research, accurately evaluating the fit of these models to data is still challenging. Using fixed fit index cutoffs is error-prone due to the fit indices' dependence on various features of the model and data ("nuisance parameters"). Nonetheless, applied researchers mostly rely on fixed fit index cutoffs, neglecting the risk of falsely accepting (or rejecting) their model. With the goal of developing a broadly applicable method that is almost independent of nuisance parameters, we introduce a machine learning (ML)-based approach to evaluate the fit of multi-factorial measurement models. We trained an ML model based on 173 model and data features that we extracted from 1,323,866 simulated data sets and models fitted by means of confirmatory factor analysis. We evaluated the performance of the ML model based on 1,659,386 independent test observations. The ML model performed very well in detecting model (mis-)fit in most conditions, hereby outperforming commonly used fixed fit index cutoffs across the board. Only minor misspecifications, such as a single neglected residual correlation, proved to be challenging to detect. This proof-of-concept study shows that ML is very promising in the context of model fit evaluation.
Correlated Residuals in Lagged-Effects Models: What They (Do Not) Represent in the Case of a Continuous-Time Process
Kuiper RM and Hamaker EL
The appeal of lagged-effects models, like the first-order vector autoregressive (VAR(1)) model, is the interpretation of the lagged coefficients in terms of predictive-and possibly causal-relationships between variables over time. While the focus in VAR(1) applications has traditionally been on the strength and sign of the lagged relationships, there has been a growing interest in the residual relationships (i.e., the correlations between the innovations) as well. In this article, we will investigate what residual correlations can and cannot signal, for both the discrete-time (DT) and continuous-time (CT) VAR(1) model, when inspecting a CT process. We will show that one should not take on a DT perspective when investigating a CT process: Correlated (i.e., non-zero) DT residuals can flag omitted common causes and effects at shorter intervals (which is well-known), but-when having a CT process-also effects at longer intervals. Furthermore, when inspecting a CT process, uncorrelated (i.e., zero) DT residuals do not imply that the variables have no effect on each other at other intervals, nor does it preclude the risk of having omitted common causes. Additionally, we will show that residual correlations in a CT model signal omitted causes for one or more of the observed variables. This may bias the estimation of lagged relationships, implying that the found predictive lagged relationships do not equal the underlying causal lagged relationships. Unfortunately, the CT residual correlations do not reflect the magnitude of the distortion.
Dynamic Fit Index Cutoffs for Time Series Network Models
Liu S, Crawford CM, Fisher ZF and Gates KM
In this study, we extend the dynamic fit index (DFI) developed by McNeish and Wolf to the context of time series analysis. DFI is a simulation-based method for deriving fit index cutoff values tailored to the specific model and data characteristics. Through simulations, we show that DFI cutoffs for detecting an omitted path in time series network models tend to be closer to exact fit than the popular benchmark values developed by Hu and Bentler. Moreover, cutoff values vary by number of variables, network density, number of time points, and form of misspecification. Notably, using 10% as the upper limit of Type I and Type II error rates, the original DFI approach fails to identify cutoffs for detecting an omitted path when effect size and/or sample size is small. To address this problem, we propose two alternatives that allow for the derivation of cutoffs using more lenient criteria. DFI extends the original DFI approach by removing the upper limit of Type I and Type II error rates, whereas DFI aims at maximizing classification quality measured by the Matthews correlation coefficient. We demonstrate the utility of these approaches using simulation and empirical data and discuss their implications in practice.
Analyzing Count Data in Single Case Experimental Designs with Generalized Linear Mixed Models: Does Serial Dependency Matter?
Li H and Luo W
Single-case experimental designs (SCEDs) involve repeated measurements of a small number of cases under different experimental conditions, offering valuable insights into treatment effects. However, challenges arise in the analysis of SCEDs when autocorrelation is present in the data. Recently, generalized linear mixed models (GLMMs) have emerged as a promising statistical approach for SCEDs with count outcomes. While prior research has demonstrated the effectiveness of GLMMs, these studies have typically assumed error independence, an assumption that may be violated in SCEDs due to serial dependency. This study aims to evaluate two possible solutions for autocorrelated SCED count data: 1) to assess the robustness of previously introduced GLMMs such as Poisson, negative binomial, and observation-level random effects models under various levels of autocorrelation, and 2) to evaluate the performance of a new GLMM and a linear mixed model (LMM), both of which incorporate an autoregressive error structure. Through a Monte Carlo simulation study, we have examined bias, coverage rates, and Type I error rates of treatment effect estimators, providing recommendations for handling autocorrelation in the analysis of SCED count data. A demonstration with real SCED count data is provided. The implications, limitations, and future research directions are also discussed.
Standardized Estimates of Second-Order Latent Growth Models: A Comparison of Alternative Latent-Standardization Methods
Wang Y, Wen Z, Hau KT and Jin T
Second-order latent growth models (LGMs) have garnered considerable attention and are increasingly utilized in longitudinal data analyses of latent constructs comprised of multiple items. The growth parameter estimates in these models are intrinsically linked to the model identification methods. Latent-standardization (identification) methods, in which the latent variable is standardized at a reference time point (e.g., eta-1), yield theoretically unique and interpretable growth parameters. Traditional latent-standardization methods indirectly standardize eta-1 the first-order component of the second-order LGM by constraining item intercepts and/or loadings. Such methods require a two-step modeling procedure and do not truly standardize eta-1. This article proposes a 1-stage method that indirectly standardizes eta-1 through the second-order component of the model by constraining the mean and variance of the level factor. This new single-step modeling method ensures eta-1 is truly standardized, with a mean of 0 and a variance of 1. Theoretical, simulated, and empirical comparisons are conducted across different latent-standardization methods, demonstrating the target accuracy and implementation simplicity of the proposed 1-stage method.
A Two-Step Estimator for Growth Mixture Models with Covariates in the Presence of Direct Effects
Liu Y, Bakk Z, McCormick EM and de Rooij M
Growth mixture models (GMMs) are popular approaches for modeling unobserved population heterogeneity over time. GMMs can be extended with covariates, predicting latent class (LC) membership, the within-class growth trajectories, or both. However, current estimators are sensitive to misspecifications in complex models. We propose extending the two-step estimator for LC models to GMMs, which provides robust estimation against model misspecifications (namely, ignored and overfitted the direct effects) for simpler LC models. We conducted several simulation studies, comparing the performance of the proposed two-step estimator to the commonly-used one- and three-step estimators. Three different population models were considered, including covariates that predicted only the LC membership (I), adding direct effects to the latent intercept (II), or to both growth factors (III). Results show that when predicting LC membership alone, all three estimators are unbiased when the measurement model is strong, with weak measurement model results being more nuanced. Alternatively, when including covariate effects on the growth factors, the two-step, and three-step estimators show consistent robustness against misspecifications with unbiased estimates across simulation conditions while tending to underestimate the standard error estimates while the one-step estimator is most sensitive to misspecifications.
Regularized Cross-Sectional Network Modeling with Missing Data: A Comparison of Methods
F Falk C and Starr J
Many applications of network modeling involve cross-sectional data of psychological variables (e.g., symptoms for psychological disorders), and analyses are often conducted using a regularized Gaussian graphical model (GGM) employing a lasso, also known as the graphical lasso or . Appropriate methodology for handling missing data is underdeveloped while using glasso, precluding the use of planned missing data designs to reduce participant fatigue. In this research, we compare three approaches to handling missing data with glasso. The first resembles a two-stage estimation approach-borrowed from the covariance structure modeling literature-whereby a saturated covariance matrix among the items is estimated prior to using glasso. The second and third approaches use glasso and the expectation-maximization (EM) algorithm in a single stage and either use EBIC or cross-validation for tuning parameter selection. We compared these approaches in a simulation study with a variety of sample sizes, proportions of missing data, and network saturation. An example with data from the Patient Reported Outcomes Measurement Information System is also provided. The EM algorithm with cross-validation performed best, but all methods appeared to be viable strategies under larger samples and with less missing data.
Residual Structural Equation Modeling with Nonnormal Distribution
Tseng MC
This study primarily investigates the impact of ignoring nonnormal distributions in RSEM models on the estimation of parameters in the second residual structure. The results of the simulation studies demonstrate that when the RSEM model follows a nonnormal distribution, it is crucial to test and estimate the nonnormal distribution while constructing mixture RI-AR or mixture RI-CLPM models. This approach guarantees the unbiased estimation of autoregressive parameters and cross-lagged parameters in the second residual structure. If, during the construction of an empirical model, the nonnormal distribution of mixture RI-AR models or mixture RI-CLPM models is not taken into account, or if a normal distribution is assumed directly for analysis, the resulting parameter estimates for autoregressive parameters and cross-lagged parameters will be biased, leading to erroneous inferences.
The Impact of Temporal Expectation on Unconscious Inhibitory Processing: A Computational Analysis Using Hierarchical Drift Diffusion Modeling
Wang Y, Cao J, Chen W, Tang Z, Liu T, Mu Z, Liu P and Wang Y
Numerous studies have shown that motor inhibition can be triggered automatically when the cognitive system encounters interfering stimuli, even a suspicious stimulus in the absence of perceptual awareness (e.g., the negative compatibility effect). This study investigated the effect of temporal expectation, a top-down active preparation for future events, on unconscious inhibitory processing both in the local expectation context on a trial-by-trial basis (Experiment 1) and in the global expectation context on a block-wise basis (Experiment 2). Modeling of the behavioral data using a drift-diffusion model showed that temporal expectation can accelerate the evidence accumulation and improve response caution, regardless of context. Importantly, the acceleration is lower when the target is consistent with the suspicious response tendency induced by the subliminal prime than when the target is inconsistent with that, which is significantly correlated with the behavioral RTs (i.e., the compatibility effect). The results provide evidence for a framework in which temporal expectation enhances inhibitory control of unconscious processes. The mechanism is likely to be that temporal expectation enhances the activations afforded by subliminal stimuli and the strength of cognitive monitoring, so that the cognitive system suppresses these suspicious activations more strongly, preventing them from escaping and interfering with subsequent processing.
On the Ratio Between Point-Polyserial and Polyserial Correlations for Non-Normal Bivariate Distributions
Barbiero A
It is a well-known fact that for the bivariate normal distribution the ratio between the point-polyserial correlation (the linear correlation after one of the two variables is discretized into categories with probabilities ) and the polyserial correlation (the linear correlation between the two normal components) remains constant with keeping the 's fixed. If we move away from the bivariate normal distribution, by considering non-normal margins and/or non-normal dependence structures, then the constancy of this ratio may get lost. In this work, the magnitude of the departure from the constancy condition is assessed for several combinations of margins (normal, uniform, exponential, Weibull) and copulas (Gauss, Frank, Gumbel, Clayton), also varying the distribution of the discretized variable. The results indicate that for many settings we are far from the condition of constancy, especially when highly asymmetrical marginal distributions are combined with copulas that allow for tail-dependence. In such cases, the linear correlation may even increase instead of decreasing, contrary to the usual expectation. This implies that most existing simulation techniques or statistical models for mixed-type data, which assume a linear relationship between point-polyserial and polyserial correlations, should be used very prudently and possibly reappraised.