The Target Study: A Conceptual Model and Framework for Measuring Disparity
We present a conceptual model to measure disparity-the target study-where social groups may be similarly situated (i.e., balanced) on allowable covariates. Our model, based on a sampling design, does not intervene to assign social group membership or alter allowable covariates. To address non-random sample selection, we extend our model to generalize or transport disparity or to assess disparity after an intervention on eligibility-related variables that eliminates forms of collider-stratification. To avoid bias from differential timing of enrollment, we aggregate time-specific study results by balancing calendar time of enrollment across social groups. To provide a framework for emulating our model, we discuss study designs, data structures, and G-computation and weighting estimators. We compare our sampling-based model to prominent decomposition-based models used in healthcare and algorithmic fairness. We provide R code for all estimators and apply our methods to measure health system disparities in hypertension control using electronic medical records.
Maximizing Utility or Avoiding Losses? Uncovering Decision Rule-Heterogeneity in Sociological Research with an Application to Neighbourhood Choice
Empirical studies on individual behaviour often, implicitly or explicitly, assume a single type of decision rule. Other studies do not specify behavioural assumptions at all. We advance sociological research by introducing (random) regret minimization, which is related to loss aversion, into the sociological literature and by testing it against (random) utility maximization, which is the most prominent decision rule in sociological research on individual behaviour. With an application to neighbourhood choice, in a sample of four European cities, we combine stated choice experiment data and discrete choice modelling techniques and find a considerable degree of decision rule-heterogeneity, with a strong prevalence of regret minimization and hence loss aversion. We also provide indicative evidence that decision rules can affect expected neighbourhood demand at the macro level. Our approach allows identifying heterogeneity in decision rules, that is, the degree of regret/loss aversion, at the level of choice attributes such as the share of foreigners when comparing neighbourhoods, and can improve sociological practice related to linking theories and social research on decision-making.
Theoretical foundations and limits of word embeddings: What types of meaning can they capture?
Measuring meaning is a central problem in cultural sociology and word embeddings may offer powerful new tools to do so. But like any tool, they build on and exert theoretical assumptions. In this paper I theorize the ways in which word embeddings model three core premises of a structural linguistic theory of meaning: that meaning is coherent, relational, and may be analyzed as a static system. In certain ways, word embeddings are vulnerable to the enduring critiques of these premises. In other ways, word embeddings offer novel solutions to these critiques. More broadly, formalizing the study of meaning with word embeddings offers theoretical opportunities to clarify core concepts and debates in cultural sociology, such as the coherence of meaning. Just as network analysis specified the once vague notion of social relations, formalizing meaning with embeddings can push us to specify and reimagine meaning itself.
The Design and Optimality of Survey Counts: A Unified Framework Via the Fisher Information Maximizer
Grouped and right-censored (GRC) counts have been used in a wide range of attitudinal and behavioural surveys yet they cannot be readily analyzed or assessed by conventional statistical models. This study develops a unified regression framework for the design and optimality of GRC counts in surveys. To process infinitely many grouping schemes for the optimum design, we propose a new two-stage algorithm, the Fisher Information Maximizer (FIM), which utilizes estimates from generalized linear models to find a global optimal grouping scheme among all possible -group schemes. After we define, decompose, and calculate different types of regressor-specific design errors, our analyses from both simulation and empirical examples suggest that: 1) the optimum design of GRC counts is able to reduce the grouping error to zero, 2) the performance of modified Poisson estimators using GRC counts can be comparable to that of Poisson regression, and 3) the optimum design is usually able to achieve the same estimation efficiency with a smaller sample size.
The gap-closing estimand: A causal approach to study interventions that close disparities across social categories
Disparities across race, gender, and class are important targets of descriptive research. But rather than only describe disparities, research would ideally inform interventions to close those gaps. The gap-closing estimand quantifies how much a gap (e.g. incomes by race) would close if we intervened to equalize a treatment (e.g. access to college). Drawing on causal decomposition analyses, this type of research question yields several benefits. First, gap-closing estimands place categories like race in a causal framework without making them play the role of the treatment (which is philosophically fraught for non-manipulable variables). Second, gap-closing estimands empower researchers to study disparities using new statistical and machine learning estimators designed for causal effects. Third, gap-closing estimands can directly inform policy: if we sampled from the population and actually changed treatment assignments, how much could we close gaps in outcomes? I provide open-source software (the R package gapclosing) to support these methods.
The Potential for Using a Shortened Version of the Everyday Discrimination Scale in Population Research with Young Adults: A Construct Validation Investigation
Discrimination is associated with numerous psychological health outcomes over the life course. The nine-item Everyday Discrimination Scale (EDS) is one of the most widely used measures of discrimination; however, this nine-item measure may not be feasible in large-scale population health surveys where a shortened discrimination measure would be advantageous. The current study examined the construct validity of a combined two-item discrimination measure adapted from the EDS by Add Health ( = 14,839) as compared to the full nine-item EDS and a two-item EDS scale (parallel to the adapted combined measure) used in the National Survey of American Life (NSAL; = 1,111) and National Latino and Asian American Study (NLAAS) studies ( = 1,055). Results identified convergence among the EDS scales, with high item-total correlations, convergent validity, and criterion validity for psychological outcomes, thus providing evidence for the construct validity of the two-item combined scale. Taken together, the findings provide support for using this reduced scale in studies where the full EDS scale is not available.
Marginal and Conditional Confounding Using Logits
This article presents two ways of quantifying confounding using logistic response models for binary outcomes. Drawing on the distinction between marginal and conditional odds ratios in statistics, we define two corresponding measures of confounding (marginal and conditional) that can be recovered from a simple standardization approach. We investigate when marginal and conditional confounding may differ, outline why the method by Karlson, Holm, and Breen recovers conditional confounding under a "no interaction"-assumption, and suggest that researchers may measure marginal confounding by using inverse probability weighting. We provide two empirical examples that illustrate our standardization approach.
Applying Responsive Survey Design to Small-Scale Surveys: Campus Surveys of Sexual Misconduct
Responsive survey design is a technique aimed at improving the efficiency or quality of surveys by using incoming data from the field to make design changes. The technique was pioneered on large national surveys, but the tools can also be applied on the smaller-scale surveys most commonly used by sociologists. We demonstrate responsive survey design in a small-scale, list-based sample survey of students on the topic of sexual misconduct. We investigate the impact of individual incentive levels and a two-phase responsive design with changes to mode of contact as approaches for limiting the potential of nonresponse bias in data from such surveys. Our analyses demonstrate that a two-phase design introducing telephone and face-to-face reminders to complete the survey can produce stronger change in response rates and characteristics of those who respond than higher incentive levels. These findings offer tools for sociologists designing smaller-scale surveys of special populations or sensitive topics.
Machine Learning as a Model for Cultural Learning: Teaching an Algorithm What it Means to be Fat
Public culture is a powerful source of cognitive socialization; for example, media language is full of meanings about body weight. Yet it remains unclear how individuals process meanings in public culture. We suggest that schema learning is a core mechanism by which public culture becomes personal culture. We propose that a burgeoning approach in computational text analysis - neural word embeddings - can be interpreted as a formal model for cultural learning. Embeddings allow us to empirically model schema learning and activation from natural language data. We illustrate our approach by extracting four lower-order schemas from news articles: the gender, moral, health, and class meanings of body weight. Using these lower-order schemas we quantify how words about body weight "fill in the blanks" about gender, morality, health, and class. Our findings reinforce ongoing concerns that machine-learning models (e.g., of natural language) can encode and reproduce harmful human biases.
The Future Strikes Back: Using Future Treatments to Detect and Reduce Hidden Bias
Conventional advice discourages controlling for postoutcome variables in regression analysis. By contrast, we show that controlling for commonly available postoutcome (i.e., future) values of the treatment variable can help detect, reduce, and even remove omitted variable bias (unobserved confounding). The premise is that the same unobserved confounder that affects treatment also affects the future value of the treatment. Future treatments thus proxy for the unmeasured confounder, and researchers can exploit these proxy measures productively. We establish several new results: Regarding a commonly assumed data-generating process involving future treatments, we (1) introduce a simple new approach and show that it strictly reduces bias, (2) elaborate on existing approaches and show that they can increase bias, (3) assess the relative merits of alternative approaches, and (4) analyze true state dependence and selection as key challenges. (5) Importantly, we also introduce a new nonparametric test that uses future treatments to detect hidden bias even when future-treatment estimation fails to reduce bias. We illustrate these results empirically with an analysis of the effect of parental income on children's educational attainment.
The Age-Period-Cohort-Interaction Model for Describing and Investigating Inter-cohort Deviations and Intra-cohort Life-course Dynamics
Social scientists have frequently sought to understand the distinct effects of age, period, and cohort, but disaggregation of the three dimensions is difficult because cohort = period - age. We argue that this technical difficulty reflects a disconnection between how cohort effect is conceptualized and how it is modeled in the traditional age-period-cohort framework. We propose a new method, called the age-period-cohort-interaction (APC-I) model, that is qualitatively different from previous methods in that it represents Ryder's (1965) theoretical account about the conditions under which cohort differentiation may arise. This APC-I model does not require problematic statistical assumptions and the interpretation is straightforward. It quantifies inter-cohort deviations from the age and period main effects and also permits hypothesis testing about intra-cohort life-course dynamics. We demonstrate how this new model can be used to examine age, period, and cohort patterns in women's labor force participation.
Meta-Analysis in Sociological Research: Power and Heterogeneity
Meta-analysis is a statistical method that combines quantitative findings from previous studies. It has been increasingly used to obtain more credible results in a wide range of scientific fields. Combining the results of relevant studies allows researchers to leverage study similarities while modeling potential sources of between-study heterogeneity. This paper provides a review of the core methodologies of meta-analysis that we consider most relevant to sociological research. After developing the foundation of the fixed-effects and random-effects models of meta-analysis, this paper illustrates the utility of the method with regression coefficients reported from two sets of social science studies. We explain the various steps of the process including constructing the meta-sample from primary studies; estimating the fixed- and random-effects models; analyzing the source of heterogeneity across studies; assessing publication bias. We conclude with a discussion of steps that could be taken to strengthen the development of meta-analysis in sociological research, which will eventually increase the credibility of sociological inquiry via a knowledge-cumulative process.
BIC extensions for order-constrained model selection
The Schwarz or Bayesian information criterion (BIC) is one of the most widely used tools for model comparison in social science research. The BIC however is not suitable for evaluating models with order constraints on the parameters of interest. This paper explores two extensions of the BIC for evaluating order constrained models, one where a truncated unit information prior is used under the order-constrained model, and the other where a truncated local unit information prior is used. The first prior is centered around the maximum likelihood estimate and the latter prior is centered around a null value. Several analyses show that the order-constrained BIC based on the local unit information prior works better as an Occam's razor for evaluating order-constrained models and results in lower error probabilities. The methodology based on the local unit information prior is implemented in the R package 'BFpack' which allows researchers to easily apply the method for order-constrained model selection. The usefulness of the methodology is illustrated using data from the European Values Study.
Opening the Blackbox of Treatment Interference: Tracing Treatment Diffusion through Network Analysis
Causal inference under treatment interference is a challenging but important problem. Past studies usually make strong assumptions on the structure of treatment interference in order to estimate causal treatment effects while accounting for the effect of treatment interference. In this article, we view treatment diffusion as a concrete form of treatment interference that is prevalent in social settings and also as an outcome of central interest. Specifically, we analyze data from a smoking prevention intervention conducted with 4,094 students in six middle schools in China. We measure treatment interference by tracing how the distributed intervention brochures are shared by students, which provides information to construct the so-called treatment diffusion networks. Besides providing descriptive analyses, we use exponential random graph models to model the treatment diffusion networks in order to reveal covariates and network processes that significantly correlate with treatment diffusion. We show that the findings provide an empirical basis to evaluate previous assumptions on the structure of treatment interference, are informative for imputing treatment diffusion data that is crucial for making causal inference under treatment interference, and shed light on how to improve designs of future interventions that aim to optimize treatment diffusion.
What's to Like? Facebook as a Tool for Survey Data Collection
In this paper, we explore the use of Facebook targeted advertisements for the collection of survey data. We illustrate the potential of survey sampling and recruitment on Facebook through the example of building a large employee-employer linked dataset as part of The Shift Project. We describe the workflow process of targeting, creating, and purchasing survey recruitment advertisements on Facebook. We address concerns about sample selectivity, and apply post-stratification weighting techniques to adjust for differences between our sample and that of "gold-standard" data sources. We then compare univariate and multi-variate relationships in the Shift data against the Current Population Survey and the National Longitudinal Survey of Youth-1997. Finally, we provide an example of the utility of the firm-level nature of the data by showing how firm-level gender composition is related to wages. We conclude by discussing some important remaining limitations of the Facebook approach, as well as highlighting some unique strengths of the Facebook targeting advertisement approach, including the ability for rapid data collection in response to research opportunities, rich and flexible sample targeting capabilities, and low cost, and we suggest broader applications of this technique.
Agent-Based Models for Assessing Complex Statistical Models: An Example Evaluating Selection and Social Influence Estimates from SIENA
Although agent-based models (ABMs) have been increasingly accepted in social sciences as a valid tool to formalize theory, propose mechanisms able to recreate regularities, and guide empirical research, we are not aware of any research using ABMs to assess the robustness of our statistical methods. We argue that ABMs can be extremely helpful to assess models when the phenomena under study are complex. As an example, we create an ABM to evaluate the estimation of selection and influence effects by SIENA, a stochastic actor-oriented model proposed by Tom A. B. Snijders and colleagues. It is a prominent network analysis method that has gained popularity during the last 10 years and been applied to estimate selection and influence for a broad range of behaviors and traits such as substance use, delinquency, violence, health, and educational attainment. However, we know little about the conditions for which this method is reliable or the particular biases it might have. The results from our analysis show that selection and influence are estimated by SIENA asymmetrically and that, with very simple assumptions, we can generate data where selection estimates are highly sensitive to misspecification, suggesting caution when interpreting SIENA analyses.
Evaluating the Cumulative Impact of Childhood Misfortune: A Structural Equation Modeling Approach
Most studies of the early origins of adult health rely on summing dichotomously measured negative exposures to measure childhood misfortune (CM), neglect, adversity, or trauma. There are several limitations to this approach, including that it assumes each exposure carries the same level of risk for a particular outcome. Further, it often leads researchers to dichotomize continuous measures for the sake of creating an additive variable from similar indicators. We propose an alternative approach within the structural equation modeling (SEM) framework that allows differential weighting of the negative exposures and can incorporate dichotomous and continuous observed variables as well as latent variables. Using the Health and Retirement Study data, our analyses compare the traditional approach (i.e., adding indicators) with alternative models and assess their prognostic validity on adult depressive symptoms. Results reveal that parameter estimates using the conventional model likely underestimate the effects of CM on adult health outcomes. Additionally, while the conventional approach inhibits testing for mediation, our model enables testing mediation of both individual CM variables and the cumulative variable. Further, we test whether cumulative CM is moderated by the accumulation of protective factors, which facilitates theoretical advances in life course and social inequality research. The approach presented here is one way to examine the cumulative effects of early exposures while attending to diversity in the types of exposures experienced. Using the SEM framework, this versatile approach could be used to model the accumulation of risk or reward in many other areas of sociology and the social sciences beyond health.
How Accurate Are Self-reports of Voluntary Association Memberships?
Questions on voluntary association memberships have been used extensively in social scientific research for decades. Researchers generally assume that these respondent self-reports are accurate, but their measurement has never been assessed. Respondent characteristics are known to influence the accuracy of other self-report variables such as self-reported health, voting, or test scores. In this article, we investigate whether measurement error occurs in self-reports of voluntary association memberships. We use the 2004 General Social Survey (GSS) questions on voluntary associations, which include a novel resource: the actual organization names listed by respondents. We find that this widely used voluntary association classification scheme contains significant amounts of measurement error overall, especially within certain categories. Using a multilevel logistic regression, we predict accuracy of response nested within respondents and interviewers. We find that certain respondent characteristics, including some used in research on voluntary associations, influence respondent accuracy. Inaccurate and/or incorrect measurement will affect the statistics and conclusions drawn from the data on voluntary associations.
The Turnout Gap in Surveys: Explanations and Solutions
Postelection surveys regularly overestimate voter turnout by 10 points or more. This article provides the first comprehensive documentation of the turnout gap in three major ongoing surveys (the General Social Survey, Current Population Survey, and American National Election Studies), evaluates explanations for it, interprets its significance, and suggests means to continue evaluating and improving survey measurements of turnout. Accuracy was greater in face-to-face than telephone interviews, consistent with the notion that the former mode engages more respondent effort with less social desirability bias. Accuracy was greater when respondents were asked about the most recent election, consistent with the hypothesis that forgetting creates errors. Question wordings designed to minimize source confusion and social desirability bias improved accuracy. Rates of reported turnout were lower with proxy reports than with self-reports, which may suggest greater accuracy of proxy reports. People who do not vote are less likely to participate in surveys than voters are.
Using universal kriging to improve neighborhood physical disorder measurement
Ordinary kriging, a spatial interpolation technique, is commonly used in social sciences to estimate neighborhood attributes such as physical disorder. Universal kriging, developed and used in physical sciences, extends ordinary kriging by supplementing the spatial model with additional covariates. We measured physical disorder on 1,826 sampled block faces across 4 US cities (New York, Philadelphia, Detroit, and San Jose) using Google Street View imagery. We then compared leave-one-out cross-validation accuracy between universal and ordinary kriging and used random subsamples of our observed data to explore whether universal kriging could provide equal measurement accuracy with less spatially dense samples. Universal kriging did not always improve accuracy. However, a measure of housing vacancy did improve estimation accuracy in Philadelphia and Detroit (7.9 and 6.8% lower root mean square error, respectively) and allowed for equivalent estimation accuracy with half the sampled points in Philadelphia. Universal kriging may improve neighborhood measurement.
Optimizing Count Responses in Surveys: A Machine-learning Approach
Count responses with grouping and right censoring have long been used in surveys to study a variety of behaviors, status, and attitudes. Yet grouping or right-censoring decisions of count responses still rely on arbitrary choices made by researchers. We develop a new method for evaluating grouping and right-censoring decisions of count responses from a (semisupervised) machine-learning perspective. This article uses Poisson multinomial mixture models to conceptualize the data-generating process of count responses with grouping and right censoring and demonstrates the link between grouping-scheme choices and asymptotic distributions of the Poisson mixture. To search for the optimal grouping scheme maximizing objective functions of the Fisher information (matrix), an innovative three-step M algorithm is then proposed to process infinitely many grouping schemes based on Bayesian A-, D-, and E-optimalities. A new R package is developed to implement this algorithm and evaluate grouping schemes of count responses. Results show that an optimal grouping scheme not only leads to a more efficient sampling design but also outperforms a nonoptimal one even if the latter has more groups.
