Psychological Methods

The recently proposed class of item response tree models provides a flexible framework for modeling multiple response processes. This feature is particularly attractive for understanding how response styles may affect answers to attitudinal questions. Facilitating the disassociation of response styles and attitudinal traits, item response tree models can provide powerful process tests of how different response formats may affect the measurement of substantive traits. In an empirical study, 3 response formats were used to measure the 2-dimensional Personal Need for Structure traits...

Multiple item scales have long been used to measure latent constructs on individual-level data. This is appropriate when an otherwise unobserved construct is indirectly measured by combining observable correlated characteristics that are thought to measure slightly different dimensions of that construct. Network data, which consist of observations on the relationships between a set of actors, however, are typically drawn from single-relation measurements. While this approach is sufficient for learning about discrete relations (communication, coauthorship, etc...

Multiple imputation has enjoyed widespread use in social science applications, yet the application of imputation-based inference to structural equation modeling has received virtually no attention in the literature. Thus, this study has 2 overarching goals: evaluate the application of Meng and Rubin's (1992) pooling procedure for likelihood ratio statistic to the SEM test of model fit, and explore the possibility of using this test statistic to define imputation-based versions of common fit indices such as the TLI, CFI, and RMSEA...

The most widely used statistical model for conducting moderation analysis is the moderated multiple regression (MMR) model. In MMR modeling, missing data could pose a challenge, mainly because the interaction term is a product of two or more variables and thus is a nonlinear function of the involved variables. In this study, we consider a simple MMR model, where the effect of the focal predictor X on the outcome Y is moderated by a moderator U. The primary interest is to find ways of estimating and testing the moderation effect with the existence of missing data in X...

Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct...

The study examined the performance of maximum likelihood (ML) and multiple imputation (MI) procedures for missing data in longitudinal research when fitting latent growth models. A Monte Carlo simulation study was conducted with conditions of small sample size, intermittent missing data, and nonnormality. The results indicated that ML tended to display slightly smaller degrees of bias than MI across missing completely at random (MCAR) and missing at random (MAR) conditions. Although specification of prior information in the MI imputation-posterior (I-P) phase influenced the performance of MI, especially with nonnormal small samples and missing not at random (MNAR), the impact of this tight specification was not dramatic...

In psychology, the use of intensive longitudinal data has steeply increased during the past decade. As a result, studying temporal dependencies in such data with autoregressive modeling is becoming common practice. However, standard autoregressive models are often suboptimal as they assume that parameters are time-invariant. This is problematic if changing dynamics (e.g., changes in the temporal dependency of a process) govern the time series. Often a change in the process, such as emotional well-being during therapy, is the very reason why it is interesting and important to study psychological dynamics...

This article evaluates the impact of partial or total covariate inclusion or exclusion on the class enumeration performance of growth mixture models (GMMs). Study 1 examines the effect of including an inactive covariate when the population model is specified without covariates. Study 2 examines the case in which the population model is specified with 2 covariates influencing only the class membership. Study 3 examines a population model including 2 covariates influencing the class membership and the growth factors...

Technology and collaboration enable dramatic increases in the size of psychological and psychiatric data collections, but finding structure in these large data sets with many collected variables is challenging. Decision tree ensembles such as random forests (Strobl, Malley, & Tutz, 2009) are a useful tool for finding structure, but are difficult to interpret with multiple outcome variables which are often of interest in psychology. To find and interpret structure in data sets with multiple outcomes and many predictors (possibly exceeding the sample size), we introduce a multivariate extension to a decision tree ensemble method called gradient boosted regression trees (Friedman, 2001)...

Structural equation model (SEM) trees, a combination of SEMs and decision trees, have been proposed as a data-analytic tool for theory-guided exploration of empirical data. With respect to a hypothesized model of multivariate outcomes, such trees recursively find subgroups with similar patterns of observed data. SEM trees allow for the automatic selection of variables that predict differences across individuals in specific theoretical models, for instance, differences in latent factor profiles or developmental trajectories...

The growth of social media and user-created content on online sites provides unique opportunities to study models of human declarative memory. By framing the task of choosing a hashtag for a tweet and tagging a post on Stack Overflow as a declarative memory retrieval problem, 2 cognitively plausible declarative memory models were applied to millions of posts and tweets and evaluated on how accurately they predict a user's chosen tags. An ACT-R based Bayesian model and a random permutation vector-based model were tested on the large data sets...

Studying communities impacted by traumatic events is often costly, requires swift action to enter the field when disaster strikes, and may be invasive for some traumatized respondents. Typically, individuals are studied after the traumatic event with no baseline data against which to compare their postdisaster responses. Given these challenges, we used longitudinal Twitter data across 3 case studies to examine the impact of violence near or on college campuses in the communities of Isla Vista, CA, Flagstaff, AZ, and Roseburg, OR, compared with control communities, between 2014 and 2015...

This article aims to introduce the reader to essential tools that can be used to obtain insights and build predictive models using large data sets. Recent user proliferation in the digital environment has led to the emergence of large samples containing a wealth of traces of human behaviors, communication, and social interactions. Such samples offer the opportunity to greatly improve our understanding of individuals, groups, and societies, but their analysis presents unique methodological challenges. In this tutorial, we discuss potential sources of such data and explain how to efficiently store them...

The introduction to this special issue on psychological research involving big data summarizes the highlights of 10 articles that address a number of important and inspiring perspectives, issues, and applications. Four common themes that emerge in the articles with respect to psychological research conducted in the area of big data are mentioned, including: (a) The benefits of collaboration across disciplines, such as those in the social sciences, applied statistics, and computer science. Doing so assists in grounding big data research in sound theory and practice, as well as in affording effective data retrieval and analysis...

Language data available through social media provide opportunities to study people at an unprecedented scale. However, little guidance is available to psychologists who want to enter this area of research. Drawing on tools and techniques developed in natural language processing, we first introduce psychologists to social media language research, identifying descriptive and predictive analyses that language data allow. Second, we describe how raw language data can be accessed and quantified for inclusion in subsequent analyses, exploring personality as expressed on Facebook to illustrate...

Statistical learning theory (SLT) is the statistical formulation of machine learning theory, a body of analytic methods common in "big data" problems. Regression-based SLT algorithms seek to maximize predictive accuracy for some outcome, given a large pool of potential predictors, without overfitting the sample. Research goals in psychology may sometimes call for high dimensional regression. One example is criterion-keyed scale construction, where a scale with maximal predictive validity must be built from a large item pool...

The term big data encompasses a wide range of approaches of collecting and analyzing data in ways that were not possible before the era of modern personal computing. One approach to big data of great potential to psychologists is web scraping, which involves the automated collection of information from webpages. Although web scraping can create massive big datasets with tens of thousands of variables, it can also be used to create modestly sized, more manageable datasets with tens of variables but hundreds of thousands of cases, well within the skillset of most psychologists to analyze, in a matter of hours...

For nearly a century, detecting the genetic contributions to cognitive and behavioral phenomena has been a core interest for psychological research. Recently, this interest has been reinvigorated by the availability of genotyping technologies (e.g., microarrays) that provide new genetic data, such as single nucleotide polymorphisms (SNPs). These SNPs-which represent pairs of nucleotide letters (e.g., AA, AG, or GG) found at specific positions on human chromosomes-are best considered as categorical variables, but this coding scheme can make difficult the multivariate analysis of their relationships with behavioral measurements, because most multivariate techniques developed for the analysis between sets of variables are designed for quantitative variables...