Tutorial on Bayesian Statistics

Investigators: Bill Chen and Dr. Victoria Savalei

Bayesian statistics is an alternative to the classical frequentist statistics. It is gaining popularity in the social sciences. If no reviewer or colleague has suggested you conduct a Bayesian data analysis so far, it will happen soon enough. In this tutorial, we offer you a brief and intuitive introduction to a few key concepts in Bayesian statistics. The tutorial focuses on concepts that are needed for hypothesis testing with Bayesian statistics. These concepts are priors on effect sizes and Bayes Factors.

Our introduction is intuitive and nontechnical. If you are more statistically curious, you might find our simplified presentation unsatisfying. We do gloss over many important details. As a result, our tutorial will not prepare you for actually conducting Bayesian analyses. However, while the tutorial won’t teach you everything you need to know, you should get a sense of what people are talking about at nerdy cocktail parties when they talk about Bayes Factors! And you should get a sense of whether this approach might be useful in your own research. If the tutorial inspires you to learn more about Bayesian statistics, we provide a list of further readings at the end. Why are we doing this study? We are interested in how ordinary researchers would go about selecting priors for effect sizes in research applications. As such, we will first teach you the necessary concepts, then we’ll ask you to take a short quiz just to make sure you’ve understood the stuff, and finally, we will ask you to apply them to a few research scenarios. We will gain some data and you will gain some knowledge, so everyone wins!

Begin Tutorial!

Item-Level Missing Data

Investigators: Bill Chen and Dr. Victoria Savalei

Psychologists often use scales composed of multiple items to measure underlying constructs, such as well-being, depression, and personality traits. Missing data often occurs at the item-level. If variables in the dataset can account for the missingness, the data is missing at random (MAR). Modern missing data approaches can deal with MAR missing data effectively, but existing analytical approaches cannot accommodate item-level missing data. A very common practice in psychology is to average all available items to produce scale means when there is missing data. This approach, called available-case maximum likelihood (ACML) may produce biased results in addition to incorrect standard errors. Another approach is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if even one item is missing. SL-FIML is inefficient and prone to bias. A new analytical approach, called the two-stage maximum likelihood approach (TSML), was recently developed as an alternative by Savalei and Rhemtulla (2017). The original work showed that the method outperformed ACML and SL-FIML in structural equation models with parcels. Our current study examines the performance of ACML, SLFIML, and TSML in the context of bivariate regression.

Measurement Invariance

Investigators: Jordan Brace and Dr. Victoria Savalei

Measurement invariance is the property of psychometric instruments indicating that they perform identically when applied to different populations. When measurement invariance does not hold between populations, comparing scores of individuals from different populations (e.g., in cross-cultural research, or using a threshold for diagnosis) is inappropriate. Our research on this topic focuses on evaluating methods for testing measurement invariance, particularly when data are non-normal, as well as meaningfuilly quantifying the impact of an absence of measurement invariance in applications of psychometric instruments.

Format of Psychological Scales

Investigators: Cathy Xijuan Zhang and Dr. Victoria Savalei

Undergraduate Helpers: Bernice Liang, Yu Luo, Ramsha Noor and Kurtis Stewart
The traditional Likert format scale usually contains both positively worded (PW) items and reverse worded (RW) items. The main rationale for including RW items on scales is to for control acquiescence bias, which is the tendency for respondents to endorse the item regardless of its content (Ray, 1983). However, many researchers have questioned the benefit of including RW items in scales (e.g., Sonderen, Sanderman & Coyne, 2013). First, if the tendency to engage in acquiescence bias is an individual difference variable, then it will always contaminate the covariance structure of the data. Second, some RW items may cause confused and lead to errors due to carelessness among some respondents. Finally, RW items may also create method effects that represent a consistent behavioral trait, such as fear of negative evaluation (e.g., DiStefano & Molt, 2006), and these method effects may cause the scales to have lower validity and reliability (e.g., Rodebaugh et al., 2011; Roszkowski & Soven, 2010). Unlike scales in the Likert format, scales in the Expanded format do not contain RW items and thus do not have the problems associated with RW items. In the Expanded format, a full sentence replaces each of the response option in the Likert format. For instance, an item from the Rosenberg Self-Esteem Scale (RSES) that reads On the whole, I am satisfied with myself. and that has four response options (i.e., Strongly disagree, Somewhat disagree, Somewhat agree, and Strongly agree) would be written in the Expanded format as follows :
- On the whole, I am very satisfied with myself.
- On the whole, I am satisfied with myself.
- On the whole, I am disappointed with myself.
- On the whole, I am very disappointed with myself.
In this scale format, because both PW and RW items are presented as response options for each scale item, acquiescence bias and possible unique method effects due to item wording are theoretically eliminated. In addition, by using response options that are unique for each item, the Expanded format forces participants to pay more attention to recognize the subtle differences between options. Therefore, in this format, participants would be less likely to engage in the type of carelessness where they miss the presence of a negative particle (e.g., I am not happy misread as I am happy).

Bootstrap Fit Indices

Investigators: Cathy Xijuan Zhang and Dr. Victoria Savalei

Bootstrapping approximate fit indices in structural equation modeling (SEM) is of great importance because most fit indices do not have tractable analytic distributions. Model-based bootstrap, which has been proposed to obtain the distribution of the model chi-square statistic under the null hypothesis (Bollen & Stine, 1992), is not theoretically appropriate for obtaining confidence intervals for fit indices because it assumes the null is exactly true. On the other hand, naive bootstrap is not expected to work well for those fit indices that are based on the chi-square statistic, such as the RMSEA and the CFI, because sample noncentrality is a biased estimate of the population noncentrality. In this article we argue that a recently proposed bootstrap approach due to Yuan, Hayashi, and Yanagihara (YHY; 2007) is ideal for bootstrapping fit indices such as RMSEA and CFI that are based on the chi-square. This method transforms the data so that the parent population has the population noncentrality parameter equal to the estimated noncentrality in the original sample. Our lab is investigating the performance of the YHY bootstrap and the naive bootstrap for four indices: RMSEA, CFI, GFI, and SRMR. We are finding that for RMSEA and CFI, the confidence intervals (CIs) under the YHY bootstrap have relatively good coverage rates for all conditions whereas the CIs under the naïve bootstrap have very low coverage rates when the fitted model had large degrees of freedom.

Impact of Reverse Wording

Investigators: Cathy Xijuan Zhang, Ramsha Noor and Victoria Savalei

Undergraduate Helpers: Kurtis Stewart

Reverse wording is frequently employed across a variety of psychological scales to reduce or eliminate acquiescence bias, but there is rising concern about its harmful effects, one being the ability to contaminate the covariance structure of the scale. Therefore, results obtained via traditional covariance analyses may be distorted. Our lab examines the impact of reverse wording on the factor structure of the abbreviated 18-item Need for Cognition (NFC) scale using confirmatory and exploratory factor analysis. Data is fit to four previously developed models, including a unidimensional single factor model, a two factor model distinguishing between items of positive polarity and those of negative polarity, and two two-factor models, each with one all encompassing factor and one method factor. The NFC scale is modified to form three revised versions, varying from no reverse worded items to all reverse worded items. The original scale and revised versions are each fit to the four different models in hopes to gain further insight into not only the dimensionality of the scale, but also the effect of reverse wording on the factor structure. Our current results show that degree and type of reverse wording differentially impacts the factor structure and functionality of the NFC scale.

Research

Tutorial on Bayesian Statistics

Investigators: Bill Chen and Dr. Victoria Savalei

Item-Level Missing Data

Investigators: Bill Chen and Dr. Victoria Savalei

Measurement Invariance

Investigators: Jordan Brace and Dr. Victoria Savalei

Bayesian Statistics in Social Sciences

Investigators: Bill Chen and Dr. Victoria Savalei

Format of Psychological Scales

Investigators: Cathy Xijuan Zhang and Dr. Victoria Savalei

Undergraduate Helpers: Bernice Liang, Yu Luo, Ramsha Noor and Kurtis Stewart

Bootstrap Fit Indices

Investigators: Cathy Xijuan Zhang and Dr. Victoria Savalei

Impact of Reverse Wording

Investigators: Cathy Xijuan Zhang, Ramsha Noor and Victoria Savalei

Undergraduate Helpers: Kurtis Stewart