top of page
measurement inv cats.png

Statistical Analysis of a Psychometric Scale

This was my final project for a multivariate statistics course I took in grad school. We had a lot of latitude with this assignment, as the only requirement was that we apply one of the methods we'd learned in class to analyze a dataset of our choosing. I decided to work with a dataset that contained survey responses to a conspiracism inventory, and approach my analysis through the lens of individual differences.

 

I'm drawn to these issues because I feel their prevalence and impact are underestimated. ​Conspiracism particularly feels inescapable these days, whether it be on Twitter or at the dinner table. ​And while I'm always tempted to dismiss absurd (and usually racist) conspiracy theories as contrarian and politically motivated, I wanted to use this project as an opportunity to build cognitive empathy for those who are susceptible to this kind of thinking. 

​

To understand how people think, we need to be able to measure how they think. And we can't do this if our measurement tools aren't tuned right. So much of research involves reducing a diverse group of people to a single average. This is necessary if we want our findings to be easily interpretable, but can come at the cost of true understanding. Every part of the individual that we ignore decreases the precision of our model, the quality of our knowledge, and our ability to make informed decisions.

​

When done right, splitting populations by group can reduce noise and make our predictions about individuals better. 

 

But we can't consider every potential factor at play — even if we could, our models would be overfit. So how do we strike a balance? Consideration of group identity is a good place to start. The purpose of this analysis was to see if the structure of a psychometric scale depends on gender. This write-up mostly covers my process and findings, but the full paper can be found here.

​

Table of Contents
Background

Background

For this project, I analyzed the Generic Conspiracist Beliefs Scale (GCBS), a scale measuring respondents' level of agreement with 15 statements across five domains of conspiracist beliefs. This framework was developed by Brotherton, French, & Pickering in 2013.

​

The five-factor model of conspiracist beliefs as defined bBrotherton, French, & Pickering.

 

While the researchers checked several types of reliability and validity after they developed this framework, they didn’t assess the degree of measurement invariance, or lack of difference in the structure of the framework, between groups. A scale with stronger measurement invariance is usually considered better, since that means it measures the same thing, the same way, no matter who you give it to.

Methods and Results

Methods 

I used a statistical method called factor analysis to identify and address gender-based differences in this scale. A factor analysis is a way of using observed variables to find latent variables. Observed variables are things that we can directly measure, like how many texts you sent today. Latent variables are things that we can't directly measure, but that impact observed variables. The personality trait extraversion is a good example of a latent variable. Your level of extraversion might affect how many texts you sent today

 

A factor analysis finds latent variables by calculating how the observed variables in a dataset covary, or move together. Observed variables that strongly covary are assumed to belong to the same latent variable. The underlying math optimizes the way observed variables are clustered, maximizing covariance within each latent variable. In this process, each observed variable is given a factor loading, which tells us how strongly it's influenced by each latent variable. 

​

Importantly, it's up to the researcher to make sense of these factors. The factor analysis clusters like-items, but a person has to manually extract the central theme from each of these clusters. â€‹â€‹â€‹

Analysis

Analysis

I found a large dataset on the Open Source Psychometric Project with GCBS item responses and demographic information for each participant. I used confirmatory factory analyses to test the original model defined by Brotherton, French, & Pickering. While this model was good, three of the five factors were strongly correlated, suggesting that this model may be splitting items into more factors than needed.

​

I evaluated the semantic and thematic qualities of these statements and developed a four-factor model, which I tested against the original. The revised model decreased the correlation between factors, offered a simpler solution that made more sense, and didn't generate additional error.

​​​

R code for confirmatory factor analyses specifying the original and revised model of conspiracist beliefs.

 

Next I assessed the measurement invariance, of both the original and revised model, with multi-group confirmatory factor analyses. ​This revealed no measurement invariance, suggesting substantial gender-based differences in the structure of conspiracist beliefs.

 

To find better factor structures that account for these individual differences, I conducted two exploratory factor analyses, one on men and one on women. 

​

R code testing the revised model for increasing degrees of measurement invariance. Significant chi-squared tests provide evidence for lack of metric, scalar, and strict invariance. Practically, this means all parameter estimates likely vary by gender, from item means to factor loadings.

Results

Results

The exploratory factor analysis revealed two notable differences in these structures: domain size and domain theme. For men, the largest domain of conspiracist beliefs is related to harm (Covert Harm), while for women the largest domain is related to deceit (Nefarious Public Relations). 

​

Potential structures of conspiracist beliefs for men (left) and women (right), as suggested by an exploratory factor analysis. These structures need to be validated with a confirmatory factor analysis of a new dataset.

​

While both genders have a domain related to harm, the scope of this harm differs. For men, the harm domain (Covert Harm) contains a variety of beliefs related to a generalized sense of harm. For women, the harm domain (Harmful Science and Technology) contains only specific beliefs related to science and technology. These structures vary because men and women differed in their evaluation of two ambiguously-worded statements. 

​

For statements that describe both harmful and deceitful situations, men appear to be more attentive to the harmful elements and women to the deceitful elements. This effect is demonstrated well by the statement, “The government permits or perpetrates acts of terrorism on its own soil, disguising its involvement.” This is a double-barreled statement that makes salient both harm (“terrorism”) and deceit (“disguising”). For men, this statement was clustered with statements that were explicitly related to harm, but for women it was clustered with statements that were explicitly related to deceit. 

Discussion

Discussion

There are probably lots of reasons for this difference in structure. One that was particularly interesting to me had to do with threat perception. Compared to women, men score higher on measures of reactive aggression and have stronger emotional responses to ambiguous social stimuli (Im et al., 2018; Newhoff et al., 2015). This could explain why the two ambiguously-worded statements were perceived by men as primarily harmful and by women as primarily deceitful.

 

More than anything, this project was an exercise in measurement refinement. Scale design is an iterative process of development and validation. If we want to build robust frameworks that accurately and ethically measure people, we need to identify and address their limitations. This analysis only took a couple hours, but provided important insights about how different types of people see the world. With a few cups of coffee and some really cool stats, we can find and fix shortcomings to make our research more precise and useful.

References

Im, S., Jin, G., Jeong, J., Yeom, J., Jekal, J., Lee, S. I., Cho, J. A., Lee, S., Lee, Y., Kim, D. H., Bae, M., Heo, J., Moon, C., &

Lee, C. H. (2018). Gender Differences in Aggression-related Responses on EEG and ECG. Experimental neurobiology, 27(6), 526–538.

Newhoff, M., Treiman, D. M., Smith, K. A., Steinmetz, P. N. Gender differences in human single neuron responses to male

emotional faces. (2015). Frontiers in Human Neuroscience

© 2025 by Sam Light. Powered and secured by Wix

bottom of page