Before you start…

Before starting this exercise, you should have completed all the Absolute Beginners’, Part 1 worksheets. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.

Getting the data into R

Relevant worksheet: Intro to RStudio

You’ll need to complete the Psych:EL exercise to get the CSV files containing your data, and data for your group. Open an RStudio project for this analysis, within that create a script file for this analysis, and upload the CSV files to the project.

Exploring your data

Load

Relevant worksheet: Exploring data

Load the tidyverse package, and load your data.

# Policital psychology
# Load tidyverse
library(tidyverse)
# Load data
brex <- read_csv("brexit.csv")

Note: Everyone’s CSV file has a different name. For example, yours might be called Wills.csv. In the example above, you’ll need to replace Wills.csv with the name of your personal CSV file.

Inspect

Look at the data by clicking on it in the Environment tab in RStudio. Each row is one person’s rating for one question. Here’s what each of the columns in the data set contain:

Column Description Values
UnRowID Unique Row ID: you can ignore this.
SRN Your Student Reference Number
cond How would you vote in a Brexit referendum? “leave”, “remain”
type Which questionnaire is this a response for? “auth” = Authoritarian Personality Scale, “dom” = Social Dominance Orientation questionnaire
qu This number uniquely identifies the question that was asked 1 - 21
rating The rating given in response to this question 1 - 7, higher numbers = more authoritarian / higher social dominance orientation

Calculating your own personality scores

Relevant worksheets: Group Differences

How highly did you score on each of the two questionnaires (Authoritarian Personality Scale, Social Dominance Orientation questionniare)?

To look at this, you have to calculate your average (mean) rating for each questionnaire. To do this, use the group_by and summarise commands you learned in the Group Differences worksheet.

# Group data by 'type', display mean of 'rating'
brex %>% group_by(type) %>% summarise(mean(rating))
# A tibble: 2 × 2
  type  `mean(rating)`
  <chr>          <dbl>
1 auth            3.95
2 dom             3.25

As before, you can safely ignore the “ungrouping” message that you receive.

NOTE: Your output should look similar to that shown above, but the numbers will be different.

Exploring everyone’s data

Load data

Relevant worksheet: Exploring data

Load the data.

# Load everyone's data into 'brex.all'
brex.all <- read_csv("brexit-all.csv")

Inspect

Look at the data by clicking on it in the Environment tab in RStudio. You’ll see it has the same columns as the the other data file, it just has a lot more rows (because it contains a lot of participants).

Do ‘leavers’ score higher on an Authoritarian Personality scale?

Relevant worksheet: Group Differences

It’ll take a few steps to answer this question:

1. Filter

First, we first need to filter the data so it only contains the responses to the Authoritarian Personality questionnaire (and not the Social Dominance Orientation questionnaire). We do this using the filter command we covered in the Group Differences worksheet:

# Filter 'auth' questionnaire into 'auth' data frame
auth <- brex.all %>% filter(type == "auth")

2. Summarise

Next, we summarize this large amount of data, so we have just one overall score on the questionnaire for each person. We use the summarise command to do this, which we learned in the Exploring Data worksheet. We also need to use the group_by command (covered in Group Differences), so that we get one score for each person (using the column SRN). We also add the column cond to the group_by command, so that our summarized data still contains information about which condition each person was in (i.e. whether they voted leave or remain):

# Group 'auth' by 'cond' and 'SRN'; calculate mean of 'rating'; place results in 'auth.sum'
auth.sum <- auth %>% group_by(cond, SRN) %>% summarise(rating = mean(rating))

If you look at auth.sum by clicking on it in the Environment tab in RStudio, you’ll see we now have just one row for each person.

3. Plot

Now we can look at the distribution of scores on this personality scale for leavers and remainers. A density plot is a good choice for this (you learned to produce these in the Group Differences worksheet). Here, we’re going to make a density plot of the data in column rating of the auth.sum data frame. We use the cond column to set the colour of the density plot, leaving us with two distributions – one for leavers and one for remainers :

# Display density plot of 'rating', by 'cond'
auth.sum %>% ggplot(aes(rating, colour = factor(cond))) + geom_density(aes(y = ..scaled..)) 

Your plot should look something like this, although the coloured lines will be somewhat different. In this example plot, the leavers seem to score a bit higher overall than the remainers, although the two sets of scores also overlap a lot. What happens in your plot?

4. Effect size

As we covered in the Group Differences worksheet, an effect size is the difference between the mean scores of the two groups, divided by the standard deviation (a measure of how much the scores vary around those means). Here, we’re going to calculate the effect size for this personality difference in leavers and remainers. We use the cohen.d command to do this, as covered in the Group Differences worksheet.

# Load package that calculates effect sizes
library(effsize)
# Calculate Cohen's d for effect of 'cond' on 'rating'
cohen.d(auth.sum$rating ~ auth.sum$cond)

Cohen's d

d estimate: 0.4810782 (small)
95 percent confidence interval:
    lower     upper 
0.1731142 0.7890421 

If the example above, our effect size (Cohen’s d) is around 0.48, which is normally described as being between a small and a medium effect (see Group Differences for the meaning of small in this context, it does not imply “unimportant”).

Enter your Cohen’s d value into PsycEL. Your number will likely be somewhat different to the one in the above example.

5. Evidence

Relevant worksheet: Evidence

How good is the evidence that this is a real result, and not just some kind of fluke we can put down to chance? The best way to answer this question is to calculate a Bayes Factor, as was covered in the Evidence worksheet. In this case, the data frame is auth.sum, with the column rating containing the questionnaire scores, and the column cond containing information about the condition (i.e. whether each person voted leave or remain):

# Load BayesFactor package
library(BayesFactor, quietly = TRUE)
# Calculate Bayesian t-test for effect of 'cond' on 'rating'.
ttestBF(formula = rating ~ cond, data = data.frame(auth.sum))
Bayes factor analysis
--------------
[1] Alt., r=0.707 : 14.07864 ±0%

Against denominator:
  Null, mu1-mu2 = 0 
---
Bayes factor type: BFindepSample, JZS

In this example, The Bayes Factor is about 14, meaning it’s about 14 times as likely that there is a difference, than there isn’t.

Enter your exact Bayes Factor into PsycEL. (The Bayes Factor for your data will likely be a bit different to the one above).

Do ‘leavers’ score higher on a Social Dominance Orientation scale?

In this final part of the exercise, the task is to produce a density plot, calculate the effect size, and calculate the Bayes Factor, for the other questionnaire … the Social Dominance questionnaire. Go through the steps above, changing the code so you’re doing the same analyses on this other questionnaire. To get you started, here’s Step 1:

# Filter 'dom' questionnaire into 'dom' data frame
dom <- brex.all %>% filter(type == "dom")

Enter your effect size, and Bayes Factor, for the Social Dominance Orientation scale, into PsycEL.


This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.