This is an advanced worksheet, which assumes you have completed the Absolute Beginners’ Guide to R course, the Research Methods in Practice (Quantitative section) course, and the Intermediate Guide to R course.
This worksheet describes a partial analysis pipeline for an experiment taken from a PhD thesis. The experiment explored self-esteem before and after a mental imagery intervention. Self-esteem was measured using the State Self-Esteem Scale (SSES, Osman et al., 2012), a 20-item survey used to measure short-lived (state) changes in self-esteem. Participants completed one of two mental imagery conditions, or a control condition:
This was a 2 (time) x 3 (condition) mixed design.
The initial preprocessing steps for this data are described in the Data preprocessing for scales worksheet. Complete that worksheet first, then add the code in this worksheet to the end of
We start with some final bits of preprocessing.
sses_raw <- bind_rows(sses_pre_raw, sses_post_raw) sses_raw <- sses_raw %>% mutate(total = sses_raw %>% select(q1:q20) %>% rowSums())
bind_rows() simply joins data frames together, in the order the arguments are specificed, making a new data frame. We do this because we want to do at least one analysis on all of the data.
mutate() in the second line uses
rowSums() to add up the values in the columns
q1:q20 for each row. Because our data has one participant per row, this calculates the SSES score for each participant. You’ve seen something similar when we calculated subscale scores in the Data preprocessing for scales. The SSES contains some reverse-coded items, but these were already reversed in this particular dataset, so we didn’t need to do that step.
We can get a feel for our data by using the familiar
summarise() commands to calculate means and standard deviations by group within time.
sses_raw %>% group_by(time, condition) %>% summarise(mean = mean(total), sd = sd(total))
`summarise()` regrouping output by 'time' (override with `.groups` argument)
# A tibble: 6 x 4 # Groups: time  time condition mean sd <fct> <fct> <dbl> <dbl> 1 pre control 31.1 9.34 2 pre other 27.8 6.79 3 pre self 33.8 9.27 4 post control 29.3 9.19 5 post other 28.6 7.44 6 post self 37.7 7.58
From these results, we can see that, compared to the pre-intervention scores, self-esteem rose slightly in
other condition, rose even more in the
self condition and dropped slightly in the
To check the reliability of our SSES measurements, we’ll calculate Cronbach’s alpha for the pre-intervention SSES data. Cronbach’s alpha was introduced in the Analysing scales worksheet.
library(psy) sses_pre <- sses_raw %>% filter(time == 'pre') sses_pre %>% select(q1:q20) %>% cronbach()
$sample.size  73 $number.of.items  20 $alpha  0.6862132
Line 1 loads the
psy package which provides the
cronbach() function. Line 2 creates a data frame contain only the rows with pre-intervention scores, along with the condition, and SSES columns. We only use the pre-intervention scores because, assuming the interventions were successful, including the post-intervention scores would reduce alpha. The last line calcluates Cronbach’s alpha.
The value of alpha (0.69) could be a cause for concern, as it is below the 0.7-0.8 convention of acceptable reliability, and well below the alpha of 0.92 reported by the authors of the SSES (Heatherton & Polivy, 1991).
We’ll do an additional analysis to see how the scale performed in each group.
sses_pre %>% filter(condition == 'control') %>% select(q1:q20) %>% cronbach()
$sample.size  34 $number.of.items  20 $alpha  0.7419521
sses_pre %>% filter(condition == 'self') %>% select(q1:q20) %>% cronbach()
$sample.size  18 $number.of.items  20 $alpha  0.6823747
sses_pre %>% filter(condition == 'other') %>% select(q1:q20) %>% cronbach()
$sample.size  21 $number.of.items  20 $alpha  0.4775607
Lines 1-2 filter the pre-intervention data to only include the control condition, and then calculates Cronbach’s alpha. The remaining lines do the same for the other two conditions.
Putting aside the reliability of the measurements for now, we would like to check that there were no major self-esteem differences between conditions before our intervention. This should be the case if we successfully randomised participants to conditions. If there are baseline differences we would need to account for these when comparing them to post-intervention scores. We can use a between-subjects ANOVA to compare the baseline SSES scores in our three conditions. This is the similar to the ANOVA described in the Within-subject differences worksheet, but for a between-subjects factor.
library(BayesFactor, quietly = TRUE) sses_pre <- sses_raw %>% filter(time == 'pre') anovaBF(formula = total ~ condition, data = data.frame(sses_pre))
Bayes factor analysis --------------  condition : 0.6820205 ±0.03% Against denominator: Intercept only --- Bayes factor type: BFlinearModel, JZS
In line 1, we load the BayesFactor package. Line 2 assigns just the pre-intervention data to
sses_pre. Line 3 runs the between-subjects ANOVA. To run a Bayesian ANOVA using a random factor, we would need more than one observation for each participant for
condition. As we only have one observation for each participant, we don’t use
+ subj in the formula, and
whichRandom = 'subj'.
The Bayes factor of .68 is greater than the conventional
0.33 which would satisfy us that there were no differences between the conditions. This could be due to the slightly lower mean in the
other condition. We’ll address this issue at the end of the worksheet.
sses_raw %>% filter(time == 'pre') %>% group_by(condition) %>% summarise(mean = mean(total), sd = sd(total))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 3 x 3 condition mean sd <fct> <dbl> <dbl> 1 control 31.1 9.34 2 other 27.8 6.79 3 self 33.8 9.27
The main question we’d like to answer is whether our two imagery interventions (visualising a negative mental image of oneself, or someone else) had effects on self esteem which differed from our control condition. We can test this using a factorial ANOVA to compare SSES scores before and after the three interventions. In this design,
condition is a between-subjects variable, and
time (pre and post intervention) is a repeated measure.
bf <- anovaBF(formula = total ~ time*condition + subj, data = data.frame(sses_raw), whichRandom = 'subj') bf
Bayes factor analysis --------------  condition + subj : 4.052594 ±6.26%  time + subj : 0.1904245 ±0.97%  condition + time + subj : 0.712687 ±2.78%  condition + time + condition:time + subj : 7.96878 ±12.84% Against denominator: total ~ subj --- Bayes factor type: BFlinearModel, JZS
bf / bf
Bayes factor analysis --------------  condition + time + condition:time + subj : 11.18132 ±13.14% Against denominator: total ~ condition + time + subj --- Bayes factor type: BFlinearModel, JZS
The first two lines run a Bayesian factorial ANOVA, with
subj as a random factor, and store result is stored in
bf. Line 3 prints the results, which provides us with the Bayes Factors for the main effects of
bf). Line 4 calculates the Bayes Factor for the interaction.
The Bayes Factor for
time is less than .33, which tells us that there was no overall change in self-esteem after the intervention relative to baseline. The Bayes Factor for
condition is greater than 3, indicating that there were differences in self-esteem between the three conditions. The final Bayes Factor tells us that it’s about 11 times more like that there’s an interaction between
time than that there isn’t.
Heatherton, T. F., & Polivy, J. (1991). Development and validation of a scale for measuring state self-esteem Journal of Personality and Social Psychology, 60(6), 895.
This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.