Contents

Introduction

In the Absolute Beginners’ Guide to R and in the revision worksheet we used a Bayesian between-subjects t-test to look at the strength of evidence for the presence (or absence) of a difference between groups. For example, in the revision worksheet, we looked at whether the way you experienced a word (silently read, or spoken out loud) affected how good your memory of it was. We found a Bayes Factor close to 1, indicating there was neither good evidence for, nor good evidence against, there being a difference (an indecisive result). Bayes Factors greater than 3 are generally considered to be evidence for a difference worth listening to, while Bayes Factors less than one-third (0.33) are generally considered to be evidence for the absence of a difference that is worth listening to.

In this worksheet, we’ll cover how to calculate a Bayes Factor for a simple within-subjects experiment. As we covered in the statistical power worksheet, a within-subjects experiment is one where all the conditions are done by all the participants. Within-subjects designs are often a really good idea, because they are more efficient. In other words, they tend to allow us to give confident answers to questions while testing fewer people than would be needed in a between-subjects experiment.

The technique we’re going to use is called ANOVA, short for ANalysis Of VAriance. It’s probably the most used analysis technique in psychology. Its name (ANOVA) has a long history that isn’t really worth getting into here, particularly as it comes from our history in psychology of using p-values as evidence terms. As we’ve covered previously, p-values have no interpretation that is simultaneously accurate, simple, and useful. In this worksheet, we focus on a form of ANOVA that calculates Bayes Factors, which are more easily interpretable. If you ever need to calculate a traditional ANOVA, take a look at the traditional ANOVA worksheet.

Getting started

In the preprocessing worksheet, you loaded data from a git repository into an R project. Go back to that project now. This is very important. Do not stay in your current project, change your Rstudio project using the the drop-down list on the top right of Rstudio. Change to the rminr-data project you used for the preprocessing worksheet. If you do not do this, the rest of this worksheet will not work.

Now that you’re in the rminr-data project, create a new script called anova1.R, put all the commands you use in this worksheet into that file, and save regularly.

Clearing the environment

You will notice that your Environment still contains all the data frames from the previous worksheet. You can leave them there if you like, but some people prefer to start with a “clean slate”. If that’s you, you can remove data frames with the rm command (as in remove). For example, to remove the dat data frame, type:

rm(dat)

The rm command can also be used with the ls() command. The ls() command gives you a list of the names of the things in your environment. The command is called ls because it lists them. We can use the ls() and rm() commands together to remove everything from our Environment, like this:

rm(list=ls())

Notice that your Environment tab in Rstudio is now empty. This command only clears the Environment, it leaves all the files in your project as they were (you can still see them in the Files panel).

Loading data

Now load this new data set:

library(tidyverse)
words <- read_csv("wordnaming2.csv")

Now click on the words data frame in the Environment tab, and take a look. You’ll see that its a data frame with 10 columns and 37,800 rows.

Description of data set

We’ll be using this data set over the next few worksheets, so it’s important to take some time here to understand the experiment it comes from.

The data frame contains simulated data for a word-naming experiment. In the experiment, participants just had to name (i.e. speak out loud) each written word as it appeared on the screen. However, on every trial, there was also a picture. Sometimes the picture matched the word (e.g. the word ‘car’ and a picture of a car). This is called a congruent trial. Other times, the picture and word were different (e.g. the word ’ boat’ and a picture of a carrot). This is called an incongruent trial. Participants were instructed to ignore the picture, but were they able to do so? If they can’t help processing the picture, then they should be slower to name the word on incongruent trials than on congruent trials. There are a third type of trials, neutral trials. These are trials on which just the word appears, with no picture.

In this experiment, the word naming task is being used to study the effects of meditation practice. Some psychologists have suggested that learning how to meditate can improve people’s ability to ignore irrelevant information. If so, we should be able to improve people’s performance on this word naming task by training them how to meditate first.

So, in this experiment, 140 participants were randomly allocated to a meditation-training condition, in which they received a week’s training on how to meditate, before doing the word naming experiment. Another 140 were assigned to a no-training condition. These people just did the word naming experiment, with no meditation training. Finally, another 140 people were assigned to a relaxation-training condition, in which they received a week’s training, but on some relaxation techniques, instead of meditation.

Within each group of participants, half were male and half were female. Each participant completed 90 trials of the word naming task. They did 30 trials, took a short break, then did another 30 trials, took another short break, and did the final 30 trials. This gives us 37,800 word naming trials in total. On each trial, the participant either named the word correctly, or they didn’t. Either way, they took a certain amount of time (measured in milliseconds) to name the word.

Here’s a summary of what each of the columns in the data set contains:

Column Description Values
subj Unique anonymous participant number 1-420
sex Biological sex of the participant “male”, “female”
medit Whether the participant has had meditation, relaxation, or no training “meditate”, “relax”, “control”
block Block of trials 1-3
trial Trial number (per participant) 1-90
word The word presented on this trial e.g. “cup”
pic The name of the picture presented e.g. “cake”
congru Experimental condition: congruent, incongruent or neutral trial? ‘cong’, ‘incong’, ‘neutral’
acc Accuracy of naming response 1 = correct, 0 = incorrect
rt Reaction time in milliseconds

DISCLAIMER: This data set is loosely based on the design of experiments that have really been done, and is realistic in terms of its size and complexity. However, it’s important to remember that this is generated data, not data actually collected from participants. So, you should not take the analyses you do of this simulated data as evidence for the effectiveness of meditation in improving attentional control. As is often the case, real data in this area is less clear, and more controversial.

Preprocessing

As covered in the preprocessing worksheet, the first thing we need to do with realistic data sets like this is to preprocess them, so the data is in a format that we can analyse. The two steps we need to do for our first analysis are filtering and summarising.

Filtering

In our first analyses, we’re going to take a look at the participants who had no pre-training in meditation or relaxation (i.e. the control group), so we get a sense of what performance on this task ‘normally’ looks like. So:

WRITE A COMMAND: Use a filter command to extract just the control condition, and put it into a new data frame called wordctrl. If you need a reminder of how the filter command works, take a look at the preprocessing worksheet.

EXPECTED OUTPUT: There will be new data frame in your Environment window that has 12,600 rows and 10 columns and is called control.

Also, for now, we’re not going to look at the neutral trials, just the congruent and incongruent ones. So:

WRITE A COMMAND: Filter out the neutral trials from wordctrl and put the remaining data into wordctrlCI. You may find that the command != is the most efficient way to do this. For example, if you were trying to exclude the relaxation condition, you could use filter(medit != "relax"). The command != means ‘not equal to’.

EXPECTED OUTPUT: There will be a new data frame in your Environment window that has 8,400 rows and 10 columns and is called wordctrlCI.

Summarising

The data frame is still very long, because it contains data for every trial. What we want to know, for each participant, is whether they are quicker to name the words on the congruent trials (where the picture and word match) than on the incongruent trials (where the picture and word are different). So, we need to summarise each person’s data down to just two numbers - their mean reaction time on congruent trials, and their mean reaction time on incongruent trials. So:

WRITE A COMMAND to group the data by participant (subj) and experimental condition (congru), using the group_by command, get the mean reaction times using the summarise command, and put this summarised data into a new data frame called wordctrlCIsum. If you need a reminder of how to group and summarise data, take a look at the preprocessing worksheet.

Column names with summarize: There is one new thing to learn here, though, relative to the previous worksheet. The column containing the RT data in this new data frame will need to have the name rt, because we use this name later on. You can set the column name within the summarise command. For example, summarise(acc = mean(acc)) takes the mean of the column acc and puts the answer into a column called acc in your new data frame. If you just use summarise(mean(acc)) then the column name you end up with is mean(acc) and this would be a pain to use in later commands, so we rename the column as acc by typing summarise(acc = mean(acc)) instead.

`summarise()` has grouped output by 'subj'. You can override using the `.groups` argument.

EXPECTED OUTPUT: There will be a new data frame in your Environment window called wordctrlCIsum. It will have 280 rows - two reaction times for each of the 140 participants - one for the congruent trials, and one for the incongruent trials. You’ll notice that the incongruent reaction times tend to be larger, as expected.

Density plots

Now, as you did in the Absolute Beginners’ Guide to R:

WRITE A COMMAND to create a density plot of this summarised data (wordctrlCIsum), showing the distribution of reaction times in the two conditions. Click on the link above if you need to revise how to do this.

If you get it right, it should look like this:

We can see a few different things in this plot. First, there’s quite a lot of variation between people in how fast they react, with mean reaction time for incongruent trials anywhere from about 200 milliseconds to 800 milliseconds. Second, reaction times seem to vary less on congruent trials (averages around 400-600 milliseconds). Third, the peak of the incongruent distribution is slightly to the right of the congruent distribution, suggesting that, overall, incongruent trials might take a bit longer to react to than congruent trials.

But…looking at the data this way misses the most important thing about the design of this experiment, which is that it is within subjects. This is not two different groups of people, one who see only congruent trials, and the other who only see incongruent trials. Every participant sees both types. So, there’s a better way to look at this data, particularly when (as here) people differ a lot in their average reaction time.

What we can do is calculate a difference score for each participant. In other words, how much longer each participant takes to respond to incongruent trials, than congruent ones.

More preprocessing

In order to do this, we have to learn some new types of data preprocessing - pivoting and mutating.

Pivoting

Pivoting is where you take a long data file (lots of rows, few columns) and make it wider. Or where you take a wide data file (lots of columns, few rows) and make it longer. In this case, we need to make wordctrlCIsum wider, for reasons that will become clearer in the next step.

To make a data frame wider, we can use the pivot_wider command from the tidyverse package. So, when we do this:

ctrl <- wordctrlCIsum %>% pivot_wider(names_from = congru, values_from = rt)

and look at ctrl in the Environment window, we can see that it now has fewer rows (140). It has the same number of columns, but the column names are different. Specifically, we now have one row for each participant and we have one column for their congruent reaction time, and one column for their incongruent reaction time.

Explanation of command: The pivot_wider command has two components: names_from and values_from. The first one, names_from, tells R which column you want to use to as the column names in your new, wider data frame. In this case, we want to take them from the congru column of wordctrlCIsum. The values_from part tells R which column contains the data you want to put in these new columns. In this case, it’s the rt column of wordctrlCIsum.

Mutating

Mutation in this context means to take some things and change them to make something new. In this case, we want to mutate the congruent and incongruent reaction times into a difference score, i.e. incongruent RT minus congruent RT. We do this using the mutate command:

ctrldiff <- ctrl %>% mutate(diff = incong - cong)

Look at ctrldiff in the Environment window - you will see that it has a fourth column, diff and that each row contains the incongruent reaction minus the congruent reaction time. For most participants, the difference is positive, meaning that incongruent trials are slower. However, some participants are the other way around (e.g. participant 146).

Explanation of command: It’s the mutate command that’s doing all the work here. We give it a name for the new column we want to create, in this case diff. We then tell it how to calculate that new column. In this case, by subtracting the congruent reaction time from the incongruent reaction time, incong - cong.

Density plots of differences

Now we have the difference scores, we can take a look at their distribution. We do this using a density plot, as we have on several previous occasions:

ctrldiff %>% ggplot(aes(diff)) +
    geom_density(aes(y=..scaled..)) +
    geom_vline(xintercept = 0, colour = 'red')

The only slightly unusual thing about this command is the geom_vline part, which we use to draw a vertical line on the plot, that intercepts (hits) the x axis at zero (xintercept = 0). We used something like this before in Putting R to work.

In this case, the vertical line allows us to quickly see that most of the distribution is to the right of zero. In other words, for most people, incongruent trials take longer to respond to than congruent trials. On average, it takes around 100 ms longer to react to incongruent trials, although the range is anything from around -200 to +300 milliseconds.

It is these differences, rather than the absolute reaction times for incongruent or congruent trials, that are critical to this within-subjects experiment. In the next section, we use Bayesian analysis to work out how good the evidence is for the claim that incongruent trials are on average slower than congruent trials (i.e. that the difference scores are on average positive).

Bayesian ANOVA (within-subjects)

We can use the command anovaBF to look at the evidence for this within-subjects effect of congruency. This command takes data in the format of our wordctrlCIsum data frame, so we’re almost ready to go. But, before we can do this analysis, we have to tell R which columns of this data frame are factors.

Factors

The word factor is a jargon term for “column that contains information about the experimental design”. To understand the meaning of this term, take a look at the wordctrlCIsum data frame in your environment. In wordctrlCIsum, the column congru is a factor because it tells us whether the reaction time comes from an congruent trial (cong) or an incongruent trial (incong). The column subj is also a factor, because it tells us which participant the reaction time comes from. The rt column is not a factor, because it contains the data (reaction times) that we’re analyzing.

We tell R this in the following way:

wordctrlCIsum$congru <- factor(wordctrlCIsum$congru)
wordctrlCIsum$subj <- factor(wordctrlCIsum$subj)

Explanation of command - The first command means “turn the congru column of the wordctrlCIsum data frame into a factor”. More specifically:

wordctrlCIsum$congru - This means the congru column of the wordctrlCIsum data frame

factor(wordctrlCIsum$congru) - The factor command tells R that the column wordctrlCIsum$congru is a factor.

<- This means put the thing on the right of <- into the thing on the left. In other words, once factor has labelled wordctrlCIsum$congru as a factor, record that change back into wordctrlCIsum$congru (rather than, for example, sending the data to a different column).

The second command works in the same way as the first.

Evidence

Now we’ve given R this information about factors, we’re ready to calculate a Bayes Factor for the difference between congruent and incongruent trials. To do this, first we load the BayesFactor package as before:

library(BayesFactor)

and then we use the command:

anovaBF(formula = rt ~ congru + subj,
        data = data.frame(wordctrlCIsum),
        whichRandom = "subj")
Bayes factor analysis
--------------
[1] congru + subj : 42922660 ±0.95%

Against denominator:
  rt ~ subj 
---
Bayes factor type: BFlinearModel, JZS

Explanation of command

Much of this anovaBF command is the same as for a Bayesian t-test. For example, formula = rt ~ congru means we want to know the effect of the congruency condition (congru) on the dependent measure of reaction time (rt).

The first part that’s new is that the formula contains + subj. Unlike for ttestBF, we always need to include this for anovaBF, so it knows which data came from which participant. What we include is the name of the column that contains the subject IDs – in this case, subj.

The other part that’s new is whichRandom = "subj". The command whichRandom means “tell me which factors are random factors”. A random factor is one where, if you ran the experiment again, the levels of the factor would be different. So, if you ran this experiment again, you’d almost certainly want to test different people, so the levels of the subj factor would be different if you re-ran the experiment. This makes it a random factor. In contrast, if you wanted to re-run this experiment in the same way as before, you would still have congruent and incongruent trials, so the levels of the congru factor aren’t random. They’re generally described as fixed. The command anovaBF assumes factors are fixed unless you tell it they are random.

The remaining part, data = data.frame(wordctrlCIsum), just tells R where to find the data.

Explanation of output

The first part of the output is just a way of showing a ‘progress bar’ on a worksheet. R shows you a progress bar going from 0% to 100% to let you know it’s working on the answer. This analysis may take several seconds to run, so you may end up looking at this progress bar for a while.

The second part of the output gives the Bayes Factor. The key figure here is 44286971, which is the Bayes Factor. Note that your number might be slightly different. This is what the ±1.23% part of the above output indicates. The number 44286971 is the correct answer, to within about plus or minus 1%. It’s like saying “this person is 2 metres tall, give or take 5cm”. The command anovaBF can take a while to run, so we prioritise speed over accuracy to some extent. If it was important to get the Bayes Factor precisely correct, we could let R work on the problem for much longer (e.g. overnight). However, the extra time is usually not worth it, and we don’t cover how to increase accuracy of the output in this worksheet.

The Bayes Factor on this occasion is around 40 million, meaning there is really very strong evidence in support of an effect of congruency. Pretty much however unlikely we thought it was beforehand that an incongruent picture would slow down word naming, we should now be pretty confident that it does.

The rest of the output is not crucial to understand right now, but if you are curious take a look at the more on Bayes factors worksheet.

Bayesian ANOVA with more than two levels

In the above Bayes Factor calculation, we looked at just the congruent and incongruent trials of the experiment. However, this experiment had a third level on the condition factor – there were also neutral trials where no picture was shown. These data are in the wordctrl data frame that you made. In this section, we’re going to do an analysis using all three conditions.

Unlike ttestBF, anovaBF can cope with a factor than has more than two levels. This is true for both between-subject and within-subject factors, although we’ll focus on within-subject factors in this worksheet.

Summarising

WRITE A COMMAND: Create a subject-level summary of mean reaction times, as you did before. However, this time, you’ll need all three conditions in that summary, so use your wordctrl data frame, rather than your wordctrlCI data frame. Put your summary into a data frame called wordctrlsum.

`summarise()` has grouped output by 'subj'. You can override using the `.groups` argument.

EXPECTED OUTPUT: There will be a new data frame wordctrlsum in your Environment, with 420 rows and 3 columns.

Graphing

WRITE A COMMAND that produces a density plot that contains all three conditions. You can do this the same way as before, just use wordctrlsum instead of wordctrlsumCI.

EXPECTED OUTPUT: This is what the graph should look like:

The main thing to notice about this density plot is that the reaction times for neutral trials are intermediate between the congruent and incongruent trials, both in terms of their mean, and in terms of their variance.

It’s still the case that it is the difference between conditions that is key here, because this is a within-subjects experiment. However, there are three differences we could work out here (incongruent - neutral, neutral - congruent, incongruent - congruent), so we’re going to leave further graphing until a bit later.

Evidence

The command anovaBF allows us to calculate a Bayes Factor across all three conditions, all we need to do is give it the appropriate subject-level summary (wordctrlsum in this case). However, don’t forget that, before anovaBF can work, it needs to know which columns of wordctrlsum are factors (you will need to set subj and congru as factors). So:

WRITE SOME COMMANDS to perform an Bayesian ANOVA that includes all three levels of the congruency condition.

EXPECTED OUTPUT: The correct output is as follows (your Bayes Factor may be slightly different):

Bayes factor analysis
--------------
[1] congru + subj : 1.769968e+14 ±0.9%

Against denominator:
  rt ~ subj 
---
Bayes factor type: BFlinearModel, JZS

The Bayes Factor here is approximately 1.8e+14, which is read as \(1.8 \times 10^{14}\) (click here if you need a reminder about scientific notation). So, the Bayes Factor is overwhelmingly in support of the conclusion that there is a difference between these three conditions.

Pairwise comparisons

It’s important to be clear what the above Bayes Factor tell us. It gives us evidence that the three conditions (congruent, incongruent, neutral) are not all the same. This is worth knowing, but often we want evidence for more specific hypotheses. For example, does the presentation of a congruent picture speed responding? If so, there should be evidence that congruent trials are faster than neutral trials. The Bayes Factor we have just calculated does not tell us that – it just tells us the evidence level for the claim that the three conditions are different.

Calculating the evidence for a particular pair in an experiment is called doing a pairwise comparison. You’ve in fact already done this – you removed the neutral trials from the data set and calculated a Bayes Factor for the congruent vs. incongruent comparison. You can use the same techniques to look at any pair.

Exercise

The goal in this exercise is to look at the reaction time difference between congruent and neutral trials in the control condition of this experiment.

Specifically, the task is to write a new R script that contains all the necessary commands, and only the necessary commands, to do all of the following, in this order:

  1. Start a new script and call it anova1ex.R. Start your script with the command rm(list=ls()). This command clears your Environment and so ensures your script is not relying on commands you have previously run.

  2. Load the packages you need (tidyverse and BayesFactor).

  3. Load the data file wordnaming2.csv (using read_csv).

  4. Preprocess the data to include only the control condition, and exclude the incongruent trials. Use the filter command to do this.

  5. Generate a subject-level summary of these filtered data (use the group_by and summarise commands).

  6. Show a density plot of the RT differences for congruent versus neutral trials. You’ll need to use pivot_wider and mutate to get the data in the right format, and then use ggplot to produce the graph.

  7. Calculate a Bayes Factor for this difference. Use anovaBF to do this, and don’t forget to use the factor command to tell R which columns of your data frame are factors.

Other requirements:

## Clear the environment
rm(list = ls())

Expected output, submitting your work

Your output should look like the example below. Note that your Bayes Factor may be slightly different to the one reported here, for the reasons we covered earlier.

Once it does, copy and paste your R script for this exercise into PsycEL. REMEMBER: This should be all the code you need to complete this exercise, and only the code you need to complete this exercise (not any other code from this worksheet). Even with comments, it shouldn’t be longer than around 25 lines of code. Your code has to run, produce the correct answer, and not include irrelevant code, to pass this exercise.

Parsed with column specification:
cols(
  subj = col_double(),
  sex = col_character(),
  medit = col_character(),
  block = col_double(),
  trial = col_double(),
  word = col_character(),
  pic = col_character(),
  congru = col_character(),
  acc = col_double(),
  rt = col_double()
)
`summarise()` has grouped output by 'subj'. You can override using the `.groups` argument.