Return to the project that you used for the understanding interactions worksheet, which you named **rminr-data**. Open a new blank script and save it as **anova3.R**. Put all commands in that script, and save the script regularly.

This worksheet makes use of the preprocessed `MCwordsCIsum`

data frame we made in the understanding interactions worksheet. Start by running the following commands to make sure you have this data frame in your *environment*:

```
library(tidyverse)
library(BayesFactor)
words <- read_csv("wordnaming2.csv")
MCwords <- words %>% filter(medit != "relax")
MCwordsCI <- MCwords %>% filter(congru != "neutral")
MCwordsCIsum <-
MCwordsCI %>% group_by(subj, medit, congru) %>% summarise(rt = mean(rt))
```

In this worksheet, we’ll use the `anovaBF`

command to analyse the data from an experiment with a *factorial* design. In the understanding interactions worksheet, we said that a *factorial* experiment design is one where there are at least two factors (e.g. training type, trial type) and we have data for all combinations of those factors (e.g. meditate-congruent, meditate-incongruent, control-congruent, control-incongruent). When you have a factorial design with two factors, there are three questions you can ask:

Is there a

*main effect*of the first factor (e.g. training type)?Is there a

*main effect*of the second factor (e.g. trial type)?Is there an

*interaction*between these two factors?

The command `anovaBF`

allows us to answer all these three questions in one go.

Take a look at the data frame `MCwordsCIsum`

. It gives the mean congruent reaction time, and mean incongruent reaction time, for each participant. As before, you have to tell R which of the columns of this data frame are *factors*. The columns `medit`

and `congru`

are factors - they are the two factors of the design. Recall that the participant ID is also a factor. So, you need to set three of the four columns of `MCwordsCIsum`

as factors, like this:

```
MCwordsCIsum$subj <- factor(MCwordsCIsum$subj)
MCwordsCIsum$medit <- factor(MCwordsCIsum$medit)
MCwordsCIsum$congru <- factor(MCwordsCIsum$congru)
```

Once you have told R which columns are *factors*, you’re ready to answer these three questions:

Is there a

*main effect*of the first factor (e.g. training type)?Is there a

*main effect*of the second factor (e.g. trial type)?Is there an

*interaction*between these two factors?

This is a big calculation for R, so run the following commands, and then read the explanation while you’re waiting for the results. It could take up to a minute to get the answer.

```
bf <- anovaBF(formula = rt ~ medit*congru + subj,
data = data.frame(MCwordsCIsum), whichRandom = "subj")
```

We’ll look at each part of this command but, rather than looking at each in turn, we’ll look at the crucial part first, which is this:

`formula = rt ~ medit*congru + subj`

This tells `anovaBF`

what type of ANOVA we want. It says that the dependent variable is the reaction time column `rt`

, and that the independent variables are the training type `medit`

and the trial type `congru`

. The `*`

means “do a factorial ANOVA”. In other words, it means work out a Bayes Factor for the main effect of each factor, and for the interaction. The final part, `+ subj`

, is something we always include in `anovaBF`

, and just says that the participant IDs are to be found in the `subj`

column.

You also have to tell `anovaBF`

which factors are random factors. In most cases we’ll deal with in this guide, only the participant ID is a random factor. Hence:

`whichRandom = "subj"`

`bf <-`

As the calculation takes a while to run, it makes sense to save the results of the calculation, so we only ever have to do it once. This command writes the output of `anovaBF`

to an *object* called `bf`

. Until now, most objects you’ve seen have been data frames. But there are other types of object in R including ones, like this one, that contain the results of calculations.

`anovaBF`

- This is the same command as we’ve used for a while now. It’s quite flexible!

`data = data.frame(MCwordsCIsum)`

- As in previous examples of `anovaBF`

, this just says where to find the data (i.e. in the `MCwordsCIsum`

data frame).

Once calculation has finished, take a look at the results. You do this by typing in the name of the object you stored the results in:

`bf`

```
Bayes factor analysis
--------------
[1] medit + subj : 345.3945 ±0.89%
[2] congru + subj : 865.3041 ±5.97%
[3] medit + congru + subj : 306142.9 ±2.29%
[4] medit + congru + medit:congru + subj : 1851438813 ±3.37%
Against denominator:
rt ~ subj
---
Bayes factor type: BFlinearModel, JZS
```

The exact figures in your output may be slightly different to those shown.

The key parts of this output are the Bayes Factors, which are the numbers immediately after the colons. A couple of things to notice here. First, there are four Bayes factors, when we might have expected to see three. We’ll come back to that later. Second, there are things like `±2.66%`

after the Bayes Factors. This means that R has *estimated* the Bayes Factors, and is able to tell you how accurate that estimate is. So, for example, `10 ± 10%`

means the Bayes Factor is somewhere between 9 and 11.

We’ll look at each of these Bayes factors in turn. Notice they are numbered, e.g. `[1]`

- this will become useful later. Also notice that they all include `+ subj`

. This just reminds us that we’ve told `anovaBF`

which data comes from which participant, so we’ll ignore the `+ subj`

in our descriptions from here on.

`[1] medit`

This is the **main effect** of training type, `medit`

. More specifically, it is a test of the hypothesis that `medit`

, affects `rt`

.

Recall that a Bayes Factor is a comparison of evidence of two hypotheses. So, in a Bayesian t-test, for example, a BF of 3 means there is three times as much evidence in favour of a difference between groups (the experimental hypothesis) as in favour of their *not* being a difference (the null hypothesis).

The Bayes Factor for this hypothesis is around 340, which is really strong evidence that there is a main effect of meditation training. But which direction is the main effect in? Does meditation training make you faster overall, or slower overall?

We can find this out by using the `group_by`

and `summarise`

commands we have used several times before:

`MCwordsCIsum %>% group_by(medit) %>% summarise(mean(rt))`

```
# A tibble: 2 x 2
medit `mean(rt)`
<fct> <dbl>
1 control 520.
2 meditate 488.
```

From this, we can see that people in the meditation condition are about 30 ms faster, on average, than those in the control condition. Meditation training seems to have led to a reduction in overall reaction times in this task.

`[2] congru`

This is the **main effect** of congruence. More specifically, it is the hypothesis that `congru`

affects `rt`

. This is compared against the hypothesis that there is no effect of congruence.

The Bayes Factor is about 860, strong evidence for a main effect of congruence. As before, we can find out which direction this effect is in:

`MCwordsCIsum %>% group_by(congru) %>% summarise(mean(rt))`

```
# A tibble: 2 x 2
congru `mean(rt)`
<fct> <dbl>
1 cong 490.
2 incong 518.
```

As expected, incongruent trials are slower than congruent trials, on average.

`anovaBF`

does not directly give us a Bayes Factor for the interaction of the two main effects. Instead, it gives us two Bayes Factors for things that we can use to work out the interaction BF. These are:

`medit + congru`

This the hypothesis that there is a main effect of both factors. There is no assumption that the two main effects are of the same size. This ‘main effects’ hypothesis is compared against the null hypothesis, i.e. that neither `medit`

nor `congru`

affect `rt`

.

The BF for this hypothesis is large (about 300,000). We’d expect this, given that there was substantial evidence for both `congru`

alone and `medit`

alone.

This is the Bayes Factor for the hypothesis that there are main effects for both factors (`medit + congru`

) *and* that the two factors interact (`+ medit:congru`

). This is again compared against the null hypothesis that neither `medit`

nor `congru`

have any effect.

The BF for this hypothesis is also large (about \(1.8 \times 10^{9}\)). We’d expect this, because there was substantial evidence for the ‘main effects’ hypothesis `medit + congru`

.

Remember that a Bayes Factor is always a comparison of two types of evidence. So far, we’ve always compared some experimental hypothesis with the null hypothesis. But we can also compare the evidence for two different experimental hypotheses. The hypothesis that there is an interaction is the hypothesis that there is something more going on that just the combination of main effects. For example, the hypothesis that the congruency effect is smaller after mediation. So, to get a Bayes Factor for the interaction, we compare the evidence for hypothesis `[4]`

with the evidence for hypothesis `[3]`

. To do this, we divide the Bayes Factor for `[4]`

by the Bayes Factor for `[3]`

.

In R we can use the `bf`

object to do this:

`bf[4] / bf[3]`

```
Bayes factor analysis
--------------
[1] medit + congru + medit:congru + subj : 6047.63 ±4.08%
Against denominator:
rt ~ medit + congru + subj
---
Bayes factor type: BFlinearModel, JZS
```

This gives us a Bayes Factor for the interaction close to 6000. So, there is strong evidence for the interaction, too.

As covered in the within-subject differences, one of the strengths of `anovaBF`

is that it’s not limited to factors with two levels. In our meditation experiment, there were three between-subjects training conditions (meditate, relaxation, none), and three within-subjects trial types (congruent, incongruent, neutral). **The goal of this exercise is to write an R script to analyse the full experiment**.

Below are further instructions. In some cases, there are also examples of what your output should look like at each stage if you’ve got it right (recall that your BF may be slightly different to those shown). Once your script is working correctly, **copy and paste it into PsycEL**.

Start a new R script in your current project (**rminr-data**), call it **anova3ex.R** and begin it with `rm(list=ls())`

to make sure you’ve cleared everything from your environment. Then, write R commands to do the following (see below). Only include the commands that are needed to do this, and use meaningful names for your variables.

Load packages and data,

Preprocess that data and place it in a data frame called

`wordsum`

``summarise()` regrouping output by 'subj', 'medit' (override with `.groups` argument)`

Tell R which columns of your preprocessed data are factors

Calculate a factorial Bayesian ANOVA for this full 3 (meditate, relax, control) x 3 (congruent, incongruent, neutral) data set, and store it in an object (e.g.

`bfall`

)Show the results of that calculation

```
Bayes factor analysis
--------------
[1] medit + subj : 830.3331 ±0.96%
[2] congru + subj : 2.769511e+16 ±0.64%
[3] medit + congru + subj : 2.347575e+19 ±0.89%
[4] medit + congru + medit:congru + subj : 1.217232e+27 ±1.09%
Against denominator:
rt ~ subj
---
Bayes factor type: BFlinearModel, JZS
```

- Calculate the BF for the interaction.

```
Bayes factor analysis
--------------
[1] medit + congru + medit:congru + subj : 51850606 ±1.41%
Against denominator:
rt ~ medit + congru + subj
---
Bayes factor type: BFlinearModel, JZS
```

- Report the mean RTs for the main effect of each condition.

``summarise()` ungrouping output (override with `.groups` argument)`

```
# A tibble: 3 x 2
medit `mean(rt)`
<fct> <dbl>
1 control 520.
2 meditate 488.
3 relax 517.
```

``summarise()` ungrouping output (override with `.groups` argument)`

```
# A tibble: 3 x 2
congru `mean(rt)`
<fct> <dbl>
1 cong 491.
2 incong 526.
3 neutral 508.
```

- Calculate the mean RT for each of the nine conditions of the experiment, and place these data in a new data frame so you can graph them:

``summarise()` regrouping output by 'medit' (override with `.groups` argument)`

```
# A tibble: 9 x 3
# Groups: medit [3]
medit congru rt
<fct> <fct> <dbl>
1 control cong 492.
2 control incong 548.
3 control neutral 519.
4 meditate cong 489.
5 meditate incong 487.
6 meditate neutral 489.
7 relax cong 492.
8 relax incong 543.
9 relax neutral 517.
```

- Graph

Use a line graph, similar to the ones you produced earlier, to show these nine means. Note that `ggplot`

orders conditions alphabetically, so by default your x-axis would come out in the order `cong, incong, neutral`

. It makes more sense to use the order `cong, neutral, incong`

as the neutral trials (those without pictures) are, in some sense, in between the congruent trials (helpful pictures) and the incongruent trials (unhelpful pictures). You can reorder the points on the x-axis of a graph using the `scale_x_discrete`

command. In this case, you can set the correct order by adding this to your graph commands:

`scale_x_discrete(limits=c("cong", "neutral", "incong"))`

Although that’s the end of the exercise, in a full analysis you would probably want to go further and look at particular pairs of conditions within the experiment. You do this as before, by using the `filter`

command to select the data you want to analyse. For more details, see the within-subject differences worksheet.

The graph you produced in the above exercise is quite common in psychology experiments, but it’s not particularly informative because it gives only the average response for each condition. It gives no sense of how much people vary from one another.

One common approach to this problem is to add *error bars*, or *confidence intervals* to a line plot or bar plot. These are ways of showing variability around the mean with a little “I” shaped mark and although we don’t cover them in this course, you’ll see them in a lot of journal articles.

Unfortunately, very many of those articles use them *wrongly* for within-subjects factors, because they use the variability of each level of the factor to calculate the confidence intervals. This is wrong because it is the variability of the *differences* between conditions that are relevant in a within-subjects design, not the variability of the individual conditions. The former will often be smaller than the latter. Indeed, this difference is one of the main reasons to favour within-subjects designs, see here for more details.

A better way of showing variability in a within-subjects factor is to use a density plot of the differences, as we have in these worksheets. Where you also have a between-subjects factor, you can overlay those difference-density plots, as in this figure. You can alternatively use boxplots or violin plots, which some people find more attractive and/or easier to interpret.

In cases where your within-subjects factor has more than two levels, this can get more tricky but generally the solution is to pick those pairs of levels that are important for your hypothesis, and do difference-density plots of those.

When it comes to reporting the results of a Bayesian ANOVA, you just give the Bayes Factor in the appropriate part of your text. For example:

There was a main effect of training type, \(BF = 814\), a main effect of congruency, \(BF = 2.8 \times 10^{16}\), and an interaction between these two factors, \(BF = 5.4 \times 10^{7}\).

It’s important to remember that Bayes Factors without means are basically meaningless. So, you need to show, for example, a graph of the means for each condition (as above) for the reader to make sense of your analysis.

There are a number of different ways to do Bayesian calculations, and these can lead to somewhat different results. So, it’s really important to also say exactly what calculation you did. In this case you would say:

We performed a factorial Bayesian ANOVA with one within-subjects factor (congruency) and one between-subject factor (training type), using the BayesFactor package (Morey & Rouder, 2018) in R (R Core Team, 2019).

It’s also important to include those references in your Reference section. R will tell you the reference for a package if you type e.g. `citation("BayesFactor")`

. The reference for R itself is found by typing `citation()`

. Note that R is what is doing your calculations, while *RStudio* is a web page that makes it easier to use R. *RStudio* does not have any affect on the output you get. So, you don’t normally cite *RStudio* in your writeups.

A final important thing to realise about ANOVA is that it does not care about the order of the levels in your factors. For example, the full data set you have been analyzing includes block number. Participants do 30 trials, then take a break, then do another 30 trials, and so on. So each response is either in block 1, 2, or 3. If people were getting tired, you might see reaction times rise from blocks 1 to 2 and again from blocks 2 to 3. You might also find Bayesian evidence for a main effect of block (e.g. BF = 30).

It’s important to realise that this ANOVA Bayes Factor tells you only that the three groups differ, not that 3 is greater than 2 and 2 is greater than 1. There are two ways of finding evidence for that sort of question. First, you could do two pairwise comparisons (1 vs. 2, and 2 vs. 3). Alternatively, you could use a different analysis method that takes account of the fact that block is an *ordinal* factor (i.e. that it has a specific order). Regression is often a good choice in these cases.

This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.