Before starting this exercise, you should have completed **all** the Absolute Beginners’ workshop exercises. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.

**Relevant worksheet:** Intro to RStudio

In this exercise, you’ll be analysing some data that has already been collected. To get this data into R, follow these steps:

Set up an RStudio project for this analysis, and create a script file within that project.

**Plymouth University students**: Create/open your project named`psyc415`

; within that create a script file called`lineup.R`

. Enter all commands into that script and run them from there.Upload this CSV file into your RStudio project folder. Here’s a reminder of how to upload CSV files.

Load the

*tidyverse*package, and then load your data into R.

```
library(tidyverse)
lup <- read_csv("lineup.csv")
```

Look at the data by clicking on it in the *Environment* tab in RStudio.

Each row is one participant in this simulated police line-up experiment. Each participant views a video of a simulated crime, then has to pick the criminal from one of four photographs of different people. The criminal in the video does not appear in any of those four photos, but the participants have not yet been told that. After they make their decision, some participants are told they picked the correct person; the rest are not told anything. Each participant then goes on to answer a series of questions (Q2-Q9, below).

*Will being told they made the right choice change peoples’ answers to these questions?*

Column | Description | Values |
---|---|---|

Sub | Subject number | a number |

Cond | Did the subject receive feedback on their decision? | “Feedback”, “No Feedback” |

Q1 | The photograph chosen by the participant | A, B, C, or D |

Q2 | “Would you be willing to testify in court?” | “Testify”, “Not Tesitfy” |

Q3 | “How was your view of the scene?” | 0 - 100, higher numbers = better view |

Q4 | “How long did you see the thief’s face? (in seconds)” | a number |

Q5 | “When you chose the photograph, how confident were you?” | 0 - 100, higher numbers = more confident |

Q6 | “Did the thief shove the victim?” | Yes, No |

Q7 | “How confident were you in your answer?” (about the shove) | 0 - 100, higher numbers = more confident |

Q8 | “Do you think the thief may be violent?” | Yes, No |

Q9 | “How confident were you in your answer?” (about the thief’s violence) | 0 - 100, higher numbers = more confident |

**Relevant worksheet:** Relationships, Evidence

Will witnesses be more likely to testify in court if they are told they are right? We can look at this question with the data set you just loaded. Looking at the `lup`

data frame, the column `Cond`

tells us whether each participant was given feedback or not. The `Q2`

column tells us whether they said they would be willing to testify in court or not. Both of these variables have unordered (“nominal”) data, so the appropriate form of analysis here is a contingency table. As we covered in the *Relationships* worksheet, we produce a contingency table using the `table`

command:

```
cont <- table(lup$Cond, lup$Q2)
cont
```

```
Not Testify Testify
Feedback 43 45
No Feedback 66 17
```

Often, it’s easier to see what’s going on in a contingency table if we draw a mosaic plot:

`mosaicplot(cont)`

It looks like, with feedback, people are about 50:50 on whether they would testify. Without feedback, a large majority would not testify.

Is this a real effect, or could it just be down to chance? As we covered in the *Relationships* worksheet, the best way to look at this is with a Bayesian test. We use the `cont`

contingency table we generated above:

```
library(BayesFactor, quietly = TRUE)
contingencyTableBF(cont, sampleType = "indepMulti", fixedMargin = "rows" )
```

```
Bayes factor analysis
--------------
[1] Non-indep. (a=1) : 1201.46 ±0%
Against denominator:
Null, independence, a = 1
---
Bayes factor type: BFcontingencyTable, independent multinomial
```

We’ve set `fixedMargin = "rows"`

because the rows of the contingency table represent the groups created by the experimenter (*Feedback* vs. *No Feedback*).

The Bayes Factor here is about 1200, so it’s over a thousand times more likely there is a real difference, than there isn’t.

Now do the same analyses as above, but on question 6, “Did the thief shove the victim?”. To do this you change the command `cont <- table(lup$Cond, lup$Q2)`

so that you get a contingency table for question 6. You can then re-run the commands above to get the answers.

**Enter the Bayes Factor for question 6 into PsycEL.**

Using the convention that there is a difference if BF > 3, there isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3, we’re unsure, select **difference, no difference, or unsure**, on PsycEL.

**Relevant worksheet:** Group Differences, Evidence

Did participants think their view was better if they were told they made the correct decision? In this case, we have one ordered variable (`Q3`

, their rating of their view on a 1-100 scale), and one unordered variable (`Cond`

- whether they got feedback or not).

We start by looking to see how the mean scores on Question 3 differ for those who were and weren’t given feedback. As we saw in the *Group Differences* worksheet, we use the `group_by`

, `summarise`

, and `mean`

commands to do this:

`lup %>% group_by(Cond) %>% summarise(mean(Q3))`

```
# A tibble: 2 x 2
Cond `mean(Q3)`
<chr> <dbl>
1 Feedback 45.6
2 No Feedback 41.2
```

As before, you can safely ignore the “ungrouping” message that you receive.

It looks like there’s a small difference, with the ratings of their view slightly higher in the feedback condition – but how does this between-group difference compare to the within-group variability? As we covered in the *Group Differences* worksheet, this most easily looked at with a scaled density plot:

`lup %>% ggplot(aes(Q3, colour=factor(Cond))) + geom_density(aes(y=..scaled..)) `

This graph tells a somewhat different story to the means. The two groups almost completely overlap, with the main difference being that the No Feedback participants mostly give scores close to 50, while the Feedback participants give a broader range of scores.

At this point, the most pressing question is probably whether the difference observed in the mean scores is likely to be real, or whether it’s more likely down to chance. As we saw in the *Evidence* worksheet, the best way to look at this is with a Bayesian t-test:

`ttestBF(formula = Q3 ~ Cond, data = data.frame(lup))`

```
Bayes factor analysis
--------------
[1] Alt., r=0.707 : 0.3230296 ±0%
Against denominator:
Null, mu1-mu2 = 0
---
Bayes factor type: BFindepSample, JZS
```

The Bayes Factor in this case is about 1/3, meaning it’s about three times as likely there *isn’t* a difference as there is.

Did participants who were told they were right think they saw the thief’s face for longer? This was addressed by Question 4 (column `Q4`

in data frame `lup`

). By changing `Q3`

to `Q4`

in the commands above, you can answer this question.

**Enter the mean viewing time for each condition, and the Bayes Factor for the difference, into PsycEL**.

Using the convention that there is a difference if BF > 3, there isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3, we’re unsure, select **difference, no difference, or unsure**, into PsycEL.

This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.