Before starting this exercise, you should have completed
**all** the Absolute Beginners’
workshop exercises. If not, take a look at those exercises before
continuing. Each section below also indicates which of the earlier
worksheets are relevant.

**Relevant worksheet:** Intro to RStudio

In this exercise, you’ll be analysing some data that has already been collected. To get this data into R, follow these steps:

Set up an RStudio project for this analysis, and create a script file within that project.

**Plymouth University students**: Create/open your project named`psyc415`

; within that create a script file called`lineup.R`

. Enter all commands into that script and run them from there.Upload this CSV file into your RStudio project folder. Here’s a reminder of how to upload CSV files.

Load the

*tidyverse*package, and then load your data into R.

```
# Police lineup experiment
# Load tidyverse
library(tidyverse)
# Load data into 'lup'
lup <- read_csv("lineup.csv")
```

Look at the data by clicking on it in the *Environment* tab in
RStudio.

Each row is one participant in this simulated police line-up experiment. Each participant views a video of a simulated crime, then has to pick the criminal from one of four photographs of different people. The criminal in the video does not appear in any of those four photos, but the participants have not yet been told that. After they make their decision, some participants are told they picked the correct person; the rest are not told anything. Each participant then goes on to answer a series of questions (Q2-Q9, below).

*Will being told they made the right choice change peoples’
answers to these questions?*

Column | Description | Values |
---|---|---|

Sub | Subject number | a number |

Cond | Did the subject receive feedback on their decision? | “Feedback”, “No Feedback” |

Q1 | The photograph chosen by the participant | A, B, C, or D |

Q2 | “Would you be willing to testify in court?” | “Testify”, “Not Tesitfy” |

Q3 | “How was your view of the scene?” | 0 - 100, higher numbers = better view |

Q4 | “How long did you see the thief’s face? (in seconds)” | a number |

Q5 | “When you chose the photograph, how confident were you?” | 0 - 100, higher numbers = more confident |

Q6 | “Did the thief shove the victim?” | Yes, No |

Q7 | “How confident were you in your answer?” (about the shove) | 0 - 100, higher numbers = more confident |

Q8 | “Do you think the thief may be violent?” | Yes, No |

Q9 | “How confident were you in your answer?” (about the thief’s violence) | 0 - 100, higher numbers = more confident |

**Relevant worksheet:** Relationships, Evidence

Will witnesses be more likely to testify in court if they are told
they are right? We can look at this question with the data set you just
loaded. Looking at the `lup`

data frame, the column
`Cond`

tells us whether each participant was given feedback
or not. The `Q2`

column tells us whether they said they would
be willing to testify in court or not. Both of these variables have
unordered (“nominal”) data, so the appropriate form of analysis here is
a contingency table. As we covered in the *Relationships*
worksheet, we produce a contingency table using the `table`

command:

```
# Create contingency table of 'Cond' by 'Q2'
cont <- table(lup$Cond, lup$Q2)
# Display contigency table
cont
```

```
Not Testify Testify
Feedback 43 45
No Feedback 66 17
```

Often, it’s easier to see what’s going on in a contingency table if we draw a mosaic plot:

```
# Display mosaic plot
mosaicplot(cont)
```

It looks like, with feedback, people are about 50:50 on whether they would testify. Without feedback, a large majority would not testify.

Is this a real effect, or could it just be down to chance? As we
covered in the *Relationships* worksheet, the best way to look at
this is with a Bayesian test. We use the `cont`

contingency
table we generated above:

```
# Load BayesFactor package
library(BayesFactor, quietly = TRUE)
# Calculate Bayes Factor for the contingency table 'cont'
contingencyTableBF(cont, sampleType = "indepMulti", fixedMargin = "rows" )
```

```
Bayes factor analysis
--------------
[1] Non-indep. (a=1) : 1201.46 ±0%
Against denominator:
Null, independence, a = 1
---
Bayes factor type: BFcontingencyTable, independent multinomial
```

We’ve set `fixedMargin = "rows"`

because the rows of the
contingency table represent the groups created by the experimenter
(*Feedback* vs. *No Feedback*).

The Bayes Factor here is about 1200, so it’s over a thousand times more likely there is a real difference, than there isn’t.

Now do the same analyses as above, but on question 6, “Did the thief
shove the victim?”. To do this you change the command
`cont <- table(lup$Cond, lup$Q2)`

so that you get a
contingency table for question 6. You can then re-run the commands above
to get the answers.

**Enter the Bayes Factor for question 6 into
PsycEL.**

Using the convention that there is a difference if BF > 3, there
isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3,
we’re unsure, select **difference, no difference, or
unsure**, on PsycEL.

**Relevant worksheet:** Group Differences, Evidence

Did participants think their view was better if they were told they
made the correct decision? In this case, we have one ordered variable
(`Q3`

, their rating of their view on a 1-100 scale), and one
unordered variable (`Cond`

- whether they got feedback or
not).

We start by looking to see how the mean scores on Question 3 differ
for those who were and weren’t given feedback. As we saw in the
*Group Differences* worksheet, we use the `group_by`

,
`summarise`

, and `mean`

commands to do this:

```
# Group by 'Cond', take mean for Q3.
lup %>% group_by(Cond) %>% summarise(mean(Q3))
```

```
# A tibble: 2 × 2
Cond `mean(Q3)`
<chr> <dbl>
1 Feedback 45.6
2 No Feedback 41.2
```

As before, you can safely ignore the “ungrouping” message that you receive.

It looks like there’s a small difference, with the ratings of their
view slightly higher in the feedback condition – but how does this
between-group difference compare to the within-group variability? As we
covered in the *Group Differences* worksheet, this most easily
looked at with a scaled density plot:

```
# Display density plot of 'Q3', by 'Cond'
lup %>% ggplot(aes(Q3, colour = factor(Cond))) + geom_density(aes(y = ..scaled..))
```

This graph tells a somewhat different story to the means. The two groups almost completely overlap, with the main difference being that the No Feedback participants mostly give scores close to 50, while the Feedback participants give a broader range of scores.

At this point, the most pressing question is probably whether the
difference observed in the mean scores is likely to be real, or whether
it’s more likely down to chance. As we saw in the *Evidence*
worksheet, the best way to look at this is with a Bayesian t-test:

```
# Calculate Bayesian t-test for effect of 'Cond' on 'Q3'
ttestBF(formula = Q3 ~ Cond, data = data.frame(lup))
```

```
Bayes factor analysis
--------------
[1] Alt., r=0.707 : 0.3230296 ±0.04%
Against denominator:
Null, mu1-mu2 = 0
---
Bayes factor type: BFindepSample, JZS
```

The Bayes Factor in this case is about 1/3, meaning it’s about three
times as likely there *isn’t* a difference as there is.

Did participants who were told they were right think they saw the
thief’s face for longer? This was addressed by Question 4 (column
`Q4`

in data frame `lup`

). By changing
`Q3`

to `Q4`

in the commands above, you can answer
this question.

**Enter the mean viewing time for each condition, and the Bayes
Factor for the difference, into PsycEL**.

Using the convention that there is a difference if BF > 3, there
isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3,
we’re unsure, select **difference, no difference, or
unsure**, into PsycEL.

This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.