Before you start…

Before starting this exercise, you should have completed all the Absolute Beginners’ workshop exercises. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.

Getting the data into R

Relevant worksheet: Intro to RStudio

In this exercise, you’ll be analysing some data that has already been collected. To get this data into R, follow these steps:

  1. Set up an RStudio project for this analysis, and create a script file within that project. Plymouth University students: Create/open your project named psyc415; within that create a script file called lineup.R. Enter all commands into that script and run them from there.

  2. Upload this CSV file into your RStudio project folder. Here’s a reminder of how to upload CSV files.

  3. Load the tidyverse package, and then load your data into R.

library(tidyverse)
lup <- read_csv("lineup.csv")

Inspect

Look at the data by clicking on it in the Environment tab in RStudio.

Each row is one participant in this simulated police line-up experiment. Each participant views a video of a simulated crime, then has to pick the criminal from one of four photographs of different people. The criminal in the video does not appear in any of those four photos, but the participants have not yet been told that. After they make their decision, some participants are told they picked the correct person; the rest are not told anything. Each participant then goes on to answer a series of questions (Q2-Q9, below).

Will being told they made the right choice change peoples’ answers to these questions?

Column Description Values
Sub Subject number a number
Cond Did the subject receive feedback on their decision? “Feedback”, “No Feedback”
Q1 The photograph chosen by the participant A, B, C, or D
Q2 “Would you be willing to testify in court?” “Testify”, “Not Tesitfy”
Q3 “How was your view of the scene?” 0 - 100, higher numbers = better view
Q4 “How long did you see the thief’s face? (in seconds)” a number
Q5 “When you chose the photograph, how confident were you?” 0 - 100, higher numbers = more confident
Q6 “Did the thief shove the victim?” Yes, No
Q7 “How confident were you in your answer?” (about the shove) 0 - 100, higher numbers = more confident
Q8 “Do you think the thief may be violent?” Yes, No
Q9 “How confident were you in your answer?” (about the thief’s violence) 0 - 100, higher numbers = more confident

Testifying in court

Relevant worksheet: Relationships, Evidence

Will witnesses be more likely to testify in court if they are told they are right? We can look at this question with the data set you just loaded. Looking at the lup data frame, the column Cond tells us whether each participant was given feedback or not. The Q2 column tells us whether they said they would be willing to testify in court or not. Both of these variables have unordered (“nominal”) data, so the appropriate form of analysis here is a contingency table. As we covered in the Relationships worksheet, we produce a contingency table using the table command:

cont <- table(lup$Cond, lup$Q2)
cont
             
              Not Testify Testify
  Feedback             43      45
  No Feedback          66      17

Often, it’s easier to see what’s going on in a contingency table if we draw a mosaic plot:

mosaicplot(cont)

It looks like, with feedback, people are about 50:50 on whether they would testify. Without feedback, a large majority would not testify.

Is this a real effect, or could it just be down to chance? As we covered in the Relationships worksheet, the best way to look at this is with a Bayesian test. We use the cont contingency table we generated above:

library(BayesFactor, quietly = TRUE)
contingencyTableBF(cont, sampleType = "indepMulti", fixedMargin = "rows" )
Bayes factor analysis
--------------
[1] Non-indep. (a=1) : 1201.46 ±0%

Against denominator:
  Null, independence, a = 1 
---
Bayes factor type: BFcontingencyTable, independent multinomial

We’ve set fixedMargin = "rows" because the rows of the contingency table represent the groups created by the experimenter (Feedback vs. No Feedback).

The Bayes Factor here is about 1200, so it’s over a thousand times more likely there is a real difference, than there isn’t.

Did the thief shove the victim?

Now do the same analyses as above, but on question 6, “Did the thief shove the victim?”. To do this you change the command cont <- table(lup$Cond, lup$Q2) so that you get a contingency table for question 6. You can then re-run the commands above to get the answers.

Enter the Bayes Factor for question 6 into PsycEL.

Using the convention that there is a difference if BF > 3, there isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3, we’re unsure, select difference, no difference, or unsure, on PsycEL.

How was your view of the scene?

Relevant worksheet: Group Differences, Evidence

Did participants think their view was better if they were told they made the correct decision? In this case, we have one ordered variable (Q3, their rating of their view on a 1-100 scale), and one unordered variable (Cond - whether they got feedback or not).

We start by looking to see how the mean scores on Question 3 differ for those who were and weren’t given feedback. As we saw in the Group Differences worksheet, we use the group_by, summarise, and mean commands to do this:

lup %>% group_by(Cond) %>% summarise(mean(Q3))
# A tibble: 2 x 2
  Cond        `mean(Q3)`
  <chr>            <dbl>
1 Feedback          45.6
2 No Feedback       41.2

As before, you can safely ignore the “ungrouping” message that you receive.

It looks like there’s a small difference, with the ratings of their view slightly higher in the feedback condition – but how does this between-group difference compare to the within-group variability? As we covered in the Group Differences worksheet, this most easily looked at with a scaled density plot:

lup %>% ggplot(aes(Q3, colour=factor(Cond))) + geom_density(aes(y=..scaled..)) 

This graph tells a somewhat different story to the means. The two groups almost completely overlap, with the main difference being that the No Feedback participants mostly give scores close to 50, while the Feedback participants give a broader range of scores.

At this point, the most pressing question is probably whether the difference observed in the mean scores is likely to be real, or whether it’s more likely down to chance. As we saw in the Evidence worksheet, the best way to look at this is with a Bayesian t-test:

ttestBF(formula = Q3 ~ Cond, data = data.frame(lup))
Bayes factor analysis
--------------
[1] Alt., r=0.707 : 0.3230296 ±0%

Against denominator:
  Null, mu1-mu2 = 0 
---
Bayes factor type: BFindepSample, JZS

The Bayes Factor in this case is about 1/3, meaning it’s about three times as likely there isn’t a difference as there is.

How long they thought they saw the thief’s face

Did participants who were told they were right think they saw the thief’s face for longer? This was addressed by Question 4 (column Q4 in data frame lup). By changing Q3 to Q4 in the commands above, you can answer this question.

Enter the mean viewing time for each condition, and the Bayes Factor for the difference, into PsycEL.

Using the convention that there is a difference if BF > 3, there isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3, we’re unsure, select difference, no difference, or unsure, into PsycEL.


This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.