Before you start…

Before starting this exercise, you should have completed all the Absolute Beginners’ workshop exercises. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.

Getting the data into R

Relevant worksheet: Intro to RStudio

You’ll need to complete the Psych:EL excerise to get the CSV file containing your data. You’ll also need to get an older person to complete the same Psych:EL exercise to get a second CSV file containing their data.

Once you have downloaded your CSV files, open a project on RStudio Server for this analysis, create a script file, and upload your CSV to your project.

Plymouth University students: Create/open your project named psyc415; within that create a script file called risk-rat.R. Enter all commands into that script and run them from there.

Exploring your data


Relevant worksheet: Exploring data

Load the tidyverse package, and load your own data, and that of your older participant.

# Risk-taking
# Load tidyverse
# Load my data into '' <- read_csv("riskrat.csv")
# Load older participant's data into 'risk.other'
risk.other <- read_csv("riskrat-other.csv")

Note: Everyone’s CSV files have different names. For example, yours might be called 10435678you.csv and 10435678other.csv. In the example below, you’ll need to replace riskrat.csv and riskrat-other.csv with the name of your personal CSV files.


Look at the data by clicking on it in the Environment tab in RStudio. Each row is one person’s rating for one question. Here’s what each of the columns in the data set contain:

Column Description Values
SRN Your Student Reference Number
who Is this data from you, or from the other (older) person you tested? “you”, “other”
group Which sort of risk-taking behaviour is this question about? “ethical”, “financial”, “health”, “social”, “recreation”
qu This number uniquely identifies the question that was asked 1 - 26, e.g. qu. 19 is “Taking a skydiving class”
rating The rating given in response to this question 1 - 7, higher numbers = more likely to engage in the risky behaviour described in the question.

Calculating your risk-taking scores

Relevant worksheets: Group Differences

How highly did you score on each of the types of risk-taking behaviour (e.g. ethical, financial, …)?

To look at this, we take the average (mean) rating you made for each type of behaviour. To do this, we use the group_by and summarise commands you learned in the Group Differences worksheet.

# Group by 'group', calculate mean of 'rating' %>% group_by(group) %>% summarise(mean(rating))
# A tibble: 5 × 2
  group      `mean(rating)`
  <chr>               <dbl>
1 ethical              6.5 
2 financial            3.8 
3 health               5.2 
4 recreation           4.17
5 social               2.83

As before, you can safely ignore the “ungrouping” message that you receive.

NOTE: Your output should look similar to that shown above, but the numbers will be different.

Which types of risk-taking behaviours did you score highest on? And lowest on?

Comparing your risk-taking behaviour to that of an older adult

Relevant worksheets: Group Differences

People tend to be less likely to take risks as they get older. Is this the case for you and the older adult you tested? In order to answer this question, we first have to put your data, and that of your older adult, together in one data frame.

Combining data frames

We can use the bind_rows command to combine two data frames, like this:

# Combine '' and 'risk.other' into one data frame
risk <- bind_rows(, risk.other)

Comparing two individuals on an overall score

Now we can compare you and your older adult on your overall mean risk-taking score. We do this by grouping by the who variable in the risk data frame. If you get this right, your output will look a bit like this, although the exact numbers will be different:

# Group 'risk' by 'who', calculate mean of 'rating'
risk %>% group_by(who) %>% summarise(mean(rating))
# A tibble: 2 × 2
  who   `mean(rating)`
  <chr>          <dbl>
1 other           3.54
2 you             4.35

Who scores higher on risk taking – you, or your older adult?

Enter the exact risk-taking score for you and your older adult into your lab book.

Exploring everyone’s data

This part of the exercise can only be completed once sufficient number of people have completed the risk-taking questionnaire on Psych:EL. When this happens, you will be able to download everyone’s data from Psych:EL as a CSV file. Download that file, and copy it into your RStudio project (the project you generated at the beginning of this exercise).

Load data

Relevant worksheet: Exploring data

Load the tidyverse package, and load your everyone’s data.

# Load data into 'risk.all'
risk.all <- read_csv("riskrat-all.csv")


Look at the data by clicking on it in the Environment tab in RStudio. You’ll see it has the same columns as the the other data files, it just has a lot more rows (because it contains a lot of participants).

How do you compare to your peers?

Relevant worksheet: Group Differences


Let’s start by looking at the range of scores your peers got on this questionnaire. The first thing we’ll need to do is filter the data so it only contains your classmates, not the older adults. This is because older adults tend to score lower on risk taking than younger adults, and so it’s best to compare your score to people who are closer to your own age. We do this using the filter command you learned in the Group Differences worksheet. Here, we want to keep all ratings where the column who says you, because these are the ratings for when your peers are answering the questionnaire themselves. We can filter like this:

# Filter the 'you' data into 'risk.young'
risk.young <- risk.all %>% filter(who == "you")

Density plot

Now we can look at the range of scores given by your peers. A density plot is a good choice for this, which you learned to produce in the Group Differences worksheet. Here, we’re going to make a density plot of the data in column rating of the risk.young data frame:

# Display density plot of 'rating'
risk.young %>% ggplot(aes(rating)) + geom_density(aes(y = ..scaled..), adjust = 2)

Note: You may have noticed the addition of adjust = 2 in the above command, which we didn’t use in the Group Differences worksheet. The adjust command changes how smooth the density plot looks, with higher numbers making for smoother plots. Try changing the value to see what effect it has on your plot.

In this particular plot, a rating of around 5 is the most common, with higher and lower ratings becoming increasingly less likely. But where does your score fit on this distribution? You’ve already calculated your overall score, so you can make this comparison manually, but we can also draw a line on this density plot representing your score, which is more immediately interpretable.

To do this, we use the command geom_vline (vline being short for “vertical line”) to draw a line on the plot to show your score. Replace the number 4.35 in the command below with your score:

# Plot as above, with vertical line added
risk.young %>% ggplot(aes(rating)) + geom_density(aes(y = ..scaled..), adjust = 2) +
  geom_vline(xintercept = 4.35)

In the above example, the individual’s score is close to the centre of the distribution. Are you towards the bottom, towards the top, or near the middle?

Finally, we’ll make this plot a bit prettier by the addition of some colour. Here, I’ve used some fair ugly colours, for your plot use a lightblue fill and a red line:

# Plot as above, with a green fill and yellow line
risk.young %>% ggplot(aes(rating)) + 
  geom_density(aes(y = ..scaled..), adjust = 2, fill = "green") +
  geom_vline(xintercept = 4.35, colour = 'yellow')

Use RStudio to export your light blue and red graph as an Image, and upload it to your lab book.

This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.