Before you start…

Before starting this exercise, you should have completed all the Absolute Beginners’ workshop exercises. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.

Getting the data into R

Relevant worksheet: Intro to RStudio

You’ll need to complete the Psych:EL excerise to get the CSV file containing your data. You’ll also need to get an older person to complete the same Psych:EL exercise to get a second CSV file containing their data.

Plymouth University students: Create/open your project named psyc415; within that create a script file called risk-rat.R. Enter all commands into that script and run them from there.

Relevant worksheet: Exploring data

library(tidyverse)
risk.other <- read_csv("riskrat-other.csv")

Note: Everyone’s CSV files have different names. For example, yours might be called 10435678you.csv and 10435678other.csv. In the example below, you’ll need to replace riskrat.csv and riskrat-other.csv with the name of your personal CSV files.

Inspect

Look at the data by clicking on it in the Environment tab in RStudio. Each row is one person’s rating for one question. Here’s what each of the columns in the data set contain:

Column Description Values
who Is this data from you, or from the other (older) person you tested? “you”, “other”
group Which sort of risk-taking behaviour is this question about? “ethical”, “financial”, “health”, “social”, “recreation”
qu This number unqiuely idenifies the question that was asked 1 - 26, e.g. qu. 19 is “Taking a skydiving class”
rating The rating given in response to this question 1 - 7, higher numbers = more likely to engage in the risky behaviour described in the question.

Relevant worksheets: Group Differences

How highly did you score on each of the types of risk-taking behaviour (e.g. ethical, financial, …)?

To look at this, we take the average (mean) rating you made for each type of behaviour. To do this, we use the group_by and summarise commands you learned in the Group Differences worksheet.

risk.me %>% group_by(group) %>% summarise(mean(rating))
# A tibble: 5 x 2
group      mean(rating)
<chr>               <dbl>
1 ethical              6.5
2 financial            3.8
3 health               5.2
4 recreation           4.17
5 social               2.83

As before, you can safely ignore the “ungrouping” message that you receive.

NOTE: Your output should look similar to that shown above, but the numbers will be different.

Which types of risk-taking behaviours did you score highest on? And lowest on?

Relevant worksheets: Group Differences

People tend to be less likely to take risks as they get older. Is this the case for you and the older adult you tested? In order to answer this question, we first have to put your data, and that of your older adult, together in one data frame.

Combining data frames

We can use the bind_rows command to combine two data frames, like this:

risk <- bind_rows(risk.me, risk.other)

Comparing two individuals on an overall score

Now we can compare you and your older adult on your overall mean risk-taking score. We do this by grouping by the who variable in the risk data frame. If you get this right, your output will look a bit like this, although the exact numbers will be different:

risk %>% group_by(who) %>% summarise(mean(rating))
# A tibble: 2 x 2
who   mean(rating)
<chr>          <dbl>
1 other           3.54
2 you             4.35

Who scores higher on risk taking – you, or your older adult?

Exploring everyone’s data

This part of the exercise can only be completed once sufficient number of people have completed the risk-taking questionnaire on Psych:EL. When this happens, you will be able to download everyone’s data from Psych:EL as a CSV file. Download that file, and copy it into your RStudio project (the project you generated at the beginning of this exercise).

Relevant worksheet: Exploring data

library(tidyverse)
risk.all <- read_csv("riskrat-all.csv")

Inspect

Look at the data by clicking on it in the Environment tab in RStudio. You’ll see it has the same columns as the the other data files, it’s just has a lot more rows (because it contains a lot of participants).

How do you compare to your peers?

Relevant worksheet: Group Differences

Filtering

Let’s start by looking at the range of scores your peers got on this questionnaire. The first thing we’ll need to do is filter the data so it only contains your classmates, not the older adults. This is because older adults tend to score lower on risk taking than younger adults, and so it’s best to compare your score to people who are closer to your own age. We do this using the filter command you learned in the Group Differences worksheet. Here, we want to keep all ratings where the column who says you, because these are the ratings for when your peers are answering the questionnaire themselves. We can filter like this:

risk.young <- risk.all %>% filter(who == "you")

Density plot

Now we can look at the range of scores given by your peers. A density plot is a good choice for this, which you learned to produce in the Group Differences worksheet. Here, we’re going to make a density plot of the data in column rating of the risk.young data frame:

risk.young %>% ggplot(aes(rating)) + geom_density(aes(y=..scaled..), adjust = 2)

Note: You may have noticed the addition of adjust = 2 in the above command, which we didn’t use in the Group Differences worksheet. The adjust command changes how smooth the density plot looks, with higher numbers making for smoother plots. Try changing the value to see what effect it has on your plot.

In this particular plot, a rating of around 5 is the most common, with higher and lower ratings becoming increasingly less likely. But where does your score fit on this distribution? You’ve already calculated your overall score, so you can make this comparison manually, but we can also draw a line on this density plot representing your score, which is more immediately interpretable.

To do this, we use the command geom_vline (vline being short for “vertical line”) to draw a line on the plot to show your score. Replace the number 4.35 in the command below with your score:

risk.young %>% ggplot(aes(rating)) + geom_density(aes(y=..scaled..), adjust = 2) +
geom_vline(xintercept = 4.35)

In the above example, the individual’s score is close to the cente of the distribution. Are you towards the bottom, towards the top, or near the middle?

Finally, we’ll make this plot a bit prettier by the addition of some colour. Here, I’ve used some fair ugly colours, for your plot use a lightblue fill and a red line:

risk.young %>% ggplot(aes(rating)) +
geom_density(aes(y=..scaled..), adjust = 2, fill = "green") +
geom_vline(xintercept = 4.35, colour = 'yellow')

Use RStudio to export your light blue and red graph as an Image, and upload it to your lab book.