Before starting this exercise, you should have completed all the Absolute Beginners’ workshop exercises. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.
In order to complete this worksheet, you’ll need to have downloaded your CSV file from the PsycEL exercise. See the instructions on PsycEL for how to do this.
Plymouth University students: Create/open your project named
psyc412; within that create a script file called
memories.R. Enter all commands into that script and run them from there.
Finally, load the tidyverse package, and load your data.
library(tidyverse) mems <- read_csv("memories-single.csv")
Your CSV file may have a different name to the example above. If so, you will need to change
memories-single.csv to the name of your file.
Relevant worksheet: Exploring data.
Are memories from all time periods about equally common? Or are recent memories more common than remote ones? Or perhaps some other pattern? A histogram can help us to answer this question by visualising our data. You covered how to make a histogram in the Exploring Data worksheet. In this case, our data of interest are in the
period column of the
mems data frame, so the command we use is:
mems %>% ggplot(aes(period)) + geom_histogram(binwidth=.5)
Your histogram will look something like the above, but the heights of the bars will likely be somewhat different.
binwidth has been set to .5 here to make a gap between each bar in the histogram. Try changing
binwidth to 1 to see what effect it has on your plot.
Not bad…but it could be better. In particular, having the time periods labelled as numbers doesn’t make for a very readable graph; it would be better if we used more meaningful labels. We can use the
scale_x_continuous command of ggplot to add our own labels to a histogram:
mems %>% ggplot(aes(period)) + geom_histogram(binwidth=.5) + scale_x_continuous(limits = c(0.75,5.25), breaks = 1:5, labels = c("Fred", "Wilma", "Barney", "Betty", "Pebbles"))
Explanation: The command
scale_x_continuous contains the words
breaks = 1:5 tells R we want a bar for each of the periods 1, 2, 3, 4 and 5.
labels gives the label for each of those bars, in order.
limits tells R what to use as the minimum and maximum values for the x axis. It is important to include this as well as setting
breaks because otherwise R will ignore zero-height bars. We set the range from 0.75 to 5.25 (rather than from 1 to 5) because we need to leave room for the width of the bars (which we set to .5 using
Export your histogram, using the Export icon on RStudio’s Plots window, and selecting “Save as image…”. Give it a meaningful file name (e.g. “memories-hist”) and click ‘Save’.
Download your histogram from RStudio server - see these instructions for a reminder of how to do this.
Upload your histogram to PsycEL (see the PsycEL activity for instructions of how to do this).
This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.