Before you start…

Before starting this exercise, you should have completed all the Absolute Beginners’, Part 1 workshop exercises. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.

Getting the data into RStudio Server

Relevant worksheet: Using RStudio projects, Exploring data

In order to complete this worksheet, you’ll need to have downloaded your CSV file from the PsycEL exercise. See the instructions on PsycEL for how to do this.

Once you have downloaded your CSV file, set up a new project on RStudio Server for this analysis, and upload your CSV to your project.

Finally, load the tidyverse package, and load your data.

library(tidyverse)
mems <- read_csv("memories-single.csv")

Your CSV file may have a different name to the example above. If so, you will need to change memories-single.csv to the name of your file.

Making a histogram

Relevant worksheet: Exploring data.

Are memories from all time periods about equally common? Or are recent memories more common than remote ones? Or perhaps some other pattern? A histogram can help us to answer this question by visualising our data. You covered how to make a histogram in the Exploring Data worksheet. In this case, our data of interest are in the period column of the mems data frame, so the command we use is:

mems %>% ggplot(aes(period)) + geom_histogram(binwidth=.5) 

Explanation

  • Your histogram will look something like the above, but the heights of the bars will likely be somewhat different.

  • The binwidth has been set to .5 here to make a gap between each bar in the histogram. Try changing binwidth to 1 to see what effect it has on your plot.

Improving the histogram

Not bad…but it could be better. In particular, having the time periods labelled as numbers doesn’t make for a very readable graph; it would be better if we used more meaningful labels. We can use the scale_x_continuous command of ggplot to add our own labels to a histogram:

mems %>% ggplot(aes(period)) + 
  geom_histogram(binwidth=.5) + 
  scale_x_continuous(limits = c(0.75,5.25), breaks = 1:5, labels = c("Fred", "Wilma", "Barney", "Betty", "Pebbles"))

Explanation: The command scale_x_continuous contains the words breaks, labels and limits.

breaks = 1:5 tells R we want a bar for each of the periods 1, 2, 3, 4 and 5.

labels gives the label for each of those bars, in order.

limits tells R what to use as the minimum and maximum values for the x axis. It is important to include this as well as setting breaks because otherwise R will ignore zero-height bars. We set the range from 0.75 to 5.25 (rather than from 1 to 5) because we need to leave room for the width of the bars (which we set to .5 using binwidth)

Exercise

  1. Modify the command above to add more meaningful labels to your histogram. If you get it right, it’ll look something like this (without the words “example plot”, of course):

  1. Export your histogram, using the Export icon on RStudio’s Plots window, and selecting “Save as image…”. Give it a meaningful file name (e.g. “memories-hist”) and click ‘Save’.

  2. Download your histogram from RStudio server - see these instructions for a reminder of how to do this.

  3. Upload your histogram to PsycEL (see the PsycEL activity for instructions of how to do this).


This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.