Using RStudio projects
Andy Wills, Paul Sharpe, Stuart Spicer
Before you start…
Before starting this exercise, you should have had a brief introduction to getting and using RStudio – Introduction to RStudio. You should also have also completed the workshop exercises for Exploring Data, Group Differences, and Evidence. If not, take a look these earlier worksheets before continuing.
Log in to RStudio server.
- Creating a new project
- Writing an R script
- Running an R script
- Exercise: Analysis using an R script
- Downloading files from RStudio Online
Creating a new project
RStudio uses projects to help you keep your work organized, and to make sure you have a reproducible record of your analyses. Reproducible analysis is essential to good, open science.
We’re going to use a project to organize the analyses of data you have collected. An R Studio project is a way to organise data and anlyses that belong together. You should start a new project for each worksheet in this introduction to R, and for each study you conduct throughout your degree. As you will be working with numerous data sets and analyses, developing this habit now ensures that they remain organised.
Here’s how to create a new project:
At the top right of RStudio, you will see a little blue cube, with the text “Project: (none)”. Click on this.
Now click “New Directory”
- Now click “New Project”
- Next, type in a name for the project that makes sense to you in the “Directory name” box. I’ve typed psyc411, a module code. Then click “Create project”. NB: You must not use most punctuation, including spaces, in project names (
-are OK). For example, both
Fish & Chipsand
Fish&Chipswill cause non-obvious errors later on. Use one of these instead:
- Now, create a R script. An R script is a place to keep your analysis commands safely stored. You create an R Script by clicking on the white plus sign on a green background (see below), and then clicking on “R Script”.
If everything worked well, your screen should now look like this:
Notice that projects in RStudio look slightly different to how you’ve been using RStudio up until now. The two main changes are:
There is now a new type of tab on the top left. This is a Script tab, which we’re going to use in a moment.
The Files tab (bottom right) is nearly empty. This tab will now only show the files in your project. This makes it easier to keep stuff organized.
Writing an R script
We’re now going to write an R script to organize the analysis of the data you collected during this module. Up to this point, you have been typing commands directly into the R console. However, it’s almost always better to keep your commands in an R script. Scripts make it easy to re-run one or more commands, without having to copy and paste them into the console. As you will see from the following worksheets, you tend build your analyses step-by-step, so keeping all of the commands which belong together in a script is the best way of organising your work.
You should be familiar with this command by now:
library(tidyverse). The slight difference this time is that you’re going to type (or paste) it as the first line of your R script (top left window), not into the Console.
You should notice that the name
Untitled1 on the tab has now gone red. This is to remind you that your script has changed since the last time you saved it. So, click on the “Save” icon (the little floppy disk) and save your R script with some kind of meaningful name. I’ve called mine
.R indicates that it is an R script.
Next, you’re going to need to load your own data. The Entering Data by Hand worksheet explains how to do this, so take a look at that now. Once you’ve read it and done it, add this command to the next line of your script:
p411data <- read_csv("psyc411data.csv")
If you gave your CSV file a different name, change the name inside the quote marks accordingly.
Save your script again (click the Save icon), so you don’t lose anything. Do this each time you add something important to your script.
Running an R script
Put your cursor on line 1 of your script and press CTRL+ENTER (i.e. press the key marked ‘Ctrl’ and the RETURN or ENTER key together). Line 1 is automatically copied to your Console window and run. The cursor moves to line 2. Press CTRL+ENTER again to run the second line.
Exercise: Analysis using an R script
You’ve written and run a working R script — it doesn’t do much yet! In order to analyse the data from your experiment, you need to use the commands you’ve learned up until now. The things you’ll need to do are:
Produce an appopriately labelled density plot of your dependent variable, with one line for each of your between-subject groups.
Calculate your effect size.
Perform a between-subjects t-test.
Perform a Bayesian t-test.
Write a script to do these analyses on your data.
Here’s what such a script looks like for the gender pay gap analyses. The lines that begin
## are comments. They are ignored by R but they help human readers work out what is going on.
## Load packages library(tidyverse) library(effsize) library(BayesFactor) ## Load data cpsdata <- read_csv("cps2.csv") ## Produce density plot cpsdata %>% ggplot(aes(income, colour=factor(sex))) + geom_density(aes(y=..scaled..)) + xlab("Income in US Dollars") + ylab("Density") ## Calculate effect size cohen.d(cpsdata$income ~ cpsdata$sex) ## Perform t-test t.test(cpsdata$income ~ cpsdata$sex) ## Perform Bayesian t-test ttestBF(formula = income ~ sex, data = data.frame(cpsdata))
Downloading files from RStudio Online
In order to use files from RStudio Online in other applications (e.g. Microsoft Office, LibreOffice), you’ll need to download them. Here’s how:
Click on the ‘Files’ tab in RStudio Online, and click the tick box to select your file (it will have whatever filename you just gave it.).
Click on “More” (next to the little blue gear wheel), and then “Export…”.
Click “Download”. The file will now be in the Downloads folder of your computer.
This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.