Research Methods in R
2023 Edition.
Research Methods in R is a set of guides on how to use R as your central research methods tool. They are written by various authors, and curated by Andy Wills. The target audience is psychology undergraduate students. Research Methods in R is Creative Commons, so you are free to reuse these materials and adapt them as you wish, as long as you attribute them to their authors, and as long as your modifications have a Creative Commons licence. They come with absolutely no warranty of any kind.
Note to teachers: These materials have been tested against R version 4.2.1 (released 23rd June 2022), and the most recent version of packages that were available on MRAN on the 1st June 2022.
List of guides
Introductory quides
Start with ONE of these three options:
 Long, easygoing introduction
 Somewhat shorter introduction
 Even shorter introduction
Intermediate guides
Next, go through BOTH of these guides.
Advanced guides
General resources

Why R? Discussion of the advantages of R over other software packages.

Pedagogy Discussion of philosophy of teaching and learning underlying these materials (mainly aimed at teachers).

Who is using R? Partial list of psychology degree programmes around the world than use R.

Other resources. A list of other Creative Commons resources about using R.

Calculating your module mark. How to calculate a final module mark from your component marks, using R.

Dealing with common errors. List of commonlyencountered errors and how to solve them.
Absolute Beginners’ Guide to R
A series of worksheets on using R for data analysis in psychology. No previous knowledge of R, or of psychology, is assumed.
Part 1

Introduction to RStudio. A basic introduction to the software.

Exploring data. Means, medians, and histograms.

More on tibbles. Deeper explanation of ‘tibbles’ in R.

Means and medians. Some slides on the difference between a mean and a median.


Group differences. Means and standard deviations, by group. Filtering data. Effect size.

Evidence. Introduction to p values. Traditional betweensubjects ttest. Bayesian betweensubjects ttest.

More on ttests. Further information on traditional ttests, and confidence intervals.

More on Bayes Factors. A more detailed discussion of Bayes Factors.


Analyzing your project data. Analysing your own data.

Entering data by hand. Entering data into a spreadsheet. Saving data into your RStudio project.
Part 2

Interrater reliability. Percentage agreement. Cohen’s kappa.
 More on Cohen’s kappa. A discussion of some potentially surprising outputs from a Cohen’s kaapa calculation.

Relationships. Frequency and contingency tables. Mosaic plots. Traditional chisquare test. Bayesian test.

More on relationships. Extension material on chisquare calculations, including issues surrounding ordered variables (e.g. age), the interpretation of large contingency tables, and a further explanation of the output of the Bayesian chisquare test.

Sample characteristics. How to calculate summary information about your sample, such as number of participants or gender balance, from your data file.


Relationships, part 2. Density plots. Scatter plots. Correlation coefficient. Bayesian and traditional tests.

More on relationships, part 2. Spearman’s correlation, Kendall’s tau, onetailed tests, confidence intervals, plus a deeper look at the output of the Bayesian correlation test.

Making reports with R. How to insert an RStudio graph into your word processor document (e.g. Word). Links to RMarkdown as an alternative.

Putting R to work
These are mainly further practice in the skills learned in Absolute Beginners’. Where the exercises contain completely new skills, these are shown in bold. Where the exercises extend a skill you’ve already been taught, these are shown in italics. The exercises become somewhat more difficult as you go down the list.
If you are a current undergraduate student at Plymouth University, you should complete the accompanying Psych:EL (Psychology: Experiential Learning) activity first, in order to generate your own set of data. If you’re not, you can download sample data files here.

Autobiographical memory. Entering data by hand, histograms.

Face recognition. Means, filtering data, and a bar graph.

Spatial navigation. More on bar graphs.

Response compatibility. Means, filtering data, standard deviations, and density plots.

Visual illusions. Filtering data, means, violin plot, Bayesian ttest.

Facial attractiveness. Means, standard deviations, interquartile range, and density plots.

Police lineup. Contingency table, mosaic plot, Bayesian contingency test, means, density plot, Bayesian ttest

Risk taking. Means, combining data frames, filtering data, and density plots.

Animal Welfare. Percentage agreement, Cohen’s kappa, contingency tables, bar charts.

Creativity and the environment. Preprocessing, means, density plots, effect size, Bayesian ttest.

Political psychology. Means, filtering data, summarizing data, density plots, effect size, Bayesian ttest, traditional ttest.
A Very Brief Guide to R
The Absolute Beginners’ Guide to R and Putting R to Work provide, between them, about 20 hours of introductory material. For those in a hurry, the Very Brief Guide to R covers the most critical material from those two courses in about four hours.

Using RStudio: Brief introduction to the software

Exploring data: Loading data, calculating means

Group differences: Grouping, density plots, filtering.

Evidence, part 1: Bayesian and traditional ttests

Evidence, part 2: Bayes and traditional correlation, scatterplot
Research Methods in Practice (Quantitative section)
These are intermediatelevel materials. They are maintained by Ben Whalley on a separate site, but have been designed to fit in here in this sequence of materials. Only the quantitative section of Ben’s site contains information concerning the usage of R.
 Research Methods in Practice: Data handling, fitting lines  scatterplot with best fit line , converting Likert scales from text to numbers, reverse scoring scale items, multiple regression.
Intermediate Guide to R
These are intermediatelevel materials. They provide analysis methods for conducting realistic, highquality studies in psychology. They are aimed at a secondyear undergraduate audience.

Revision: A quick recap of key information covered in earlier courses.

Statistical power: How to calculate the statistical power of experiments.
 More on statistical power: A deeper discussion on statistical power, including: (1) relation between statistical power and the replication crisis, (2) better standards for statistical power, (3) how to improve effect size, (4) estimating effect size from previous work.

Data preprocessing: Getting data from labbased (OpenSesame) experiments into a format closer to something you can actually analyse, in five steps: loading, selecting, filtering, summarising, and combining. Also covers combining data frames, renaming columns.
 More on preprocessing: A slightly more advanced worksheet, covering adding columns to a data frame, and subsetting strings.

Withinsubject differences: Data preprocessing (pivoting and mutating). Onefactor withinsubject Bayesian ANOVA. Pairwise comparisons, multiple comparisons.
 More on Bayes Factors. A more detailed discussion of Bayes Factors.

Understanding interactions: Learn what an interaction is, and learn how to do line plots at the same time.

Factorial differences: Twofactor Bayesian ANOVA (one within, one between), plus advice on: pairwise comparisons, better graphs, reporting Bayesian ANOVA, and ordinal (i.e. ordered) independent variables.
Going further with R
These are slightly more advanced materials, aimed at a finalyear undergraduate psychology audience.

Data management
 Data management: Anonymity and privacy, good and bad file types, creating and sharing a private github repository, adding a repository to Rstudio, adding files to github using Rstudio, modifying and updating files, git log as your logbook, branching, recovering an earlier version of a file.

Preprocessing

Data preprocessing for experiments: Deduplicating data, excluding participants, log transform.

Data preprocessing for scales: Handling missing data, calculating scale scores, tidying survey data.


Descriptive statistics

Better tables: correlation matrix, custom table of descriptive statistics.

Better graphs: publicationquality graphs showing both central tendency and variability (or uncertainty) of your data, including: line plots, distribution plots (density, violion, halfviolin), box plots, and confidence intervals. Suggested plots for one and twofactor designs, withinsubject, betweensubject, or mixed designs, and with ordered and unordered variables. Discussion of common bad plots to avoid (bar plots; confusions over confidence intervals). Pairs plot for correlational designs.

Analysing scales: Cronbach’s alpha.


Bayesian inferential statistics

Estimate sample size with Bayes Factors: An introduction and manual to Bayesian Power Calculations.

Onesample Bayesian ttest: Comparing a singlegroup sample of data against a population mean.

More on Bayesian ANOVA: More on twofactor Bayesian ANOVA.

More on regression: Multiple regression with more than two predictors, hierarchical regression, evidence for individual predictors.


Traditional inferential statistics

Traditional ANOVA: pvalue based, approach to ANOVA.

Traditional nonparametric tests: MannWhitney U, KruskalWallis H.

Case studies
These are full preprocessing and analysis pipelines, mainly based on finalyear undergraduate psychology projects.

The effects of negative mental imagery on selfesteem: preprocessing, Cronbach’s alpha, Bayesian ANOVA.

The Perruchet Effect: Downloading from OSF, deduplicating data, excluding participants, line graphs, baseline correction of neuroscience data, functions, loops, merging data frames, list of participant numbers, log transforms, recoding data, Bayesian linear regression for withinsubjects designs.

Childrens’ language development: preprocessing, Bayesian ttest, tables of descriptive statistics, correlations, halfviolin plot, Wilcoxon test.
R for Pros
These worksheets go beyond what is taught in previous sections of RMINR. They are aimed at highachieving undergraduates, as well as postgraduate students and professional researchers. They assume familiarity with material up to and including Going Further with R.
 Bayesian ANOVA for Pros: doing twofactor withinsubjects Bayesian ANOVA better.
Source code
These teaching materials were generated using a combination of Markdown and RMarkdown. The full source code is available on github.
Licence
This material is distributed under a Creative Commons licence. CCBYSA 4.0.
Parts of this material have been adpated from these other Creative Commons materials:
 May, J. (2018). Getting Results with R.
 Whalley, B. (2018). Just Enough R.
 Wills, A. (2015). R for Experimental Psychologists.
Acknowledgements
Thanks to the following people for their feedback and advice on these materials:
Jackie Andrade, Eleanor Andrade May, Martyn Atkins, Patric Bach, Alison Bacon, Dale Barr, Nadège Bault, Chris Berry, Allegra Cattani, Laura Charlton, Lisa DeBruine, Charlotte Edmunds, Emily Filewood, Giorgio Ganis, Phil Gee, Michaela Gummerum, Yaniv Hanoch, Cathryn Harries, Jessica Hart, Sophie Homer, Courtney Hooton, Angus Inkster, Jasmin Jones, Peter Jones, Laith Kahn, Gokcek Kul Helen Lloyd, Chris Longmore, Jon May, Anthony Mee, Chris Mitchell, Millie Monks, Karol Nedza, Alyson Norman, Charlie Reynolds, Matt Roser, Paul Sharpe, Alastair Smith, Julian Stander, Sylvia Terbeck, Michael Verde, Clare Walsh, Ben Whalley.