As Psychology students, many of us know by now that statistics is a fundamental part of psychological research. We have all had our moments grappling with the intricacies of ANOVA and the multiple functions of SPSS, commonly, the first data analysis software introduced to us in our foundational psychology statistical module. However, the main drawback of SPSS is that it is not free.
Thankfully, there are many readily available open-source data analysis programmes. For today, we will be introducing the basics of R, one of the more well-known open-source programmes.
What is R?
R is free software environment for statistical computing and programming. It is the brainchild of Ross Ihaka and Robert Gentleman, conceived at the Department of Statistics in the University of Auckland.
What can R bring to the table that SPSS doesn’t already offer?
- R is completely free of charge.
- R is constantly being updated by members of the community with new features and bug-fixes.
- R is incredibly versatile, with functions that allow simple tabulation of summary statistics to the creation of extensive graphical diagrams.
- R is a programming language. When you learn how to use R, besides gaining valuable data analysis skills, you are picking up programming skills as well.
For those who have not tried their hand at programming, this may seem incredibly daunting but do not fear! There are plenty of existing R scripts readily available online for you to use for your data analysis.
Also, RStudio, another completely free interface, makes programming in R much more manageable, such as making helpful suggestions/corrections of functions.
If you are ever unsure of methods of analysis using R, there are plenty of free online resource guides to refer to. Here are some examples:
And even if you can’t find exactly what you are looking for within the guides, generally, suggestions can be found with a simple Google search.
Enough said, how do I install R/RStudio on my computer/laptop?
You can download R here: https://cran.r-project.org/
You can download RStudio here: https://www.rstudio.com/products/rstudio/download/#download
Alternatively, RStudio can also be accessed through your web browser for online use here:
(Note: Certain functions on RStudio Cloud may differ from the ones I will be mentioning below.)
Great, I installed everything.
So where do I begin?
So where do I begin?
Let me begin by introducing the four RStudio windows to you. (Note: This view might look slightly different on a non-Apple laptop/computer but the basic elements are the same.)
The “Source” panel is where you create R Scripts, another name for codes. You can type in your code here. But remember to click the “Run” button to evaluate your code.
Let’s start with something simple: 5+5
Then we click on “Run”.
The output, “10” appears in the “Console”.
Note: When you close the RStudio window and reopen it, your “Source” panel will be cleared. In order to prevent this, you can save the untitled file under a file name before working on your code.
In the “Console” panel, the codes that are typed in can be immediately evaluated when you press the enter key. However, the code in the “Console” is cleared whenever you close the window, regardless of whether you saved it as a file in the beginning. It is advisable to write most of your code in the source pane instead of the console pane to avoid losing your progress.
3. Environment/ History
The “Environment” tab displays the data objects that you defined in your R session, such as those you have in your data frame. For instance, in a previous R session, my “Environment” tab displayed the different variables that I defined and their values:
The “History” tab keeps track of the commands that you recently typed. To help with visualization, it will look something like this:
4. Files/ Plots/ Packages/ Help
The “Files” tab shows you a directory of the files that you have on your hard drive. One useful function of this tab is that it allows you to set a folder that you would like to save files to. To do this, navigate to the folder and click on “More” and select “Set As Working Directory” from the dropdown.
The “Plots” tab shows you the plots that you have created. You can choose to export the plot in .pdf, .jpeg format or simply copy the plot and paste it where you desire.
The “Package” tab shows you the R packages which are currently installed on your hard drive. R packages are collections of functions that are ready for use once downloaded. For instance, installing the “psych” R package allows you to calculate kurtosis. To ensure your desired package is loaded for use in your R session, make sure that the box next to it is has been checked.
The “Help” tab allows you to look up different R functions. For instance, if I want to learn more about the “psych” R package, I can type “psych” into the search bar.
How do I import my existing dataset into R?
Go to the “Environment” tab and click on “Import Dataset”. From the dropdown, you can select which file format to import. R supports a variety of data files, from Excel files to CSV files.
When you select an option, this screen will pop up:
You can use “Browse” to search for the file with the dataset you desire. You can also name the dataset for easy reference in R.
Any final tips for now?
- It would be great for you to note that R is case-sensitive, meaning that you should be particular about the usage of uppercase and lowercase letters when typing your code.
- Another useful tool is the commenting function in R. You can type “#” into the “Console” or “Source” panel and everything in the same line after the “#” will be ignored by R. This allows you to take notes and keep track of your thoughts while doing data analysis.
- If ever in doubt, rely on Google. The wonderful thing about open-source programmes is that since they are so easily accessed, there will probably be people in the community who have put out information on the functions that you require.
Have fun exploring the new world of data analysis through R and stay curious!