Using R
This instructional guide will provide you with many ways on how to use R on the CQUni HPC system.
R is a popular programming language which is commonly used for statistical computing and graphics. R can be used in different stages of research such as data visualisation, cleaning and analysis.
To run R programs on CQUniversity’s High Performance Computing system, we can use a graphical interactive development environment (IDE), such RStudio, or via the command line using R directly. We can also submit R jobs to the HPC scheduler to run many R programs non-interactively
To use the R software on the HPC, you will need the following:
- Access to the HPC system (Contact HPC support if you need an account created).
- A connection to the HPC System, see Connecting to the Marie Curie Cluster for information on how to do this.
- If you plan on using running RStudio on the HPC system, you will require a graphical interface to the HPC system. See Graphical Connection to the HPC System for information on how to do this.
For those who already have a HPC account and are using a “graphical” connection, you should be able to start RStudio by issuing the following command (inside a terminal session ):
rstudio
Note, if you get any issues with starting rstudio, try using the following command instead:
rstudio --disable-gpu
What versions of R are available on the HPC system?
It should be noted that it is most likely that the version of R you wish to use is not the default version available when you first log on to CQUniversity’s High Performance Computing facility.
Our HPC system has a large number of versions of R available that can be loaded as part of a software module.
You can use the software module command module avail, to highlight all of the HPC software that is available to load.
It is important to load your R software module you wish to use each time to open a new HPC session, command prompt/terminal as well as have it included in your HPC submission scripts.
A subsection of R software modules that are available to load include:
R
R-2.14.2
R-2.15.2
R-3.0.2
R-3.1.1
R-3.2.3
R-3.5.3
R-patched-30-01-2013R-2.12.2
R-2.15.1
R-3.0.1
R-3.0.2-test
R-3.2.2
R-3.4.3
R-4.0.0
R/4.0.0-foss-2020a
R/4.0.4-foss-2020b
R/4.1.2-foss-2021bR/4.0.3-foss-2020b
R/4.0.5-foss-2020b
R/4.2.0-foss-2021b
R-bundle-Bioconductor/3.11-foss-2020a-R-4.0.0
R-bundle-Bioconductor/3.12-foss-2020b-R-4.0.3
rstudio
rstudio-0.98.1049
rstudio-0.98.1103
rstudio-1.1.383
rstudio-1.1.463
rstudio-1.2.1335
rstudio-1.4.1106
rstudio-2022.07
Running and editing R code via a graphical interactive development environment (IDE)
- Connect to the CQUni HPC system through using graphical connection, instruction on how to do this can be found here. It should be noted that if you are just editing some code or submitting some R jobs via the HPC scheduler, you can use the login node “marie” or “curie”, but if you plan on running some intensive jobs, then you should start an interactive session (which will place you on one of the many compute nodes).
- To launch the R IDE RStudio, you will need to do the following:
- Launch the ‘GNOME Terminal’ located on your desktop
- Ensure the R version you wish to use is loaded. If you need to load a different version of python, it is suggest to have a look at the HPC software page.
3. Ensure the R version you wish to use is loaded.
You can check for any currently loaded R modules using:
$ which rstudio
/apps/software/rstudio/1.2.1335/bin/rstudio
Transferring files to and from the HPC system
Before you can run your R scripts, you will most likely need to upload your R programs and data from your computer to the HPC system. Instructions on how to do this can be found here.
Once you have uploaded your programs and data, you can then run the R code directly on the HPC system using the instructions provided above.
You can also use the same process of uploading your files to then download the results and anything else you need back to your local computer/s.
Sample Coding on RStudio (interactively solving R jobs
To test if our codes are working in RStudio, we can perform basic mathematical operations such as:
> 1 + 100
That should give us an output:
[1] 101
Solving R jobs non-interactively
One of the benefits of using the HPC system is that you can submit 1 to many jobs to the HPC scheduler. Using the HPC scheduler, you can request more resources (such as CPU’s) which can dramatically improve the processing execution time.
To solve a R job non-interactively, you will need to create a R HPC scheduler script. Instructions of how to do this and some examples can be found at R Sample Scripts.