Reproducible pipelines in R

R Tutorial

  Copyright (c) 2013-2023 Aurelien Ginolhac, UL HPC Team  <hpc-sysadmins@uni.lu>

Through this tutorial you will learn how to use the R package {targets}: A function-oriented Make-like workflow manager.

Warning: this tutorial does not focus on the learning of R language. If you’re also looking for a good tutorial on R’s data structures you can take a look at: Hadley Wickham’s page. Another bookdown’s book is available for free: R for Data Science by Hadley Wickham, Mine Çetinkaya-Rundel & Garrett Grolemund.

Pre-requisites

Ensure you are able to connect to the UL HPC clusters

you MUST work on a computing node

# /!\ FOR ALL YOUR COMPILING BUSINESS, ENSURE YOU WORK ON A COMPUTING NODE
(access-iris)$> si -c 2 -t 1:00:00

On HPC, using Singularity

A Singularity image was prepared that contains an Ubuntu-based R 4.2.2 but with all necessary packages. Of note this was created using renv::restore(). renv helps to manage R dependency to projects.

Once on a node, load Singularity

module load tools/Singularity

Cloning the demo repository

The demo repository is targets_demos.

cd $HOME
git clone --branch hpc https://gitlab.lcsb.uni.lu/aurelien.ginolhac/targets_demos.git

Check you can tart a container inside this newly fetched folder

singularity run -H /home/users/${USER}/targets_demos/ --contain \
  /scratch/users/aginolhac/targets_hpc/r-targets.sif ls

Here we bind as home only the targets_demos folder (specifying -H this new home and --contain to not bind the rest of your home) and list its content:

LICENSE.md  README.md      _targets_ds_1.R  _targets_ds_2_crew.R  _targets_ds_fun1.R   circles  ds1.Rmd  ds3.Rmd  lines   renv.lock
R           _targets.yaml  _targets_ds_2.R  _targets_ds_3.R       _targets_packages.R  data     ds2.Rmd  img      others  run.R

Using VScode to connect remotely to the HPC

I have followed this tutorial: R-VScode by Roland Krasser