RSuite basic workflow

WLOG Solutions

2019-06-10

Basic R Suite usage

In this document we present a basic R Suite usage. It covers:

Got stuck?

If you are stuck feel free to contact us:

Package options

RSuite uses the following options to configure behavior:

Start a new project

To create a new project (called my_project) we have to call the following function:

RSuite::prj_start(name = 'my_project')
#>  2019-06-10 10:29:55 INFO:rsuite:Project my_project has structure compliant with current version of RSuite(v0.37.253).
#>  2019-06-10 10:29:55 INFO:rsuite:Project my_project started.
#>  2019-06-10 10:29:55 INFO:rsuite:Local GIT repository created for the project
#>  2019-06-10 10:29:55 INFO:rsuite:Puting project my_project under GIT control ...
#>  2019-06-10 10:29:55 INFO:rsuite:... done

The RSuite project is being created in folder not under git/svn control. This is the cause of warning you can see above. To avoid that warning messages you can pass the TRUE for skip_rc argument.

Run master file

Every project has a special structure. Lets change working directory to the project we just created to check it.

setwd("my_project")

Lets see the contents of the project folder created:

cat(list.files(".", all.files = TRUE), sep = "\n")
#>  .
#>  ..
#>  .git
#>  .gitignore
#>  .Rprofile
#>  config_templ.txt
#>  deployment
#>  logs
#>  my_project.Rproj
#>  packages
#>  PARAMETERS
#>  R
#>  tests

In folder R there are master scripts - these are execution scripts in our project. R Suite by default creates exemplary script R\master.R of following contents:

# Detect proper script_path (you cannot use args yet as they are build with tools in set_env.r)
script_path <- (function() {
  args <- commandArgs(trailingOnly = FALSE)
  script_path <- dirname(sub("--file=", "", args[grep("--file=", args)]))
  if (!length(script_path)) {
    return("R")
  }
  if (grepl("darwin", R.version$os)) {
    script_path <- gsub("~\\+~", " ", script_path) # on MacOS ~+~ in path denotes whitespace
  }
  return(normalizePath(script_path))
})()

# Setting .libPaths() to point to libs folder
source(file.path(script_path, "set_env.R"), chdir = T)

config <- load_config()
args <- args_parser()

To check if everything is working properly run the R/master.R script:

source("R/master.R")
print("master.R sourced successfully.")
#>  [1] "master.R sourced successfully."

You should not see any error messages.

Add first package

R Suite forces the users to keep logic in packages. To create a package (called mypackage) call the following function:

RSuite::prj_start_package(name = "mypackage", skip_rc = TRUE)
#>  2019-06-10 10:29:56 INFO:rsuite:Package mypackage started in project my_project.

Add custom package to master script

Open in any editor R/master.R and change it to look like this:

# Detect proper script_path (you cannot use args yet as they are build with tools in set_env.r)
script_path <- (function() {
  args <- commandArgs(trailingOnly = FALSE)
  script_path <- dirname(sub("--file=", "", args[grep("--file=", args)]))
  if (!length(script_path)) {
    return("R")
  }
  if (grepl("darwin", R.version$os)) {
    script_path <- gsub("~\\+~", " ", script_path) # on MacOS ~+~ in path denotes whitespace
  }
  return(normalizePath(script_path))
})()

# Setting .libPaths() to point to libs folder
source(file.path(script_path, "set_env.R"), chdir = T)

config <- load_config()
args <- args_parser()

library(mypackage)

You can check if your package is visible to your master script by running the master script

source('R/master.R')
#>  Error in library(mypackage): nie ma pakietu o nazwie 'mypackage'

You can notice an error saying there is no such package as mypackage. This is fine because in R you have to install package to have access to it.

Building custom packages

Adding a package to the project is not enough to use it. You have to build it. You can do this by calling the following function:

RSuite::prj_build()
#>  2019-06-10 10:29:56 INFO:rsuite:Installing mypackage (for R 3.5) ...
#>  2019-06-10 10:29:56 WARNING:rsuite:Document building for mypackage failed: Error in loadNamespace(name): nie ma pakietu o nazwie 'devtools'
#>  2019-06-10 10:29:56 ERROR:rsuite:Failed to install project packages: mypackage
#>  Error: Failed to install project packages: mypackage

Now you can check if your master script has access to the package mypackage

source('R/master.R')
#>  Error in library(mypackage): nie ma pakietu o nazwie 'mypackage'
print("master.R sourced successfully and loaded mypackage.")
#>  [1] "master.R sourced successfully and loaded mypackage."

If everything worked properly you shouldn’t see any error messages.

Adding function to a custom package

Lets add a function hello_world to our package mypackage. To do this you have to create a new file in folder packages/mypackage/R/hello_world.R. Edit hello_world.R to have the following content:

#' @export
hello_world <- function(name) {
    sprintf("Hello %s!", name)
}

Please remember to add #' @export. It is required to expose the function into global namespace.

Now you can change master script by adding one line to it:

# Detect proper script_path (you cannot use args yet as they are build with tools in set_env.r)
script_path <- (function() {
  args <- commandArgs(trailingOnly = FALSE)
  script_path <- dirname(sub("--file=", "", args[grep("--file=", args)]))
  if (!length(script_path)) {
    return("R")
  }
  if (grepl("darwin", R.version$os)) {
    script_path <- gsub("~\\+~", " ", script_path) # on MacOS ~+~ in path denotes whitespace
  }
  return(normalizePath(script_path))
})()

# Setting .libPaths() to point to libs folder
source(file.path(script_path, "set_env.R"), chdir = T)

config <- load_config()
args <- args_parser()

library(mypackage)

hello_world("John")

In order to check if everything works, run the master script:

source('R/master.R')
#>  Error in library(mypackage): nie ma pakietu o nazwie 'mypackage'

As you can see you got an error message that there is no such function as hello_world

Rebuild custom package

You have to rebuild packages to have all the functionality available to master scripts. You do it with the following function call.

RSuite::prj_build()
#>  2019-06-10 10:29:57 INFO:rsuite:Installing mypackage (for R 3.5) ...
#>  2019-06-10 10:29:57 WARNING:rsuite:Document building for mypackage failed: Error in loadNamespace(name): nie ma pakietu o nazwie 'devtools'
#>  2019-06-10 10:29:57 ERROR:rsuite:Failed to install project packages: mypackage
#>  Error: Failed to install project packages: mypackage

And check if R/master.R works:

source('R/master.R', print.eval = TRUE)
#>  Error in library(mypackage): nie ma pakietu o nazwie 'mypackage'

Adding dependencies

You can add dependencies to external packages in two ways:

  1. Recommended - using imports in DESCRIPTION file in each package
  2. Not recommended - using library or require in master scripts

To add a dependency to an external package we will edit file packages\mypackage\DESCRIPTION like below:

Package: mypackage
Type: Package
Title: What the package does (short line)
Version: 0.1
Date: 2019-06-10
Author: ws171913
Maintainer: Who to complain to <yourfault@somewhere.net>
Description: More about what it does (maybe more than one line)
License: What license is it under?
Imports: logging
Depends: data.table (>= 1.10.1)

I have added data.table (>= 1.10.1) to the Depends section. This means I declared that mypackage depends on data.table package in version 1.10.1 or newer.

Lets rebuild package to have master scripts see the changes:

RSuite::prj_build()
#>  2019-06-10 10:29:57 ERROR:rsuite:Some dependencies are not installed in project env: data.table.Please, install dependencies(Call RSuite::prj_install_deps)
#>  Error: Some dependencies are not installed in project env: data.table.Please, install dependencies(Call RSuite::prj_install_deps)

You can conclude that you have to install dependencies to build your package.

Install dependencies

To install dependencies you have to call the following function:

RSuite::prj_install_deps()
#>  2019-06-10 10:29:57 INFO:rsuite:Detecting repositories (for R 3.5)...
#>  2019-06-10 10:30:00 INFO:rsuite:Will look for dependencies in ...
#>  2019-06-10 10:30:00 INFO:rsuite:.          MRAN#1 = https://mran.microsoft.com/snapshot/2019-06-10 (win.binary, source)
#>  2019-06-10 10:30:00 INFO:rsuite:Collecting project dependencies (for R 3.5)...
#>  2019-06-10 10:30:00 INFO:rsuite:Resolving dependencies (for R 3.5)...
#>  2019-06-10 10:30:01 INFO:rsuite:Detected 1 dependencies to install. Installing...
#>  2019-06-10 10:30:05 INFO:rsuite:All dependencies successfully installed.
#>  2019-06-10 10:30:24 INFO:rsuite:Detected 49 support packages to install. Installing...
#>  2019-06-10 10:31:44 INFO:rsuite:All support packages successfully installed.

From this output you can see that we use MRAN as our package repository. Moreover R Suite detected 1 dependency to be installed.

You can check if installation succeeded by calling the following function:

RSuite::prj_build()
#>  2019-06-10 10:31:44 INFO:rsuite:Installing mypackage (for R 3.5) ...
#>  2019-06-10 10:31:52 INFO:rsuite:Successfuly installed 1 packages

Lets check what happens if you run our master script

source('R/master.R', print.eval = TRUE)
#>  Ładowanie wymaganego pakietu: data.table
#>  [1] "Hello John!"

The output says that data.table was loaded. This is exactly what we wanted to be.

Developing custom package using devtools

If you want to develop a package the cycle dev-build can take too long. This is especially important if the packages are bigger. You can use devtools to speedup this process. Lets go through such process.

Lets perform change in our mypackage in packages/mypackage/R/hello_world.R file to look it like this:

#' @export
hello_world <- function(name) {
    sprintf("Hello %s! Good to see you again.", name)
}

Now we can load the changed package for testing without rebuilding it with following command:

devtools::load_all("packages/mypackage")
#>  Loading mypackage

Lets see how hello_world function behaves now:

hello_world("John")
#>  [1] "Hello John! Good to see you again."

As you can see that no package rebuild was required and changed package is reloaded.

Loggers in master scripts

R Suite promotes good programming practices and using loggers is one of them. R Suite is based on logging package.

Lets update R/master.R as follows

# Detect proper script_path (you cannot use args yet as they are build with tools in set_env.r)
script_path <- (function() {
  args <- commandArgs(trailingOnly = FALSE)
  script_path <- dirname(sub("--file=", "", args[grep("--file=", args)]))
  if (!length(script_path)) {
    return("R")
  }
  if (grepl("darwin", R.version$os)) {
    script_path <- gsub("~\\+~", " ", script_path) # on MacOS ~+~ in path denotes whitespace
  }
  return(normalizePath(script_path))
})()

# Setting .libPaths() to point to libs folder
source(file.path(script_path, "set_env.R"), chdir = T)

#library(mypackage)

#hello_world("Jony")

loginfo("Master info")
logdebug("Master debug")
logwarn("Master warning")
logerror("Master error")

Lets check how it works

source('R/master.R')
#>  2019-06-10 10:31:53 INFO::Master info
#>  2019-06-10 10:31:53 WARNING::Master warning
#>  2019-06-10 10:31:53 ERROR::Master error

As you can see there are logging messages. You can see that debug message is missing as by default logging level is set to present only messages on INFO and higher levels.

Controlling loggers level

To see debug logging message change project configuration file. Project configuration is in config.txt file in project root folder. Please change the file to look like this:

LogLevel: DEBUG

Lets check how it works

source('R/master.R')
#>  2019-06-10 10:31:53 INFO::Master info
#>  2019-06-10 10:31:53 WARNING::Master warning
#>  2019-06-10 10:31:53 ERROR::Master error

As you can see now debug logging message is printed.

Logs folder

Logging messages are stored in logs folder in files named with current date. You can check this by issuing a command

list.files(path = "./logs")
#>  [1] "2019_06_10.log"

When you open this log in an editor you should see content similar to this

#>  2019-06-10 10:31:52 INFO:rsuite:Successfuly installed 1 packages
#>  2019-06-10 10:31:53 INFO::Master info
#>  2019-06-10 10:31:53 WARNING::Master warning
#>  2019-06-10 10:31:53 ERROR::Master error
#>  2019-06-10 10:31:53 INFO::Master info
#>  2019-06-10 10:31:53 WARNING::Master warning
#>  2019-06-10 10:31:53 ERROR::Master error

As you can see this is very similar to the output you saw in console.

Loggers in packages

R Suite allows you to use loggers in your custom packages. Lets open packages/mypackage/R/hello_world.R and change its content to the following one

#' @export
hello_world <- function(name) {
  pkg_loginfo("Package info")
  pkg_logdebug("Package debug")
  pkg_logwarn("Package warning")
  pkg_logerror("Package error")

  sprintf("Hello %s! Good to see you again.", name)
}

Lets load changed package with devtools and see how hello_world function behaves:

devtools::load_all("packages/mypackage")
#>  Loading mypackage
hello_world("John")
#>  2019-06-10 10:31:54 INFO:mypackage:Package info
#>  2019-06-10 10:31:54 WARNING:mypackage:Package warning
#>  2019-06-10 10:31:54 ERROR:mypackage:Package error
#>  [1] "Hello John! Good to see you again."

As you can see there are messages from your package. They are marked with package name mypackage. Please also note that as you used devtools you did not have to rebuild package to see the changes.

Lets restore log level to default value:

logging::setLevel('INFO')

Project environment locking

RSuite allows the user to lock the project environment. It collects all dependencies’ versions and stores them in a lock file to enforce exact dependency versions in the future. To lock the project environment we have to call the following function:

RSuite::prj_lock_env()
#>  2019-06-10 10:31:54 INFO:rsuite:The project environment was locked successfully

The lock file is in the ‘deployment’ directory under the ‘env.lock’ name. It is a dcf file that stores information about packages in the local environment together with their versions. A sample record from the ‘env.lock’ file is presented below:

cat(readLines("deployment/env.lock"), sep = "\n")
#>  Package: data.table
#>  Version: 1.12.2
#>  
#>  Package: logging
#>  Version: 0.9-107
#>  
#>  Package: mypackage
#>  Version: 0.1

When dependencies are being installed using RSuite::prj_install_deps() the ‘env.lock’ file will be used to detect whether any package will change versions. If that’s the case an appropriate warning message will be displayed. The feature allows to safely deploy packages with specific dependencies’ versions. It prevent errors caused by newer versions of packages which might work differently than previous ones used in the project.

To safely unlock the local project environment we use the following function:

RSuite::prj_unlock_env()
#>  2019-06-10 10:31:54 INFO:rsuite:The project environment has been unlocked.

The function deletes an existing ‘env.lock’ file.

Prepare deployment package

We can now prepare a deployment package to ship our project on a production.

First lets restore R/master.R contents:

# Detect proper script_path (you cannot use args yet as they are build with tools in set_env.r)
script_path <- (function() {
  args <- commandArgs(trailingOnly = FALSE)
  script_path <- dirname(sub("--file=", "", args[grep("--file=", args)]))
  if (!length(script_path)) {
    return("R")
  }
  if (grepl("darwin", R.version$os)) {
    script_path <- gsub("~\\+~", " ", script_path) # on MacOS ~+~ in path denotes whitespace
  }
  return(normalizePath(script_path))
})()

# Setting .libPaths() to point to libs folder
source(file.path(script_path, "set_env.R"), chdir = T)

library(mypackage)

hello_world("John")

Now lets check that all project dependencies have been collected:

RSuite::prj_install_deps()
#>  2019-06-10 10:31:54 INFO:rsuite:Detecting repositories (for R 3.5)...
#>  2019-06-10 10:31:54 INFO:rsuite:Will look for dependencies in ...
#>  2019-06-10 10:31:54 INFO:rsuite:.          MRAN#1 = https://mran.microsoft.com/snapshot/2019-06-10 (win.binary, source)
#>  2019-06-10 10:31:54 INFO:rsuite:Collecting project dependencies (for R 3.5)...
#>  2019-06-10 10:31:54 INFO:rsuite:Resolving dependencies (for R 3.5)...
#>  2019-06-10 10:31:55 INFO:rsuite:Following installed packages will be updated: mypackage
#>  2019-06-10 10:31:55 INFO:rsuite:No dependencies to install.

As you did not add any new dependencies R Suite smartly understands it and does not repeat lengthy dependencies installation phase.

Lets rebuild our custom packages:

RSuite::prj_build()
#>  2019-06-10 10:31:57 INFO:rsuite:Installing mypackage (for R 3.5) ...
#>  2019-06-10 10:32:06 INFO:rsuite:Successfuly installed 1 packages

To build a deployment package you use the following function (we specify there to put the deployment package with path argument):

RSuite::prj_zip(path = tempdir())
#>  2019-06-10 10:32:06 WARNING:rsuite:Project environment is not locked!
#>  2019-06-10 10:32:06 ERROR:rsuite:Failed to find HEAD branch. Is it fresh repository?
#>  Error: Failed to find HEAD branch. Is it fresh repository?

As project is not under version control and contains only one package R Suite chooses default version for the deployment package to be the same as package version(0.1). Suffix x means that it is not a real tag.

For projects under version control consistency of project source code with repository state is checked and deployment package is versioned by source code tag (tag under Git or revision number under SVN) without x suffix.

If project inconsistency with repository is detected (like new/uncontrolled files or source code changes) R Suite prevents building deployment package unless you enforce deployment package version explicitly:

RSuite::prj_zip(path = tempdir(), zip_ver = '1.0')
#>  2019-06-10 10:32:06 WARNING:rsuite:Project environment is not locked!
#>  2019-06-10 10:32:06 INFO:rsuite:Installing mypackage (for R 3.5) ...
#>  2019-06-10 10:32:14 INFO:rsuite:Successfuly installed 1 packages
#>  2019-06-10 10:32:14 INFO:rsuite:Preparing files for zipping...
#>  2019-06-10 10:32:14 INFO:rsuite:... done. Creating zip file my_project_1.0x.zip ...
#>  2019-06-10 10:32:15 INFO:rsuite:Zip file created: C:\Users\ws171913\AppData\Local\Temp\RtmpslCi8w/my_project_1.0x.zip

You have created file my_project_1.0x.zip that contains all information necessary to run your solution on a production environment.

Running deployment package

To test if the deployment package is working you can extract my_project_1.0x.zip created in previous step in a new folder say prod:

dir.create(path = file.path(tempdir(), "prod"), showWarnings = FALSE)
unzip(zipfile = file.path(tempdir(), "my_project_1.0x.zip"), 
      exdir = file.path(tempdir(), "prod"))

cat(list.files(path = file.path(tempdir(), "prod", "my_project")), sep = "\n")
#>  config_templ.txt
#>  libs
#>  logs
#>  R
#>  readme.txt

readme.txt file contains version number the project has been tagged with:

my_project v1.0x

Now you can run your solution with the command

output <- system2(command = Sys.which("Rscript"), 
                  args = file.path(tempdir(), "prod", "my_project", "R", "master.R"),
                  stdout = TRUE)
cat(output, sep = "\n")
#>  2019-06-10 10:32:16 INFO:mypackage:Package info
#>  2019-06-10 10:32:17 WARNING:mypackage:Package warning
#>  2019-06-10 10:32:17 ERROR:mypackage:Package error
#>  [1] "Hello John! Good to see you again."

As you can see the output is exactly the same you would expect.