The goal of checkpoint is to solve the problem of package reproducibility in R. Specifically, checkpoint solve the problems that occur when you don’t have the correct versions of R packages. Since packages get updated on CRAN all the time, it can be difficult to recreate an environment where all your packages are consistent with some earlier state.
To solve this, checkpoint allows you to install packages from a specific snapshot date. In other words, checkpoint makes it possible to install package versions from a specific date in the past, as if you had a CRAN time machine.
With the checkpoint package, you can easily:
Using checkpoint is simple:
checkpoint package has only a single function, checkpoint() where you specify the snapshot date.checkpoint("2015-01-15") instructs R to install and use only package versions that existed on January 15, 2015.To write R code for reproducibility, simply begin your master R script as follows:
Choose a snapshot date that includes the package versions you need for your script (or today’s date, to get the latest versions). Any package version published since September 17, 2014 is available for use.
Sharing your R analysis reproducibly can be as easy as emailing a single R script. Begin your script with the following commands:
checkpoint package using library(checkpoint)checkpoint() with your checkpoint date, e.g. checkpoint("2014-10-01")Then send this script to your collaborators. When they run this script on their machine, checkpoint will perform the same steps of installing the necessary packages, creating the checkpoint snapshot folder and producing the same results.
When you create a checkpoint, the checkpoint() function performs the following:
~/.checkpointlibrary() and require() in your code.install.packages()options(repos)This means the remainder of your script will run with the packages from a specific date.
checkpoint finds historic package versionsTo achieve reproducibility, once a day we create a complete snapshot of CRAN, on the “Managed R archived network” (MRAN) server. At midnight (UTC) MRAN mirrors all of CRAN and saves a snapshot. (MRAN has been storing daily snapshots since September 17, 2014.) This allows you to install packages from a snapshot date, thus “going back in time” to this date, by installing packages as they were at that snapshot date.
Together, the checkpoint package and the MRAN server act as a CRAN time machine. The checkpoint() function installs the packages to a local library exactly as they were at the specified point in time. Only those packages are available to your session, thereby avoiding any package updates that came later and may have altered your results. In this way, anyone using checkpoint() can ensure the reproducibility of your scripts or projects at any time.
To revert to your default CRAN mirror and access globally-installed packages, simply restart your R session. You can also use the experimental function unCheckpoint() - this resets your .libPaths().
# Create temporary project and set working directory
example_project <- paste0("~/checkpoint_example_project_", Sys.Date())
dir.create(example_project, recursive = TRUE)
oldwd <- setwd(example_project)
# Write dummy code file to project
cat("library(MASS)", "library(foreach)",
sep="\n",
file="checkpoint_example_code.R")
# Create a checkpoint by specifying a snapshot date
library(checkpoint)
checkpoint("2014-10-01")
# Check that CRAN mirror is set to MRAN snapshot
getOption("repos")
# Check that library path is set to ~/.checkpoint
.libPaths()
# Check which packages are installed in checkpoint library
installed.packages()
# cleanup
unlink(example_project, recursive = TRUE)
setwd(oldwd)To install checkpoint directly from CRAN, use:
To install checkpoint directly from github, use the devtools package. In your R session, try:
install.packages("devtools")
devtools::install_github("RevolutionAnalytics/checkpoint")
library("checkpoint")Although checkpoint will scan for dependencies in .Rmd files if knitr is installed, it does not automatically install the knitr or rmarkdown packages.
To build your .Rmd files, you will have to add a script in your project that explicitly loads all the packages required to build your .Rmd files.
A line like the following may be sufficient:
This should automatically resolve dependencies on the packages knitr, yaml and htmltools
To build your rmarkdown file, use a call to rmarkdown::render(). For example, to build a file called example.Rmd, use:
https://github.com/RevolutionAnalytics/checkpoint/wiki
Post an issue on the Issue tracker at https://github.com/RevolutionAnalytics/checkpoint/issues
https://github.com/RevolutionAnalytics/checkpoint-server
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.