cpp11 internals

The development repository for cpp11 is https://github.com/r-lib/cpp11.

Code organization

cpp11 is a header only library, so all source code exposed to users lives in inst/include. R code used to register functions and for cpp11::cpp_source() is in R/. Tests for only the code in R/ is in tests/testthat/ The rest of the code is in a separate cpp11test/ package included in the source tree. Inside cpp11test/src the files that start with test- are C++ tests using the Catch support in testthat. In addition there are some regular R tests in cpp11test/tests/testthat/.

Running tests

All of the tests in both packages can be run by using devtools::test() from the cpp11 root directory. However it is sometimes more convenient to set your working directory to the cpp11test root to run only those tests. In order to calculate code coverage you need to call cpp11_coverage(), e.g. to generate a local coverage report you could use the following.

covr::report(cpp11_coverage())

Naming conventions

Vector classes

All of the basic r_vector classes are class templates, the base template is defined in cpp11/r_vector.hpp The template parameter is the type of value the particular R vector stores, e.g. double for cpp11::doubles. This differs from Rcpp, whose first template parameter is the R vector type, e.g. REALSXP.

The file first has the class declarations, then function definitions further down in the file. Specializations for the various types are in separate files, e.g. cpp11/doubles.hpp, cpp11/integers.hpp

Coercion functions

There are two different coercion functions

as_sexp() takes a C++ object and coerces it to a SEXP object, so it can be used in R. as_cpp<>() is a template function that takes a SEXP and creates a C++ object from it

The various methods for both functions are defined in cpp11/as.hpp

This is definitely the most complex part of the cpp11 code, with extensive use of template metaprogramming. In particular the substitution failure is not an error (SFINAE) technique is used to control overloading of the functions. If we could use C++20 a lot of this code would be made simpler with Concepts, but alas.

The most common C++ types are included in the test suite and should work without issues, as more exotic types are used in real projects additional issues may arise.

Some useful links on SFINAE

Protection

Protect list

cpp11 uses an idea proposed by Luke Tierney to use a double linked list with the head preserved to protect objects cpp11 is protecting.

Each node in the list uses the head (CAR) part to point to the previous node, and the CDR part to point to the next node. The TAG is used to point to the object being protected. The head and tail of the list have R_NilValue as their CAR and CDR pointers respectively.

Calling protect_sexp() with a regular R object will add a new node to the list and return a protect token corresponding to the node added. Calling release_protect() on this returned token will release the protection by unlinking the node from the linked list.

This scheme scales in O(1) time to release or insert an object vs O(N) or worse time with R_PreserveObject() / R_ReleaseObject().

These functions are defined in protect.hpp

Unwind Protect

In R 3.5+ cpp11 uses R_UnwindProtect to protect (most) calls to the R API that could fail. These are usually those that allocate memory, though in truth most R API functions could error along some paths. If an error happends under R_UnwindProtect cpp11 will throw a C++ exception. This exception is caught by the try catch block defined in the BEGIN_CPP11 macro in cpp11/declarations.hpp. The exception will cause any C++ destructors to run, freeing any resources held by C++ objects. After the try catch block exits the R error unwinding is then continued by R_ContinueUnwind() and a normal R error results.

In R versions prior to 3.5 R_UnwindProtect() is not available. Unfortunately the options to emulate it are not ideal.

  1. Using R_TopLevelExec() works to avoid the C long jump, but because the code is always run in a top level context any errors or messages thrown cannot be caught by tryCatch() or similar techniques.
  2. Using R_TryCatch() is not available prior to R 3.4, and also has a serious bug in R 3.4 (fixed in R 3.5).
  3. Calling the R level tryCatch() function which contains an expression that runs a C function which then runs the C++ code would be an option, but implementing this is convoluted and it would impact performance, perhaps severely.
  4. Have cpp11::unwind_protect() be a no-op for these versions. This means any resources held by C++ objects would leak, including cpp11::r_vector / cpp11::sexp objects.

None of these options is perfect, here are some pros and cons for each.

  1. Causes behavior changes and test failures, so it was ruled out.
  2. Was also ruled out since we want to support back to R 3.3.
  3. Was ruled out partially because the implementation would be somewhat tricky and more because performance would suffer greatly.
  4. is what we now do in cpp11.

We somewhat mitigate the scope of R object leaks by calling release_existing_protections() at the start of any wrapped cpp11 function. This function releases any existing protected nodes, so any R object leaks should only last until the next cpp11 call.