Yet Another Log4j Inspired Logging Framework

Stefan Fleck

2020-03-20

1 Introduction

lgr is a logging framework for R inspired by Apache Log4j and Python logging. It follows an object oriented design implemented through R6 classes. This enables lgr to have a larger set of features than other logging packages for R, and makes it flexible and easy to extend.

If you are not sure if lgr is the right package for you, take a look examples to see what lgr can do for you.

1.1 Quickstart

lgr comes with a so-called root logger that is ready to go after you install it. If you are a package developer you can (and should) set up a new Logger for your package, but more on that later. For many use cases the root Logger will suffice.

1.1.2 Logging to plaintext files

Usually you don’t (only) want to log to the console, you want to log to files. Output of Loggers is managed by Appenders. The root Logger is preconfigured with a console Appender (that is why we see output in the console). Let’s add a file Appender:

The various Appenders available in lgr are R6 classes. To instantiate an object of that class (i.e. create a new appender) you have to use the $new() function as in the example above. Whenever you see something in lgr that IsNamedLikeThis, you can be sure that it is such an R6 class.

If you look at the output, you see that the timestamp format of the file appender is slightly different to the timestamp format of the console Appender. Formatting is handled by Layouts, and each Appender has exactly one:

1.1.3 Logging to JSON files

If you log to files, you should not log normal text. If you want to analyse your logs later, it’s much better to log to a format like JSON:

JSON is still somewhat human readable

and easy for machines to parse

Many Appenders provide either a $show() method and a $data active binding convenience, and so you do not have to call readLines() & co manually.

Please note that under the hood, AppenderJson is just an AppenderFile with LayoutJson. The only difference is AppenderJson provides a $data() method while AppenderFile does not.

1.1.5 What Else

If the examples above have piqued your interest, the rest of this vignette will provide more details on the workings of lgr. Discussing all Appenders and configuration options is beyond the scope of this vignette, please refer to the function reference for that.

2 Usage

2.1 Structure of the logging system

If you want custom logging configurations, you have to understand the structure of the logging process.

2.1.1 On R6 classes

The elements described above are R6 classes. R6 is an object orientation system for R that is used by many popular packages such as shiny, dplyr, plumber, roxygen2, and testthat but often behind the scenes and not as exposed as in lgr.

You recognize R6 classes in this package because they are named following the UpperCamelCase naming convention. While there is only one kind of Logger and one kind of LogEvent, there are several subclasses of Appenders and Layouts.

An introduction to R6 classes is beyond the scope of this document, but you can find the official documentation here and there is also this talk on Youtube. In short R6 classes store data (fields) together with functions (methods) and have to be instantiated with <classname>$new(). So if you want to create a new AppenderFile, you do this by calling AppenderFile$new(file = tempfile()).

Please not that Loggers should never be instantiated directly with $new() but always with get_logger().

2.2 Log levels

lgr supports the standard log4j Log Levels outlined bellow. The Log Level of an event represents its severity. The named log levels are really just nicknames for integer values, and you can use the character or integer representations interchangeably. You can also use arbitrary integer values (greater than 0), but you are encouraged to stick to the ones bellow.

Level Name Description
0 off Tells a Logger or Appender to suspend all logging
100 fatal Critical error that leads to program abort. Should always indicate a stop() or similar
200 error A severe error that does not trigger program abort
300 warn A potentially harmful situation, like warning()
400 info An informational message on the progress of the application
500 debug Finer grained informational messages that are mostly useful for debugging
600 trace An even finer grained message than debug (more info)
NA all Tells a Logger or Appender to process all log events

off and all are valid thresholds for Appenders and Loggers, but not valid levels for LogEvents; e.g. lgr$set_threshold(NA) makes sense, but lgr$log("all", "an example message") does not.

The list of named log levels is stored as a global option (getOption("lgr.log_levels")) and you can use add_log_levels() and remove_log_levels() to define your own named levels if you want to. There are only predefined logging methods (lgr$fatal(), etc..) for the standard log levels and you have to use lgr$log(level, message) to create a LogEvent with a custom log level.

2.3 Logging with the Root Logger

lgr comes with a pre-configured root Logger. It is called root because you can set up a tree of Loggers that descent from it, but for basic use you will not have to worry about that.

2.4 LogEvents: The atomic unit of logging

LogEvents are objects that store all information collected by the Logger. They are passed on to Appenders that output them, but Appenders usually don’t utilize all the information present in a log event. The last event produced by a Logger is stored in its last_event field.

LogEvents can contain not only these standard values, but an arbitrary number of extra values. These extra values are passed as named arguments to the logging function (as opposed as to parameters to sprintf(), which are passed as unnamed arguments). It is up to the Appender whether to process them further or not.

You should consider making use of custom fields liberally and using output formats that support them (such as JSON), rather than producing elaborately formatted but hard to parse log messages. If you ever want to analyse your log data, you will thank me for this tip.

2.5 Thresholds & Filters: controlling output detail

To control the level of detail of the log output, you can set thresholds for Loggers and Appenders. A Logger with a threshold of warn will only create LogEvents of the priorities warn, error and fatal and dispatch them to its Appenders.

A threshold of a Logger or Appender is the minimum log level a LogEvent must have so that that Logger/Appender processes it.

If you require more complex logic to decide whether a LogEvent should be created/processed you can also assign filters to Loggers/Appenders. Filters are just functions that have exactly one argument, event (the LogEvent to be filtered), and return TRUE or FALSE. They will be applied after the threshold is checked. Alternatively there is also a formal R6 class for Filters (?EventFilter) that you can use, but this is usually not necessary.

examples:

2.6 Appenders: Managing log destinations

The root logger only logs to the console by default. If you want to redirect the output to a file you can just add a file appender to lgr.

2.7 Inheritance: Hierarchical Loggers

Logger hierarchies are a powerful concept to organize logging for different parts of a larger system. This is mainly relevant for package developers. It is good practice to have a separate Logger for each package. Since it is not common in R to build complex systems of hierarchically organised packages, hierarchies will usually be pretty flat (i.e. most Loggers will only inherit from the root logger).

Each newly created Logger is child to a parent Logger, derived from its name. So lg <- get_logger("foo/bar") creates the logger with the qualified name foo/bar whose parent is the logger foo whose parent is (implicitly) the root logger. If the logger foo does not exist in that scenario, it is created automatically. This behaviour might sound strange at first, but it mimics tried and tested behaviour of python logging. This way logging is decoupled from the business logic and your program will not abort if you forgot to initialize some logger up the hierarchy for whatever reason.

A logger dispatches the LogEvents it creates not only to its own Appenders, but also to the Appenders of all its ancestral Loggers (ignoring the threshold and Filters of the ancestral Loggers, but not of the Appenders). When you define Loggers for your package, you should not configure them (with custom Appenders or thresholds); that should be left to the user of the package.

If all this sounds confusing to you, take a look at the examples and ?logger_tree. The common use cases are pretty easy to understand and illustrate the how and why pretty well.

Example hierarchy for the package fancymodel that provides a model along with a plumber API and a shiny web-interface to the package.

root [info] -> 1 appenders
└─fancymodel
  ├─plumber
  └─#shiny
    ├─server [trace] -> 2 appenders
    └─ui -> 1 appenders

2.7.1 Log flow

The graph bellow outlines the flow of LogEvents through the logging system. This is an important reference if you want to work with Filters and Logger hierarchies.

2.8 Logging with LoggerGlue

glue is very nicely designed package for string interpolation. It makes composing log messages more flexible and comfortable at the price of an additional dependency and slightly less performance than sprintf() (which is used by normal Loggers). To take advantage of glue, simply create a new LoggerGlue like this:

Glue lets you define temporary variables inside the glue() call. As with the normal Logger, named arguments get turned into custom fields.

You can suppress this behaviour by making named argument start with a ".".

3 Configuration

3.1 With setters

There are several different ways to configure loggers. The most straight forward one is to use setters to specify the Loggers properties.

lgr sets up Loggers in a way so that R6 piping with $ is possible. This works similar to the magrittr (#%>#) pipes.

3.2 With a list object

3.3 With YAML or JSON

You can use YAML and JSON config files with lgr.

You can also pass in YAML/JSON directly as a character string (or vector with one element per line)

4 Examples

4.1 Logging to the console

lgr comes with simple but powerful formatting syntax for LogEvents. Please refer to ?format.LogEvent for the full list of available placeholders.

If this is not enough for you, you can use LayoutGlue based on the awesome glue package. The syntax is a bit more verbose, and AppenderGlue is a bit less performant than AppenderFormat, but the possibilities are endless.

All fields of the [LogEvent] object are exposed through LayoutGlue, so please refer to ?LogEvent for a list of all available Fields.

4.2 Logging to JSON files

JavaScript Object Notation (JSON) is an open-standard file format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types (Wikipedia). JSON is the recommended text-based logging format when logging to files 1, as it is human- as well as machine readable. You should only log to a different format if you have very good reasons for it. The easiest way to log to JSON files is with AppenderJson2

JSON is easy to parse and analyse with R. lgr provides the function read_json_lines() that can be used to read JSON log files, but you can also use AppenderJson’s $data binding for an even more convenient method to read the logfile.

#>   level           timestamp logger caller             msg  field numbers  use
#> 1   400 2020-03-20 11:56:34   test   eval JSON naturally  custom    NULL <NA>
#> 2   400 2020-03-20 11:56:34   test   eval supports custom   <NA> 1, 2, 3 <NA>
#> 3   400 2020-03-20 11:56:34   test   eval      log fields   <NA>    NULL JSON

JSON is also human readable, though this vignette does not transport that fact very well because of the lack of horizontal space.

#> {"level":400,"timestamp":"2020-03-20 11:56:34","logger":"test","caller":"eval","msg":"JSON naturally ","field":"custom"}
#> {"level":400,"timestamp":"2020-03-20 11:56:34","logger":"test","caller":"eval","msg":"supports custom","numbers":[1,2,3]}
#> {"level":400,"timestamp":"2020-03-20 11:56:34","logger":"test","caller":"eval","msg":"log fields","use":"JSON"}

4.3 Logging to rotating files

lgr can also log to rotating files. The following example logs to a file that is reset and backed-up once it reaches a size of 10kb. Only the last 5 backups of the logfile are kept.

# install.packages("rotor")
tf <- tempfile(fileext = ".log")

lg <- get_logger("test")$
  set_propagate(FALSE)$
  set_appenders(list(rotating = AppenderFileRotating$new(
    file = tf, 
    size = "10 kb",
    max_backups = 5))
  )

for (i in 1:100) lg$info(paste(LETTERS, sep = "-"))

# display info on the backups of tf
lg$appenders$rotating$backups
#>                                    path            name sfx ext  size isdir
#> 1 /tmp/RtmpbLfsAC/file326200d3bc2.1.log file326200d3bc2   1 log 10608 FALSE
#> 2 /tmp/RtmpbLfsAC/file326200d3bc2.2.log file326200d3bc2   2 log 10608 FALSE
#> 3 /tmp/RtmpbLfsAC/file326200d3bc2.3.log file326200d3bc2   3 log 10608 FALSE
#> 4 /tmp/RtmpbLfsAC/file326200d3bc2.4.log file326200d3bc2   4 log 10608 FALSE
#> 5 /tmp/RtmpbLfsAC/file326200d3bc2.5.log file326200d3bc2   5 log 10608 FALSE
#>   mode               mtime               ctime               atime   uid   gid
#> 1  644 2020-03-20 11:56:34 2020-03-20 11:56:34 2020-03-20 11:56:34 11861 11861
#> 2  644 2020-03-20 11:56:34 2020-03-20 11:56:34 2020-03-20 11:56:34 11861 11861
#> 3  644 2020-03-20 11:56:34 2020-03-20 11:56:34 2020-03-20 11:56:34 11861 11861
#> 4  644 2020-03-20 11:56:34 2020-03-20 11:56:34 2020-03-20 11:56:34 11861 11861
#> 5  644 2020-03-20 11:56:34 2020-03-20 11:56:34 2020-03-20 11:56:34 11861 11861
#>   uname grname index
#> 1 fleck  fleck     1
#> 2 fleck  fleck     2
#> 3 fleck  fleck     3
#> 4 fleck  fleck     4
#> 5 fleck  fleck     5

# manually delete all backups
invisible(lg$appenders$rotating$prune(0))
lg$appenders$rotating$backups
#>  [1] path   name   sfx    ext    size   isdir  mode   mtime  ctime  atime 
#> [11] uid    gid    uname  grname index 
#> <0 rows> (or 0-length row.names)

#cleanup
unlink(tf)

4.4 Logger hierarchies

The most common use cases for creating a new Logger rather than just using the root Logger is if you create a Package that should contain logging. This way you can have separate Appenders (e.g logfiles) and thresholds for each package.

The print() method for Loggers gives a nice overview of the newly created Logger:

This tells us that lg logs all events of at least level info. It does have a single (unnamed) Appender that logs to a temporary file, and dispatches all LogEvents it creates to the Appenders of the root Logger (ignoring the threshold and filters of the root Logger, but not of its Appenders).

We can use lg$fatal(), lg$info(), etc.. to log messages with this Logger:

If we do not want lg to dispatch to the root Logger, we can set propagate to FALSE.

When we take a look at the Logger again, we now see that it does not inherit any Appenders anymore

Consequently, lg no longer outputs log messages to he console

4.5 Buffered logging

The main purpose of AppenderBuffer is to retain LogEvents in memory and write them to destinations at a later point in time, e.g. when the Buffer is full and needs to be flushed. For example, if you log to a remote database you can postpone this costly operation until after your analysis is finished.

By setting a filter as a custom $should_flush() method for an AppenderBuffer, you can define more complex conditions to trigger flushing. For example, the will output the last 5 LogEvents that happened before an error occurred.

4.6 Logging to databases

Logging to databases is simple, though a few aspects can be tricky to configure based on the backend used. For performance reasons database inserts are buffered by default. This works exactly identical as described above for AppenderBuffer. If you want to write each LogEvent directly to the database, just set the buffer size to 0.

4.7 Inject values into LogEvents

By abusing Filters, lgr can modify LogEvents as they are processed. One example for when this is useful is assigning a grouping identifier to a series of log calls.

with_log_value() provides a convenient wrapper to inject values into log calls.

An alternative way to achieve the same is to use one of the preconfigured Filters that come with lgr. This approach is more more comfortable for use within functions.

The result is the same in both cases:

#> INFO  [11:56:34.934] cleaning data {dataset_id: dataset1}
#> INFO  [11:56:34.935] processing data {dataset_id: dataset1}
#> INFO  [11:56:34.936] outputing data {dataset_id: dataset1}

You can use with_log_level() and FilterForceLevel in a similar fashion to modify the log level of events conveniently.

4.8 Temporarily disable logging

Temporary disabling logging for portions of code is straight forward and easy with lgr:

4.9 Adding a custom logger to a package

If you are a package author, it is good practice to define a separate logger for your package. This gives users the ability to easily enable/disable logging on a per-package basis. Loggers must be initialized in the packages .onLoad hook. You can do this by adding the following code to any .R file inside the R/ directory of your package:

You can also just use lgr::use_logger() to generate the appropriate code for your package automatically.

After you set this up you can use lg$fatal(), lg$info(), etc… inside your package. You do not have to define any appenders, since all log events will get redirected to the root Logger (see Inheritance).

5 References

Python Logging

Eric Stenbock: The True Story of A Vampire


  1. Technically, the logger does not produce standard JSON files but JSON lines

  2. AppenderJson is just an AppenderFile with LayotJson as default Layout and a few extra features