
incadata 
The goal of incadata is to provide basic functionality to handle data from INCA and the Regional cancer centers in Sweden.
Installation
You can install the released version of incadata from CRAN with:
install.packages("incadata")
And the development version from BitBucket with:
# install.packages("remotes")
remotes::install_bitbucket("cancercentrum/incadata")
Standardised data sets
The function as.incadata
standardize data from INCA and Rockan:
- All date formats used by Rockan are recognized as dates and coerced to such (for example:
1985-05-04
, ""
, 19850504
, 19850500
, 19850000
and 8513
).
- Boolean values are numeric vectors in INCA:
c(0, 1, 0, 1, 0, 0)
, but coerced to character when exported: c(NA, "True", NA, "True", NA, NA)
. The package recognize this peculiarity and coerce to Boolean.
- Personal identity numbers are recognized even if they end with X et cetera (used in Rockan).
- Standard numerical codes from Rockan are decoded (using the decoder package).
- Column names are always coerced to lower case, since these are generally easier to work with.
- Data frames are coerced to tibbles .
- An
id
column is always added to data frames in order to always have an identification variable at hand (regardless if the data has none or one of PERSNR, PNR or PAT_ID)
Register documentation
The package also provides functionality for easier access and archiving of register documentation (se vignette ‘incadoc’) and function documents
.
Additional functionality
The package also lets you:
- cache data between work sessions to speed up the data loading and munging process
- use a single data reading/munging function regardless if you work on INCA or locally
- interactively engage in the coercing process of variable formats. This is handy for example if a variable is almost a date but has some additional entries that are not recognized as such.