When developing R packages, we should try to avoid directly setting dependencies to “heavy packages”. The “heaviness” for a package means, the number of additional dependent packages it brings to. If your package directly depends on a heavy package, it brings several consequences:
library(your-pkg)
) will be huge (you can see the loaded namespaces by sessionInfo()
).You package will be “heavy” as well and it may take long time to load your package.
In the DESCRIPTION file of your package, those “directly dependent pakcages” are always listed in the “Depends” or “Imports” fields. To get rid of the heavy packages that are not offen used in your package, it is better to move them into the “Suggests” fields and load them only when they are needed.
Here pkgndep package checks the heaviness of the packages that your package depends on. For each package listed in the “Depends”, “Imports” and “Suggests” fields in the DESCRIPTION file, it opens a new R session, loads the package and counts the number of namespaces that are loaded. The summary of the dependencies is visualized by a customized heatmap.
As an example, I am developing a package cola which depends on a lot of other packages. The dependency heatmap looks like (Figure in the original size is here):
In the heatmap, rows are the packages listed in “Depends”, “Imports” and “Suggests” fields, columns are the namespaces that are loaded if each of the package is only loaded to a new R session. The barplots on the right show the number of namespaces that are imported by each package and the time of only loading one of the packages into R.
We can see if all the packages are put in the “Imports” field, 166 namespaces will be loaded after library(cola)
. Some of the heavy packages such as WGCNA and clusterProfiler are not very frequently used in cola, moving them to “Suggests” field and loading them only when they are needed helps to speed up loading cola. Now the number of namespaces are reduced to only 25 after library(cola)
.
To use this package:
or
Executable examples:
## ========== checking ComplexHeatmap ==========
## Loading methods to a new R session... 7 namespaces loaded.
## Loading grid to a new R session... 8 namespaces loaded.
## Loading graphics to a new R session... 7 namespaces loaded.
## Loading stats to a new R session... 7 namespaces loaded.
## Loading grDevices to a new R session... 7 namespaces loaded.
## Loading circlize to a new R session... 12 namespaces loaded.
## Loading GetoptLong to a new R session... 10 namespaces loaded.
## Loading colorspace to a new R session... 8 namespaces loaded.
## Loading clue to a new R session... 9 namespaces loaded.
## Loading RColorBrewer to a new R session... 8 namespaces loaded.
## Loading GlobalOptions to a new R session... 8 namespaces loaded.
## Loading parallel to a new R session... 8 namespaces loaded.
## Loading png to a new R session... 8 namespaces loaded.
## Loading testthat to a new R session... 11 namespaces loaded.
## Loading knitr to a new R session... 10 namespaces loaded.
## Loading markdown to a new R session... 8 namespaces loaded.
## Loading dendsort to a new R session... 8 namespaces loaded.
## Loading Cairo to a new R session... 8 namespaces loaded.
## Loading jpeg to a new R session... 8 namespaces loaded.
## Loading tiff to a new R session... 8 namespaces loaded.
## Loading fastcluster to a new R session... 8 namespaces loaded.
## Loading dendextend to a new R session... 33 namespaces loaded.
## Loading grImport to a new R session... 10 namespaces loaded.
## Loading grImport2 to a new R session... 13 namespaces loaded.
## Loading glue to a new R session... 8 namespaces loaded.
## Loading GenomicRanges to a new R session... 19 namespaces loaded.
## Loading gridtext to a new R session... 12 namespaces loaded.
## Loading pheatmap to a new R session... 17 namespaces loaded.
## ComplexHeatmap version 2.5.2
## 14 namespaces loaded if only load packages in Depends and Imports
## 56 namespaces loaded after loading all packages in Depends, Imports and Suggests
I ran pkgndep on all packages that are installed in my computer. The table of the number of loaded namespaces as well as the dependency heatmaps are available at https://jokergoo.github.io/pkgndep/stat/.
For a quick look, the top 10 packages with the largest dependencies are:
Package | # Namespaces | also load packages in Suggests | Heatmap |
---|---|---|---|
ReportingTools | 125 | 131 | view |
TCGAbiolinks | 118 | 209 | view |
epik | 116 | 116 | view |
minfiData | 109 | 109 | view |
minfiDataEPIC | 109 | 109 | view |
ggbio | 108 | 119 | view |
FlowSorted.Blood.450k | 108 | 108 | view |
IlluminaHumanMethylation450kanno.ilmn12.hg19 | 108 | 108 | view |
IlluminaHumanMethylation450kmanifest | 108 | 108 | view |
IlluminaHumanMethylationEPICanno.ilm10b2.hg19 | 108 | 108 | view |
And the top 10 packages with the largest dependencies where packages in “Suggests” are also loaded are:
Package | # Namespaces | also load packages in Suggests | Heatmap |
---|---|---|---|
TCGAbiolinks | 118 | 209 | view |
cola | 25 | 174 | view |
broom | 29 | 171 | view |
GSEABase | 29 | 135 | view |
sesame | 73 | 134 | view |
ReportingTools | 125 | 131 | view |
GenomicRanges | 17 | 128 | view |
ensembldb | 57 | 126 | view |
AER | 36 | 126 | view |
BiocGenerics | 8 | 125 | view |