ACSN description

Paul Deveau

2016-09-01

ACSN Enrichment

Description

ACSNMineR is an R package, freely available.

ACSN stands for Atlas of Cancer Signaling Networks, and shows gene interaction in pathways relevant to cancer.

This package is designed for an easy analysis of gene maps (either user imported from gmt files or ACSN maps). Its aim is to allow a statistical analysis of statistically enriched or depleted pathways from a user imported gene list, as well as a graphic representation of results.

This readme contains:

1. This description

2. Usage section

2.1. Pathway analysis

2.1.1 Import gmt files

2.1.2 Perform analysis

2.2. Data vizualization

2.2.1. Heatmaps

2.2.2. Barplots

Usage

Pathway analysis


Import gmt files

Gmt files can be imported thanks to the format_from_gmt function. Let’s use saved data from the package:

# Retrieve path of the example gmt
file<-system.file("extdata", "cellcycle_short.gmt", package = "ACSNMineR")
# Then import it
gmt<-ACSNMineR::format_from_gmt(file)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
E2F1 19 ATM ATR CHEK2 CREBBP TFDP1 E2F1 EP300 HDAC1
MAPK 9 ATM GSK3B MAX MDM2 CDKN1A TP53 BCL2 MYC
MOMP_REGULATION 13 CDK1 GSK3B PPP1CA PPP1CB PPP1CC TP53 BBC3 BCL2
HEDGEHOG 9 CREBBP GSK3B HDAC1 CCNB1 CCNB2 CCNB3 CCNC BCL2
P21CIP 3 AKT1 PCNA CDKN1A
G2_CC_PHASE 4 CDK1 CDC25C WEE1 CCNB1
CYCLIND 9 AKT1 CDC37 CDK4 CDK6 GSK3B HSP90AA1 CCND1 CCND2
E2F4 8 TFDP2 E2F4 E2F5 HDAC1 SIN3B SUV39H1 RBL1 RBL2
CYCLINE 7 CDK2 CDKN3 CCNE1 CCNE2 RBL1 RBL2 CDKN1B
E2F6 13 EHMT2 TFDP1 TFDP2 E2F6 E2F8 EED EHMT1 EPC1

ACSN maps are built-in and can easily be accessed through ACSNEnrcihment::ACSN_maps:

# Name of available maps:
names(ACSNMineR::ACSN_maps)
Apoptosis
CellCycle
DNA_repair
EMT_motility
Survival
Master
#And accessing them:
ACSNMineR::ACSN_maps$CellCycle
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
APC 15 ANAPC1 ANAPC2 CDC27 ANAPC4 ANAPC5 CDC16 ANAPC7 CDC23
APOPTOSIS_ENTRY 10 AKT1 ATM ATR CHEK1 CHEK2 E2F1 MDM2 NBN
CDC25 2 CDC25C CHEK2
CYCLINA 6 CDK2 CCNA1 CCNA2 RBL1 RBL2 CDKN1B
CYCLINB 7 ATM CDK1 MDM2 PKMYT1 CCNB1 CCNB2 CCNB3
CYCLINC 2 CDK3 CCNC

The gene set that was used for tests is the following:

ACSNMineR::genes_test
ATM
ATR
CHEK2
CREBBP
TFDP1
E2F1
EP300
HDAC1
KAT2B
GTF2H1
GTF2H2
GTF2H2B

Perform analysis

Gene set enrichment for a single set can be performed by calling:

Example<-ACSNMineR::enrichment(ACSNMineR::genes_test,
    min_module_size = 10, 
    threshold = 0.05,
    maps = list(cellcycle = ACSNMineR::ACSN_maps$CellCycle))
module module_size nb_genes_in_module genes_in_module universe_size nb_genes_in_universe p.value p.value.corrected test
APOPTOSIS_ENTRY 10 4 ATM ATR CHEK2 E2F1 227 12 0.0023144 0.011572 greater
E2F1 19 12 ATM ATR CHEK2 CREBBP TFDP1 E2F1 EP300 HDAC1 KAT2B GTF2H1 GTF2H2 GTF2H2B 227 12 0.0000000 0.000000 greater

Where:

  • genes_test is a character vector to test

  • min_module_size is the minimal size of a module to be taken into account

  • threshold is the maximal p-value that will be displayed in the results (all modules with p-values higher than threshold will be removed)

  • maps is a list of maps -here we take the cell cycle map from ACSN- imported through the format_from_gmt() function of the package

Gene set enrichment for multiple sets/cohorts can be performed by calling:

Example<-ACSNMineR::multisample_enrichment(Genes_by_sample = list(set1 = ACSNMineR::genes_test[-1],
                                                              set2 = ACSNMineR::genes_test[-2]),
    maps = ACSNMineR::ACSN_maps$CellCycle,
    min_module_size = 10,
    cohort_threshold = FALSE)
print(Example[[1]])
module module_size nb_genes_in_module genes_in_module universe_size nb_genes_in_universe p.value p.value.corrected test
E2F1 19 11 ATR CHEK2 CREBBP TFDP1 E2F1 EP300 HDAC1 KAT2B GTF2H1 GTF2H2 GTF2H2B 227 11 0 1e-07 greater
print(Example[[2]])
module module_size nb_genes_in_module genes_in_module universe_size nb_genes_in_universe p.value p.value.corrected test
E2F1 19 11 ATM CHEK2 CREBBP TFDP1 E2F1 EP300 HDAC1 KAT2B GTF2H1 GTF2H2 GTF2H2B 227 11 0 1e-07 greater

Where:

  • Genes_by_sample is a list of character vectors to test

  • min_module_size is the minimal size of a module to be taken into account

  • maps is a list of maps -here we take the cell cycle map from ACSN - imported through the format_from_gmt() function of the package

Data visualization


Results from the enrichment analysis function can be transformed to images thanks to the represent enrichment function. Two different plot are available: heatmap and barplot.

Heatmaps


Heatmaps for single sample or multiple sample representing p-values can be easily generated thanks to the represent_enrichment function.

ACSNMineR::represent_enrichment(enrichment = list(
    SampleA = ACSNMineR::enrichment_test[1:10,], 
    SampleB = ACSNMineR::enrichment_test[3:10,]),
    plot = "heatmap", 
    scale = "reverselog",
    low = "steelblue" , high ="white",
    na.value = "grey")+theme(axis.text = element_text(size = 6,angle = 45),
                             legend.text = element_text(size = 6),
                             legend.title = element_text(size = 8))

Where:

  • enrichment is the result from the enrichment or multisample_enrichment function

  • scale can be set to either identity or log and will affect the gradient of colors

  • low: the color for the low (significant) p-values

  • high: color for the high (less significant) p-values

  • na.value is the color in which tiles which have “NA” should appear

Barplots


A barplot can be achieved by using the following:

ACSNMineR::represent_enrichment(enrichment = list(
    SampleA = ACSNMineR::enrichment_test[1:10,], 
    SampleB = ACSNMineR::enrichment_test[3:10,]),
    plot = "bar", 
    scale = "reverselog")

Where:

  • enrichment is the result from the enrichment or multisample_enrichment function

  • scale can be set to either identity or log and will affect the gradient of colors

  • nrow is the number of rows that should be used to plot all barplots (default is 1)