ACSN Enrichment

Description

ACSNMineR is an R package, freely available.

ACSN stands for Atlas of Cancer Signaling Networks, and shows gene interaction in pathways relevant to cancer.

This package is designed for an easy analysis of gene maps (either user imported from gmt files or ACSN maps). Its aim is to allow a statistical analysis of statistically enriched or depleted pathways from a user imported gene list, as well as a graphic representation of results.

This readme contains:

1. This description

2. Usage section

2.1. Pathway analysis

2.1.1 Import gmt files

2.1.2 Perform analysis

2.2. Data vizualization

2.2.1. Heatmaps

2.2.2. Barplots

Usage

Pathway analysis

Import gmt files

Gmt files can be imported thanks to the format_from_gmt function. Let’s use saved data from the package:

# Retrieve path of the example gmt
file<-system.file("extdata", "cellcycle_short.gmt", package = "ACSNMineR")
# Then import it
gmt<-ACSNMineR::format_from_gmt(file)

V1	V2	V3	V4	V5	V6	V7	V8	V9	V10
E2F1	19	ATM	ATR	CHEK2	CREBBP	TFDP1	E2F1	EP300	HDAC1
MAPK	9	ATM	GSK3B	MAX	MDM2	CDKN1A	TP53	BCL2	MYC
MOMP_REGULATION	13	CDK1	GSK3B	PPP1CA	PPP1CB	PPP1CC	TP53	BBC3	BCL2
HEDGEHOG	9	CREBBP	GSK3B	HDAC1	CCNB1	CCNB2	CCNB3	CCNC	BCL2
P21CIP	3	AKT1	PCNA	CDKN1A
G2_CC_PHASE	4	CDK1	CDC25C	WEE1	CCNB1
CYCLIND	9	AKT1	CDC37	CDK4	CDK6	GSK3B	HSP90AA1	CCND1	CCND2
E2F4	8	TFDP2	E2F4	E2F5	HDAC1	SIN3B	SUV39H1	RBL1	RBL2
CYCLINE	7	CDK2	CDKN3	CCNE1	CCNE2	RBL1	RBL2	CDKN1B
E2F6	13	EHMT2	TFDP1	TFDP2	E2F6	E2F8	EED	EHMT1	EPC1

ACSN maps are built-in and can easily be accessed through ACSNEnrcihment::ACSN_maps:

# Name of available maps:
names(ACSNMineR::ACSN_maps)

Apoptosis

CellCycle

DNA_repair

EMT_motility

Survival

Master

#And accessing them:
ACSNMineR::ACSN_maps$CellCycle

V1	V2	V3	V4	V5	V6	V7	V8	V9	V10
APC	15	ANAPC1	ANAPC2	CDC27	ANAPC4	ANAPC5	CDC16	ANAPC7	CDC23
APOPTOSIS_ENTRY	10	AKT1	ATM	ATR	CHEK1	CHEK2	E2F1	MDM2	NBN
CDC25	2	CDC25C	CHEK2
CYCLINA	6	CDK2	CCNA1	CCNA2	RBL1	RBL2	CDKN1B
CYCLINB	7	ATM	CDK1	MDM2	PKMYT1	CCNB1	CCNB2	CCNB3
CYCLINC	2	CDK3	CCNC

The gene set that was used for tests is the following:

ACSNMineR::genes_test

ATM

ATR

CHEK2

CREBBP

TFDP1

E2F1

EP300

HDAC1

KAT2B

GTF2H1

GTF2H2

GTF2H2B

Perform analysis

Gene set enrichment for a single set can be performed by calling:

Example<-ACSNMineR::enrichment(ACSNMineR::genes_test,
    min_module_size = 10, 
    threshold = 0.05,
    maps = list(cellcycle = ACSNMineR::ACSN_maps$CellCycle))

module	module_size	nb_genes_in_module	genes_in_module	universe_size	nb_genes_in_universe	p.value	p.value.corrected	test
APOPTOSIS_ENTRY	10	4	ATM ATR CHEK2 E2F1	227	12	0.0023144	0.011572	greater
E2F1	19	12	ATM ATR CHEK2 CREBBP TFDP1 E2F1 EP300 HDAC1 KAT2B GTF2H1 GTF2H2 GTF2H2B	227	12	0.0000000	0.000000	greater

Where:

genes_test is a character vector to test
min_module_size is the minimal size of a module to be taken into account
threshold is the maximal p-value that will be displayed in the results (all modules with p-values higher than threshold will be removed)
maps is a list of maps -here we take the cell cycle map from ACSN- imported through the format_from_gmt() function of the package

Gene set enrichment for multiple sets/cohorts can be performed by calling:

Example<-ACSNMineR::multisample_enrichment(Genes_by_sample = list(set1 = ACSNMineR::genes_test[-1],
                                                              set2 = ACSNMineR::genes_test[-2]),
    maps = ACSNMineR::ACSN_maps$CellCycle,
    min_module_size = 10,
    cohort_threshold = FALSE)

print(Example[[1]])

module	module_size	nb_genes_in_module	genes_in_module	universe_size	nb_genes_in_universe	p.value	p.value.corrected	test
E2F1	19	11	ATR CHEK2 CREBBP TFDP1 E2F1 EP300 HDAC1 KAT2B GTF2H1 GTF2H2 GTF2H2B	227	11	0	1e-07	greater

print(Example[[2]])

module	module_size	nb_genes_in_module	genes_in_module	universe_size	nb_genes_in_universe	p.value	p.value.corrected	test
E2F1	19	11	ATM CHEK2 CREBBP TFDP1 E2F1 EP300 HDAC1 KAT2B GTF2H1 GTF2H2 GTF2H2B	227	11	0	1e-07	greater

Where:

Genes_by_sample is a list of character vectors to test
min_module_size is the minimal size of a module to be taken into account
maps is a list of maps -here we take the cell cycle map from ACSN - imported through the format_from_gmt() function of the package

Data visualization

Results from the enrichment analysis function can be transformed to images thanks to the represent enrichment function. Two different plot are available: heatmap and barplot.

Heatmaps

Heatmaps for single sample or multiple sample representing p-values can be easily generated thanks to the represent_enrichment function.

ACSNMineR::represent_enrichment(enrichment = list(
    SampleA = ACSNMineR::enrichment_test[1:10,], 
    SampleB = ACSNMineR::enrichment_test[3:10,]),
    plot = "heatmap", 
    scale = "reverselog",
    low = "steelblue" , high ="white",
    na.value = "grey")+theme(axis.text = element_text(size = 6,angle = 45),
                             legend.text = element_text(size = 6),
                             legend.title = element_text(size = 8))

Where:

enrichment is the result from the enrichment or multisample_enrichment function
scale can be set to either identity or log and will affect the gradient of colors
low: the color for the low (significant) p-values
high: color for the high (less significant) p-values
na.value is the color in which tiles which have “NA” should appear

Barplots

A barplot can be achieved by using the following:

ACSNMineR::represent_enrichment(enrichment = list(
    SampleA = ACSNMineR::enrichment_test[1:10,], 
    SampleB = ACSNMineR::enrichment_test[3:10,]),
    plot = "bar", 
    scale = "reverselog")

Where:

enrichment is the result from the enrichment or multisample_enrichment function
scale can be set to either identity or log and will affect the gradient of colors
nrow is the number of rows that should be used to plot all barplots (default is 1)

ACSN description

Paul Deveau

2016-09-01

ACSN Enrichment

Description

Usage

Pathway analysis

Import gmt files

Perform analysis

Data visualization

Heatmaps

Barplots