Here, we developed a software package (SMDIC) to automated identification of Somatic Mutation-Driven Immune Cell. The operation modes including: i) inferring the relative abundance matrix of tumor-infiltrating immune cells and integrating it with a particular gene mutation status, ii) detecting differential immune cells with respect to the gene mutation status and converting the abundance matrix of significant differential immune cell into two binary matrices (one for up-regulated and one for down-regulated cells), iii) identifying somatic mutation-driven immune cells by comparing the gene mutation status with each immune cell in the binary matrices across all samples, and iv) visualization of immune cell abundance of samples in different mutation status. Its capabilities enable SMDIC to identify somatic mutation-specific immune cell response. SMDIC may contribute to find neoantigens to facilitate the development of tailored immunotherapy for patients with cancer.
This vignette illustrates how to easily use the SMDIC package. With the use of functions in this packages, users could identify the immune cells driven by somatic mutations in tumor microenvironment.
This package provides the exp2cell
function use gene expression profile to quantify cell abundance matrix.
This package provides the maf2matrix
function use mutation annotation file (MAF) format data to build a binary mutations matrix.
This package provides the mutcorcell
function to identify the immune cells driven by somatic mutations in tumor microenvironment .
This package provides the plotwaterfall
function to plot the waterfall for mutation genes which drive immune cells.
This package provides the plotinteractions
function to plot the co-occurrence and mutual exclusivity plots for mutation genes which drive immune cells.
This package provides the heatmapcell
function to draw clustered heatmaps for the cells driven by a somatic mutation.
This package provides the survcell
function to draw Kaplan–Meier curves for survival in the above-median and below-median groups for cell risk score.
We can use function GetExampleData
to return example data and environment variables.
library(SMDIC)
library(GSVA)
exp.example<-GetExampleData("exp.example") # obtain example expression data
cellmatrix.example<-exp2cell(exp.example,method="ssGSEA") #Cell abundance matrix,method must be one of "xCell","ssGSEA" and "CIBERSORT".
#> Warning in .local(expr, gset.idx.list, ...): 659 genes with constant
#> expression values throuhgout the samples.
#view first six rows and six colmns of cell infiltration score matrix.
cellmatrix.example[1:6,1:6]
#> TCGA.A7.A13G.01 TCGA.A7.A26E.01 TCGA.A1.A0SQ.01
#> aDC -0.1721896 -0.2365829 -0.09769543
#> B cells -0.1785598 -0.2060056 -0.21060641
#> CD8 T cells 0.2603276 0.2939651 0.32180621
#> Cytotoxic cells -0.1710623 -0.1744474 -0.13125948
#> DC -0.2110966 -0.2248506 -0.16642146
#> Eosinophils 0.1456574 0.1297275 0.16402532
#> TCGA.GI.A2C9.11 TCGA.E9.A1NE.01 TCGA.A2.A0SX.01
#> aDC -0.11102476 0.30829474 0.255985640
#> B cells -0.15950503 -0.06773615 -0.007057109
#> CD8 T cells 0.34343045 0.42561320 0.337348145
#> Cytotoxic cells 0.04400304 0.29537554 0.125879324
#> DC 0.16974204 -0.01435208 0.109405174
#> Eosinophils 0.12760198 0.13450722 0.068610081
We extract the non-silent somatic mutations (nonsense mutation, missense mutation, frame-shif indels, splice site, nonstop mutation, translation start site, inframe indels) in protein-coding regions, and built a binary mutations matrix, in which 1 represents any mutation occurs in a particular gene in a particular sample, otherwise the element is 0. The genes with a given mutation frequency greater than a threshold value one precent as the default value) are retained for the following analysis.
maf <- system.file("extdata","example.maf.gz",package = "SMDIC") #get path of the mutation annotation file.
mutmatrix.example<-maf2matrix(maf)
head(mutmatrix.example)[,1:6]
#> TCGA.E9.A22G.01 TCGA.OL.A6VO.01 TCGA.AR.A1AP.01 TCGA.BH.A0AW.01
#> TP53 1 1 1 1
#> TP53BP2 0 0 0 0
#> TP53BP1 0 0 0 0
#> PCDH10 0 0 0 0
#> CDH1 0 0 0 0
#> CDH10 0 0 0 0
#> TCGA.E2.A1LG.01 TCGA.A2.A04U.01
#> TP53 1 1
#> TP53BP2 0 0
#> TP53BP1 0 0
#> PCDH10 1 0
#> CDH1 0 0
#> CDH10 0 0
Function mutcorcell
detects the differential immune cells with respect to a particular gene mutation status and construction of two binary matrices based on the abundance matrix of significant differential immune cell (one for up-regulated and one for down-regulated), identification of somatic mutation-driven immune cells by comparing the gene mutation status with each immune cell in the binary matrices.
#prepare data for following analysis.
cellmatrix<-GetExampleData("cellmatrix") # obtain example result from real rasult: cell abundance matrix from real data.
mutmatrix<-GetExampleData("mutmatrix")# select mutmatrix example result from real result: a binary mutations matrix
#mutcell<-mutcorcell(cell = cellmatrix,mutmatrix = mutmatrix,fisher.adjust = TRUE) ## perform the function `mutcorcell`.
result mutcell include three matrices:mut_cell is a binary numerical matrix which shows the immune cells driven by somatic mutation; mut_cell_p is a numerical matrix which shows pvalue of the immune cells driven by somatic mutation; mut_cell_fdr is a numerical matrix which shows fdr of the immune cells driven by somatic mutation; mut_cell_cellresponses is a character matrix which shows the cell responses of the immune cells driven by somatic mutation.
mutcell<-GetExampleData("mutcell") #get the result of the `mutcorcell` function
#view first ten rows and six colmns of mutcell matrix.
mutcell$mut_cell[1:6,1:6]
#> aDC Adipocytes Astrocytes B-cells CD4+ memory T-cells
#> TP53 0 0 1 0 0
#> CDH1 0 1 0 0 0
#> NBPF14 0 0 0 0 0
#> OBSCN 0 0 0 0 0
#> NF1 1 0 0 1 0
#> ANKRD30A 0 0 0 0 0
#> CD4+ T-cells
#> TP53 0
#> CDH1 0
#> NBPF14 0
#> OBSCN 0
#> NF1 0
#> ANKRD30A 0
#mutcell$mut_cell_p
#mutcell$mut_cell_fdr
#mutcell$mut_cell_cellresponses
dim(mutcell$mut_cell)
#> [1] 10 42
Function mutcellsummary
produces result summaries of the results of function mutcorcell
function.
summary<-mutcellsummary(mutcell =mutcell,mutmatrix = mutmatrix,cellmatrix = cellmatrix)# The summary have four columns.The first column are gene names,the second column are the cells driven by the gene,the third column are the number of cells driven by the gene,the fourth column are mutation rates of gene.
head(summary)
#> gene
#> 1 TP53
#> 2 CDH1
#> 3 NBPF14
#> 4 OBSCN
#> 5 NF1
#> 6 KCNA4
#> cells
#> 1 Astrocytes,CD8+ naive T-cells,CD8+ Tem,DC,Keratinocytes,Macrophages,Macrophages M1,Melanocytes,NK cells,pDC,Pericytes,Plasma cells,pro B-cells,Sebocytes,Tgd cells,Th1 cells,Th2 cells,Tregs
#> 2 Adipocytes,CD4+ Tcm,Chondrocytes,CMP,Endothelial cells,Fibroblasts,HSC,ly Endothelial cells,Megakaryocytes,Mesangial cells,mv Endothelial cells,NKT
#> 3 CD8+ Tem,DC,iDC,Keratinocytes,pro B-cells,Sebocytes
#> 4 Epithelial cells,Keratinocytes,pro B-cells,Sebocytes,Th1 cells
#> 5 aDC,B-cells,Memory B-cells,pDC,Plasma cells
#> 6 CD8+ T-cells,CD8+ Tcm,Class-switched memory B-cells,NK cells,pDC
#> count mut rate
#> 1 18 0.34394251
#> 2 12 0.12936345
#> 3 6 0.01334702
#> 4 5 0.02977413
#> 5 5 0.03696099
#> 6 5 0.01129363
Function gene2cellsummary
produces result summaries of the immune cells driven by a somatic mutation.
gene2cellsummary(gene="TP53",method="xCell",mutcell = mutcell) #a matrix shows the short name, full name, pvalue, fdr, sigtype of the cells driven by a somatic mutation
#> gene cell name full name
#> Astrocytes TP53 Astrocytes Astrocytes
#> CD8+ naive T-cells TP53 CD8+ naive T-cells CD8+ naive T-cells
#> CD8+ Tem TP53 CD8+ Tem CD8+ effector memory T-cells
#> DC TP53 DC Dendritic cells
#> Keratinocytes TP53 Keratinocytes Keratinocytes
#> Macrophages TP53 Macrophages Macrophages
#> Macrophages M1 TP53 Macrophages M1 Macrophages M1
#> Melanocytes TP53 Melanocytes Melanocytes
#> NK cells TP53 NK cells NK cells
#> pDC TP53 pDC Plasmacytoid dendritic cells
#> Pericytes TP53 Pericytes Pericytes
#> Plasma cells TP53 Plasma cells Plasma cells
#> pro B-cells TP53 pro B-cells pro B-cells
#> Sebocytes TP53 Sebocytes Sebocytes
#> Tgd cells TP53 Tgd cells Gamma delta T-cells
#> Th1 cells TP53 Th1 cells Type 1 T-helper cells
#> Th2 cells TP53 Th2 cells Type 2 T-helper cells
#> Tregs TP53 Tregs Regulatory T-cells
#> pvalue fdr cell responses
#> Astrocytes 1.010916e-07 1.162554e-06 up
#> CD8+ naive T-cells 3.992545e-03 1.311836e-02 up
#> CD8+ Tem 1.748251e-02 4.467753e-02 up
#> DC 1.604871e-02 4.385580e-02 up
#> Keratinocytes 2.331616e-07 1.981328e-06 up
#> Macrophages 1.820606e-04 1.196398e-03 up
#> Macrophages M1 6.327255e-09 9.701790e-08 up
#> Melanocytes 5.155187e-03 1.580924e-02 up
#> NK cells 5.359277e-04 2.465267e-03 up
#> pDC 2.076020e-03 8.121323e-03 up
#> Pericytes 3.445410e-04 1.981111e-03 up
#> Plasma cells 1.620758e-02 4.385580e-02 up
#> pro B-cells 2.637040e-03 9.331064e-03 up
#> Sebocytes 1.510706e-09 3.474624e-08 up
#> Tgd cells 2.118606e-03 8.121323e-03 up
#> Th1 cells 2.584341e-07 1.981328e-06 up
#> Th2 cells 8.859960e-13 4.075581e-11 up
#> Tregs 4.775752e-04 2.440940e-03 up
Function heatmapcell
provides visualization of the relative abundance of identified immune cells in the gene mutation status using heat maps
library(pheatmap)
heatmapcell(gene = "TP53",mutcell = mutcell,cellmatrix = cellmatrix,mutmatrix = mutmatrix)
Function plotwaterfal
and plotinteractions
provides visualization of the mutation genes correlated with immune cells using waterfall plot and mutually exclusive or co-occurring plot.
#file<-"dir" #dir must be an absolute path or the name relatived to the current working directory.
#plotwaterfall(maffile = file,mutcell.summary = summary,cellnumcuoff =4)
#plotCoocMutex(maffile = file,mutcell.summary = summary,cellnumcuoff =4)
Function survcell
draws Kaplan–Meier curves for survival in the above-median and below-median groups for cell risk score. The cell risk score is calaulated by the weighted mean of cells driven by a gene mutation, where the weight of cells is estimated by the “Univariate” or “Multivariate” cox.
mutcell<-GetExampleData("mutcell") # The result of `mutcorcell` function.
cellmatrix<-GetExampleData("cellmatrix") # Cell abundance matrix
surv<-GetExampleData("surv") # The survival data
survcell(gene ="TP53",mutcell=mutcell,cellmatrix=cellmatrix,surv=surv)