BinQuasi

Emily Goren

June 18, 2018

Introduction

This package provides code to call peaks in ChIP-seq data with biological replicates using the BinQuasi algorithm of Goren, Liu, Wang, and Wang (2018) doi.org/10.1093/bioinformatics/bty227.

Data Preprocessing

BinQuasi accepts sorted and indexed BAM files (note that it does not perform genome alignment of raw reads). If your BAM files are not indexed and sorted, we recommend using samtools.

Peak Calling

Once installed, BinQuasi calls peaks with the function “BQ().” Below is code to run BinQuasi with all default settings, where the sorted and indexed BAM files are stored in the directory specified by “fpath” under the file names “C1.bam”, " C2.bam" and “I1.bam”, “I2.bam” for ChIP and input files, respectively.

library(BinQuasi)
fpath <- paste0(system.file(package = 'BinQuasi'), '/extdata/')
results <- BQ(fpath, 
              ChIP.files = c('C1.bam', 'C2.bam'), 
              control.files = c('I1.bam', 'I2.bam'))
#> Fragment length not provided. Estimating fragment length using cross correlation... please wait...
#> Bin size not provided. Estimating bin size... please wait...
#> Using bin size of 50 bp
#> Using estimated fragment length for C1.bam equal to 100 bp
#> Using estimated fragment length for C2.bam equal to 100 bp
#> Using estimated fragment length for I1.bam equal to 100 bp
#> Using estimated fragment length for I2.bam equal to 100 bp
#> [1] "Analyzing Window # 2"
#> [1] "Analyzing Window # 10"
#> [1] "Analyzing Window # 100"
#> [1] "Analyzing Window # 500"
#> [1] "Analyzing Window # 1000"
#> [1] "Analyzing Window # 2500"
#> [1] "Analyzing Window # 5000"
#> [1] "Analyzing Window # 10000"
#> [1] "Analyzing Window # 15000"
#> [1] "Analyzing Window # 2"
#> [1] "Analyzing Window # 10"
#> [1] "Analyzing Window # 100"
#> [1] "Analyzing Window # 500"
#> [1] "Analyzing Window # 1000"
#> [1] "Analyzing Window # 2500"
#> [1] "Analyzing Window # 5000"
#> [1] "Analyzing Window # 10000"
#> [1] "Analyzing Window # 15000"
#> [1] "Spline scaling factor: 1.7634298697216"
head(results$peaks)
#>   start   end width  chr        P.val        Q.val
#> 1 18051 18200   150 chr4 1.344579e-08 2.834062e-08
#> 2 21951 22100   150 chr4 8.874951e-07 1.216727e-06
#> 3 25401 25550   150 chr4 8.999441e-09 1.961417e-08
#> 4 29851 29950   100 chr4 7.552531e-07 1.052402e-06
#> 5 39551 39650   100 chr4 3.514902e-06 4.268095e-06
#> 6 53001 53100   100 chr4 2.743101e-07 4.020062e-07

See the package documentation for information on changing the default settings.

?BQ

Exporting Results

The code below saves the called peaks in BED format in the file “BinQuasiPeaks.bed”.

# Sort peaks by p-value
opeaks <- results$peaks[order(results$peaks$P.val),]
# Name the peaks by rank
opeaks$name <- paste0('BQ_Peak_', 1:nrow(opeaks))
# Save as .bed file, setting the scores to be -log10(p-value)
bedout <- data.frame(chrom = opeaks$chr,
                     chromStart = opeaks$start,
                     chromEnd = opeaks$end,
                     name = opeaks$name,
                     score = -log10(opeaks$P.val),
                     strand = c(rep(".",  nrow(opeaks))))
head(bedout)
#>   chrom chromStart chromEnd      name    score strand
#> 1  chr4     241451   241650 BQ_Peak_1 16.15782      .
#> 2  chr4     685551   685750 BQ_Peak_2 15.64186      .
#> 3  chr4     697051   697200 BQ_Peak_3 14.69503      .
#> 4  chr4     439301   439500 BQ_Peak_4 13.90893      .
#> 5  chr4     322051   322250 BQ_Peak_5 13.50279      .
#> 6  chr4     650851   651050 BQ_Peak_6 12.78873      .
write.table(bedout, file="BinQuasiPeaks.bed", quote = FALSE, sep = "\t", row.names = FALSE, col.names = FALSE)