[bio-tips]genomeCoverageBed introduction

Having sequenced and obatain BAM/SAM file, one is going to visulize the data in histogram. WIG, or Bedgraph format file will work. Thus what we need is a tool that convert BAM/SAM file into Bedgraph or WIG file. 

Before the converting, two things should be prepared.

1. BAM file is suggested to be sorted with SAMTOOLS. The input is `mapped.bam`, and out sorted file is `sort.mapped.bam`

samtools sort mapped.bam sort.mapped.bam

2. chromInfo file is needed. 

chromInfo file records the length for each chromatin. And can be available from UCSC genome browser. 

#chrom    size    
chr1    197195432
chr2    181748087
chr3    159599783
chr4    155630120

 

Next, let's hit it. 

1. BAM2Bedgraph

genomeCoverageBed -bg -ibam sort.mapped.bam -g genome.chromInfo >genomewide.bedgraph

2. BAM2Wig

genomeCoverageBed -d -strand + -ibam sort.mapped.bam -g genome.chromInfo >genomewide.wig

Calculating the forward strand coverage. And Wig file is 1-based coordinated format, thus we use the -d option. 

====

Here is attached the full help document for genomecoveragebed tool. And the algorithm is simple, I ever managed to write a perl version. 


 Usage: bedtools genomecov [OPTIONS] -i <bed/gff/vcf> -g <genome>

Options: 
    -ibam        The input file is in BAM format.
            Note: BAM _must_ be sorted by position

    -d        Report the depth at each genome position (with one-based coordinates).
            Default behavior is to report a histogram.

    -dz        Report the depth at each genome position (with zero-based coordinates).
            Reports only non-zero positions.
            Default behavior is to report a histogram.

    -bg        Report depth in BedGraph format. For details, see:
            genome.ucsc.edu/goldenPath/help/bedgraph.html

    -bga        Report depth in BedGraph format, as above (-bg).
            However with this option, regions with zero 
            coverage are also reported. This allows one to
            quickly extract all regions of a genome with 0 
            coverage by applying: "grep -w 0$" to the output.

    -split        Treat "split" BAM or BED12 entries as distinct BED intervals.
            when computing coverage.
            For BAM files, this uses the CIGAR "N" and "D" operations 
            to infer the blocks for computing coverage.
            For BED12 files, this uses the BlockCount, BlockStarts, and BlockEnds
            fields (i.e., columns 10,11,12).

    -strand        Calculate coverage of intervals from a specific strand.
            With BED files, requires at least 6 columns (strand is column 6). 
            - (STRING): can be + or -

    -5        Calculate coverage of 5" positions (instead of entire interval).

    -3        Calculate coverage of 3" positions (instead of entire interval).

    -max        Combine all positions with a depth >= max into
            a single bin in the histogram. Irrelevant
            for -d and -bedGraph
            - (INTEGER)

    -scale        Scale the coverage by a constant factor.
            Each coverage value is multiplied by this factor before being reported.
            Useful for normalizing coverage by, e.g., reads per million (RPM).
            - Default is 1.0; i.e., unscaled.
            - (FLOAT)

    -trackline    Adds a UCSC/Genome-Browser track line definition in the first line of the output.
            - See here for more details about track line definition:
                  http://genome.ucsc.edu/goldenPath/help/bedgraph.html
            - NOTE: When adding a trackline definition, the output BedGraph can be easily
                  uploaded to the Genome Browser as a custom track,
                  BUT CAN NOT be converted into a BigWig file (w/o removing the first line).

    -trackopts    Writes additional track line definition parameters in the first line.
            - Example:
               -trackopts 'name="My Track" visibility=2 color=255,30,30'
               Note the use of single-quotes if you have spaces in your parameters.
            - (TEXT)

 

 

posted @ 2012-09-24 11:24  Puriney  阅读(1662)  评论(0编辑  收藏  举报