10、RNA-seq for DE analysis training(Mapping to assign reads to genes)

1、Goal of mapping

1)We want to assign reads to genes they were derived from

2)The result of the mapping will be used to construct a summary of the counts: the count table.

2 、不同情况 in RNA-seq

1)Reference genome sequenceavailable

2)NO reference genome sequence available

  De novo assembly of the reads   (trinity  transcriptome construction)

  Map the reads to the assembly   (RSEM mapper)

    Extract count table   (note:no removal of polyA is required. Computationally expensive!)

3、Reads mapped to reference genome


1)Reference is haplotype: mixture of alleles, leads to mismatches.


2)Reads contain sequencing errors「


3)Reads derived from mRNA, genome is DNA

4、visualize SAM or aBAM

The outcome of the alignment is a SAM or a BAM format, which you can visualize in Galaxy (or with a stand-alone viewer such as GenomeView or IGV.

Galaxy  https://www.galaxyproject.org/  stand-double

GenomeView      stand-alone

IGV          stand-alone

5、Mapping QC

RseQC  http://rseqc.sourceforge.net/         After checking the mapping visually, determine more metrics with RseQC

BAMQC   http://qualimap.bioinfo.cipf.es/       mainly useful for DNA-seq

exeicise:  http://wiki.bits.vib.be/index.php/RNA-Seq_analysis_for_differential_expression#Mapping_processed_data


