统计tophat map上的read数量
samtools flagstat /SRA111111/SRR111222/accepted_hits.bam
78406056 + 0 in total (QC-passed reads + QC-failed reads) (1)
0 + 0 duplicates
78406056 + 0 mapped (100.00%:-nan%) (2)
78406056 + 0 paired in sequencing (3)
39915264 + 0 read1 (4)
38490792 + 0 read2 (5)
68310778 + 0 properly paired (87.12%:-nan%) (6)
73600312 + 0 with itself and mate mapped (7)
4805744 + 0 singletons (6.13%:-nan%) (8)
1208374 + 0 with mate mapped to a different chr (9)
115100 + 0 with mate mapped to a different chr (mapQ>=5) (10)
(2)=(7)+(8)
(3)=(4)+(5)
Usage: samtools flagstat <in.bam> $ samtools flagstat example.bam 11945742 + 0 in total (QC-passed reads + QC-failed reads) #总共的reads数 0 + 0 duplicates 7536364 + 0 mapped (63.09%:-nan%) #总体上reads的匹配率 11945742 + 0 paired in sequencing #有多少reads是属于paired reads 5972871 + 0 read1 #reads1中的reads数 5972871 + 0 read2 #reads2中的reads数 6412042 + 0 properly paired (53.68%:-nan%) #完美匹配的reads数:比对到同一条参考序列,并且两条reads之间的距离符合设置的阈值 6899708 + 0 with itself and mate mapped #paired reads中两条都比对到参考序列上的reads数 636656 + 0 singletons (5.33%:-nan%) #单独一条匹配到参考序列上的reads数,和上一个相加,则是总的匹配上的reads数。 469868 + 0 with mate mapped to a different chr #paired reads中两条分别比对到两条不同的参考序列的reads数 243047 + 0 with mate mapped to a different chr (mapQ>=5) #同上一个,只是其中比对质量>=5的reads的数量
samtools view ./accepted_hits.bam | cut -f1 | sort | uniq | wc -l
REF:
https://www.biostars.org/p/84396/
https://www.biostars.org/p/12475/
http://seqanswers.com/forums/showthread.php?t=16500
http://sourceforge.net/p/samtools/mailman/message/31201762/
http://xushengwang.blogspot.com/2010/09/interpreting-samtools-flagstat-output.html
http://genomespot.blogspot.com/2014/09/data-analysis-step-3-align-paired-end.html
http://seqanswers.com/forums/showthread.php?t=19844