samtools 实现对bam文件统计测序深度 和 比对率

 

001、利用samtools计算每一个位点的测序深度

[b20223040323@admin1 test]$ ls                             ## 测试bam文件
SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ samtools depth -aa SRR21814498.sorted.markdup.bam > all_depth.txt   ## 利用samtools统计参考基因组所有位点的测序深度
[b20223040323@admin1 test]$ ls                             ## 结果文件
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt             ## 查看前10行
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0

 

 

002、统计比对到参考基因组的位点的平均测序深度

[b20223040323@admin1 test]$ ls
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt           ## 所有位点的测序深度信息
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0
[b20223040323@admin1 test]$ awk -F "\t" 'BEGIN{a = 0; b = 0} {if($3 > 0) {a++; b += $3}} END {print b/a}' all_depth.txt       ## 统计测到的位点的平均的测序深度
9.88724

 

 

003、统计基因组所有位点,包括未测到的位点的平均测序深度

[b20223040323@admin1 test]$ ls
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0
[b20223040323@admin1 test]$ awk -F "\t" 'BEGIN{a = 0} {if($3 > 0) {a += $3}} END {print a/NR}' all_depth.txt    ## 统计参考基因组所有位点平均测序深度
8.84949

 

 

004、统计覆盖度

[b20223040323@admin1 test]$ ls
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0
[b20223040323@admin1 test]$ awk -F "\t" 'BEGIN{a = 0} {if($3 > 0) {a++}} END {print a/NR}' all_depth.txt       ## 统计测序覆盖度
0.895042

 

 

参考:https://www.jianshu.com/p/60b0fda0a591

 

posted @ 2023-02-11 22:44  小鲨鱼2018  阅读(1865)  评论(0编辑  收藏  举报