SNP鉴定的标准流程有三个步骤:

一、mapping

   使用BWA MEM将每个样本的测序数据比对到组装的参考序列。建立索引和比对:

   $ bwa index ref.fa

   $ bwa mem –t 40 ref.fa read1.fq read2.fq >aln-pe.sam

二、转换

   $ samtools view -bS aln-pe.sam > aln-pe.bam

   $ samtools sort -@ 40 aln-6.bam > aln-6.sorted.bam

                            $ samtools sort –o aln-pe-sorted.bam aln-pe.bam

   $ samtools index aln-pe.sorted.bam

三、SNP calling

   $ samtools mpileup -uf Trinity-longest.fasta aln-6.sorted.bam | bcftools call -Ou -mv > var.raw.bcf

  $ bcftools view var.raw.bcf | bcftools filter -e 'QUAL<30 || DP<20' > val.filter.vcf

  $ bcftools view val.filter.vcf | bcftools filter -i 'TYPE="snp"' > val.snp.vcf

                                        $ bcftools mpileup –Ou –f ref.fa aln-pe.sorted.bam | bcftools call –Ou –mv | bcftools filter -s LowQual -e ‘%QUAL<20 || DP<10’ > var.flt.vcf

https://wikis.utexas.edu/display/bioiteam/variant+calling+using+samtools

 The basic Command line

Suppose we have reference sequences in ref.fa, indexed by samtools faidx, and position sorted alignment files aln1.bam and aln2.bam, the following command lines call SNPs and short INDELs:

  1. samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf  
  2. bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf  

where the -D option sets the maximum read depth to call a SNP. SAMtools acquires sample information from the SM tag in the @RG header lines. One alignment file can contain multiple samples; reads from one sample can also be distributed in different alignment files. SAMtools will regroup the reads anyway. In addition, if no @RG lines are present, each alignment file is taken as one sample.

Since r865, it is possible to generate the consensus sequence with

  1. samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq