幸福的小恬恬

Insanity is doing the same thing over and over again and expecting different results.疯狂就是重复的做一件事,但期待不同的结果。---爱因斯坦

生信分析常用脚本(二)--SOAPdenovo

1.SOAPDenovo配置文件示例

软件下载安装和使用:http://soap.genomics.org.cn/soapdenovo.html

asm.cfg

#maximal read length
max_rd_len=100 [LIB] avg_ins=450

#if sequence needs to be reversed
reverse_seq=0

#in which part(s) the reads are used asm_flags=3

#use only first 100 bps of each read
rd_len_cutoff=100

#in which order the reads are used while scaffolding
rank=1

# cutoff of pair number for a reliable connection (at least 3 for short insert size)
pair_num_cutoff=3 #minimum aligned length to contigs for a reliable read location (at least 32 for short insert size)
map_len
=32
#a pair of fastq file, read 1 file should always be followed by read 2 file
q1=../../data/newBGIseq500_1.fq.gz q2=../../data/newBGIseq500_2.fq.gz

运行脚本:

run.sh

/home/stu2/Software/Assemblathon1_pipeline/SOAPdenovo-63mer_v2.0 all -s  asm.cfg -K 35 -p 16 -R -o asm 1>cout.log 2>cerr.log
./SOAPdenovo-63mer_v2.0 pregraph -K 63 -s asm.cfg -o asm -p 40 1>pregraph.log 2>pregraph.err
./SOAPdenovo-63mer_v2.0 contig -s asm.cfg -g asm -M 2 -e 1 -p 40 -R -D 2  1>contig.log 2>contig.err
./SOAPdenovo-63mer_v2.0 map -s asm.cfg -g asm -k 31 -p 40 1>map.log 2>map.err
./SOAPdenovo-63mer_v2.0 scaff -g asm -p 40  -F 1>scaff.log 2>scaff.err

 02. VCFtools的使用

软件下载和使用:

https://vcftools.github.io/documentation.html

https://vcftools.github.io/man_latest.html

# get Qual
./vcftools --gzvcf chr17.vcf.gz --site-quality --out Qual
# get interval
./vcftools --gzvcf chr17.vcf.gz --chr chr17 --from-bp 7661779 --to-bp 7687550 --remove-indels --out TP53 --recode

03.变异位点信息统计

#!/usr/bin/perl
use strict;

my $file = shift;

open(In,"gzip -dc $file|") or die ("can't open the file!\n");

my @type;
my @array = ( );
while(<In>){
   chomp;
   next if (/^##/);
   if (/^#/){
     my @line = split;
     push @type, $line[9];
     push @type, $line[10];
     push @type, $line[11];
     next;
   }
   my @line = split;
   if ($line[1] >=7668402 && $line[1] <=7687550 && $line[2] ne "."){
       my @type1 = split(/\/|:/, $line[9]);
       my @type2 = split(/\/|:/, $line[10]);
       my @type3 = split(/\/|:/, $line[11]);
       #print "$type1[0]\t$type1[1]\n";
       if ($type1[0]==$type1[1]){
          $array[0][0] ++;
       }else{
          $array[0][1] ++;
       }
       if ($type2[0] == $type2[1]){
           $array[1][0] ++;
       }else{
           $array[1][1] ++;
       }
       if($type3[0] == $type3[1]){
           $array[2][0] ++;
       }else{
           $array[2][1] ++;
        }
   }
}
close IN;

print "Sample\tHomozygous\tHeterozygote\n";
print "$type[0]: $array[0][0]\t$array[0][1]\n";
print "$type[1]: $array[1][0]\t$array[1][1]\n";
print "$type[2]: $array[2][0]\t$array[2][1]\n";

04. KmerFreq使用脚本

工具包地址:ftp://public.genomics.org.cn/BGI/SOAPdenovo2

kmer.sh

./kmerfreq -k 17 -l lib.list -t 16 >17mer.freq 2>17mer.log

lib.list

1 reads1.fq.gz
2 reads2.fq.gz

 

 

友情参考链接:http://20xue.com/3997.htmlhttps://www.cnblogs.com/azrael-cc/

posted on 2018-11-02 21:48  可闲可填  阅读(1614)  评论(0编辑  收藏  举报

导航