生信分析常用脚本(二)--SOAPdenovo
1.SOAPDenovo配置文件示例
软件下载安装和使用:http://soap.genomics.org.cn/soapdenovo.html
asm.cfg
#maximal read length
max_rd_len=100
[LIB]
avg_ins=450
#if sequence needs to be reversed
reverse_seq=0
#in which part(s) the reads are used
asm_flags=3
#use only first 100 bps of each read
rd_len_cutoff=100
#in which order the reads are used while scaffolding
rank=1
# cutoff of pair number for a reliable connection (at least 3 for short insert size)
pair_num_cutoff=3
#minimum aligned length to contigs for a reliable read location (at least 32 for short insert size)
map_len=32
#a pair of fastq file, read 1 file should always be followed by read 2 file
q1=../../data/newBGIseq500_1.fq.gz
q2=../../data/newBGIseq500_2.fq.gz
运行脚本:
run.sh
/home/stu2/Software/Assemblathon1_pipeline/SOAPdenovo-63mer_v2.0 all -s asm.cfg -K 35 -p 16 -R -o asm 1>cout.log 2>cerr.log
./SOAPdenovo-63mer_v2.0 pregraph -K 63 -s asm.cfg -o asm -p 40 1>pregraph.log 2>pregraph.err
./SOAPdenovo-63mer_v2.0 contig -s asm.cfg -g asm -M 2 -e 1 -p 40 -R -D 2 1>contig.log 2>contig.err
./SOAPdenovo-63mer_v2.0 map -s asm.cfg -g asm -k 31 -p 40 1>map.log 2>map.err
./SOAPdenovo-63mer_v2.0 scaff -g asm -p 40 -F 1>scaff.log 2>scaff.err
02. VCFtools的使用
软件下载和使用:
https://vcftools.github.io/documentation.html
https://vcftools.github.io/man_latest.html
# get Qual ./vcftools --gzvcf chr17.vcf.gz --site-quality --out Qual # get interval ./vcftools --gzvcf chr17.vcf.gz --chr chr17 --from-bp 7661779 --to-bp 7687550 --remove-indels --out TP53 --recode
03.变异位点信息统计
#!/usr/bin/perl use strict; my $file = shift; open(In,"gzip -dc $file|") or die ("can't open the file!\n"); my @type; my @array = ( ); while(<In>){ chomp; next if (/^##/); if (/^#/){ my @line = split; push @type, $line[9]; push @type, $line[10]; push @type, $line[11]; next; } my @line = split; if ($line[1] >=7668402 && $line[1] <=7687550 && $line[2] ne "."){ my @type1 = split(/\/|:/, $line[9]); my @type2 = split(/\/|:/, $line[10]); my @type3 = split(/\/|:/, $line[11]); #print "$type1[0]\t$type1[1]\n"; if ($type1[0]==$type1[1]){ $array[0][0] ++; }else{ $array[0][1] ++; } if ($type2[0] == $type2[1]){ $array[1][0] ++; }else{ $array[1][1] ++; } if($type3[0] == $type3[1]){ $array[2][0] ++; }else{ $array[2][1] ++; } } } close IN; print "Sample\tHomozygous\tHeterozygote\n"; print "$type[0]: $array[0][0]\t$array[0][1]\n"; print "$type[1]: $array[1][0]\t$array[1][1]\n"; print "$type[2]: $array[2][0]\t$array[2][1]\n";
04. KmerFreq使用脚本
工具包地址:ftp://public.genomics.org.cn/BGI/SOAPdenovo2
kmer.sh
./kmerfreq -k 17 -l lib.list -t 16 >17mer.freq 2>17mer.log
lib.list
1 reads1.fq.gz 2 reads2.fq.gz
友情参考链接:http://20xue.com/3997.html,https://www.cnblogs.com/azrael-cc/
越努力,越幸运,越是烦躁,越要静心,it's be OK!