gatk中的 GenomicsDBImport 模块

 

官网:https://gatk.broadinstitute.org/hc/en-us/articles/5358869876891-GenomicsDBImport

 

001、一般用法,变异检测库

gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport -V SRR21814509.g.vcf -V SRR21814514.g.vcf --genomicsdb-workspace-path my_database --tmp-dir /public/home/b20223040323/tmp -L NC_003070.9

  

002、-L参数可以指定多条染色体

gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport -V SRR21814509.g.vcf -V SRR21814514.g.vcf --genomicsdb-workspace-path my_database --tmp-dir /public/home/b20223040323/tmp -L chr.list

 

 

 

 

chr.list 格式:

NC_003070.9
NC_003071.7

 

003、将g.vcf文件写入文件列表

gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport --sample-name-map cohort.sample_map --genomicsdb-workspace-path my_database --tmp-dir /public/home/b20223040323/tmp -L NC_003070.9

 

cohort.sample_map 文件格式:
SRR21814509     SRR21814509.g.vcf
SRR21814514     SRR21814514.g.vcf

 

004、添加样本,更新my_database数据库

gatk GenomicsDBImport -V SRR21814498.g.vcf --genomicsdb-update-workspace-path my_database --tmp-dir /public/home/b20223040323/tmp

 

005、从数据集中对单条染色体进行变异检测

gatk --java-options "-Xmx60G -XX:+UseParallelGC -XX:ParallelGCThreads=20" GenotypeGVCFs -R /public/home/b20223040323/arabidopsis/fasta/GCF_000001735.4_TAIR10.1_genomic.fna -L NC_003070.9  -V gendb://my_database -O test.vcf

 

 006、对单条染色体的指定范围进行变异检测

gatk --java-options "-Xmx60G -XX:+UseParallelGC -XX:ParallelGCThreads=20" GenotypeGVCFs -R /public/home/b20223040323/arabidopsis/fasta/GCF_000001735.4_TAIR10.1_genomic.fna -L NC_003070.9:1-1000  -V gendb://my_database -O test.vcf

 

参考:

01、https://gatk.broadinstitute.org/hc/en-us/articles/360047216891-GenomicsDBImport

 

posted @ 2022-12-02 10:09  小鲨鱼2018  阅读(2026)  评论(0编辑  收藏  举报