TCGA_DNA_seq Analysis
DNA-Seq Alignment Command Line Parameters
STEP 1: CONVERTING BAMS TO FASTQS WITH BIOBAMBAM - BIOBAMBAM2 2.0.54
bamtofastq \ collate=1 \ exclude=QCFAIL,SECONDARY,SUPPLEMENTARY \ filename= <input.bam> \ gz=1 \ inputformat=bam level=5 \ outputdir= <output_path> \ outputperreadgroup=1 \ outputperreadgroupsuffixF=_1.fq.gz \ outputperreadgroupsuffixF2=_2.fq.gz \ outputperreadgroupsuffixO=_o1.fq.gz \ outputperreadgroupsuffixO2=_o2.fq.gz \ outputperreadgroupsuffixS=_s.fq.gz \ tryoq=1 \
STEP 2: BWA ALIGNMENT - BWA 0.7.15 - SAMTOOLS 1.3.1
If mean read length is greater than or equal to 70bp:
bwa mem \ -t 8 \ -T 0 \ -R <read_group> \ <reference> \ <fastq_1.fq.gz> \ <fastq_2.fq.gz> | samtools view \ -Shb -o <output.bam> -
If mean read length is less than 70bp:
bwa aln -t 8 <reference> <fastq_1.fq.gz> > <sai_1.sai> && bwa aln -t 8 <reference> <fastq_2.fq.gz> > <sai_2.sai> && bwa sampe -r <read_group> <reference> <sai_1.sai> <sai_2.sai> <fastq_1.fq.gz> <fastq_2.fq.gz> | samtools view -Shb -o <output.bam> -
If the quality scores are encoded as Illumina 1.3 or 1.5, use BWA aln with the "-l" flag.
STEP 3: BAM SORT - PICARD 2.6.0
java -jar picard.jar SortSam \ CREATE_INDEX=true \ INPUT=<input.bam> \ OUTPUT=<output.bam> \ SORT_ORDER=coordinate \ VALIDATION_STRINGENCY=STRICT
STEP 4: BAM MERGE - PICARD 2.6.0
java -jar picard.jar MergeSamFiles \ ASSUME_SORTED=false \ CREATE_INDEX=true \ [INPUT= <input.bam>] \ MERGE_SEQUENCE_DICTIONARIES=false \ OUTPUT= <output_path> \ SORT_ORDER=coordinate \ USE_THREADING=true \ VALIDATION_STRINGENCY=STRICT
STEP 5: MARK DUPLICATES - PICARD 2.6.0
java -jar picard.jar MarkDuplicates \ CREATE_INDEX=true \ INPUT=<input.bam> \ VALIDATION_STRINGENCY=STRICT
DNA-Seq Co-Cleaning Command Line Parameters
STEP 1: REALIGNTARGETCREATOR
Shell
java -jar GenomeAnalysisTK.jar \ -T RealignerTargetCreator \ -R <reference> -known <known_indels.vcf> [ -I <input.bam> ] -o <realign_target.intervals>
STEP 2: INDELREALIGNER
Shell
java -jar GenomeAnalysisTK.jar \ -T IndelRealigner \ -R <reference> \ -known <known_indels.vcf> \ -targetIntervals <realign_target.intervals> \ --noOriginalAlignmentTags \ [ -I <input.bam> ] \ -nWayOut <output.map>
STEP 3: BASERECALIBRATOR
Shell
java -jar GenomeAnalysisTK.jar \ -T BaseRecalibrator \ -R <reference> \ -I <input.bam> \ -knownSites <dbsnp.vcf> -o <bqsr.grp>
STEP 4: PRINTREADS
Shell
java -jar GenomeAnalysisTK.jar \ -T PrintReads \ -R <reference> \ -I <input.bam> \ --BQSR <bqsr.grp> \ -o <output.bam>
Variant Call Command-Line Parameters
MUSE
MuSEv1.0rc_submission_c039ffa
Step 1: MuSE call
Shell
MuSE call \ -f <reference> \ -r <region> \ <tumor.bam> \ <normal.bam> \ -O <intermediate_muse_call.txt>
Step 2: MuSE sump
Shell
MuSE sump \ -I <intermediate_muse_call.txt> \ -E \ -D <dbsnp_known_snp_sites.vcf> \ -O <muse_variants.vcf>
Note: -E is used for WXS data and -G can be used for WGS data.
MUTECT2
GATK nightly-2016-02-25-gf39d340
Shell
java -jar GenomeAnalysisTK.jar \ -T MuTect2 \ -R <reference> \ -L <region> \ -I:tumor <tumor.bam> \ -I:normal <normal.bam> \ --normal_panel <pon.vcf> \ --cosmic <cosmic.vcf> \ --dbsnp <dbsnp.vcf> \ --contamination_fraction_to_filter 0.02 \ -o <mutect_variants.vcf> \ --output_mode EMIT_VARIANTS_ONLY \ --disable_auto_index_creation_and_locking_when_reading_rods
SOMATICSNIPER
Somatic-sniper v1.0.5.0
Shell
bam-somaticsniper \ -q 0 \ -Q 15 \ -s 0.01 \ -T 0.85 \ -N 2 \ -r 0.001 \ -n NORMAL \ -t TUMOR \ -F vcf \ -f ref.fa \ <tumor.bam> \ <normal.bam> \ <somaticsniper_variants.vcf>
VARSCAN
Step 1: Mpileup; Samtools 1.1
Shell
samtools mpileup \ -f <reference> \ -q 1 \ -B \ <normal.bam> \ <tumor.bam> > <intermediate_mpileup.pileup>
Step 2: Varscan Somatic; Varscan.v2.3.9
java -jar VarScan.jar somatic \ <intermediate_mpileup.pileup> \ <output_path> \ --mpileup 1 \ --min-coverage 8 \ --min-coverage-normal 8 \ --min-coverage-tumor 6 \ --min-var-freq 0.10 \ --min-freq-for-hom 0.75 \ --normal-purity 1.0 \ --tumor-purity 1.00 \ --p-value 0.99 \ --somatic-p-value 0.05 \ --strand-filter 0 \ --output-vcf
Step 3: Varscan ProcessSomatic; Varscan.v2.3.9
Shell
java -jar VarScan.jar processSomatic \ <intermediate_varscan_somatic.vcf> \ --min-tumor-freq 0.10 \ --max-normal-freq 0.05 \ --p-value 0.07