外显子测序数据拷贝数变异分析(带更正)
Step-1: Run varscan2 on normal-tumor-mpileup and make copy-number call.
java -jar $varscan copynumber normal-tumor-mpileup $outputFile --mpileup 1 --p-value 0.005 --min-coverage 30 --min-map-qual 20 --min-base-qual 20 --data-ratio Calculate_From_Data
Step-2: Use CBS binary segmentation in R to find segmented region. Use mergeSegment.pl (provided by varscan) to merge them.
library(DNAcopy)
P1 = read.table ("filename.copynumber", header=TRUE)
CNA.object <- CNA( genomdat = P1[,7], chrom = P1[,1], maploc = P1 [,2], data.type = 'logratio')
CNA.object.smoothed <- smooth.CNA(CNA.object)
CNA.object.smoothed.seg <- segment(CNA.object.smoothed, verbose=0, min.width=2)
seg.pvalue <- segments.p(CNA.object.smoothed.seg, ngrid=100, tol=1e-6, alpha=0.05, search.range=100, nperm=1000)
write.table (seg.pvalue, file="cbs_P1_pvalue", row.names=F, col.names=F, quote=F, sep="\t")
# This will not print the p-value.
#segmented = CNA.object.smoothed.seg$output
#write.table (segmented[,2:6], file="cbs_P1", row.names=F, col.names=F, quote=F, sep="\t")
- Step-2.1: Input the output file to mergeSegments.pl [downloaded from varscan2]. Note: mergeSegment.pl needs an additional column-2, where "chrom" needs to inserted.
cat cbs_P1_pvalue | awk 'BEGIN {OFS="\t"} { print $1,"chrom",$2,$3,$4,$5,$6,$7,$8,$9,$10 }' > formattedInputFile
perl mergeSegment.pl formattedInputFile --ref-arm-sizes $refArmSize --output-basename filteredOutput
Step-3: Now, let's try to find recurrent CNV among population, for which i use varscan and GISTIC.
- Step-3.1: Make marker position
cat filename.copynumber | grep -v -e chrom -e chrM -e chrX -e chrY | awk 'BEGIN {OFS="\t"} { print "ID'",$1,$2 }' | sed 's/chr//g' > markerPosition
- Step-3.2: Make segment file
$name=filteredOutput
cat $name | awk 'BEGIN {OFS="\t"} { print "ID",$2,$3,$4,$5,$6 }' | grep -v -e chrM -e chrX -e chrY | sed 's/chr//g' > segmentedFile
Step-4: Run GISTIC
markersfile=markerPosition
segfile=segmentedFile
gp_gistic2_from_seg -b $basedir -seg $segfile -mk $markersfile -refgene $refgenefile -genegistic 1 -smallmem 1 -broad 1 -brlen 0.5 -conf 0.90 -armpeel 1 -savegene 1 -gcm extreme