Downsampling Bam file | 平衡测序深度

 

目前对peak的数据处理上,发现测序深度对peak的数量有很大影响,即使做了normalization也没办法,所以这里希望从原始的bam文件开始做downsampling。

参考一:Downsample BAM file to specific amount of reads

input_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/processed_bam/Exp1/
output_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/downsampling_bam/Exp1

cd $input_dir

for bam in `ls *.bam`
do
echo $bam
reads=5000000
fraction=$(samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {print ct/total}')
samtools view -b -s ${fraction} $bam > $output_dir/$bam
done

  

以上脚本的不完善的地方就是当测序read低于5000000,它还是执行downsampling,这不是我们想看到的。

 

参考二:Easy bam downsampling

input_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/processed_bam/Exp1/
output_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/downsampling_bam2/Exp1

cd $input_dir

for bam in `ls *.bam`
do
echo $bam
reads=5000000
# fraction=$(samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {print ct/total}')
frac=$( samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {frac=ct/total; if (frac > 1) {print 0.99} else {print frac}}' )
echo $frac
samtools view -b -s ${frac} $bam > $output_dir/$bam
done

  

samtools view: Incorrect sampling argument "1"

  

posted @ 2022-12-04 01:41  Life·Intelligence  阅读(257)  评论(0编辑  收藏  举报
TOP