fastq-dump 和 fasterq-dump 将sra文件转换为fastq格式的速度对比
001、 默认参数直接对比
[b20223040323@admin1 test02]$ ls SRR3156163.sra SRR3156164.sra [b20223040323@admin1 test02]$ md5sum * ## 两个sra文件完全一致 9e819f5e4499b54fd657163d82d07da9 SRR3156163.sra 9e819f5e4499b54fd657163d82d07da9 SRR3156164.sra [b20223040323@admin1 test02]$ time fastq-dump --split-3 SRR3156163.sra ## 使用fastq-dump,并记录时间 Read 51332776 spots for SRR3156163.sra Written 51332776 spots for SRR3156163.sra real 4m52.519s user 4m35.183s sys 0m19.418s [b20223040323@admin1 test02]$ time fasterq-dump --split-3 SRR3156164.sra ## 使用fasterq-dump,记录时间 spots read : 51,332,776 reads read : 102,665,552 reads written : 102,665,552 real 1m53.699s ## faster-dump速度更快 user 6m20.624s sys 1m4.905s [b20223040323@admin1 test02]$ ls SRR3156163_1.fastq SRR3156163_2.fastq SRR3156163.sra SRR3156164_1.fastq SRR3156164_2.fastq SRR3156164.sra [b20223040323@admin1 test02]$ ll -h 总用量 68G -rw-rw-r-- 1 b20223040323 b20223040323 14G 10月 6 16:49 SRR3156163_1.fastq -rw-rw-r-- 1 b20223040323 b20223040323 14G 10月 6 16:49 SRR3156163_2.fastq -rw-rw-r-- 1 b20223040323 b20223040323 6.6G 10月 6 16:43 SRR3156163.sra -rw-rw-r-- 1 b20223040323 b20223040323 14G 10月 6 16:51 SRR3156164_1.fastq -rw-rw-r-- 1 b20223040323 b20223040323 14G 10月 6 16:51 SRR3156164_2.fastq -rw-rw-r-- 1 b20223040323 b20223040323 6.6G 10月 6 16:43 SRR3156164.sra
002、fasterq-dump的多线程速度测试
[b20223040323@admin1 test02]$ ls SRR3156163.sra SRR3156164.sra [b20223040323@admin1 test02]$ md5sum * 9e819f5e4499b54fd657163d82d07da9 SRR3156163.sra 9e819f5e4499b54fd657163d82d07da9 SRR3156164.sra [b20223040323@admin1 test02]$ time fasterq-dump -e 8 --split-3 SRR3156164.sra ## 8线程 spots read : 51,332,776 reads read : 102,665,552 reads written : 102,665,552 real 0m55.326s user 4m7.720s sys 0m56.322s [b20223040323@admin1 test02]$ time fasterq-dump -e 30 --split-3 SRR3156163.sra ## 30线程 spots read : 51,332,776 reads read : 102,665,552 reads written : 102,665,552 real 0m33.775s ## 速度提高约40% user 5m1.410s sys 1m5.557s
003、其他参数
[b20223040323@admin1 test02]$ ls SRR3156163.sra SRR3156164.sra [b20223040323@admin1 test02]$ time fasterq-dump -e 48 -p --split-3 SRR3156163.sra -O result ## -p显示进度, -O参数指定输出目录, -e线程 join :|-------------------------------------------------- 100% concat :|-------------------------------------------------- 100% spots read : 51,332,776 reads read : 102,665,552 reads written : 102,665,552 real 0m31.166s user 4m37.099s sys 1m7.211s [b20223040323@admin1 test02]$ time fasterq-dump -e 48 -p --split-3 SRR3156164.sra -O result join :|-------------------------------------------------- 100% concat :|-------------------------------------------------- 100% spots read : 51,332,776 reads read : 102,665,552 reads written : 102,665,552 real 0m34.429s user 5m11.478s sys 1m18.067s [b20223040323@admin1 test02]$ ls result SRR3156163.sra SRR3156164.sra [b20223040323@admin1 test02]$ tree -h ## 查看结果结构 . ├── [4.0K] result │ ├── [ 14G] SRR3156163_1.fastq │ ├── [ 14G] SRR3156163_2.fastq │ ├── [ 14G] SRR3156164_1.fastq │ └── [ 14G] SRR3156164_2.fastq ├── [6.6G] SRR3156163.sra └── [6.6G] SRR3156164.sra 1 directory, 6 files
。
参考:
01、https://www.omicsclass.com/article/1917
02、https://www.jianshu.com/p/e9f6e16e2c8a