Merqury安装及使用
一、安装
conda create -n merqury -c bioconda -c conda-forge merqury
conda activate merqury
Rscript $MERQURY/plot/plot_spectra_cn.R --help
merqury.sh
./merqury.sh <read-db.meryl> [<mat.meryl> <pat.meryl>] <asm1.fasta> [asm2.fasta] <out>
Usage: merqury.sh <read-db.meryl> [<mat.meryl> <pat.meryl>] <asm1.fasta> [asm2.fasta] <out>
<read-db.meryl> : k-mer counts of the read set
<mat.meryl> : k-mer counts of the maternal haplotype (ex. mat.only.meryl or mat.hapmer.meryl)
<pat.meryl> : k-mer counts of the paternal haplotype (ex. pat.only.meryl or pat.hapmer.meryl)
<asm1.fasta> : Assembly fasta file (ex. pri.fasta, hap1.fasta or maternal.fasta)
[asm2.fasta] : Additional fasta file (ex. alt.fasta, hap2.fasta or paternal.fasta)
*asm1.meryl and asm2.meryl will be generated. Avoid using the same names as the hap-mer dbs
<out> : Output prefix
二、Example
run Merqury using the prebuilt meryl dbs on a. thaliana F1 hybrid
### Download assemblies ###
wget https://gembox.cbcb.umd.edu/triobinning/athal_COL.fasta
wget https://gembox.cbcb.umd.edu/triobinning/athal_CVI.fasta
### Download prebuilt meryl dbs ###
# read.meryl of the F1 hybrid between COL-0 and CVI-0
wget https://obj.umiacs.umd.edu/marbl_publications/merqury/athal/a_thal.k18.meryl.tar.gz
# hap-mers for COL-0 haplotype
wget https://obj.umiacs.umd.edu/marbl_publications/merqury/athal/a_thal.col0.hapmer.meryl.tar.gz
# hap-mers for CVI-0 haplotype
wget https://obj.umiacs.umd.edu/marbl_publications/merqury/athal/a_thal.cvi0.hapmer.meryl.tar.gz
# Untar
for gz in *.tar.gz
do
tar -zxf $gz
done
###Run merqury
merqury.sh F1.k18.meryl col0.hapmer.meryl cvi0.hapmer.meryl athal_COL.fasta athal_CVI.fasta test
三、QV分析流程
STEP00 Prepare meryl dbs
1、得到合适的k值(当不确定使用多大的kmer时,使用best_k.sh with genome_size in num.of basea。以人类基因组为例,best kmer = 21)
sh $MERQURY/best_k.sh <genome_size> [tolerable_collision_rate=0.001]
2、批量化构建k-mer dbs(以illumina WGS reads为例)
for each read$i.fastq.gz
do
# 1. Build meryl dbs
meryl k=$k count output read$i.meryl read$i.fastq.gz
done
# 2. Merge
meryl union-sum output $genome.meryl read*.meryl
STEP01 Run merqury
# 此处应用适用于没有父母本hap-mers的情况(更多使用案例见官网)
merqury.sh read-db.meryl asm1.fasta asm2.fasta out_prefix