ENCODE-Blacklist:基因组“黑名单”
http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz
黑名单:顾名思义,就是有问题的区域!具体怎么定义,包含哪些信息呢?咱来详细唠一唠:
在The ENCODE Blacklist: Identification of Problematic Regions of the Genome (https://www.nature.com/articles/s41598-019-45839-z) 这篇文章中,定义了基因组中的blacklist区域,即反常的或者无论在二代测序的哪个实验中都是高信号的区域。排除掉这些区域对我们进一步分析功能基因组数据可以提供质量保证。
文章中提供了一个blacklist区域和正常区域的比较:
在blacklist区域信号非常高,要达到background的 6400×左右。
现在有ce10, ce11, dm3, dm6, hg19, hg38和mm10的blacklist region,可以在以下网站下载: https://github.com/Boyle-Lab/Blacklist/; https://www.encodeproject.org/annotations/ENCSR636HFF/
- HUMAN (hg19/GRCh38): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz ENCODE portal link: https://www.encodeproject.org/annotations/ENCSR636HFF/ (Select GRCh38)
- HUMAN (hg19/GRCh37): ENCODE portal link: https://www.encodeproject.org/annotations/ENCSR636HFF/ (Select hg19) UCSC Genome browser track http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeMapability README on how this track of generated: http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg19-human/hg19-blacklist-README.pdf
- MOUSE (mm10): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm10-mouse/mm10.blacklist.bed.gz ENCODE portal link: https://www.encodeproject.org/annotations/ENCSR636HFF/ (Select mm10)
- MOUSE (mm9): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm9-mouse/mm9-blacklist.bed.gz
- WORM (ce10): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/ce10-C.elegans/ce10-blacklist.bed.gz
- FLY (dm3): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/dm3-D.melanogaster/dm3-blacklist.bed.gz