中国人群队列研究
中国人群队列研究汇总
女娲基因组(2020年)
- 群体规模:2999人
- 数据类型:位点注释数据(hg38 含有糖尿病患者);数据未开放下载,按区域/位点/基因 可在线查询
- 资源:女娲基因组资源 (ibp.ac.cn)
- 文章: NyuWa Genome resource: A deep whole-genome sequencing-based variation profile and reference panel for the Chinese population - ScienceDirect
- ref:[生物物理所“女娲”基因组资源研究取得进展----中国科学院 (cas.cn)](https://www.cas.cn/syky/202111/t20211117_4814405.shtml#:~:text=为此,中国科学院院士、中科院生物物理研究所研究员徐涛团队,研究员何顺民团队合作,在 Cell Reports 上在线发表了题为 NyuWa Genome Resource%3A A,Reference Panel for the Chinese Population 的文章,介绍该团队关于"女娲"(NyuWa)中国人群基因组资源库(http%3A%2F%2Fbigdata.ibp.ac.cn%2FNyuWa%2F)的工作,提供针对中国人群的遗传变异图谱与参考面板基因型推演服务,旨在促进中国人群的遗传学与医学研究。 研究团队分析了2%2C999个中国人的全基因组深度测序数据(26.2X),并以"女娲"命名。)
- 内容条目:
Variant ID | dbSNP | Region | Gene ID | Exonic function | Consequence | Allele Count | Allele Number | Allele Frequency |
---|---|---|---|---|---|---|---|---|
22-42125620-GGGGTGGGGAA-G | - | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 14 | 5926 | 2.3625e-3 |
22-42125620-G-GGGGTGGGGAA | - | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 62 | 5926 | 0.0105 |
22-42125620-GGGGTGGGGAAGGGTGGGGAA-G | rs751918998 | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 3 | 5926 | 5.0624e-4 |
22-42125624-T-TGGGGAAGGGA | - | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 5 | 5322 | 9.3950e-4 |
22-42125744-G-A | rs143368153 | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 20 | 5966 | 3.3523e-3 |
22-42125895-C-G | - | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 2 | 5970 | 3.3501e-4 |
22-42125899-A-G | - | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 1 | 5970 | 1.6750e-4 |
22-42125913-T-C | rs532182046 | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 7 | 5970 | 1.1725e-3 |
22-42125914-G-A | rs550576546 | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 7 | 5970 | 1.1725e-3 |
22-42125915-T-G | rs562602203 | downstream | CYP2D6;CYP2D7;LOC101929829;NDUFA6-DT | - | - | 7 | 5970 | 1.1725e-3 |
西湖大学人群(汉人)研究队列(2017-2019)
-
群体规模:14726
-
版本:
- v20210403 (hg19、hg38 4489个体)
- v20211129 (hg19、hg38 ;4480个体,含有帕金森患者)
-
群体分布:北方、中部、南方、岭南
-
内容条目:vcf格式,各个地区人群等位基因频率等信息
-
The VCF is annotated with rsIDs from dbSNP151, and the following INFO fields:
AC:Allele count in called genotypes in WBBC
AF:Allele frequency in called genotypes in WBBC
AN:Total number of alleles in called genotypes in WBBC
NS:Total number of samples in called genotypes in WBBC
North_AF:Allele frequency in North Han Chinese
North_AN:Total number of alleles in North Han Chinese
Central_AF:Allele frequency in Central Han Chinese
Central_AN:Total number of alleles in Central Han Chinese
South_AF:Allele frequency in South Han Chinese
South_AN:Total number of alleles in South Han Chinese
Lingnan_AF:Allele frequency in Lingnan Han Chinese
Lingnan_AN:Total number of alleles in Lingnan Han Chinese
RR:The number of homozygote of reference allele
RA:The number of heterozygote of reference/alternative alleles
AA:The number of homozygote of alternative allele (non-ref)
DP:Raw read depth
VQSLOD:Variant Recalibration Score from GATK
-
chinaMAP(China Metabolic Analytics Project)代谢研究(2020)
-
群体规模:10588人 数据未开放下载,注册登录后 按区域/位点/基因 可在线查询
-
数据:mBiobank
-
文章:The ChinaMAP analytics of deep whole genome sequences in 10,588 individuals | Cell Research (nature.com) 涉及部分药物基因组学分析
-
群体分布:七个亚群
-
使用限制:使用限制 (mbiobank.com)
-
内容条目:
注册后检索cyp2d6基因部分结果如下:
-
Chr Position dbSNP Ref Alt Qual Alt freq Count Gene Transcript 1KGP_AF 1KGP_EAS_AF 1KGP_AMR_AF 1KGP_AFR_AF 1KGP_EUR_AF 1KGP_SAS_AF chr22 42125857 GTCC G 697.41 0.000301568 3/9948 CYP2D6 NM_000106.5:c.714_716delGGA chr22 42125870 G T 474.84 0.000200924 2/9954 CYP2D6 NM_000106.5:c.*704C>A chr22 42125895 C G 8928.49 0.00250451 25/9982 CYP2D6 NM_000106.5:c.*679G>C chr22 42125913 rs532182046 T C 1220.33 0.000601443 6/9976 CYP2D6 NM_000106.5:c.*661A>G 0.00139776 0.003 0.0029 0.0008 0.0 0.001 chr22 42125914 rs550576546 G A 1220.33 0.000601443 6/9976 CYP2D6 NM_000106.5:c.*660C>T 0.00119808 0.003 0.0029 0.0 0.0 0.001 chr22 42125915 rs562602203 T G 1220.29 0.000601443 6/9976 CYP2D6 NM_000106.5:c.*659A>C 0.00119808 0.003 0.0029 0.0 0.0 0.001 chr22 42125946 rs111265642 A G 3886.75 0.0027103 27/9962 CYP2D6 NM_000106.5:c.*628T>C
华表计划(2021)
- 群体规模: 5000人 (wes测序,位点数据可直接在线查询)
- 华表数据库:"HUABIAO" whole-exome public database (biosino.org)
- 群体分布:华北(郑州)、华东(泰州)、华南(南宁)
- 文章:The HuaBiao project: whole-exome sequencing of 5000 Han Chinese individuals - ScienceDirect
China Kadoorie Biobank(中国慢性病前瞻性研究数据)
- 规模:超过512,000名成人参与者(2004、2008、2013、2020)
- 数据:
- Welcome — China Kadoorie Biobank (CKB) (ckbiobank.org)
- 需要注册申请(繁琐、未必能通过),数据开放程度较低。
炎黄计划(2008)
- 文章:亚洲个体的二倍体基因组序列 |自然界 (nature.com)
- 100人
- 位点包含在dbsnp130