文献阅读 | Population genomics and haplotype analysis in spelt and bread wheat identifies a gene regulating glume color
Abrouk, M., Athiyannan, N., Müller, T. et al. Population genomics and haplotype analysis in spelt and bread wheat identifies a gene regulating glume color. Commun Biol 4, 375 (2021).
Introduction
颖壳颜色是鉴定小麦品种的重要表型特征。在Central European spelt(20世纪之前欧洲主要小麦亚种)中,红色颖壳十分常见。作者采用了GBS测序方法对一个有267个样本的spelt全球群体进行了测序
现在种植最为广泛的小麦是tetraploid durum wheat(又称pasta wheat,学名Triticum turgidum ssp. durum)和 hexaploid bread wheat(学名T. aestivum ssp. aestivum)。最近发表的“10+小麦基因组计划”填补了之前只有 10.5 Gb硬质小麦Svevo 和 16 Gb面包小麦品种中国春 两个参考基因组可用的不足。
Spelt小麦(T. aestivum ssp. spelta)是普通小麦的一个亚种,作为一种高生态位价值产品( high-value niche product.)主要种植于 Central Europe and northern Spain。Spelt自青铜时代到20世纪初一直是欧洲主要农作物。在Central Europe的部分地区,尤其是 southern Germany 和 Switzerland,spelt甚至在early Iron Age (around 750 BC)就取代了二粒小麦成为主导小麦品种。1930年,spelt依然占了Central Europe小麦产区的 ~40%。
尽管作为一种驯化小麦,spelt依然具有一些类似 non-domesticated grass species 的特征,包括脆性的穗轴和紧密包裹着种子的颖片。尽管这些特征有利于种子在野生植物中的散播和保护,但对于机械收割和加工却是不利的,这是20世纪用自由打谷的面包小麦代替普通小麦的主要原因之一。小麦亚种 aestivum 和 spelta 可以自由杂交,因此育种家在持续将spelt的重要农艺基因转入面包小麦基因库中。
Fig. 1: Spike morphology and spikelet disarticulation in spelt.
a Representative spikes of spelt accessions collected in Central Europe, Asia, and the Iberian Peninsula. Scale bar = 5 cm. b Spikelet disarticulation of spelt. Spikelet disarticulation in plants with brittle rachis can be classified as barrel-type (upper rachis segment pressed against the lower spikelet) and wedge type (rachis segment pointing down). Central European spelt shows a barrel-type disarticulation, whereas disarticulation in Asian and Iberian spelt is of the wedge type. Scale bar = 2 cm. Arrows point to the rachis segment after spikelet disarticulation.
六倍体小麦是在两次独立的杂交事件之后出现的。第一次发生在几十万年前,第二次则被认为是在大约10,000年前栽培小麦田中发生的[1]。尽管spelt和面包小麦间有着密切的联系,但spelt的群体结构和农业历史仍然模糊,而且尚未在全基因组水平上进行深入研究。已有研究提出,面包小麦和spelt在新月沃地具有共同、单一的起源。在向欧洲迁移的过程中,一种自由落粒的六倍体小麦可能同hulled tetraploid emmer wheat进行了杂交,从而产生了spelt小麦[2][3][4]。这一四倍体小麦渗入的演化模型会导致面包小麦和欧洲spelt小麦在A、B亚基因组上的遗传差异,同时D基因组会显示高度相似性[5]。
许多欧洲spelt品种中发现的一个明显特征是红色颖壳。遗传研究表明,六倍体面包小麦的颖片颜色由同源染色体组1上的Rg-A1 (1AS), Rg-B1 (1BS), and Rg-D1 (1DS) 三个loci控制。三个显性等位基因Rg-A1b, Rg-B1b, and Rg-D1b决定红色颖壳,其中Rg-B1b (also referred to as Rg1)是六倍体小麦中最为常见的。此外,黑色颖壳等位基因Rg-A1c和烟灰色Rg-D1c也已被鉴定。在黑衣草属的各种二倍体野生小麦近缘种中也描述了赋予dark glumes的同质等位基因。
Spelt群体结构与历史
这项研究中大部分欧洲spelt品种都是1930年代收集的。作者使用GBS测序了267个spelt和75个面包小麦品种,按照缺失率≤20%筛选出了55,638个双等位基因SNP。大部分(60%)突变都是 MAF<5% 的稀有突变。SNP分布在整个基因组上,在gene-rich和telomeric(端粒) regions区频率上升。在这55638个突变中,A、B、D基因组上分别有19379、22803和12023个,同时还有1433个未能定位。基因内/基因附近的突变占37%,其中7465 were gene proximal (±2 kb of coding sequence (CDS)), 8677 in exons, and 4293 in introns。
CallSNP: Reads were mapped against the Chinese Spring IWGSC RefSeq v1.0 with bwa v0.7.15. SNPs were called using TASSEL v5.2.31 with a read length of 64 bp. SNPs were filtered using vcftools v0.1.14 using the following criteria: (1) biallelic SNP, (2) SNPs with >20% missing data were discarded, (3) accessions with >20% missing information were discarded.
PCA、系统发育和admixture analyses表明,spelt分为三个基因库,对应Asia, Central Europe, and the Iberian Peninsula收集到的种质。PCA中,亚洲spelt和面包小麦(尽管面包小麦种质收集自中欧)聚在一起,而中欧和伊比利亚则形成两个不同的群体。图中黑色的spelt-x-wheat人工杂交群体证明了分类的正确性。美国收集到的两个spelt材料同样位于wheat-spelt杂交区,证明它们的杂交起源。非洲收集的四份种质与欧洲品种聚在一起,证明他们可能是欧洲带过去的。
Population genomic analyses: PCA were performed using the python library scikit-learn v0.17.1. Phylogenetic trees were computed using the SNPhylo software package with the 54,205 anchored SNPs. Low-quality SNPs based on allele frequency (MAF ≤ 0.05) and missing data (≥20%) were removed using the SNPRelate package. Multiple sequence alignments were done using MUSCLE and phylogenetic trees were constructed by running DNAML from the PHYLIP package. One hundred bootstraps using the phangorn package were done and trees were visualized and annotated with FigTree v1.4.3. Structure analyses were performed with the ADMIXTURE software. Data management and quality control operations were performed using vcftools v0.1.14 and PLINK v1.9. We explored K values from 2 to 10 and determined the lowest cross-validated error rate.
最大似然树(图2b)也提供了三个基因库的证据。ADMIXTURE工具的种群结构推断在K=7时获得最小的交叉验证误差值。通过PCA和系统发育分析揭示的三个主要基因库在K=3时很明显。与K= 相比,中欧spelt的种质在K=7时被进一步分为四个不同的种群,反映了spring spelt,来自德国和瑞士的南北方种质的差异。
Fig. 2: Whole-genome analysis of spelt and bread wheat
a Principal component analysis (PCA) across the 339 spelt and bread wheat accessions. Samples are colored according to subspecies and geographical origin. b Maximum-likelihood tree constructed with SNPhylo. The colors used to label the accessions are identical to a and the branch size is indicated below the tree. c ADMIXTURE ancestry coefficients (K = 3 and K = 7) for the 339 accessions of spelt and bread wheat. Stacked bars represent accessions and colors represent ancestry components. Accessions are ordered according to subspecies and geographic origin.
对于ABD三个亚基因组,观察到与全基因组分析相似的模式。A和B亚基因组的PCA同样显示了三个群体的明显差异,而D基因组的前两个主成分的解释度(9.3%)低于A(17.28%)和B(17.23%)两个亚基因组,这些结果与面包小麦和spelt的D亚基因组来源一致的已知信息相符。通过随机抽取3500个亚基因组特异SNP重复PCA,作者证实这一差异不是D基因组marker少导致的。
考虑到地理上的邻近性,令人惊讶的是,伊比利亚和中欧的spelt显示出如此清晰的形态和遗传差异。作为了解伊比利亚spelt可能起源的第一步,作者使用了基于结盟的进化方法来模拟和测试三种进化方案:
(i)在从新月沃地共同向西迁移时,一次性将伊比利亚和中欧引入欧洲;
(ii)伊比利亚和中欧的spelt独立起源,并分别独立引入欧洲;
(iii)假设有共同起源但允许最近发生了从亚洲流向伊比利亚的基因流spelt的混合模型。
在所有这些模型中,祖先群体被假定为亚洲拼写人口或未采样的“幽灵”人口,总共形成了六个模型(补充图 6)。由于从野生和栽培四倍体小麦(AABB基因组)的反复基因流进入栽培六倍体基因库,因此仅考虑了D亚基因组变体。似然度最高的模型表明,伊比利亚和中欧的spelt是独立的(AIC = 15,536.17),表明这两个群体可能具有独立的移民史。伊比利亚群体和中欧群体从其祖先群体中所散布的时间分别估计在1218(CI = 1018–1280)和1023(CI = 1014–1072)世代之前。
Supplementary Figure 6. The six demographic models that were compared in this study. Subscripts indicate population names (AS: Asia, CE: Central Europe, IB: Iberia, ANC: ancestral); N indicates population sizes, T indicates times. Detailed parameters are given in Supplementary Table 1. The terms ‘westMig’, ‘indepWestMig’ and ‘westMigAdmix’ refer to the evolutionary scenarios (i), (ii), and (iii) described in the main text, respectively.
在红色颖片Rg-B1位点解析单倍型变异
颖壳表型是从瑞士国家种质库(https://www.bdn.ch/)中的历史数据提取的。全基因组关联研究(GWAS)显示了1B染色体短臂上的一个峰(图 3b,补充图 7),对应于〜2 Mb的置信区间,跨越了2.24–4.17 Mb的物理位置。中国春RefSeq v1.0 与此物理位置重合的相关区域包含33个带注释的高可信度基因,其中一个预测的R2R3-MYB样转录因子(TraesCS1B02G005200)的单拷贝被确定为最有前途的候选基因。
GWAS按照MAF>0.05过滤了SNP,采用PLINK使用logistic regression model assuming additive genetic effects进行SNP相关性检验。
a Images of a white glume spelt accession (right) and a red glume spelt accession (left). b Manhattan plot showing a significant association for glume color at the Rg-B1 locus on chromosome arm 1BS. The physical confidence interval spans positions 2.24–4.17 Mb in the RefSeq v1.0 assembly of Chinese Spring. The region contains a MYB transcription factor gene (TraesCS1B02G005200). c Copy number and allele variation for the candidate MYB transcription factor gene in ten high-quality wheat assemblies. Shown are the first 10 megabases of chromosome arm 1BS in the ten different wheat assemblies (left). Arrows indicate MYB-like transcription factor genes that are paralogous to TraesCS1B02G005200. Colors refer to one of five different allele groups (G1–G5) shown on the right. Group 3 is associated with red glumes in bread wheat and spelt. d Haplotype-specific PCR marker for the group 3 alleles. The red arrow points to the 334 bp amplicon specific for the group 3 Rg-B1 alleles. Scale bar = 1 cm. e Phylogenetic analysis of MYB transcription factors regulating the flavonoid biosynthesis pathway. SG = subgroup based on conserved amino-acid motifs41. The specific flavonoids that are linked to each subgroup are indicated.
为了评估各种小麦品种的等位基因变异,作者在十个优质小麦基因组的Rg-B1区间进行了单倍型分析,揭示了MYB转录因子基因的广泛等位基因和拷贝数变异。总共在10个基因组中鉴定出26个paralogs,范围从瑞士面包小麦品系ArinaLrFor中的零到spelt登录号PI 190962的七个。这26个paralogs都位于1B染色体短臂的前10 Mb之内。广泛的结构变异代表了13种等位基因,这些等位基因与TraesCS1B02G005200具有> 93%的序列同一性。基于它们的基因结构和序列相似性,作者将13个等位基因分为五个组。
为了验证候选等位基因,作者基于特定的核苷酸多态性开发了组特异性标记。特别是,基于第一个内含子中独特的47 bp InDel开发了Rg-B1b_h1的基于PCR的co-dominant标记,该标记将第3组等位基因与所有其他组区分开。
Characterization of the Rg-B1 group 3 alleles
Group 3 Rg-B1 alleles upregulate flavonoid biosynthesis genes
Rg-B1在烟草中的瞬时表达
Fig. 4: Transient expression of Rg-B1 in Nicotiana benthamiana
a Transcript levels of endogenous flavonoid biosynthetic genes in N. benthamiana infiltrated with the 35 S:GFP vector control and the Rg-B1-overexpressing constructs 35 S:Rg-B1b_h1:GFP and 35 S:Rg-B1b_h3:GFP. Error bars represent standard errors of three biological replicates. b Spectrofluorometric profile of agroinfiltrated leaves pre and post DPBA staining. The region between the two vertical dotted lines from 550 to 650 nm coincides with the previously reported peak for the flavonol quercetin. The peak from ̴665 to 685 nm is autofluorescence from chlorophyll/chloroplasts.
总之,单倍型分析,关联和遗传作图以及infiltration实验表明,Rg-B1基因座处的颖片颜色受特定的一组R2R3-MYB转录因子变体控制。
其他参考文献
Marcussen, T. et al. Ancient hybridizations among the ancestral genomes of bread wheat. Science 345, 1250092 (2014). ↩︎
Dvorak, J. et al. The origin of spelt and free-threshing hexaploid wheat. J. Hered. 103, 426–441 (2012). ↩︎
Liu, Y. G. & Tsunewaki, K. Restriction-fragment-length-polymorphism (RFLP) analysis in wheat. II. Linkage maps of the RFLP sites in common wheat. Jpn. J. Genet. 66, 617–633 (1991). ↩︎
Blatter, R. H. E., Jacomet, S. & Schlumbaum, A. Spelt-specific alleles in HMW glutenin genes from modern and historical European spelt (Triticum spelta L.). Theor. Appl. Genet. 104, 329–337 (2002). ↩︎
Blatter, R. H. E., Jacomet, S. & Schlumbaum, A. About the origin of European spelt (Triticum spelta L.): allelic differentiation of the HMW Glutenin B1-1 and A1-2 subunit genes. Theor. Appl. Genet. 108, 360–367 (2004). ↩︎