Soybean (Glycine max) Haplotype Map (GmHapMap): a universal resource for soybean translational and functional genomics

Citation

Torkamaneh, D., Laroche, J., Valliyodan, B., O’Donoughue, L., Cober, E., Rajcan, I., Vilela Abdelnoor, R., Sreedasyam, A., Schmutz, J., Nguyen, H.T., Belzile, F. (2021). Soybean (Glycine max) Haplotype Map (GmHapMap): a universal resource for soybean translational and functional genomics. Plant Biotechnology Journal, [online] 19(2), 324-334. http://dx.doi.org/10.1111/pbi.13466

Plain language summary

A haplotype map describes patterns of genetic variation within a population of individuals. A soybean haplotype map (GmHapMap) describes genetic variation among different soybean varieties. This study describes the development of a worldwide GmHapMap from 1007 soybean varieties. Almost 18 million genetic variants were found in this collection of soybeans. The use of between 600 to 800 varieties was able capture almost all of the worldwide soybean diversity. Haplotypes at actual genes can identify versions of genes which have different effects. As well, haplotypes at actual genes sometimes show variation as extreme as a loss of function for that gene. The soybean haplotype map can be used worldwide for the study of gene function and for soybean breeding.

Abstract

Here, we describe a worldwide haplotype map for soybean (GmHapMap) constructed using whole-genome sequence data for 1007 Glycine max accessions and yielding 14.9 million variants as well as 4.3 M tag single-nucleotide polymorphisms (SNPs). When sampling random subsets of these accessions, the number of variants and tag SNPs plateaued beyond approximately 800 and 600 accessions, respectively. This suggests extensive coverage of diversity within the cultivated soybean. GmHapMap variants were imputed onto 21 618 previously genotyped accessions with up to 96% success for common alleles. A local association analysis was performed with the imputed data using markers located in a 1-Mb region known to contribute to seed oil content and enabled us to identify a candidate causal SNP residing in the NPC1 gene. We determined gene-centric haplotypes (407 867 GCHs) for the 55 589 genes and showed that such haplotypes can help to identify alleles that differ in the resulting phenotype. Finally, we predicted 18 031 putative loss-of-function (LOF) mutations in 10 662 genes and illustrated how such a resource can be used to explore gene function. The GmHapMap provides a unique worldwide resource for applied soybean genomics and breeding.

Publication date

2021-02-01

Author profiles