Genomic prediction accuracy of seven breeding selection traits improved by QTL identification in flax

Citation

Lan, S., Zheng, C., Hauck, K., McCausland, M., Duguid, S.D., Booker, H.M., Cloutier, S., You, F.M. (2020). Genomic prediction accuracy of seven breeding selection traits improved by QTL identification in flax. International Journal of Molecular Sciences, [online] 21(5), http://dx.doi.org/10.3390/ijms21051577

Plain language summary

Genomic selection is a key step in plant breeding and crop improvement. It predicts the ability of achieving the desired traits in agriculture by using molecular markers spanning all chromosomes. Quantitative trait loci (QTL) are important molecular markers that increase the prediction accuracy, saving time and money. Using three different statistical models, we identified three potential QTL sets for seven traits in flax. This study evaluated the performances of different combinations of QTL sets in predicting the trait, and found that predictions based on a combination of the QTL detected by two of the statistical models for single traits were most accurate. The addition of extra markers, such as genome-wide SNP or QTL for other traits, reduced the prediction accuracy of traits. In order to maximize prediction accuracy and minimize the number of QTL markers, further studies on detection and removal of redundant or false positive QTL in genomic selection are required.

Abstract

Molecular markers are one of the major factors affecting genomic prediction accuracy and the cost of genomic selection (GS). Previous studies have indicated that the use of quantitative trait loci (QTL) as markers in GS significantly increases prediction accuracy compared with genome-wide random single nucleotide polymorphism (SNP) markers. To optimize the selection of QTL markers in GS, a set of 260 lines from bi-parental populations with 17,277 genome-wide SNPs were used to evaluate the prediction accuracy for seed yield (YLD), days to maturity (DTM), iodine value (IOD), protein (PRO), oil (OIL), linoleic acid (LIO), and linolenic acid (LIN) contents. These seven traits were phenotyped over four years at two locations. Identification of quantitative trait nucleotides (QTNs) for the seven traits was performed using three types of statistical models for genome-wide association study: two SNP-based single-locus (SS), seven SNP-based multi-locus (SM), and one haplotype-block-based multi-locus (BM) models. The identified QTNs were then grouped into QTL based on haplotype blocks. For all seven traits, 133, 355, and 1,208 unique QTL were identified by SS, SM, and BM, respectively. A total of 1420 unique QTL were obtained by SS+SM+BM, ranging from 254 (OIL, LIO) to 361 (YLD) for individual traits, whereas a total of 427 unique QTL were achieved by SS+SM, ranging from 56 (YLD) to 128 (LIO). SS models alone did not identify sufficient QTL for GS. The highest prediction accuracies were obtained using single-trait QTL identified by SS+SM+BM for OIL (0.929 ± 0.016), PRO (0.893 ± 0.023), YLD (0.892 ± 0.030), and DTM (0.730 ± 0.062), and by SS+SM for LIN (0.837 ± 0.053), LIO (0.835 ± 0.049), and IOD (0.835 ± 0.041). In terms of the number of QTL markers and prediction accuracy, SS+SM outperformed other models or combinations thereof. The use of all SNPs or QTL of all seven traits significantly reduced the prediction accuracy of traits. The results further validated that QTL outperformed high-density genome-wide random markers, and demonstrated that the combined use of single and multi-locus models can effectively identify a comprehensive set of QTL that improve prediction accuracy, but further studies on detection and removal of redundant or false-positive QTL to maximize prediction accuracy and minimize the number of QTL markers in GS are warranted.

Publication date

2020-03-01