Low-Depth Genotyping-by-Sequencing (GBS) in a Bovine Population: Strategies to Maximize the Selection of High Quality Genotypes and Accuracy of Imputation

Citation

Bissonnette, N., J.-S. Brouard, B. Boyle, and E. Ibeagha-Awemu. 2017. Low-Depth Genotyping-by-Sequencing (GBS) in a Bovine Population: Strategies to Maximize the Selection of High Quality Genotypes and Accuracy of Imputation. Plant & Animal Genome Plant & Animal Genome Conference, San Diego 2017/01/14 - 2017/01/18

Plain language summary

Paratuberculosis, an incurable disease of the dairy bovine and other ruminants, is also incriminated in several human and animal diseases. To improve the natural disease resistance to pathogen, genetic can provide a long term beneficial effect. Analysis tools are accessible for the industry but most of them have been designed to improve livestock production. To refine our capacity to read the genome of sick animals and disease resistant animals, a new approach was tested. This new genetic analysis tool, historically applied for crop improvement, was tested for bovine using two methods. The restrictive method should increase the precision of the genetic information. Both methods, the conventional and the restrictive, were tested using a pilot set of 48 animals. We also developed bioinformatics tool to validate the genetic information obtained from the two methods. We found that the information produced by the conventional GBS method was accurate (>97%), whereas restrictive method failed (<50%). We also tested two programs used to retrieve missing information and can now recommend the FIMPUTE program. The strategies presented here provide a framework for such analysis to generate, at low cost, the large number of high quality molecular markers required to perform animal genetic studies.

Abstract

Genotyping-by-sequencing (GBS) has emerged as a powerful and cost-effective approach for discovering and genotyping genetic variations. To achieve high levels of complexity reduction, an alternative GBS protocol including selective primers during the PCR amplification step has been proposed. In the present study, we compared this modified protocol to the conventional two-enzyme GBS protocol, using a small group of cows (n=48). Using 48 plex GBS libraries, we detected a total of 123,666 variants with the GBS selective-primer approach and 272,103 variants with the conventional GBS approach. Validating these data with genotypes obtained from mass spectrometry and Illumina’s bovine SNP50 array, we found that the genotypes produced by the conventional GBS method were accurate, whereas the selective-primer method failed to call heterozygotes with confidence. Our results indicate that high accuracy in genotype calling (>97%) can be obtained using low read-depth thresholds (3 to 5 reads) provided that markers are simultaneously filtered for genotype quality scores. We also show that factors such as the minimum call rate and the minor allele frequency positively influence the accuracy of imputation of missing GBS data. The highest accuracies (around 85%) of imputed GBS markers were obtained with the FIMPUTE program when GBS genotypes and SNP50 array genotypes were combined. The strategies presented here provide a framework for the analysis of GBS data and could be used to generate, at low cost, the large number of high quality molecular markers required to perform genome-wide association studies in animal populations.