Is the GATK joint variant discovery approach appropriate for calling variants in RNA-Seq experiments?
Brouard and Bissonnette. 2018. Is the GATK joint variant discovery approach appropriate for calling variants in RNA-Seq experiments? Plant and Animal Genome Conference XXVI, Jan 13th-17th 2018, San diego USA.
The Genome Analysis Toolkit (GATK) is a popular set of programs for discovering and genotyping variants from high-throughput-sequencing data. Versions 3.0 and above of GATK offer the possibility of calling DNA variants on cohorts of samples using the HaplotypeCaller algorithm in GVCF mode. With this approach, variants are called individually on each sample using the –ERC GVCF mode, leading to the production of one gVCF file per sample that lists genotype likelihoods and annotations for each site in the genome. In a second step, variants are called through a joint genotyping analysis from gVCF files of all samples. This strategy is more flexible and reduces computational challenges in comparison with the traditional joint discovery workflow. Although the current GATK recommendation for RNA sequencing (RNA-Seq) is to perform variant calling from individual samples, using a GVCF workflow in RNA-Seq could provide substantial advantages. That workflow has not been validated, however. In accordance with the GATK best practices for variant calling on RNA-Seq data, we compared the per-sample and the joint genotyping approaches using paired samples from 56 cows genotyped with RNA-Seq data derived from whole primary macrophage transcriptomes, genotyping-by-sequencing (GBS) data, and Bovine SNP50 BeadChip data. Our results indicate that the per-sample and the joint genotyping approaches perform similarly in terms of sensitivity (>90%) and precision (>70%). Our results also indicate that RNA-Seq genotypes with high accuracy (>98%) can be obtained with RNA-Seq data. In addition, we found that a sizeable proportion of discrepancies between the GBS variant calls and the RNA-Seq variant calls would be explained best by RNA-Seq editing events. This study suggests that joint genotyping is a suitable variant-calling method when conducting RNA-Seq experiments.