A Novel and Efficient Approach for Phasing Highly Heterozygous Plant Genomes
Citation
Zhu S, Wang L, You FM, Rodriguez JC, Deal KR, Chen L, Li J, Chakraborty S, Balan B, Jiang CZ, Brown PJ, Leslie CA, Aradhya M, Dandekar AM, Kluepfel DA, Dvorak J, LuoM-C (2017) A novel and efficient approach for phasing highly heterozygous plant genomes. Proc 26th Plant & Animal Genome Conference, San Diego, CA, January 13-17, P0193 (poster)
Résumé
Assembling highly heterozygous plant genomes from short sequence reads is challenging due to difficulty in recovering the different haplotypes. Standard assembly protocols tend to collapse homozygous regions and report heterozygous regions as alternative contigs; such multiple assemblies are hard to resolve leading to fragmented assemblies larger than the expected size. We devised a novel method that overcomes genome heterozygosity by assembling two haploid genomes of an interspecific hybrid. Here we report the de novo assembly of two haploid genomes in interspecific hybrid MS1-56 (Juglans regia cv. Serr × Juglans microcarpa). We used a combination of BioNano genome (BNG) mapping, PacBio single-molecule real-time (SMRT) and Illumina sequencing technologies along with standard and custom designed assembly protocols to achieve complete assembly of two haploid genomes (J. regia and J. microcarpa) comprising the genome of hybrid MS1-56. By coupling SMRT sequencing and BNG mapping technologies, we were able to generate a 1.07 Gb highly contiguous assembly, with a contig N50 size of 8.0 Mb and a scaffold N50 size of 34.8 Mb. We also constructed BNG maps for both parental species of MS1-56 and successfully partitioned the two haplotypes from the sequence assembly of MS1-56, i.e. 529 Mb for J. regia ‘Serr’ and 538 Mb for J. microcarpa, respectively. We then applied the genetic map of J. regia cv. Chandler onto each assembled genome, resulting in 532 Mb scaffolds in J. regia ‘Serr’ and 524 Mb scaffolds in J. microcarpa anchored onto 16 chromosomes in each genome, of which 12 and 14 chromosomes in J. regia ‘Serr’ and J. microcarpa, respectively, were able to be resolved into single scaffolds. After gap closing, the total number of N’s dropped to 0.76% in J. regia ‘Serr’ and 0.82% in J. microcarpa. Characterization of the repetitive portion of the two genomes revealed over 350,000 transposable elements in both genomes. In addition, approximately 31,000 and 29,000 evidence-supported genes were predicted in the J. regia ‘Serr’ and J. microcarpa genomes, respectively. To date, this work presents the most contiguous and complete genome assembly of a highly heterozygous plant species. It should also be noted that high-quality haplotype genomes for both parental species were generated from a single sequencing of one hybrid offspring.