SACCHARIS v2: Streamlining prediction of carbohydrate-active enzyme specificities within large datasets

Citation

TBD

Résumé en langage clair

The structural and chemical diversity of carbohydrates and their linkages requires a large variety of carbohydrate-active enzymes to form, dismantle, and metabolize these complex molecules. The software SACCHARIS (Sequence Analysis and Clustering of CarboHydrate Active enzymes for Rapid Informed prediction of Specificity) has been developed as a rapid and easy-to-use bioinformatics pipeline to predict CAZyme function in new datasets. SACCHARIS v2 has been designed to provide a new (optional) GUI and semi-automation of prediction and figure generation, with high-throughput sequencing methods in mind geared toward complex datasets to reveal the total CAZyme content of an organism or community. Here, we outline the development and use of SACCHARIS v2 to discover and annotate CAZYmes, and provide insight into complex carbohydrate metabolism in individual organisms and communities.

Résumé

Carbohydrates are chemically and structurally diverse, comprised of a wide array of monosaccharides, stereochemical linkages, substituent groups, and intermolecular associations with other biological molecules. A large repertoire of carbohydrate-active enzymes (CAZymes) and enzymatic activities are required to form, dismantle, and metabolize these complex molecules. The software SACCHARIS (Sequence Analysis and Clustering of CarboHydrate Active enzymes for Rapid Informed prediction of Specificity) provides a rapid, easy-to-use pipeline for the prediction of potential CAZyme function in new datasets. We have updated SACCHARIS to: (i) simplify its installation by re-writing in Python and packaging for Conda; (ii) enhance its usability through a new (optional) interactive GUI; and (iii) enable semi-automated annotation of phylogenetic tree output via a new R package or the commonly-used webserver iTOL. Significantly, SACCHARIS v2 has been developed with high-throughput omics in mind, with pipeline automation geared toward complex (meta)genome and (meta)transcriptome datasets to reveal the total CAZyme content (“CAZome”) of an organism or community. Here, we outline the development and use of SACCHARIS v2 to discover and annotate CAZymes, and provide insight into complex carbohydrate metabolisms in individual organisms and communities.

Date de publication

2024-03-01

Profils d'auteurs