1887

Abstract

Pectinolytic enzymes are a variety of enzymes involved in breaking down pectin, a complex and abundant plant cell-wall polysaccharide. In nature, pectinolytic enzymes play an essential role in allowing bacteria and fungi to depolymerize and utilize pectin. In addition, pectinases have been widely applied in various industries, such as the food, wine, textile, paper and pulp industries. Due to their important biological function and increasing industrial potential, discovery of novel pectinolytic enzymes has received global interest. However, traditional enzyme characterization relies heavily on biochemical experiments, which are time consuming, laborious and expensive. To accelerate identification of novel pectinolytic enzymes, an automatic approach is needed. We developed a machine learning (ML) approach for predicting pectinases in the industrial workhorse fungus, . The prediction integrated a diverse range of features, including evolutionary profile, gene expression, transcriptional regulation and biochemical characteristics. Results on both the training and the independent testing dataset showed that our method achieved over 90 % accuracy, and recalled over 60 % of pectinolytic genes. Application of the ML model on the genome led to the identification of 83 pectinases, covering both previously described pectinases and novel pectinases that do not belong to any known pectinolytic enzyme family. Our study demonstrated the tremendous potential of ML in discovery of new industrial enzymes through integrating heterogeneous (post-) genomimcs data.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000674
2021-12-07
2024-04-23
Loading full text...

Full text loading...

/deliver/fulltext/mgen/7/12/mgen000674.html?itemId=/content/journal/mgen/10.1099/mgen.0.000674&mimeType=html&fmt=ahah

References

  1. Peng M, de Vries R. Machine learning prediction of novel pectinolytic enzymes in aspergillus niger through integrating heterogeneous (post-) genomics data. Figshare 2021 https://doi.org/10.6084/m9.figshare.14958117.v1
    [Google Scholar]
  2. Prade RA, Zhan D, Ayoubi P, Mort AJ. Pectins, pectinases and plant-microbe interactions. Biotechnol Gen Eng Rev 1999; 16:361–391 [View Article]
    [Google Scholar]
  3. Reignault P, Valette-Collet O, Boccara M. The importance of fungal pectinolytic enzymes in plant invasion, host adaptability and symptom type. Eur J Plant Pathol 2008; 120:1–11
    [Google Scholar]
  4. Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 2014; 42:495 [View Article]
    [Google Scholar]
  5. Garg G, Singh A, Kaur A, Singh R, Kaur J. Microbial pectinases: an ecofriendly tool of nature for industries. 3 Biotech 2016; 6:47 [View Article] [PubMed]
    [Google Scholar]
  6. Jayani RS, Saxena S, Gupta R. Microbial pectinolytic enzymes: A review. Process Biochem 2005; 40:2931–2944 [View Article]
    [Google Scholar]
  7. Pel HJ, de Winde JH, Archer DB, Dyer PS, Hofmann G. Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nat Biotechnol 2007; 25:221–231 [View Article] [PubMed]
    [Google Scholar]
  8. Benoit I, Coutinho PM, Schols HA, Gerlach JP, Henrissat B. Degradation of different pectins by fungi: correlations and contrasts between the pectinolytic enzyme sets identified in genomes and the growth on pectins of different origin. BMC genomics 2012; 13:321 [View Article] [PubMed]
    [Google Scholar]
  9. Martens-Uzunova ES, Schaap PJ. Assessment of the pectin degrading enzyme network of Aspergillus niger by functional genomics. Fungal Genet Biol 2009; 46 Suppl 1:S170–S179 [View Article] [PubMed]
    [Google Scholar]
  10. Gruben BS, Makela MR, Kowalczyk JE, Zhou M, Benoit-Gelber I. Expression-based clustering of CAZyme-encoding genes of Aspergillus niger. BMC genomics 2017; 18:900 [View Article] [PubMed]
    [Google Scholar]
  11. Alazi E, Niu J, Kowalczyk JE, Peng M, Aguilar Pontes MV. The transcriptional activator GaaR of Aspergillus niger is required for release and utilization of D-galacturonic acid from pectin. FEBS Lett 2016; 590:1804–1815 [View Article] [PubMed]
    [Google Scholar]
  12. Niu J, Alazi E, Reid ID, Arentshorst M, Punt PJ. an evolutionarily conserved transcriptional activator-repressor module controls expression of genes for D-galacturonic acid utilization in Aspergillus niger. Genetics 2017; 205:169–183 [View Article] [PubMed]
    [Google Scholar]
  13. Kowalczyk JE, Lubbers RJM, Peng M, Battaglia E, Visser J. Combinatorial control of gene expression in Aspergillus niger grown on sugar beet pectin. Sci Rep 2017; 7:12356 [View Article] [PubMed]
    [Google Scholar]
  14. Gruben BS, Zhou M, Wiebenga A, Ballering J, Overkamp KM. Aspergillus niger RhaR, a regulator involved in L-rhamnose release and catabolism. Appl Microbiol Biotechnol 2014; 98:5531–5540 [View Article] [PubMed]
    [Google Scholar]
  15. Moore BM, Wang P, Fan P, Leong B, Schenck CA. Robust predictions of specialized metabolism genes through machine learning. Proc Natl Acad Sci USA 2019; 116:2344–2353 [View Article] [PubMed]
    [Google Scholar]
  16. Yamanishi Y, Vert JP, Kanehisa M. Supervised enzyme network inference from the integration of genomic data and chemical information. Bioinformatics 2005; 21 Suppl 1:i468–77 [View Article] [PubMed]
    [Google Scholar]
  17. Terrapon N, Lombard V, Gilbert HJ, Henrissat B. Automatic prediction of polysaccharide utilization loci in Bacteroidetes species. Bioinformatics 2015; 31:647–655 [View Article] [PubMed]
    [Google Scholar]
  18. Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 2011; 8:785–786 [View Article] [PubMed]
    [Google Scholar]
  19. Grigoriev IV, Nikitin R, Haridas S, Kuo A, Ohm R et al. MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res 2014; 42:704 [View Article]
    [Google Scholar]
  20. Kall L, Krogh A, Sonnhammer EL. A combined transmembrane topology and signal peptide prediction method. J Mol Biol 2004; 338:1027–1036 [View Article] [PubMed]
    [Google Scholar]
  21. van den Brink J, de Vries RP. Fungal enzyme sets for plant polysaccharide degradation. Appl Microbiol Biotechnol 2011; 91:1477–1492 [View Article] [PubMed]
    [Google Scholar]
  22. Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 2003; 13:2178–2189 [View Article] [PubMed]
    [Google Scholar]
  23. Dilokpimol A, Peng M, Di Falco M, Chin A Woeng T, Hegi RMW et al. Penicillium subrubescens adapts its enzyme production to the composition of plant biomass. Bioresour Technol 2020; 311:123477 [View Article] [PubMed]
    [Google Scholar]
  24. Glazko GV, Mushegian AR. Detection of evolutionarily stable fragments of cellular pathways by hierarchical clustering of phyletic patterns. Genome Biol 2004; 5:R32 [View Article] [PubMed]
    [Google Scholar]
  25. Coconi Linares N, Di Falco M, Benoit-Gelber I, Gruben BS, Peng M et al. The presence of trace components significantly broadens the molecular response of aspergillus niger to guar gum. N Biotechnol 2019; 51:57–66 [View Article] [PubMed]
    [Google Scholar]
  26. Szklarczyk R, Megchelenbrink W, Cizek P, Ledent M, Velemans G et al. WeGET: predicting new genes for molecular systems by weighted co-expression. Nucleic Acids Res 2016; 44:573
    [Google Scholar]
  27. Schape P, Kwon MJ, Baumann B, Gutschmann B, Jung S. Updating genome annotation for the microbial cell factory Aspergillus niger using gene co-expression networks. Nucleic Acids Res 2019; 47:559–569 [View Article] [PubMed]
    [Google Scholar]
  28. Song L, Langfelder P, Horvath S. Comparison of co-expression measures: mutual information, correlation, and model based indices. BMC bioinformatics 2012; 13:328 [View Article] [PubMed]
    [Google Scholar]
  29. Benocci T, Aguilar-Pontes MV, Zhou M, Seiboth B, de Vries RP. Regulators of plant biomass degradation in ascomycetous fungi. Biotechnol Biofuels 2017; 10:152 [View Article] [PubMed]
    [Google Scholar]
  30. van Peij NN, Gielkens MM, de Vries RP, Visser J, de Graaff LH. The transcriptional activator XlnR regulates both xylanolytic and endoglucanase gene expression in Aspergillus niger. Appl Environ Microbiol 1998; 64:3615–3619 [View Article] [PubMed]
    [Google Scholar]
  31. Battaglia E, Visser L, Nijssen A, van Veluw GJ, Wosten HA. Analysis of regulation of pentose utilisation in Aspergillus niger reveals evolutionary adaptations in Eurotiales. Stud Mycol 2011; 69:31–38 [View Article] [PubMed]
    [Google Scholar]
  32. Petersen KL, Lehmbeck J, Christensen T. A new transcriptional activator for amylase genes in Aspergillus. Mol Gen Genet 1999; 262:668–676 [View Article]
    [Google Scholar]
  33. Peng M, Khosravi C, Lubbers RJM, Kun RS, Aguilar Pontes MV. CreA-mediated repression of gene expression occurs at low monosaccharide levels during fungal plant biomass conversion in a time and substrate dependent manner. The Cell Surface 2021; 7:100050 [View Article]
    [Google Scholar]
  34. Kozlowski LP. Proteome-pi: Proteome isoelectric point database. Nucleic Acids Res 2017; 45:D1112–D1116 [View Article]
    [Google Scholar]
  35. Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007; 23:2507–2517 [View Article] [PubMed]
    [Google Scholar]
  36. Lang M, Binder M, Richter J, Schratz P, Pfisterer F et al. mlr3: A modern object-oriented machine learning framework in R. J Open Source Softw 2019; 4:44 [View Article]
    [Google Scholar]
  37. Genuer R, Poggi JM, Tuleau-Malot C. Variable selection using random forests. Pattern Recognition Letters 2010; 31:2225–2236 [View Article]
    [Google Scholar]
  38. Wright MN, ranger ZA. A fast implementation of random forests for high dimensional data in C plus plus and r. J Stat Softw 2017; 77:1–17
    [Google Scholar]
  39. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res 2002; 16:321–357 [View Article]
    [Google Scholar]
  40. Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC bioinformatics 2006; 7:91 [View Article] [PubMed]
    [Google Scholar]
  41. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS one 2015; 10:e0118432 [View Article]
    [Google Scholar]
  42. Konietzny SG, Pope PB, Weimann A, McHardy AC. Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders. Biotechnol Biofuels 2014; 7:124 [View Article] [PubMed]
    [Google Scholar]
  43. Paolinelli-Alfonso M, Villalobos-Escobedo JM, Rolshausen P, Herrera-Estrella A, Galindo-Sanchez C. Global transcriptional analysis suggests Lasiodiplodia theobromae pathogenicity factors involved in modulation of grapevine defensive response. BMC genomics 2016; 17:615 [View Article] [PubMed]
    [Google Scholar]
  44. Benz JP, Chau BH, Zheng D, Bauer S, Glass NL. A comparative systems analysis of polysaccharide-elicited responses in Neurospora crassa reveals carbon source-specific cellular adaptations. Mol Microbiol 2014; 91:275–299 [View Article] [PubMed]
    [Google Scholar]
  45. Matsumoto S, Yamada H, Kunishige Y, Takenaka S, Nakazawa M. Identification of a novel Penicillium chrysogenum rhamnogalacturonan rhamnohydrolase and the first report of a rhamnogalacturonan rhamnohydrolase gene. Enzyme Microb Technol 2017; 98:76–85 [View Article] [PubMed]
    [Google Scholar]
  46. Cao H, Walton JD, Brumm P, Phillips GN Jr. Structure and substrate specificity of a eukaryotic fucosidase from fusarium graminearum. J Biol Chem 2014; 289:25624–25638 [View Article] [PubMed]
    [Google Scholar]
  47. Katayama T, Sakuma A, Kimura T, Makimura Y, Hiratake J. Molecular cloning and characterization of Bifidobacterium bifidum 1,2-alpha-L-fucosidase (AfcA), a novel inverting glycosidase (glycoside hydrolase family 95). J Bacteriol 2004; 186:4885–4893 [View Article] [PubMed]
    [Google Scholar]
  48. de Vries RP, Kester HC, Poulsen CH, Benen JA, Visser J. Synergy between enzymes from Aspergillus involved in the degradation of plant cell wall polysaccharides. Carbohydr Res 2000; 327:401–410 [View Article] [PubMed]
    [Google Scholar]
  49. Hu J, Arantes V, Pribowo A, Saddler JN. The synergistic action of accessory enzymes enhances the hydrolytic potential of a “cellulase mixture” but is highly substrate specific. Biotechnol Biofuels 2013; 6:112 [View Article] [PubMed]
    [Google Scholar]
  50. Suwannarangsee S, Bunterngsook B, Arnthong J, Paemanee A, Thamchaipenet A. Optimisation of synergistic biomass-degrading enzyme systems for efficient rice straw hydrolysis using an experimental mixture design. Bioresour Technol 2012; 119:252–261 [View Article] [PubMed]
    [Google Scholar]
  51. Broxterman SE, Schols HA. Characterisation of pectin-xylan complexes in tomato primary plant cell walls. Carbohydr Polym 2018; 197:269–276 [View Article] [PubMed]
    [Google Scholar]
  52. Popper ZA, Fry SC. Widespread occurrence of a covalent linkage between xyloglucan and acidic polysaccharides in suspension-cultured angiosperm cells. Ann Bot 2005; 96:91–99 [View Article] [PubMed]
    [Google Scholar]
  53. Popper ZA, Fry SC. Xyloglucan-pectin linkages are formed intra-protoplasmically, contribute to wall-assembly, and remain stable in the cell wall. Planta 2008; 227:781–794 [View Article] [PubMed]
    [Google Scholar]
  54. Thompson JE, Fry SC. Evidence for covalent linkage between xyloglucan and acidic pectins in suspension-cultured rose cells. Planta 2000; 211:275–286 [View Article] [PubMed]
    [Google Scholar]
  55. Fernandez-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems?. J Mach Learn Res 2014; 15:3133–3181
    [Google Scholar]
  56. Glass NL, Schmoll M, Cate JH, Coradetti S. Plant cell wall deconstruction by ascomycete fungi. Annu Rev Microbiol 2013; 67:477–498 [View Article] [PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000674
Loading
/content/journal/mgen/10.1099/mgen.0.000674
Loading

Data & Media loading...

Supplements

Loading data from figshare Loading data from figshare
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error