Detalle del congreso

Autores: Spetale, Flavio E.; Murillo, Javier; Cacchiarelli, Paolo; Bulacio, Pilar; Angelone, Laura; Tapia, Elizabeth.

Resumen: Gene Ontology (GO) is a structured repository of concepts (GO-terms) including three sub-ontologies, biological process (BP), molecular function (MF), and cellular component (CC). Although gene products should be ideally annotated in the three sub-ontologies, in-silico annotation methods usually cover individual sub-ontologies. Regarding cross-ontology annotations, methods based on cross-ontology association rules and interaction networks can be mentioned. However, the applicability of these methods is limited, since association rules can only be used with GO transitive relationships and interaction networks needhuge amounts of curated data - only available for model organisms.In this contribution, we present a method for automatic cross-ontology GO annotation, with no restriction on ontology relationships and no requirement of interaction data. The method relies on the factor graph modeling of already linked sub-ontologies. The use of factor graphs makes resulting cross-ontology annotations highly interpretable, thus enabling easy expert analysis. A proof of concept of the cross-ontology GO annotation method on BP and MF considering Glycine max (soybean) proteins is presented. As expected, the proper logic model of the high diversity of relationships involved in the combined sub-ontologies: is-a, part-of, regulates, and capable-of, improves the annotation performance related to reference annotations on individual sub-ontologies.The cross-ontology annotation of soybean protein sequences was first tackled with FGGA-GO classifiers, hierarchical ensembles of binary SVM classifiers relying on the power of factor graph models for overcoming inconsistencies among flat SVM predictions of individual GO sub-ontologies (BP and MF). FGGA-GO classifiers arise as a natural extension of FGGA and FGGA-CC + classifiers originally developed for the automated and consistent GO-MF and GO-CC annotation of protein coding genes respectively (Spetale et al, 2016; Spetale et al, 2018). FGGA-GO classifiers were evaluated on protein sequences from a non-model organism, Glycine max, using a 5-fold cross-validationapproach. Table 1 shows the average hierarchical precision (HP),recall (HR) and F-score (HF) accomplished by FGGA-GO classifiersusing Physicochemical + characterization method in BP, MF and BP-MFsub-ontologies. Here is shown that the prediction of cross-ontology(BP-MF) achieves slightly better results than the prediction ofindividual sub-ontologies (BP and MF), but helps greatly theunderstanding of the underlying biological behavior.A first insight into the benefits of the consistent GO cross-ontologyprediction can be appreciated in Figure 1 where the resulting GO-terms are consistent both for each sub-ontology and for the cross-ontology. This consistency is achieved through the passing messagealgorithm and mathematical model based on factor graph that usesthe intra and between sub-ontologies relationships. We should notethat the cross-onotology prediction obtained explain the expected:molecular functions involved in biological processes do not occur inan isolated form but are presented simultaneously.On the other hand, predictions on non-model organisms are verydifferent from the ones on model organisms because of the lack ofproper reference genomes. In addition, although recent advances intechnology offer opportunities to work in non-model organisms, lotof work is still ongoing on the basis of model organism. To change,researchers have to define and restrict the assumptions such asspecific biological questions to choose the best approach.FGGA-GO can efficiently annotate proteins and greatly benefitsstudies focusing on non-model organisms by means of aninterpretable model.

Tipo de reunión: Congreso.

Producción: Consistent Cross-Ontology Prediction of GO protein in Glycine max.

Reunión científica: V International Society for Computational Biology Latin America.

Lugar: Viña del Mar.

Institución organizadora: ISCB-LA, SOIBIO and EMBnet.

Publicado: No

Mes de reunión: 11