Authors: Spetale, Flavio; Tapia, Elizabeth; Murillo, Javier; Angelone, Laura; Valentini, Giorgio; Bulacio, Pilar.
Title: Reliable Electronic GO Annotations with True Path Rule.
Resumen: Gene annotation is an important problem in bioinformatics research. Possible gene functions and relationships between them can be described by Gene Ontology (GO). GO provides a controlled vocabulary of terms across three branches, Cellular Component (CC), Molecular Function (MF) and Biological Process (BP). Gene annotation aims the association between biological data and GO concepts, here called GO terms. Gene annotation can be performed experimentally using the EXP GO evidence code (Inferred from Experiment) to tag biological knowledge evidence. Alternatively, to narrow down candidate gene annotations for further experimental work, gene annotation can be performed electronically using the IEA GO evidence code (Inferred from Electronic Annotation).Current IEA annotations are mostly performed by BLAST similarity searches. But in many cases, e.g., for non-model organisms, BLAST similarity scores may be too weak. To overcome this problem, we consider the design of machine learning methods for reliable IEA gene annotations. Without lack of generality, we focus on the prediction of GO BP classes. For this purpose, IEA gene annotation predictions are modeled as a hierarchical multilabel classication problem. Under this baseline, we consider the True Path Rule (TPR) method for predicting BP class nodes. Briefly, TPR carries out two steps. At rst, predictions are made at each node of the BP ontology graph using a set of binary classiers. Secondly, the BP ontology graph is scanned in a bottom-up way and consensus predictions are made for each node taking into account former binary predictions and evidence from children nodes. As a result of this propagation strategy, a ne balance between precision and recall of gene annotations is obtained. It should be noted, however, that TPR predictions may suu000ber from a starting problem, i.e., predictions may signicantly diu000ber depending on the selection of the starting node at each level of the ontology graph.
Meeting type: Conferencia.
Type of job: Resumen.
Production: Reliable Electronic GO Annotations with True Path Rule.
Scientific meeting: 4to. Congreso Argentino de Bioinformática y Biología Computacional (4CAB2C) y 4ta. Conferencia Internacional de la Sociedad Iberoamericana de Bioinformática (SolBio).
Meeting place: Rosario.
Organizing Institution: CIFASIS-Conicet-UNR, Asociación Argentina de Bioinformática y Biología Computacional (A2B2C) - Sociedad Iberoamericana de Bioinformática (SoIBio), .
It's published?: Yes
Publication place: Rosario
Meeting month: 10