Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees.

Details

Ressource 1Download: BIB_3EF9A7BDC38C.P001.pdf (579.58 [Ko])
State: Public
Version: author
Serval ID
serval:BIB_3EF9A7BDC38C
Type
Article: article from journal or magazin.
Collection
Publications
Institution
Title
Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees.
Journal
Briefings in Bioinformatics
Author(s)
Boeckmann B., Robinson-Rechavi M., Xenarios I., Dessimoz C.
ISSN
1477-4054 (Electronic)
ISSN-L
1467-5463
Publication state
Published
Issued date
2011
Peer-reviewed
Oui
Volume
12
Number
5
Pages
423-435
Language
english
Abstract
Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference trees. For three well-conserved protein families, we observed a generally high specificity of orthology assignments for these databases. We show that differences in the completeness of predicted gene relationships and in the phylogenetic information are, for the great majority, not due to the methods used, but to differences in the underlying database concepts. According to our metrics, none of the databases provides a fully correct and comprehensive protein classification. Our results provide a framework for meaningful and systematic comparisons of phylogenomic databases. In the future, a sustainable set of 'Gold standard' phylogenetic trees could provide a robust method for phylogenomic databases to assess their current quality status, measure changes following new database releases and diagnose improvements subsequent to an upgrade of the analysis procedure.
Keywords
conceptual comparison, phylogenomic databases, quality assessment, reference gene trees
Pubmed
Web of science
Open Access
Yes
Create date
15/08/2011 10:20
Last modification date
20/08/2019 14:35
Usage data