GOM
DI   ISRO

Estimating phylogenies under maximum likelihood: A very large-scale neighborhood approach

GOM - Technical Report


Authors

D. Catanzaro, R. Pesenti, M. C. Milinkovitch

Abstract

A basic problem in evolutionary genetics is the estimation of phylogenies among DNA or protein sequences. This problem is known to be NP-hard under several optimality criteria used for evaluating the quality of phylogenies. Consequently, one can reasonably search for optimal phylogenies only for datasets of small sizes such that the ever increasing number of molecular data accumulating in public databases increases the need for new heuristic algorithms for large phylogeny estimation. We introduce here Very Large-Scale Neighborhood (VLSN) techniques for the phylogeny estimation problem. VLSN techniques belong to the class of local search algorithms, and are characterized by a neighborhood size that grows exponentially with the input data. The underlying basic concept of VLSN techniques is that a greater neighborhood improves the quality of the locally optimal solutions found. These techniques provide efficient (typically polynomial) means to search for the best local optimum inside a neighborhood of a given solution. Here, we adapt these techniques to estimate phylogenies of large datasets of nucleotide sequences under the maximum likelihood criterion. We show that the use of the VLSN techniques speeds up convergence to topological local optima, and increases the overall performances of stochastic-based search algorithms.

Full text

Full text available in PDF.

Technical Reports 2007
Publications
Home