The best magazine
Probabilities on cladograms: Introduction to the alpha model.
This thesis introduces the alpha model. The alpha model is a one parameter family of probability models on cladograms (binary leaf-labeled trees) which interpolates continuously between the Yule, Uniform and Comb distributions. The single parameter alpha varies from 0 to 1, with alpha = 0 giving the Yule model, alpha = 1/2 the Uniform and alpha = 1 the Comb. For each fixed alpha, the alpha model is a sequence, Pnn∈ N with Pn a probability on cladograms with n leaves. This sequence is sampling consistent, roughly meaning that choosing a random tree from Pn and deleting k random leaves gives a random tree from Pn-k. It is also Markovian self-similar. The only other known family with these properties is the beta model of Aldous. An explicit formula is given to calculate the probability of a given tree shape under the alpha model. The expected values of Sakin's and Colless' indices are found ( ∼n1+aG3-a a1+a for alpha > 0) as well as their asymptotic covariance. The expected depth of a random leaf is ∼naG3-a a1+a for alpha ≠ 0. The number of cherries on a random alpha tree is shown to be asymptotically normal with known mean and variance.; The alpha and beta models are used to analyze the shape of a large number of phylogenetic trees from the databases Treebase and Treefam. Some algorithms for ranked trees are presented. Encodings of cladograms as strings and perfect matchings are also given. The mixing times for Markov chains on tree shapes and cladograms are also bounded.