A collection of links to Phylogeny Programs

The main packages:


General-purpose packages

Parsimony programs

Distance matrix methods

Computation of distances

Maximum likelihood and related methods

Quartets methods

Invariants (or Evolutionary Parsimony) methods

Interactive tree manipulation

Biogeographic analysis and host-parasite comparison

Bootstrapping and other measures of support

Compatibility analysis

Tree alignment

Comparative method analysis

Tree plotting/drawing

Sequence management/job submission

Teaching about phylogenies


PHYLIP is the package described in this Web site (so, strictly speaking, it's not available only "elsewhere"). It is available free, from our Web site, in C source code, or as executables for pre-386 DOS, 386/486/Pentium DOS, 386/486/Pentium Windows, Macintosh, or PowerMac. It includes programs to carry out parsimony, distance matrix methods, maximum likelihood, and other methods on a variety of types of data, including DNA and RNA sequences, protein sequences, restriction sites, 0/1 discrete characters data, gene frequencies, continuous characters and distance matrices. It is the most widely-distributed phylogeny package, with over 3,000 registered users, some of them satisfied.


David Swofford of the Laboratory of Molecular Systematics, National Museum of Natural History, Smithsonian Instition, Washington, D.C. has written PAUP (which originally meant Phylogenetic Analysis Using Parsimony). Version 3.0 was available for Macintoshes. It is currently not available, but a new version, to be called PAUP*, will be released by Sinauer Associates, of Sunderland, Massachusetts, no earlier than June, 1996. It will have Macintosh, DOS, and Unix versions. It will include parsimony, distance matrix, invariants, and maximum likelihood methods. PAUP 3.0 was probably the most sophisticated parsimony program, with many options and close compatibility with MacClade (for which see below). The new program, PAUP*, will become much broader with the inclusion of more methods. The price will be $100 US for the Macintosh or DOS/Windows versions. Their ISBN number will be 0-87893-802-8 and -803-6. Orders can be placed with Sinauer at orders@sinauer.com, once it is released.


If you have a Macintosh computer and any interest in discrete-state parsimony methods (including DNA and protein parsimony), you should definitely get MacClade. It was written by Wayne Maddison and David Maddison of the University of Arizona. All distribution is by Sinauer Associates, Sunderland Massachusetts 01375, USA. Their phone number is: (413) 665 3722, FAX: (413) 665 7292. A disk with program, help file, and example data files, plus book (which has about 100 pages of intro to phylogenetic theory, and 250 pages of program instructions), is $75 U.S. ($40 for the book alone). Site licenses also available. An earlier and less capable Version 2 (which for example cannot read nucleic acid sequences and has fewer features for discrete characters) is also available by anonymous ftp from the EMBL, Indiana and Houston molecular biology software servers. Their addresses are given below under the descriptions of TreeAlign and ClustalV. MacClade 2.1 will be found among their Mac software, as a squeezed and then binhexed file. MacClade enables you to use the mouse-window interface to specify and rearrange phylogenies by hand, and watch the number of character steps and the distribution of states of a given character on the tree change as you do so. MacClade is positively addictive and will give you a much better feel for the tree and your data. It's the closest thing to a phylogeny video game that I have seen. It has been influential in spurring the inclusion of interaction and graphics into other phylogeny programs. (I have tried to supply this functionality in PHYLIP by incorporating the programs MOVE, DOLMOVE, and DNAMOVE, which act somewhat like MacClade). MacClade does not have a sophisticated search algorithm to find best trees: it largely relies on you to do it by hand (which is surprisingly effective), with only a local rearrangement algorithm available to improve on that tree.


J. S. Farris has produced Hennig86, a fast parsimony program including branch-and-bound search for most parsimonious trees and interactive tree rearrangement. Although complete benchmarks have not been published it is said to be faster than Swofford's PAUP; both are a great many times faster than the parsimony programs in PHYLIP. The program is distributed in executable object code only and costs $50, plus $5 mailing costs ($10 outside of of the U.S.). The user's name should be stated, as copies are personalized as a copy- protection measure. It is distributed by Arnold Kluge, Amphibians and Reptiles, Museum of Zoology, University of Michigan, Ann Arbor, Michigan 48109-1079, U.S.A. (akluge@umich.edu) and by Diana Lipscomb at George Washington University (BIODL@gwuvm.gwu.edu). It runs on PC-compatible microcomputers with at least 512K of RAM and needs no math coprocessor or graphics monitor. It can handle up to 180 taxa and 999 characters.


Mark Siddall, of the Virginia Institute of Marine Sciences (mes@vims.edu) has released Random Cladistics, version 4.0, a set of programs that can carry out bootstrapping, jackknifing, a variety of kinds of permutation tests, and search for "islands" of trees, using Hennig86 to analyze the data. It can also mark ranges of sites for inclusion or exclusion, compare trees from the analyses, and do many other operations. To use it you must have a copy of Hennig86 (for whose distribution see above). Random Cladistics will carry out the appropriate transformations of your data and will call Hennig86 and have it analyze them, and then it will summarize the results. Recent additions to Random Cladistics include ARNIE, which looks for the length incongruence difference of Farris et. al. (1994) to test whether your dataset partitions are meaningful, WARDLEY, which uses the method of Wheeler and Nixon to create Sankoff-characters (step matrices) for TV:TS ratios, and gap costs in tree estimation, TEASE, which strips uninformative characters from your matrix, and PACK, which combines characters with the same state distributions into one weighted character. They are available as DOS executables in a self-extracting archive by anonymous ftp from zoo.toronto.edu or by World Wide Web from the Willi Hennig Society software page http://www.vims.edu/~mes/hennig/software.html.


Torsten Eriksson at Stockholm University (Torsten.Eriksson@botan.su.se) has written a program, AutoDecay which generates Decay Indices from an existing PAUP treefile. It is intended to simplify the the task of creating reverse constraint trees in PAUP and subsequent generation of Bremer support values. (Bremer, K. 1994. Cladistics 10: 295-304). AutoDecay 3.0 (A C program compiled for the Macintosh) and relevant instructions as well as an older (hypercard stack) version (AutoDecay 2.7) can be obtained by anonymous ftp from 128.97.40.127 in directory pub/autodecay.


Doug Eernisse of the California State University, Fullerton (DEernisse@fullerton.edu) has constructed DNA Stacks, a Macintosh HyperCard stack that can carry out a variety of analyses on DNA sequences. It has an alignment editor, and can carry out various kinds of translation, and codon bias analysis. It can write out data sets in PAUP, Hennig86, and PHYLIP formats. In its "Support Index Blocks..." menu item it is able to prepare jobs for PAUP to enable Decay Index (Support Index) analysis as well. It is available by World Wide Web from http://biology.fullerton.edu/people/faculty/doug-eernisse


James Lyons-Weiler of the University of Nevada, Reno ( weiler@ers.unr.edu) has released RASA, software for the Mac that will perform "Relative Apparent Synapomorphy Analysis", a test for the presence of phylogenetic signal in any type of discrete character data matrix (morphological or molecular). This is two programs, RASA itself and RASA Plot, which (respectively) carry out the test and plot the results. The test will be described in a forthcoming paper: Lyons-Weiler, J., G.A. Hoelzer, and R.J. Tausch. 1996. Relative Apparent Synapomorphy Analysis (RASA) I: the statistical measurement of phylogenetic signal. Molecular Biology and Evolution, in press. The programs are available by World Wide Web at http://loco.biology.unr.edu/archives/rasa/rasa.html as a binhexed self-extracting archive, and by anonymous ftp at loco.biology.unr.edu in directory pub/rasa


J. S. Farris has recently released RA (Rapid nucleotide Analysis). It features rapid bootstrapping. It is available from Arnold Kluge, Amphibians and Reptiles, Museum of Zoology, University of Michigan, Ann Arbor, Michigan 48109-1079, U.S.A. (akluge@umich.edu) and Diana Lipscomb at George Washington University (BIODL@gwuvm.gwu.edu) who may be contacted for details. The cost is said to be about $30 US.


ClaDOS, an interactive program which allows rearrangement of trees and their evaluation, mapping of characters into them, and more, is available for DOS systems from Kevin Nixon, L. H. Bailey Hortorium, Cornell University, 467 Mann Library, Ithaca, New York 14853. Rumor has it that the cost is in the vicinity of $55 US.


MEGA (Molecular Evolutionary Genetic Analysis) has been released at the by Sudhir Kumar, Koichiro Tamura, and Masatoshi Nei of the Institute of Molecular Evolutionary Genetics, 328 Mueller Lab, Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A. It is an executable program for DOS machines, and is menu-driven with context-sensitive help. It also runs under Windows in a DOS Window. It analyzes data from DNA, RNA and protein sequences, and distance matrices produced from other kinds of data as well. It includes the Neighbor-Joining method distance matrix method, a branch and bound parsimony method, and bootstrapping. It also plots trees on many kinds of printers. The program costs $15 (for the documentation). Inquiries can also be made by mail to Joyce White at the above address or by electronic mail to imeg@@psuvm.psu.edu.


TREECON is a software package developed by Yves Van de Peer for the construction and drawing of phylogenetic trees based on distance data. Several equations are included to convert dissimilarity into evolutionary distance and several methods (such as neighbor-joining) are included for inferring the tree topology. It also includes bootstrap analysis. The DOS version of the program is available for free and runs on 80386 (and higher) computers. It was described in CABIOS 9: 177-182 (1993). A Windows version, announced in CABIOS 10: 569-570 (1994), is also available; a fee of $75 is asked for it. A demonstration version of the Windows version and more information about TREECON can be found at URL http://www.uia.ac.be/u/yvdp/index.html or you can contact the author at the Department of Biochemistry, University of Antwerp (UIA), Universiteitsplein 1, B-2610 Antwerpen, Belgium. His e-mail address is yvdp@uia.ua.ac.be.


Jun Adachi and Masami Hasegawa have written a package MOLPHY 2.2, carrying out maximum likelihood inference of phylogenies for either nucleotide sequences or protein sequences. Their protein sequence maximum likelihood program, ProtML, is a successor to the one they made available to me for distribution on a nonsupported basis in PHYLIP, and is much improved over that. It is the best protein maximum likelihood program available. The package is distributed free in C source code, with documentation, by ftp from sunmh.ism.ac.jp. A PowerMac executable version is available from the University of Oxford Biology Department Web server at http://evolve.zps.ox.ac.uk/PhySoft/PhySoft.html.


Gary Olsen, of the Department of Microbiology, University of Illinois, has developed a speeded-up replacement for my program DNAML coded in C, called fastDNAml. It achieves a number of economies and also is organized so that it can be run on parallel processors -- he and his co-workers have constructed trees of very large size on a high-speed parallel processor. The program can be compiled using the "p4" portable parallel processing toolkit. It can also be run in ordinary serial mode on workstations where it is faster than DNAML. The C program is available by anonymous ftp from the Ribosomal Database Project at info.mcs.anl.gov in directory pub/RDP/programs/fastDNAml. The C program and PowerMac executables are also available by anonymous ftp from the Indiana University Biology ftp server at ftp.bio.indiana.edu in directory molbio/evolve and also from the University of Oxford Biology Web server at http://evolve.zps.ox.ac.uk/PhySoft/PhySoft.html.


Ziheng Yang of the Department of Integrative Biology, University of California, Berkeley), (yang@msw4.biol.berkeley.edu) has released PAML 1.0, a program for the maximum likelihood analysis of nucleotide or protein sequences (including Hidden Markov Model analysis like the features we have in DNAML). It is available as C source code for Unix systems, and is free by anonymous ftp from the molecular biology software servers. It will be found on ftp.bio.indiana.edu, for example, in directory molbio/evolve, including a 486 executable.


Mike Charleston (mcharles@udcf.gla.ac.uk) of the Division of Environmental and Evolutionary Biology of the University of Glasgow has developed Spectrum, a Macintosh program for finding bipartition spectra from phylogenetic molecular and distance data, according to the method of Hendy et al. (1994), for moderately sized data sets (up to 18 taxa). The program also implements a branch-and-bound search for the "closest tree" - that is, the tree whose expected spectrum is closest to the spectrum derived from the observed data. It is free, and is now available from its WWW site in the Glasgow Taxonomy web pages: http://taxonomy.zoology.gla.ac.uk/mike/spectrum/spectrum.html.


Pablo Goloboff, of INSUE - Fundacion e Instituto Miguel Lillo in Tucuman, Argentina, has written PEWEE and NONA, to carry out weighted parsimony analyses. NONA searches for most parsimonious trees according to character weights defined by the user a priori. PEWEE calculates weights of the characters by a method introduced by Goloboff, a noniterative version of J. S. Farris's "successive weighting". It was described in Goloboff's paper in Cladistics 9: 83-91, 1993. The programs run on DOS with versions available for both 386-486-Pentium machines and earlier 16-bit machines. The programs cost $50 US. To order them write to James M. Carpenter, Department of Entomology, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024. Send the money and the name in which the copies are to be registered.


Yasuo Ina of the National Institute of Genetics, Mishima, Japan (yina@ddbj.nig.ac.jp) has developed ODEN, a package of programs for doing distance matrix analyses on nucleotide or protein sequences. It is described in CABIOS 10: 11-12 (1994). It is available free by anonymous ftp from directory pub/oden on bioslave.uio.no as C source code for Unix systems.


A. Luettke and R. Fuchs have written MacT, a package of programs for Macintoshes that compute distances and compute Neighbor-Joining phylogenies for them. The programs work on 4 through 26 sequences, and source code in Microsoft QuickBasic is provided as well as compiled executables. The package is free and is available on the molecular biology software servers. On ftp.bio.indiana.edu it will be found in directory molbio/mac. The programs are described in CABIOS 8: 591-594, 1992.


Andrey A. Zharkikh, Andrey Rzhetsky, and co-workers in the Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia, have produced VOSTORG, a package of programs for alignment (both manual and automatic) and inferring phylogenies by distance methods and parsimony for molecular sequences. It runs on IBM PC-compatibles and includes some rather fancy graphics. Most of the authors are currently in the U.S. (Andrey Zharkikh's e-mail address is zharkikh@hgc6.sph.uth.tmc.edu at the University of Texas Health Sciences Center in Houston). A version of the program is available free by by World Wide Web from http://hgc6.sph.uth.tmc.edu:8080/vostorg.dir/index.html or by anonymous ftp from hgc6.sph.uth.tmc.edu in directory pub/zharkikh/vostorg. The programs are described in a paper by Zharkikh et. al. in Gene 101: 251-254 (1991).


Walter Fitch (wfitch@uci.edu), of the Department of Ecology and Evolutionary Biology, of the University of California at Irvine, has available by anonymous ftp at daedalus.bio.uci.edu in directory pub/evoprog about 20 programs which carry out various kinds of phylogeny estimation and related tasks. They are available in source code in FORTRAN 77, (except for a few which are in C) and also as Sun SPARC executables and as PCDOS executables. They include:

There are also many programs that convert sequences among various formats, generate all possible trees, shuffle sequences, align sequences, and do various other functions.


Nicholas Galtier of the University of Lyon (galtier@biomserv.univ-lyon1.fr) has written PHYLO_WIN, a "graphic interface" for molecular phylogenetic inference. It performs neighbor-joining, parsimony and maximum likelihood methods and bootstrap with any of them. Many distances can be used including Jukes & Cantor, Kimura, Tajima & Nei, Galtier & Gouy (1995), LogDet for nucleotidic sequences, Poisson correction for proteic sequences, Ka and Ks for codon sequences. Species and sites to include in the analysis are selected by mouse. Reconstructed trees can be drawn, edited, printed, stored, evaluated according to numerous criteria. Taxonomic species groups and sets of conserved regions can be defined by mouse in both tools and stored into sequence files, thus avoiding multiple data files. It is entirely mouse-driven. Most usual sequence file formats are read : CLUSTAL, FASTA, PHYLIP, MASE. It runs under X windows on many Unix workstations including Sun (SunOS and Solaris), Silicon Graphics, IBM, DEC Alpha, and HP. It can be obtained by anonymous ftp to biom3.univ-lyon1.fr in directory pub/mol_phylogeny.


Walter Fitch (wfitch@uci.edu) of the Department of Ecology and Evolutionary Biology of the University of California at Irvine, has written WTDPARS, a package of programs for weighted parsimony analysis according to the methods he has introduced in the papers by P. L. Williams and W. M. Fitch, in pages 453-470 of the Nobel Symposium on the Heirarchy of Life, edited by B. Fernholm, K. Bremer, and H. Jornval, Elsevier, 1989, and the paper by P. L. Williams and W. M. Fitch, in Advances in Enzymology, volume 183, pages 615-625, 1990. The package is available as source code in FORTRAN 77 and as Sun SPARC executables and as PC executables, by anonymous ftp from daedalus.bio.uci.edu in directory pub/evoprog.


Rino Zandee (zandee@rulsfb.leidenuniv.nl), of the Institute of Evolutionary and Ecological Science, Van der Klaauw Laboratory, Leiden University, has written CAFCA, the Collection of APL Functions for Comparative Analysis. It carries out a search for the most parsimonious tree with discrete-character data (either two-state or multistate), using a search for cliques of component compatibility (monothetic subsets) to propose the candidates for most parsimonious trees. The program is written as functions in the APL language, but Macintosh and PowerMac executables are distributed. The program is free and is available from the CAFCA Web Site http://wwwbio.leidenuniv.nl/~zandee/cafca.html.


Korbinian Strimmer (strimmer@zi.biologie.uni-muenchen.de) and Arndt von Haeseler of the Zoological Institute of the University of Munich have developed PUZZLE, to infer phylogenies by "quartet puzzling", a method that applies maximum likelihood tree reconstruction to all possible quartets of taxa and subsequently tries to combine most of the four-taxa maximum likelihood trees to construct an overall tree. Usually there are several possible solutions. A consensus tree generated from the quartet puzzling trees shows nodes that are well supported. More details about the algorithm and on the phylogenetic accuracy are in press (K. Strimmer and A. von Haeseler, 1996). PUZZLE supports all popular models of sequence evolution of nucleotides and proteins. The program is written in ANSI C and is compatible with PHYLIP. Precompiled executables are provided for MacOS and MS-DOS. For UNIX and VMS system specific files for automated compilation are provided. PUZZLE is currently at version 2.3. It is available for free by anonymous ftp from either the European Bioinformatics Institute (at ftp.ebi.ac.uk in pub/software/mac/puzzle, pub/software/dos/puzzle, pub/software/unix/puzzle, or pub/software/vms/puzzle), or from the authors' anonymous ftp server at fx.zi.biologie.uni-muenchen.de in directory pub/puzzle. The authors should be contacted about other means of distribution for those who cannot access ftp.


Kay Nieselt-Struwe (kns@phy.auckland.ac.nz) of the Department of Physics of the University of Auckland, New Zealand has released version 1.0 of STATGEOM. It carries out computation of the statistical geometry in distance and in sequence space of a set of aligned DNA/RNA, amino acid or binary sequences. The user can decide to either compute the overall tree-likeness of the whole set, or a certain subset, or given a tree of the sequences compute the reliability of certain edges in the tree. Postscript files of the graphs of the statistical geometry are automatically generated. A sequence reformatting utility allow various sequence formats to be read in. STATGEOM is written in ANSI C; source code with documentation and a Sun SPARC executable are available by anonymous ftp at cage.mpibpc.gwdg.de (or 134.76.209.64) in directory pub/kniesel. The method of statistical geometry was originally published in: Eigen, M., Winkler-Oswatitsch, R. and Dress, A. (1988) Statistical geometry in sequence space: a method of comparative sequence analysis. Proc. Natl. Acad. Sci. USA 85: 5913-5917.


Rainer Wetzel and Daniel Huson of the University of Bielefeld (huson@mathematik.uni-bielefeld.de) have developed a Macintosh program SplitsTree for carrying out the split decomposition method of A. Bandelt and A. Dress (Molecular Phylogenetics 1: 242-252 (1992)). Huson has also begun to release SplitsTree2 which does split decomposition, and also Bandelt and Dress's p-splits, and spectral analysis (Hendy, Penny, Szekely, Steel & Erdos). It can process sequence or restriction site data, and can do does bootstrapping. It also contains an implementation of Cooper, Penny and Steel's method for dating divergences and also their molecular clock test. SplitsTree2 is currently available as a Mac program or in a Unix version for a number of different machines (Sun, SGI, DEC and HP). These use the program ghostview to draw the computed graph. The Mac version draws the graph in its own window, and the picture can be copied and pasted or printed in the usual way. There is no windows version at present. It is available by ftp from ftp.uni-bielefeld.de in directory pub/math/splits. SplitsTree2 is software under development. There is no manual for it at present. At the developmental stage, no registration fee need to be paid for SplitsTree 2. This will change for the final version. Registered users of SplitsTree1 will be able to use their registration number for SplitsTree2.


James Lake distributes Evomony, a program for using the "evolutionary parsimony" (invariants) method for inferring phylogenies from DNA or RNA sequences. It runs on 286 or higher DOS systems with at least 500k bytes of memory. A Macintosh version was also contemplated. I do not know what the current distribution arrangements are. Lake's address is Department of Biology, University of California, Los Angeles, California 90024.


Pierre Rioux and Tim Littlejohn of the Informatics Division of the Organelle Genome Megasequencing Program at the Universite de Montreal has made Available PARBOOT, a program that takes bootstrap sampled data sets and splits them up, submitting each to a different computer, so as to run bootstrapping quickly on networks of computers. It is available free as C source code by ftp from megasun.bch.umontreal.ca in directory pub/parboot. It requires a networked system of computers with PHYLIP, a "perl" interpreter, and appropriate accounts and permissions.


Andrey Zharkikh (zharkikh@hgc6.sph.uth.tmc.edu) of the Genetics Centers at the University of Texas Health Sciences Center in Houston has programs for bootstrapping of nucleotide sequences, including his innovative complete-and-partial bootstrap method for getting less biased P values. They are available free by World Wide Web at (for the bootstrap) http://hgc6.sph.uth.tmc.edu:8080/bootstrap.dir or (for the complete-and-partial bootstrap) http://hgc6.sph.uth.tmc.edu:8080/CP-bootstrap.dir, or by anonymous ftp at hgc6.sph.uth.tmc.edu/pub/zharkikh/bootstrap or hgc6.sph.uth.tmc.edu/pub/zharkikh/bootstrap/double-bootstrap. The programs njbootjc, njbootk2, and njbootli implement methods based on Jukes-Cantor, Kimura, and Li distances, respectively.


David Penny (Department of Botany and Zoology, Massey University, Palmerston North, New Zealand) has been offering for free distribution several DOS programs, one a fast parsimony program, TurboTree. There are also two others, Hadtree which computes expected frequencies of all possible distributions of nucleotides among species, and Great Deluge, an approximate search for the most parsimonious tree by a quasi-random method. He tells me that funding exigiencies are such that he may soon have to start charging for these. His electronic mail address is dpenny@massey.ac.nz.


David Swofford, of the Laboratory of Molecular Systematics of the Smithsonian Institution, Washington, D.C., has written Freqpars. It implements parsimony analysis based on gene frequencies. The method was described by D. L. Swofford and S. H. Berlocher in a paper in Systematic Zoology 36: 293-325, 1987. The program is available in FORTRAN 77 source code and also as a DOS executable, which requires a math coprocessor to run. The search for most parsimonious trees under Swofford and Berlocher's criterion is not very extensive, Swofford notes, because the individual tree evaluations are computationally difficult. The source code and executable, with documentation, are available by anonymous ftp from onyx.si.edu in directory freqpars.


Jotun Hein, (Institute of Genetics and Ecology, University of Aarhus, 8000 Aarhus C, Denmark) has produced TreeAlign, a multiple sequence alignment program that builds trees as it aligns DNA or protein sequences. It uses a combination of distance matrix and approximate parsimony methods. TreeAlign uses too much memory for it to run on PC's (DOS or Mac systems) but is really designed for a workstation or mainframe. It is available by anonymous ftp at the Indiana, Houston, and EMBL molecular biology software distribution sites. Their network addresses are respectively: ftp.bio.indiana.edu, ftp.bchs.uh.edu, and ftp.ebi.ac.uk. In the Indiana archive one must enter directory molbio/align, in the Houston archive it is in directory pub/gene-server in the subdirectories unix and vms. If you are on Internet and use molecular data it is important that you learn to use anonymous ftp and become familiar with these ftp servers.


Another multisequence alignment program that estimates trees as it aligns multiple sequences is ClustalW. Currently it is distributed as C source code, and in Macintosh and DOS executables by its author, Desmond Higgins. He is at the European Bioinformatics Institute in Cambridge, England (Des_Higgins@ebi.ac.uk). ClustalW successfully compiles and runs on many different workstations. DOS, Mac, and PowerMac executables are also available. It is a complete rewrite and upgrade of the Clustal and ClustalV packages; the first was described by Higgins and Sharp (1989). New features include the ability to detect read different input formats (NBRF/ PIR, Fasta, EMBL/Swissprot); align old alignments; produce phylogenetic trees after alignment (Neighbor Joining trees with a bootstrap option); write different alignment formats (Clustal, NBRF/PIR, GCG, PHYLIP); full command line interface. It is described in the paper by J. D. Thompson, D. G. Higgins, and T. J. Gibson in Nucleic Acids Research 22: 4673-4680, 1994. The program is available by anonymous ftp at the Indiana, Houston, and EMBL molecular biology distribution sites. Their network addresses are respectively: ftp.bio.indiana.edu, ftp.bchs.uh.edu, and ftp.ebi.ac.uk. In the Indiana archive one must enter directory molbio/align, in the Houston archive it is in directory pub/gene-server in all of the four directories dos, Mac, unix, and vms, and in the EBI archive it is in directory pub/software in all four directories unix, vms, mac, and PC. If you are on Internet and use molecular data it is important that you learn to use anonymous ftp and become familiar with one or more of these ftp servers.


Ward Wheeler and David Gladstein (wheeler@amnh.org) have written MALIGN, a parsimony-based alignment program for molecular sequences. It implements the original suggestion by Sankoff, Morel, and Cedergren (1973) that alignment and phylogenies could be done at the same time by finding that tree that minizes the total alignment score along the tree. Jotun Hein's program TreeAlign (mentioned above) is another, more approximate but probably faster, attempt to implement the Sankoff-Morel-Cedergren suggestion. MALIGN is the only attempt at a non-approximate implementation of the original method. MALIGN is available free of charge by World Wide Web from the Hennig Society's software page, as a DOS executable or as C source (with a Makefile) for Unix workstations). You can also get the files by anonymous ftp from the American Museum of Natural History's anonymous ftp site, ftp.amnh.org, in directory pub/molecular.


Rod Page of the Division of Environmental and Evolutionary Biology of the University of Glasgow has written COMPONENT, a program for Windows systems for comparing cladograms for use in phylogeny and biogeography studies. It has many tree comparison and consensus methods, and far more features for biogeographic studies (such as comparing species and area cladograms) than any other package. It runs on PCDOS 286 or 386 systems under Windows 3.0 or higher. Its cost is 40 pounds U.K., and it can be ordered from Lisa Sharp at the Department of Botany, Natural History Museum, London (lfs@nhm.ic.ac.uk). Rod's e-mail address is dpage@udcf.gla.ac.uk. There is a review of the program in Cladistics 9: 351-353 (1993). COMPONENT has a World Wide Web site: http://taxonomy.zoology.gla.ac.uk/rod/cpw.html which includes an order form.


Rod Page(dpage@udcf.gla.ac.uk), of the Division of Environmental and Evolutionary Biology of the University of Glasgow has written TREEMAP, a free, experimental program for comparing host and parasite phylogenies. It allows you to interactively compare host and parasite trees, construct reconstructions of the history of the association, and perform some simple randomisation tests of hypotheses of cospeciation. The program is available as an executable for Macintoshes or an executables for Windows PCs (the two versions are essentially identical). They can be downloaded from its WWW site : http://taxonomy.zoology.gla.ac.uk/rod/treemap.html The site also has an online manual, or you can download the documentation as a Postscript file. For a description of the method used by TreeMap, see Page, R.D.M. 1994. Parallel Phylogenies: Reconstructing the history of host-parasite assemblages. Cladistics 10: 155-173.


Andrew Purvis and Andrew Rambaut of the Department of Zoology, University of Oxford, England, have written CAIC (Comparative Analysis of Independent Contrasts). It is a Macintosh program that carries out the contrasts method (like my CONTRAST) but with some modifications by others to cope with lack of resolution of the phylogeny. It will run on any Macintosh, and is available free from CAIC's Web page http://evolve.zps.ox.ac.uk/CAIC/CAIC.html or by anonymous ftp from directory packages/CAIC at evolve.zps.ox.ac.uk. It is described in the paper by A. Purvis and A. Rambaut (1995) Comparative analysis by independent contrasts (CAIC): an Apple Macintosh application for analysing comparative data. Computer Applications in the Biosciences (CABIOS) 11: 247-251.


Emilia P. Martins (emartins@work.uoregon.edu), of the University of Oregon, has released version 1.0 of COMPARE, a package of programs for comparative methods analysis. COMPARE includes various programs for conducting statistical analyses of comparative data in a phylogenetic context. At the moment, it includes programs to conduct contrasts, spatial autocorrelation analyses, generate random data, trees and/or branch lengths, and various other small things. New programs will be added as they are ready. COMPARE currently runs on Windows, DOS, and Unix systems. It will eventually run on Macintosh computers as well. The C source code is also provided and can be compiled on almost anything else. You can download executables, source code and/or documentation for free by World Wide Web or by anonymous ftp to work.uoregon.edu (in pub/COMPARE).


Hang-Kwang Luh, John Gittleman, and Mark Kot of the University of Tennessee at Knoxville have made available PA, a package of Macintosh programs that implement the phylogenetic autocorrelation comparative method introduced by Gittleman and Kot ( Systematic Zoology , 1990). It is free and is available by anonymous ftp from ftp.math.utk.edu in directory pub/luh.


Joaquin Dopazo(dopazo@samba.cnb.uam.es) at the Centro Nacional de Biotecnologia in Madrid, Spain, has written a program ABLE (Analysis of Branch Length Errors) which implements the method described by Adell and Dopazo in J. Mol. Evol. 38:305-309 (1994). This is a parametric bootstrap test of constancy in evolutionary rates. The idea of the test is to simulate a large number of a data sets under the model of rate constancy and then to examine the distribution of the branch lengths. After, a tree is reconstructed without the constraint of rate constancy it can be checked whether the observed branch length values fall within the expected distribution. The program is intended for use with the PHYLIP programs FITCH and KITSCH. It is available as a DOS executable over World Wide Web at http://www.cnb.uam.es/www/SOFTWARE/XIMO/WWW1.html or by anonymous ftp at: ftp.cnb.uam.es in directory software/molevol. Or by sending a 3 1/2" diskette to the Dopazo at the Centro Nacional de Biotecnologia, CSIC, Universidad Autonoma, 28049 Cantoblanco, Madrid, SPAIN


Kent Fiala, now of SAS Institute, has written a compatibility (clique) program, based on an earlier program written by Kent and George Estabrook. Christopher Meacham has put the latest version of CLINCH (6.2), with Kent's permission, as a self-extracting DOS archive vailable free on Jim Beach's TAXACOM fileserver, muse.bio.cornell.edu. CLINCH 6.2 and associated files can be found by anonymous ftp in /pub/software/clinch as clinch62.exe, which is a self-extracting archive. Documentation, sample input and output, and FORTRAN source code are included. PC-CLINCH is probably the most sophisticated compatibility analysis program. The Taxacom server, by the way, also has other material related to botanical systematics, including flora information.


Christopher Meacham (Museum Informatics Project, University of California, Berkeley, California 94720, U.S.A.) produces COMPROB, a Pascal program to compute probabilities that characters would be compatible at random, thus telling us which clique is "most surprising". He can be contacted as meacham@violet.berkeley.edu about receiving a copy. The program is free.


The program MARKOV computes a distance measure between pairs of nucleotide sequences. It also constructs phylogenies from these and summarizes the 4x4 substitution matrices between the pairs of species. It uses a more general model of substitution than used in PHYLIP, the Stationary Markov Model described in the paper by Saccone et. al. in Methods in Enzymology volume 183, pages 570-583, 1990. Bootstrapping is used to analyze the statistical error of the results. Output files from CLUSTAL and PILEUP, as well as some other formats, can be used for input, and analysis can be confined to certain codon positions in coding sequences. The program is written in FORTRAN and runs on VMS and Unix systems. It was produced by Dr. Graziano Pesole and Professor Cecilia Saccone at the University of Bari, Italy, and is available (for free?) from Dr. Cecilia Lanave at CSMME-CNR, Dipartimento di Biochimica e Biologia Molecolare, Universita` di Bari, via Orabona 4, 70126 Bari, Italy. Her phone number is 39-80-243305, her fax number is 39-80-243317, and her e-mail address is lanave@vaxba0.ba.it or mvx36@ibacsata.it


J. S. Armstrong, A. J. Gibbs, R. Peakall and G. Weiller, (johna@rsbs-central.anu.edu.au) of Gibbs's group at the Research School of Biological Sciences of the Australian National University, Canberra, have produced RAPDistance, a package for DOS or Windows systems for computing distance matrices for RAPD analyses. It has a comprehensive range of options for creating data files, editing them and using application programs to analyse them. RAPDistance is available free by anonymous ftp from directory pub/RAPDistance at life.anu.edu.au, or on the World Wide Web at http://life.anu.edu.au/molecular/software/rapd.html.


P. R. Reeves and colleagues at Sydney University, Australia, have produced MULTICOMP, a program for computing various distances from sequence data. It is described in a paper by Reeves et. al. in CABIOS 10: 281-284 (1994). I do not know what computer systems it runs on. Reeves may be contacted at reeves@angis.su.oz.au for distribution information.


Ken Rice of the Department of Organismal and Evolutionary Biology of Harvard University has produced RSVP (restriction site variability program) which calculates several measures of genetic variability based on restriction map data. It also produces Jukes-Cantor corrected distance matrices with standard errors from collections of restriction maps. C source code for Version 2.08 of RSVP is available free by anonymous ftp from: green.harvard.edu/~rice or you can get it on WWW from: http://oeb.harvard.edu/~rice. It runs under Unix.


Microsat, by Eric Minch (minch@lotka.stanford.edu) is a program for calculating distances from microsatellite data. It uses the methods developed by David Goldstein et. al., and presented in their papers of 1995 in Proc. Natl. Acad. Sci. USA 92: 6720-6727 and Genetics 139: 463-471. The distance is based on the mean microsatellite array size, implementing the "Delta mu" distance that they defined, which corrects for within-population variability and provides a distance that is independent of population size. It is available for free from a page in Luca Cavalli-Sforza's lab web site at http://lotka.stanford.edu/research/distance.html. The program is written in ANSI C. Source code is distributed, and so are executables for DEC Alpha (under OSF/1), Macintosh, Sun SPARC, DOS, and NeXT.


Georg Weiller, of the Bioinformatics Laboratory, Australian National University, Canberra, Australia (weiller@rsbs-central.anu.edu.au) has produced DIPLOMO (DIstance PLOt MOnitor). It compares different distance measures with each other by displaying them as a scatter plot. It then helps one instantly identify all individual comparisons within the plot. individual taxa can be excluded or included in the plots, DIPLOMO enables you to see whether different taxa have different mutational characteristics (such as more having relatively more transitions in some lineages), and whether different distance measures correlate. The program takes as input a file with several different distance matrices. This file is in a simple format which can readily be produced by editing distance matrices produced by other packages. A program to compute the distance matrices is currently under development. Although DIPLOMO is intended to be ported to multiple platforms the current version (1.03) runs on DOS on PC-compatibles. DIPLOMO is free; it can be obtained by anonymous ftp from life.anu.edu.au in /pub/molecular_biology/software/diplomo, or by World Wide Web from http://life.anu.edu.au/~weiller/gfw.html. Floppy disk distribution is also possible.


J. S. Farris and Mary Mickevich earlier released a package of phylogeny programs, PHYSYS, which, at about $5,000, was extremely expensive (in my opinion, which is certainly a biased one). I am not sure whether, from whom, or under what conditions it is still available.


Fujitsu Ltd. ("a $21 billion global leader in advanced computer, telecommunications, and electronic devices") sells for $28,000 US a Fujitsu S family workstation complete with a program, SINCAIDEN, which allows "experimental researchers, even those unfamiliar with such analyses, [to] easily create phylogenetic trees in their own laboratories." The program also allows searches of the major nucleic acid sequence and protein databases (the ad I saw does not make it clear whether these databases are provided with the workstation). The methods available are UPGMA, neighbor-joining, Farris's (Distance Wagner) and the modified Farris distance matrix methods. The workstation is SPARC compatible and runs SunOS. The SYNCAIDEN program was developed by the group at the National Institute of Genetics, Japan under Dr. Takashi Gojobori. Fujitsu Ltd. may be contacted at 21-8, Nishi-Shinbashi 3- chome, Minato-ku, Tokyo 105, Japan (phone 81-3-3437-5111 ext. 2831, fax 81-3- 5472-4354), or in the U.S. at Fujitsu America Inc., 3055 Orchard Drive, San Jose, California 95134-2017 (phone 1-408-432-1300 ext. 5168, fax 1-408-434- 1045).


MUST, a package of sequence management programs, is distributed on a shareware basis by Herve Phillippe, Laboratoire de Biologie Cellulaire (URA CNRS 1134 D), Batiment 444, Universite de Paris-Sud, 91405 Orsay cedex, France. His e-mail address is: adoutte@frciti51 on Bitnet/EARN. His phone and fax numbers are respectively 33.1.69.41.64.81 and 33.1.69.41.21.30. MUST is available on a shareware basis ($100 registration fee if you do not send diskettes) and runs on DOS systems using DOS version 3 or later. It is intended as complementary to existing phylogeny and alignment programs and can produce output files in the formats of PHYLIP, PAUP, Hennig86, and CLUSTAL. It contains a variety of sequence input, editing, checking, and storage functions, as well as a sequence editor and a phylogeny plotter. It also allows further analyses of the results from these phylogeny programs.


Steve Smith, formerly of the Harvard Genome Laboratory, has written an X-Windows interactive sequence editor, GDE (Genetic Data Environment) which allows the user to edit sequences and align them by hand, and to select subsets of sites and sequences and call a variety of analysis proprams including ClustalV and many of the PHYLIP 3.5 programs. The GDE 2.0 system will run on many workstations that have the X windowing system. It also includes the TreeTool tree-plotting program (see below). GDE 2.0 is free and is available for anonymous ftp transfer at the molecular biology software servers, such as ftp.bio.indiana.edu in directory molbio/unix/GDE, or at megasun.bch.umontreal.ca in directory pub/gde. At the latter location there are also Linux binaries, and at both there are Sun binaries.


Mike Maciukenas, at the Department of Microbiology of the University of Illinois, has written a wonderful X-windows based interactive tree-plotting program called TreeTool. It takes as input a PHYLIP tree file, with branch lengths if they are provided, displays the tree in either rooted or unrooted form on any X-windows screen, and allows the user to modify the form of the tree and the placement of nodes and labels. When the tree is in final form the user can have it written to a Postscript file and/or printed to a Postscript- compatible printer. TreeTool is free as a C program for X windows using the Xview library. However, Xview seems to be available mostly on Sun workstations. The home archive for treetool is the Ribosomal Database Project at rdp.life.uiuc.edu. You can ftp the current treetool version 2.0.2 with source from there. It will also be found in directory molbio/unix/treetool at iubio.bio.indiana.edu. It is also included in the GDE 2.0 sequence analysis environment mentioned above.


Rod Page of the University of Glasgow, Scotland (dpage@udcf.gla.ac.uk), has written TreeView, a program for displaying trees on Apple Macs and Windows PCs. It can draw rooted and unrooted trees, display bootstrap values, and supports the native font and graphics file formats of both Macs and PCs. The program reads NEXUS, PHYLIP, and Hennig86 style tree files (including files produced by fastDNAml and CLUSTALW), and can save trees in the same formats so that it can convert trees among these formats. The Mac and Windows versions have almost identical interfaces. They can support the standard TrueType and Postscript fonts available on Macs and PCs, and they support the standard PICT and Windows Metafile formats for output, allowing tree pictures to be copied into other applications, as well as being saved in files. There is a print preview and drag-and-drop facilities. Currently (version 1.2) TreeView can read up to 100 trees with up to 200 taxa. The program is free, and can be obtained by World Wide Web from http://taxonomy.zoology.gla.ac.uk/rod/treeview.html. It comes in 68K Mac, PowerMac, and Windows versions. If you download the program and find it of use, please email Rod at dpage@udcf.gla.ac.uk so that he can keep you informed of bug fixes and improvements.


Manolo Gouy of the University of Lyon, France, has produced NJplot, which displays phylogenies (input in the standard form) on Macintosh screens and saves them in PICT files. It displays branch lengths and bootstrap information (if present) and allows the user to swap branches and change the position of the root. It is available free and can be retrieved by anonmyous ftp from molecular biology software servers such as the European Bioninformatics Institue's server, ftp.ebi.ac.uk, where it is in directory pub/software/mac.


Steven Brewer of Western Michigan University (Steven.Brewer@wmich.edu) has developed Phylogenetic Investigator a teaching program that allows students to connect together organisms from a data set provided to them, to make phylogenies and examine them. The program is written as a Supercard 2.0 stack for Macintosh and PowerMacintosh systems; it is also available as a 680x0 Macintosh or PowerMac standalone executable. Version 1.6 is the last freeware version; for the moment it is available for free by World Wide Web from http://141.218.91.93/docs/PIGuide/piguide.html. However that server was expected be taken off-line after March, 1996. A later version of Phylogenetic Investigator will appear on the 1996 yearly CDROM from the BioQUEST consortium, a nonprofit publisher of interesting biological teaching software. It would therefore not be free, though the cost of the CDROM is about $100 and for that you get many different teaching programs.


Arnold G. Kluge (akluge@umich.edu), of the Department of Biology of the University of Michigan, has written Systack, a teaching program designed to teach the principles of synapomorphy/homology analysis in the context of chordate phylogeny. It implements a hierarchical filing system in the form of a phylogeny, with character information available on chordates and with users able to add new characters. It is a Hypercard stack for Macintosh computers, and is available free for noncommercial use. A Web site is available at http://www.ummz.lsa.umich.edu/herps/systack.html to download it.


Last Updated: Tuesday, 20 April, 2004 8:37 PM
1989
About l Home Page l Projects l Publications l Estación Biológica de Doñana l CSIC