Performance of single and concatenated sets of mitochondrial genes at inferring metazoan relationships relative to full mitogenome data
Havird, Justin C
Santos, Scott R
MetadataShow full item record
Mitochondrial (mt) genes are some of the most popular and widely-utilized genetic loci in phylogenetic studies of metazoan taxa. However, their linked nature has raised questions on whether using the entire mitogenome for phylogenetics is overkill (at best) or pseudoreplication (at worst). Moreover, no studies have addressed the comparative phylogenetic utility of mitochondrial genes across individual lineages within the entire Metazoa. To comment on the phylogenetic utility of individual mt genes as well as concatenated subsets of genes, we analyzed mitogenomic data from 1865 metazoan taxa in 372 separate lineages spanning genera to subphyla. Specifically, phylogenies inferred from these datasets were statistically compared to ones generated from all 13 mt protein-coding (PC) genes (i.e., the "supergene" set) to determine which single genes performed "best" at, and the minimum number of genes required to, recover the "supergene" topology. Surprisingly, the popular marker COX1 performed poorest, while ND5, ND4, and ND2 were most likely to reproduce the "supergene" topology. Averaged across all lineages, the longest ∼2 mt PC genes were sufficient to recreate the "supergene" topology, although this average increased to ∼5 genes for datasets with 40 or more taxa. Furthermore, concatenation of the three "best" performing mt PC genes outperformed that of the three longest mt PC genes (i.e, ND5, COX1, and ND4). Taken together, while not all mt PC genes are equally interchangeable in phylogenetic studies of the metazoans, some subset can serve as a proxy for the 13 mt PC genes. However, the exact number and identity of these genes is specific to the lineage in question and cannot be applied indiscriminately across the Metazoa.