Identifying and characterizing alternative molecular markers for the symbiotic and free-living dinoflagellate genus Symbiodinium.
Dinoflagellates in the genus Symbiodinium are best known as endosymbionts of corals and other invertebrate as well as protist hosts, but also exist free- living in coastal environments. Despite their importance in marine ecosystems, less than 10 loci have been used to explore phylogenetic relationships in this group, and only the multi-copy nuclear ribosomal Internal Transcribed Spacer (ITS) regions 1 and 2 have been used to characterize fine-scale genetic diversity within the nine clades (A-I) that comprise the genus. Here, we describe a three-step molecular approach focused on 1) identifying new candidate genes for phylogenetic analysis of Symbiodinium spp., 2) characterizing the phylogenetic relationship of these candidate genes from DNA samples spanning eight Symbiodinium clades (A-H), and 3) conducting in-depth phylogenetic analyses of candidate genes displaying genetic divergences equal or higher than those within the ITS-2 of Symbiodinium clade C. To this end, we used bioinformatics tools and reciprocal comparisons to identify homologous genes from 55,551 cDNA sequences representing two Symbiodinium and six additional dinoflagellate EST libraries. Of the 84 candidate genes identified, 7 Symbiodinium genes (elf2, coI, coIII, cob, calmodulin, rad24, and actin) were characterized by sequencing 23 DNA samples spanning eight Symbiodinium clades (A-H). Four genes displaying higher rates of genetic divergences than ITS-2 within clade C were selected for in-depth phylogenetic analyses, which revealed that calmodulin has limited taxonomic utility but that coI, rad24, and actin behave predictably with respect to Symbiodinium lineage C and are potential candidates as new markers for this group. The approach for targeting candidate genes described here can serve as a model for future studies aimed at identifying and testing new phylogenetically informative genes for taxa where transcriptomic and genomics data are available.