Abstract
Myrteae is the largest and most diverse tribe within Myrtaceae and represents the majority of its diversity in the Neotropics. Members of Myrteae hold ecological importance in tropical biomes for the provision of food sources for many animal species. Thus, due to its several roles, a growing interest has been addressed to this group. In this study, we report the sequencing and de novo assembly of the complete chloroplast (cp) genomes of six Myrteae species: Eugenia brasiliensis, E. pyriformis, E. nitida, Myrcianthes pungens, Plinia edulis and Psidium cattleianum. We characterized genome structure, gene content, and identified SSRs to detect variation within Neotropical Myrteae. The six newly sequenced plastomes exhibit a typical quadripartite structure, gene content and organization highly conserved among Myrtaceae species. Some differences in genome length, protein-coding genes and non-coding regions were found. Besides, IR boundaries present structural changes among species. Increased sequence diversity was observed in some intergenic regions, suggesting their suitability for investigating intraand interspecific genetic diversity in populational studies. These data also contribute to the improvement of taxa sampling in further phylogenetic investigations to understand Myrtaceae evolution.
Keywords cpDNA; genomic resource; populational genetics; plastid; conservation
Myrtaceae encompasses over 6000 species of shrubs and trees, classified in 144 genera and subdivided into 17 tribes (Wilson et al., 2005; WCSP, 2019). This angiosperm family has a predominant Southern-Hemisphere distribution and is assumed to be of Gondwanan origin, being an important component in the forests of Southeast Asia, Australia, and South America (Wilson et al., 2005; Thornhill et al., 2015). In the Neotropical region, most of Myrtaceae is represented by the tribe Myrteae, which comprises over 50 genera and 2500 species, representing half of the diversity of the family (Wilson et al., 2005; WCSP, 2019). Myrteae species play an important ecological role in Neotropical environments as foraging resources to animals, especially to a variety of bee species (Fidalgo and Kleinert, 2009). Besides that, some studies focused on specific classes of compounds produced by Myrtaceae, such as terpenes, which present commercial uses (Guzman et al., 2014). Other studies have demonstrated the antifungal, antioxidant, antiinflammatory, gastroprotective, and other bioactivities of
Myrteae species from Brazil (Salvador et al., 2011; Souza Moreira et al., 2019). Thus, due to its plethora of roles, a growing interest has been addressed to this group as a model for evolutionary, ecological and applied studies. For this study, leaves from Eugenia brasiliensis, Eugenia pyriformis, Eugenia nitida, Myrcianthes pungens, Plinia edulis and Psidium cattleianum trees were from a private in vivo collection in Gravataí, RS, Brazil (latitude (S): 29°51’52"; longitude (W): 50°53’53") and used to isolate chloroplasts by the modified high salt method, followed by cpDNA extraction with the CTAB method (Vieira et al., 2014). DNA quality was evaluated by electrophoresis in a 1% agarose gel, and DNA quantity was determined using a NanoDrop spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). For each species, one genomic paired-end library of 150 nt length was generated using an Illumina HiSeq2000 platform (Macrogen). The bases with quality below Q30 and adapter contamination were trimmed using Trim Galore!0.4.2, with a 50 bases minimum allowed. Paired-end sequence reads were filtered against 28 Myrtaceae plastomes (Table S1) using BWA software (Li and Durbin, 2009) with two mismatches allowed. Mapped reads were used for the de novo assembly with ABySS software (Jackman et al., 2017). The plastome scaffolds were orientated by MUMmer (Delcher et al., 2003) using Eugenia uniflora (NC_026450.1), Plinia trunciflora (NC_034801.1) or Psidium guajava (NC_033355.1) as reference genomes for species of the same or closer genus. Genes were annotated using GeSeq (Tillich et al., 2017) by BLAST searches with 80% similarity. Circular plastome maps were drawn using the OGDRAW web toolkit (Lohse et al., 2013). IRa and IRb boundaries were analyzed using IRScope (Amiryousefi et al., 2018). For each species, local mVISTA (Frazer et al., 2004) was used to pairwise align plastomes with their respective reference. An overall genome comparison was performed with BLAST Ring Image Generator (BRIG) (Alikhan et al., 2011). Krait v0.11.4 (Du et al., 2018) was used to search and annotate perfect SSRs using the genomes and their annotated GFF3 file. The parameters for minimum repeat numbers were 8, 4, 3, 3, 3, 3 for mono-, di-, tri-, tetra-, penta- and hexanucleotide SSRs, respectively. DNA sequencing libraries were produced for each species, and these comprised 34.1-48.7 M raw Illumina paired-end reads (summarized in Table S2). The percentage of removed reads due to trimming ranged from 1.17-1.45%. The number of filtered reads was 1.1-2.2 M reads. The obtained reads were de novo assembled into scaffolds that completely covered each plastome, without any gaps. The number of assembled scaffolds obtained ranged from four to eight. The minimum coverage ranged from 13 to 46 reads and the maximum coverage, 1,508-3,620 reads. The complete sequences were submitted to GenBank at accession numbers MN095407 to MN095411 and MN095413 (Table 1). The complete plastomes of six Myrteae have a narrow size range, from 157,683 bp in E. nitida to 159,631 bp in P. edulis, similar to the size of Myrtaceae species plastomes (Eguiluz et al., 2017 a,b; Machado et al., 2017). Figures S1, S2, S3, S4, S5, S6 present the genome maps for each species. Four well-defined regions are present in all newly assembled genomes. Inverted regions (IR) ranged from ~26.3 to 26.4 kbp and had the smallest size variation, up to 78 bp. Short single copy (SSC) sections have ~18.2-18.5 kbp, while long single copy (LSC) sections have ~86.4-88.2 kbp (Table 1). Protein coding sequences comprise ~50% of the genome, rRNAs and tRNAs comprise ~7%, and non-coding regions, such as introns, pseudogenes and intergenic spacers correspond to ~43% (Table 1). Genome structure analysis showed a high degree of synteny among evaluated species (Figure 1). Genomes contained 129 genes in total, corresponding to 78 single-copy protein-coding genes, 30 transfer RNA (tRNA) genes, four ribosomal genes (rRNA) and one pseudogene (ycf1) (Figure 1, Table 1). In general, genomic features, such as size, structure, and gene abundance are similar to previously described Myrtaceae species (Eguiluz et al., 2017a,b). Despite the similarity in genomic features, the mVISTA comparison against each respective reference genome showed that some regions display lower similarity (Figures S7, S8, S9). Non-coding regions, particularly the intergenic, had lower conservation and, therefore, more variation, such as psbI-trnS, trnT-psbD, trnS-psbZ-trnG, accD-psaI, and ndhF-rp132 in Eugenia and Myrcianthes; trnS-trnR, atpF-atpH, trnT-trnL, and rpl32-trnL in Plinia; and most intergenic regions of LSC in Psidium. Regarding protein-coding genes, we observed a conservation decrease in accD and ccsA in Eugenia species and M. pungens; and rpoC2 in Plinia and Psidium. Protein-coding genes matK, ndhF and ycf1, showed more nucleotide diversity (4.6 to 6.1%) in all analyzed species (Figures S7, S8, S9, S10). This diversity corroborates previous studies based on plastidial genes and non-coding regions with contrasting substitution rates (Thornhill et al., 2015; Machado et al., 2017). Some structural changes in the IRa and IRb boundaries were found for the evaluated species (Figure 2). Within the IRb-LSC boundaries, the boundaries of the rps19 gene were located on the left side. In the IRb region, except for M. pungens, the IRb-LSC boundary was embedded in rps19 and had a length of three bp in Eugenia (Figure 2A), ~30 bp in Plinia (Figure 2B), and 31 bp in Psidium, contained in the IRb (Figure 2C). The IRb-SSC boundaries were embedded in the ycf1 pseudogene, ranging from one to eight bp in Eugenia/Myrcianthes, one and two bp in Plinia, and only one bp in Psidium species. The ndhF gene was located on the right side of the IRb-SSC at a distance from the boundary of 10 bp in E. uniflora, 36 to 121 bp in other Eugenia, 72 bp in M. pungens, 109 bp and 120 bp in Plinia, and 111 bp and 124 bp in Psidium. The SSC-IRa boundary was embedded in ycf1, with a length of 1047 to 1080 bp in Eugenia/Myrcianthes, 1080 and 1011 bp in Plinia, and 1071 and 1079 bp in Psidium in the IRa region. The trnH-GUG gene was located on the right side of the IRa-LSC boundary ranging from 11 to 52 bp in Eugenia/Myrcianthes, from 3 to 10 bp in Plinia and 10 to 14 bp in Psidium. The contraction and expansion of IR regions are measurable events of plastome evolution. These results demonstrate a genus-specific IR conservation, which can be considered one of the reasons for genome size variation among species. In this work, we present, for the first time, a characterization of IR boundaries from species of the same genus in Myrteae because they compared different genera, previous studies could not report a significant variation in IRb-SSC border within Myrteae species (Eguiluz et al., 2017a,b; Machado et al., 2017). All plastomes presented a similar number of SSRs. In total, over 315 SSRs were identified for each species (Table 2). The mononucleotide SSRs of A/T were the most frequent, varying in number from 85/98 in E. brasiliensis/E. pyriformis to 93/103 in P. edulis/P. cattleianum (Table S3). This AT richness was already demonstrated in previous studies and reflects the lower GC content in these plastid genomes (Eguiluz et al., 2017a,b). The secondmost common were the trinucleotide SSRs, ranging in number from 61 in E. nitida to 71 in P. edulis. In addition, the number of SSRs located in different regions were similar: in intergenic regions ranging from 171 in E. nitida to 185 in P. edulis; in genes, ranging from 96 in E. brasiliensis, E. pyriformis, M. pungens, and P. cattleianum, and 98 in E. nitida and P. edulis; and in introns, ranging from 43 in E. brasiliensis to 47 in E. nitida, P. edulis, and P. cattleianum (Table 2). Hexanucleotide SSRs could not be found in the genomes. All found SRRs are listed in Table S4. These SSR results provide more information on molecular markers that could be used to evaluate intra- and interspecific diversity. This work provides reference genomes for six Neotropical Myrtaceae species, increasing the genetic information available for the Myrteae tribe, and allowing the improvement of taxa sampling in further investigations into Myrtaceae evolution.
Complete gene map of six Myrteae plastomes. Gene annotations are in black. The plastomes are in red (P. edulis), purple (E. brasiliensis), orange (E. nitida), blue (E. pyriformis), yellow (M. pungens), green (P. cattleianum). LSC: large single-copy region; SSC: small single-copy region; IR: inverted repeat. The numbers near P. edulis (red circle) represent the nucleotide positions (in kbp).
Comparison ofborder positions ofLSC, SSC and IRamong (A) Eugenia uniflora, (B) Psidium guajava and (C) Plinia trunciflora andrelated new species. JSA/JSB, junction of SSC-IRa/IRb; JLA/JLB, junction of LSC-IRa/IRb. Boxes above or under the main line indicate the predicted genes; ycfl pseudogenes are at JSB and their lengths are displayed in the corresponding regions. The figure is not to scaled, and shows relative changes at or near the IR-SC borders.
Acknowledgments
This study was carried out with financial fellowship supports from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES, Brasil - Finance code 001) and Fundação de Amparo à Pesquisa do Rio Grande do Sul (FAPERGS) with PRONEX grant number 16/2551-0000491-9.
References
- Alikhan NF, Petty NK, Ben Zakour NL and Beatson SA (2011) BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. BMC Genomics 12:402.
- Amiryousefi A, Hyvonen J, and Poczai P (2018) IRscope: an online program to visualize the junction sites of chloroplast genomes. Bioinformatics 34:3030-3031.
- Delcher AL, Salzberg SL, and Phillippy AM (2003) Using MUMmer to identify similar regions in large sequence sets. Curr Protoc Bioinform 10:10.3.
- Du L, Zhang C, Liu Q, Zhang X, and Yue B (2018) Krait: An ultrafast tool for genome-wide survey of micro satellites, and primer design. Bioinformatics 34:681-683.
- Eguiluz M, Rodrigues NF, Guzman F, Yuyama P, and Margis R (2017a) The chloroplast genome sequence from Eugenia uniflora, a Myrtaceae from Neotropics. Plant Syst Evol 303:1199-1212.
- Eguiluz M, Yuyama PM, Guzman F, Rodrigues NF, and Margis R (2017b) Complete sequence, and comparative analysis of the chloroplast genome of Plinia trunciflora Genet Mol Biol 40:871-876.
- Fidalgo ADO, and Kleinert ADMP (2009) Reproductive biology of six Brazilian Myrtaceae: Is there a syndrome associated with buzz-pollination? New Zeal J Bot 47:355-365.
- Frazer KA, Pachter L, Poliakov A, Rubin EM, and Dubchak I (2004) VISTA: Computational tools for comparative genomics. Nucleic Acids Res 32:W273-W279.
- Guzman F, Kulcheski FR, Turchetto-Zolet AC, and Margis R (2014) De novo assembly of Eugenia uniflora L. transcriptome, and identification of genes from the terpenoid biosynthesis pathway. Plant Sci 229:238-246.
- Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S, Hammond SA, Jahesh G, Khan H, Coombe L, Warren RL et al. (2017) ABySS 2 . 0: Resource-efficient assembly of large genomes using a Bloom filter. Genome Res 27:768-777.
- Li H, and Durbin R (2009) Fast, and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754-1760.
- Lohse M, Drechsel O, Kahlau S, and Bock R (2013) OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid, and mitochondrial genomes, and visualizing expression data sets. Nucleic Acids Res 41:W575-W581.
- Machado LO, Vieira LD, Stefenon VM, Oliveira Pedrosa F, Souza EM, Guerra MP, and Nodari RO (2017) Phylogenomic relationship of feijoa (Acca sellowiana (O.Berg) Burret) with other Myrtaceae based on complete chloroplast genome sequences. Genetica 145:163-174.
- Salvador MJ, de Lourenfo CC, Andreazza NL, Pascoal AC, and Stefanello ME (2011) Antioxidant capacity, and phenolic content of four myrtaceae plants of the South of Brazil. Nat Prod Commun 6:977-982.
- Souza-Moreira TM, Severi JA, Rodrigues ER, de Paula MI, Freitas JA, Vilegas W, and Pietro RCLR (2019) Flavonoids from Plinia cauliflora (Mart.) Kausel (Myrtaceae) with antifungal activity. Nat Prod Res 33:2579-2582.
- Thornhill AH, Ho SYW, Kulheim C, and Crisp MD (2015) Interpreting the modern distribution of Myrtaceae using a dated molecular phylogeny. Mol Phylogenet Evol 93:29-43.
- Tillich M, Lehwark P, Pellizzer T, Ulbricht-Jones ES, Fischer A, Bock R, and Greiner S (2017) GeSeq Versatile, and accurate annotation of organelle genomes. Nucleic Acids Res 45:W6-W11.
- Vieira LN, Faoro H, Fraga HP, Rogalski M, de Souza EM, de Oliveira Pedrosa F, Nodari RO, and Guerra MP (2014) An improved protocol for intact chloroplasts, and cpDNA isolation in conifers. PLoS One 9:e84792.
-
WCSP (2019) World checklist of selected plant families, http://wcsp.science.kew.org/
» http://wcsp.science.kew.org/ - Wilson PG, O’Brien MM, Heslewood MM, and Quinn CJ (2005) Relationships within Myrtaceae sensu lato based on a matK phylogeny. Plant Syst Evol 251:3-19.
Supplementary material
The following online material is available for this article:
Figure S1 - Gene map of Eugenia brasiliensis chloroplast genome.
Figure S2 - Gene map of Eugenia nitida chloroplast genome.
Figure S3 - Gene map of Eugenia pyriformis chloroplast genome.
Figure S4 - Gene map of Myrcianthes pungens chloroplast genome.
Figure S5 - Gene map of Plinia edulis chloroplast genome.
Figure S6 - Gene map of Psidium cattleianum chloroplast genome.
Figure S7 - Sequence identity plot comparing plastomes of Myrcianthes and Eugenia species.
Figure S8 - Sequence identity plot comparing plastomes of Plinia species.
Figure S9 - Sequence identity plot comparing plastomes of Psidium species.
Figure S10 - Nucleotide alignment of accD, ccsA, rpoC2, matK, ndhF and ycfl.
Table S1 - List of 28 Myrtaceae chloroplast genomes.
Table S2 - Summary of libraries and assemblies from the chloroplast genomes.
Table S3 - List of simple sequence repeats of six Myrteae plastomes.
Table S4 - List of simple sequence repeats with the respective position in plastome.
Publication Dates
-
Publication in this collection
08 May 2020 -
Date of issue
2020
History
-
Received
02 Oct 2019 -
Accepted
29 Jan 2020