RESEARCH ARTICLE
Transcriptome Analysis of ESTs from a Chaetognath Reveals a Deep-Branching Clade of Retrovirus-Like Retrotransposons
Roxane M Barthélémy, Jean-Paul Casanova, Eric Faure*
Article Information
Identifiers and Pagination:
Year: 2008Volume: 2
First Page: 44
Last Page: 60
Publisher Id: TOVJ-2-44
DOI: 10.2174/1874357900802010044
Article History:
Received Date: 28/3/2008Revision Received Date: 8/4/2008
Acceptance Date: 9/4/2008
Electronic publication date: 7/5/2008
Collection year: 2008
open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.5/), which permits unrestrictive use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Chaetognaths constitute a small marine phylum exhibiting several characteristic which are highly unusual in animal genomes, including two classes of both rRNA and protein ribosomal genes. As in this phylum presence of retrovirus-like elements has never been documented, analysis of a published expressed sequence tag (EST) collection of the chaetognath Spadella cephaloptera has been made. Twelve sequences representing transcript sections of reverse transcriptase domain of active retrotransposons were isolated from~11,000 ESTs. Five of them are originated from Gypsy retrovirus-like elements, whereas the other are transcripts from a Bel-Pao LTR-retrotransposon, a Penelope-like element and LINE retrotransposons. Moreover, a part of a putative integrase has also been found. Phylogenetic analyses suggest a deep-branching clade of the retrovirus-like elements, which is in agreement with the probably Cambrian origin of the phylum. Moreover, retrotransposons have not been found in telomeric-like transcripts which are probably constituted by both vertebrate and arthropod canonical repeats.
INTRODUCTION
Chaetognaths are a small marine phylum living in various habitats, but most of them are among the most abundant planktonic organisms [1]. Their body is constituted of three parts, the head, trunk and tail, separated by septa [2]. These animals are protandric hermaphrodites; the ovaries lie in the trunk on both sides of the gut, while the testis are in the tail (reviewed in [3]). Their phylogenetic position remains enigmatic, although recent molecular analyses suggest a protostome affinity [4-7]. Casanova et al. [8] showed that chaetognaths can be considered as a model animal.
The chaetognath genomes exhibit several molecular singularities including paralog of both ribosomal RNA genes and ribosomal protein genes [9-11] and conservation of extremely divergent paralogous sequences suggesting a low rate of gene conversion [12]. Moreover, in situ hybridizations have shown that each type of both 18S and 28S rRNA paralogs is important for specific cellular functions [13-15]. The causes of these features are unknown, even if an alloploid event has been suggested [12]. As it is well known that mobile genetic elements (MGEs, also called transposable elements) can strongly impact genome evolution, knowledge of these elements about few is known in chaetognath [16] could be a fruitful contribution.
MGEs are ubiquitous in a wide range of living organisms; however, they make up a large fraction of genome sizes which is evident through the C-values of only pluricellular eukaryotes. These elements, which can transpose from one location to another within the genome, are known to be one of the causes of large scale genome reorganization [17]. Although regarded as a selfish DNA with negative impact on the host, MGEs have been shown to contribute significantly to gene evolution [18]. Now these elements are regarded as one of the principal forces driving the evolution of eukaryotic genomes [19,20]. Due to the great number of known MGEs (several thousands) and as new types of mobile repeats are discovered at a rapid rate, a unified classification system for eukaryotic transposable elements has recently been proposed, designed on the basis of the transposition mechanism, sequence similarities and structural relationships [21]. MGEs are divided into two classes. Class I retrotransposons replicate via an RNA intermediate; the key enzyme of this mechanism is the reverse transcriptase (RT), each complete replication cycle produces one new copy. Class II transposons, which are out of our topic, move as a DNA segment by a classical ‘‘cut-and-paste’’ mechanism.
Retrotransposons have been divided into five orders on the basis of their mechanistic features, organization and reverse transcriptase (RT) phylogeny: LTR-retrotransposons, DIRS-like elements, Penelope-like elements, LINEs and SINEs. The LTR-retrotransposons are retrovirus-like elements containing long terminal repeat (LTR) and ORFs for at least gag, a structural protein for virus-like particles, and pol. Pol encodes an aspartic proteinase (PR), reverse transcriptase, RNase H, and a DDE integrase (IN). LTR retrotransposons also contain specific signals for packaging, dimerization, reverse transcription and integration. The two main superfamilies Gypsy and Copia, differing in the relative order of integrase and RT domains. According to Wicker et al. [21], the other members are retrovirus, endogenous retroviruses (ERVs) and the Bel–Pao subfamily containing MGEs structurally similar to Gypsy or Copia elements but exhibit differences on RT phylogenies. Evolutionarily, LTR retrotransposons are closely related to retroviruses. Retroviruses have a viral lifestyle through acquisition of an envelope protein added to various regulatory proteins. Retroviruses are distributed widely among vertebrates and may also occur in some invertebrates, for example, some members of the Gypsy family found in dipteran insects are able to infect new individuals [22]. Retrovirus can also be transformed into an LTR retrotransposon-like through inactivation or deletion of the domains that enable extracellular mobility and can only be inherited by vertical transmission through the germ line, this is the case of the endogenous retroviruses even if some of them can always be transmitted horizontally.
Penelope-like elements (PLEs) represent a new order of retroelements identified in more than 80 species including unicellular animals, fungi and plants [23]. These elements code for a protein that represents a fusion between a reverse transcriptase and a GIY-YIG endonuclease. They encode an RT that is more closely related to telomerase than to the RT from LTR retrotransposons or LINEs; moreover, members of this order have also LTR-like sequences that can be in a direct or an inverse orientation. The principal other retrotransposon orders are the LINEs and the SINEs. The LINEs lack LTRs, can reach several kilobases in length, and are found in all eukaryotic kingdoms. Autonomous LINEs encode at least an RT and a nuclease in their pol ORF for transposition, they often display a poly(A) tail at their 3' end. The SINE elements are non-autonomous elements and as they do not contain RT gene, they rely on the activity of RT proteins encoded by LINEs to retrotranspose [24]. They originated from accidental retrotransposition of various polymerase III transcripts and possess an internal III promoter, allowing them to be expressed. SINEs belong to retrosequences, a group containing all the sequences arisen by reverse transcription of ribosomal, messenger and small stable RNAs [25,26].
A previous study using degenerate primers gave positive results during screening of Sagitta sp. for LINE-like and Gypsy-like reverse transcriptases, and also for Mariner-like transposases; however, only two LINE elements have been sequenced [16]. As in this phylum retrovirus-like sequences have never been documented, analysis of a published expressed sequence tag (EST) collection [6] of the chaetognath Spadella cephaloptera has been made.
MATERIAL AND METHODS
EST Sequence Identifications
The chaetognath EST collection used has been annoted by Marlétaz et al. [6]. Since then, a great number of new sequences have been deposited in databases. As a great level of sequence diversity could be found even in the same retrotransposon subfamily, this has necessitated a new analysis has been made. Since amino acid sequences are more useful to detect homology over long periods, the EST sequences were translated in all six reading frames and compared to the sequences in the NCBI nr and Swissprot protein databases. Sequences that did not match were further compared against the Gen-Bank and dbEST nucleotide databases (Blastn). Among 11,254 sequences, thirteen showed similarity (e-value<10–5) to previously described retroelement sequences.
Blast and Phylogenetic Analyses
For each amino-acid sequences deduced from chaetognath ESTs which are homologous to retrotranspon genes are automatically searched for in the full length proteins from NCBI NR protein database. At this step, the Figenix platform has been used to automatically detect homologs based upon robust phylogenetic reconstruction [27]. When the number of homologs automatically detected is lower than 20, BLAST-based datasets were constructed using BLASTp queries against NCBI NR protein database. The Figenix platform has also been used for these phylogenetic reconstructions. The robustness of the tree has been tested by bootstrap analyses with 1000 resamplings. As for some analyses the number of homologous sequences was too low (< 8), only the alignments are given using the Clustal W program [28].
RESULTS
Blast and Phylogenetic Analyses
The cDNA library has been made from mRNAs isolated from various embryonic stages of Spadella cephaloptera (from 0 to 48 hours after hatching) [6]. The 5'-ends of 11,254 clones from this library have been sequenced and after annotation analyses the homology relations have been assigned to 2396 clones corresponding to the transcripts of 792 different genes. Similarly to Marlétaz et al.’s annotation [6], our re-analysis suggests that thirteen ESTs represent transcript sections of active retroelements (Table 1). Three EST sequences are strictly identical and one sequence is internal to another EST. In spite the fact that alignments with published retrotransposons suggest that some EST sequences could partially overlap between them, it has been impossible to assemble these ESTs into contig due to significant nucleotide differences in the overlapping regions, suggesting that these ESTs belong to different MGEs although very close phylogenetically.
Chaetognath ESTs Containing Transcript Sections of Retrotransposon Domains
Type of Retrotransposon: Order and Superfamily for LTR- Retrotransposon | EST Acc. n° | Total Lenght - [Lenght of the Microsatelitte Region]- (Lenght of the poly(A) Tail) |
Data Concerning the Closer Homologous Complete Retrotransposon: Name of the Element (if Known), Taxon, Species, Acc. n° | Protein Domain Matching | Other Caracteristic(s) of the ESTs |
---|---|---|---|---|---|
LTR-retrotransposon: Retrovirus-like, Gypsy | CR953949 | 641-(0) | Tv1, insect Drosophila virilis (AF056940) | RT-RNase H | One frameshift |
LTR-retrotransposon: Retrovirus-like, Gypsy | CR953634 CR953418 CR953554 |
494-(23) 494-(23) 494-(23) |
Tv1, insect Drosophila virilis (AF056940) | RT | The three sequences are stricly identical |
LTR-retrotransposon: Retrovirus-like, Gypsy | CR952896 | 360-(0) | Tv1, insect Drosophila virilis (AF056940) | RT | |
LTR-retrotransposon: Bel-Pao | CR950076 CR950075 |
784–(0) 682-(0) |
Kamikaze, insect Bombyx mori [30] | RT | Shortest sequence internal to the longest |
Penelope-like element (Ple) | CR950254 | [53]-471-(26) | Xena, fish Takifugu rubripes (AAK58879) | RT | (CAA)n microsatellite |
Penelope-like element (Ple) | CR950255 | 706-[303]-(0) | Xena, fish Takifugu rubripes (AAK58879) | RT | (CAA)n microsatellite |
Penelope-like element (Ple) | CR941783 | 714-(30) | Xena, fish Takifugu rubripes (AAK58879) | RT | Complementary sequence |
LINE | CR950197 | 574-(0) | CR1, testudine, Platemys spixii (AB005891) | RT | |
LINE | CR950101 | 669-(0) | CR1-1, fish, Danio rerio (AB211149) | RT | |
Putative integrase of an unknown element | CR950196 | 678-(30) | Urochordate Oikopleura dioica (AAS21408) | IN | Doubtful, homology with only one putative integrase |
Several characteristics of each EST have given and of the closer homologous complete retrotransposon are reported. Abbreviations: (Acc. n°) Accession number, (RT) transcriptase inverse, (IN) integrase. The position of the number corresponding to the microsatellite sequence length reflects the position in the EST, i.e. in 5’ or in 3’ regions.
Analysis of the Chaetognath ESTs Containing Telomeric Motifs
Number of Repeat | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
Motifs | |||||||
TTAGGG "Metazoan" | 36 | 10 | - | 2 | - | - | 1 |
TTAGG "Arthropoda" | 24 | 1 | 1 | - | - | - | - |
TTAGG/TTAGGG or TTAGGG/TTAGG | 3 | - | - | - | - | - | - |
TTAGGG/TTAGGG/TTAGG | 1 | - | - | - | - | - | - |
TTAGG/TTAGG/TTAGGG | 1 | - | - | - | - | - | - |
The accession numbers of these ESTs are: CR940792, CR941840, CR944786, CR945111, CR945171, CR946582, CR946615, CR946633, CR946651, CR946666, CR947615, CR948211, CR949136, CR949498, CR949663, CR949907 and CR950054. - as 0.
Generally, in retrotransposons the gag and pol genes are intact; however some are interrupted by inframe stop codons, but some of the non-autonomous LTR-retrotransposons can transpose via trans-activation by autonomous partners. Only three of the chaetognath ESTs (CR950075, CR950076 and CR950197) do not contain stop codons; moreover, two ESTs have a microsatellite sequence linked to a part of the pol gene (Table 1). In addition, in the chaetognath EST collection, only sequence homologies against Pol domain has been found; indeed, it is well known that the pol gene is the most conserved among the retrotransposons and to elucidate the phylogenetic relationships between these elements, pol-based trees have been constructed since Xiong and Eickbush [29]. Our analyses also shown that twelve out of thirteen of the ESTs encode part of the RT domain and generally the COOH-terminal amino acid region, whereas an EST sequence exhibits some homologies with a putative integrase (Table 1 and Fig. 1).
Using Figenix platform, six phylogenetic trees have been obtained (Fig. 1). Three analyses reveal that the deduced sequences of five ESTs belong to the Gypsy/Ty3 superfamily (Fig. 1A, B and C). These retrovirus-like elements have been found in animals, fungi and plants [21] and sequences belonging to these three taxa are present in the three phylogenies. Moreover, several clades of retrovirus-like elements have been found for the three taxa. Although the bootstrap values are very low, all these analyses suggest that the partial chaetognath sequences group with homologous regions of animal Gypsy/Ty3 elements. In two analyses, the chaetognath RT sequences group with fungi sequences or with fungi and plant sequences, but it is not statistically supported (Fig. 1A and C), whereas in the other analysis, chaetognath sequence is the sister group of a a clade containing sequences from fish (Teleostei), echinoderms, plathyhelminthes and insectes (Fig. 1B). In this analysis, chaetognath sequences seem to constitute a deep-branching clade of one of the animal retrovirus-like clade. In addition, another phylogenetic analysis suggests that the predicted sequence of two similar ESTs belongs to the Bel-Pao superfamily (Fig. 1D) [30]; the chaetognath sequence appears to be the sister group of deuterostomian and plathyhelimnth sequences. In all the phylogenetic analyses using LTR-retrotransposons sequences, one or more bacterial sequences are included. They belong to the bacterial endosymbiont Wolbachia and exhibit a high homology percentage with the orthologs of their host (Drosophila ananassae); the bacterial sequence of the deduced pol gene of the Gypsy element and of the Bel-Pao element share respectively 703 and 1265 strictly contiguous identical amino acid with their insect orthologs. Similarly, in another phylogeny analysis, polymerase sequence of a Pineapple bacilliform virus (Retro-transcribing viruses, Caulimoviridae) [31]) groups with RT sequences from plants (Fig. 1A). As it has been already shown for several other genes, this supports the hypothesis of horizontal gene transfers [32]. The two last phylogenetic analyses suggest that two chaetognath ESTs could encode part of the RT of a LINE element (Fig. 1E and F). One predicted sequence group LINE elements from a turtle and an annelid, whereas the other groups with nematode and echinoderm sequences; however, in these two analyses the observed groupings are not statistically supported. Moreover, these phylogenetic analyses suggest that the chaetognath genome bears at least two types of LINE elements.
For the other chaetognath EST sequences, due to the low number of homologous sequences, phylogenetic analyses cannot be performed and the predicted amino acid sequences have been aligned with the homologous sequences given using Blast analyses (Fig. 2). The deduced sequences from three ESTs exhibit homologies with parts of the RT domain of some of the Penelope-like elements (Fig. 2A, B and C), whereas one EST shares partial homology with the COOH-terminal amino acid region of a putative integrase found in a urochordate (Fig. 2D).
Closed Relationships Between Retrotransposons and Microsatellites
Analysis of chaetognath ESTs reveals that two sequences contain part of retrotransposable elements stringly associated with the same microsatellite repeat motif (CAA)n (Table 2). This motif has already been found in two out of the four microsatellite loci known of another chaetognath, Sagittasetosa (DQ463218 and DQ463220) [33]. However, in this last species, no MGEs have been found in the flanking sequences; the first loci is at the end of the ribosomal protein L8 gene, and the second in an intron of the Midasin gene (or pseudogene). Interestingly, several microsatellite families exhibit MGEs in their flanking sequences both in plants and animals [34-36] suggesting a possible close relationships between these two types of repetitive sequences.
Research of Retrotransposons and Inverse Transcriptase Genes in Telomeric Regions
Telomeres are regions of repetitive DNA at the end of most of the linear chromosomes, which protects the end of the chromosome from destruction [37]. Telomerase, a ribonucleoprotein composed of a reverse transcriptase (RT) and an RNA encoded by a different gene, synthesizes the telomeric DNA repeats [38]. A relationship between telomerases and retrotransposon RTs, which are encoded by their template RNA, was surmised from shared amino acid motifs [38,39] and the observation that telomeres are elongated by non-LTR retrotransposons in some insects [40]. Indeed, as in almost all studied organisms, telomere repeats are very short simple sequences the exceptions are dipterans, which evolved chromosome ends with complex arrays of long satellite repeats in both Chironomus and Anopheles gambiae [41,42] or of non-LTR retrotransposons (members of Drosophila genus) that are elongated by telomerase-independent mechanisms (for review see [43,44]). Drosophila melanogaster telomeres are composed of two non-LTR retrotransposons, HeT-A and TART, with few copies of Tahre, an element that appears to combine parts of both HeT-A and TART and homologs of these two last elements have been found in D. virilis [45]. Moreover, one of the chaetognath EST sequences exhibits homology with the RT domain of the Penelope-like elements (Table 1 and Fig. 1D). This order of retrotransposon has been found in animals, fungi and plants; in some taxa, they can be associated with telomeric repeats as shown by Gladyshev and Arkhipova [46]. According to these authors, they may descend from the missing link between early eukaryotic retroelements and present-day telomerases.
As a large number of ESTs have been found in telomeric regions of both in animals and plants (for example [47,48], consensus motifs of various taxa including animals, protozoa, plants and fungi [49] have been found in the chaetognath EST collection. ESTs containing telomeric-like sequences have been researched using all the known telomeric sequences. Only ESTs containing at least three telomeric motifs in a sequence of 153 nucleotides have been selected. Using this non stringent criterion, seventeen ESTs have been found and all but two contain both TTAGGG and TTAGG motifs (Table 2). So (TTAGGG)n could be the ancestral telomere repeat motif of Metazoa. It has been conserved from the metazoan radiation in most animal phylogenetic lineages and according to our knowledge replaced by other motifs, only in two major lineages, Arthropoda [(TTAGG)n] and Nematoda [(TTAGGC)n] [50]. In the chaetognath EST collection, the mean nucleotide distance between two motifs is 15 nt and 19 motifs in tandem or more have been found. The nematode TTAGGC motif has very rarely been found but never in tandem or more, suggesting probably it is a TTAGG motif with a C as flanking base in 3’ end or a variant of the metazoan consensus.
Moreover, in addition to their role in protecting the ends of chromosomes, telomeres also influence the expression of adjacent genes, a process called telomere-position effect. The pattern of expression of the telomeric transgenes demonstrates that subtelomeric regions are epigenetically reprogrammed [51]. However, when the seventeen sequences were translated in all six reading frames and compared to the sequences in protein databases, the result has been negative indicating that these regions are not closely located to genes. A recent study suggests that in Drosophila an existing non-LTR retrotransposon was recruited to perform the cellular function of telomere maintenance [52]. However, in our analysis, the telomere flanking regions do not contain sequence homology with any retrotransposon element, and never even Penelope-like element, suggesting the presence of only consensus telomere repeats in chaetognath.
DISCUSSION
Present analysis of a chaetognath EST library allowed to find thirteen retrotransposon sequences. Retrotransposons comprise 0.11% and 0.54 % of respectively the total ESTs and of the number of ESTs assigned to protein gene transcripts. In other animals, the corresponding values are generally higher, e.g., 1.0% in venom glands of a fish [53], from 4 to 14 according to the stages in a platyhelminth [54], and 14% in mouse oocytes [55]. However, the chaetognath value is similar to the 0.12% reported from the survey of ESTs from many plant species [56] but higher values could be found [57]. These transcriptional differences could be due to severe controls of the retrotransposon expression in higher plants, in which LTR-retrotransposons alone can comprise 50–90% of the genome [58] and probably also in juvenile chaetognaths. However, EST collection can exhibit several bias; indeed, ESTs are short single-pass sequence reads derived from cDNA clones selected randomly from cDNA libraries, and in contrast to genomic sequences, they are generally of low quality, poorly annotated, and highly redundant. Common EST features include ~2% sequence error rate with high frequency of insertions and deletions, redundancy of sequences derived from highly expressed genes, and low representation of genes expressed at low levels. Moreover, ESTs may derive from unspliced immature mRNAs, alternative splicing and polyadenylation sites, cloning artifacts (chimerisms), and mitochondrial transcripts [59]. In addition, it should be kept in mind that only transcription and ultimate integration in tissues giving rise to gametes is heritable and the chaetognath EST collection was derived from whole animals.
In spite the low number of ESTs encoding retrotransposons found in the chaetognath library studied, members of three out of four orders of autonomous retrotransposons have been found [21]; the lacking order is the one containing the DIRS-like elements. The higher number of transcripts has been found for the Gypsy retrovirus-like superfamily which could constitute a deep-branching clade according to phylogenetic analyses. The origin of chaetognaths remains obscure, but fossil evidences suggest that this phylum was widespread and diverse in the earliest Cambrian [60], and the difficulties of the phylogenetic position of this taxon is probably partly due to its divergence at an early stage from the primitive ancestor of the Bilateria. Moreover, as it is well known [61], present retrovirus-like phylogenies suggest the polyphyly of these elements.
In addition, one deduced amino acid sequence of two similar ESTs exhibits sequence homology RT of Bel-Pao LTR-retrotransposons issuing from deuterostomian and platyhelminthes. Moreover, phylogenies suggest the presence of active LINEs in the chaetognath genome; however, the level of homology between these sequences and the closer published sequences is very low. Phylogenetic analyses of LINE-like RT sequences suggested that sequences of the chaetognath Sagitta sp. group with those of Lophotrochozoa (i.e., Nemertea, Mollusca, Gastrotricha, Annelida, Echiura and Rotifera) [16], whereas in our phylogenies, the chaetognath LINE elements group with vertebrae and insect sequences.
Alignments of other predicted EST sequences with the closer orthologs obtained using Blast analyses show that the chaetognath genome also contains active elements which exhibit sequence homology with RT of Penelope-like elements issuing from deuterostomian, platyhelminthes and insects [39,62,63]. However, all our analyses show that the chaetognath retrotransposon sequences are not phylogenetically informative; this is principally due to the short size of the regions analyzed. Moreover, both in plants and animals numerous evidences of horizontal transmission have been published [64-66]; however, the basal positions of chaetognath sequences in most of our phylogenetic trees seem to exclude horizontal transfers.
Association between microsatellites and MGEs has been reported in a variety of organisms including plants and animals [34-36]. These MGEs could be DNA transposons, but many of them are retroelements. This results in microsatellites that have similar or nearly identical flanking regions that share great homology with MGEs, suggesting these last elements can be involved in the genesis and genomic spread of microsatellites in organisms as diverse as animals [34,35,67-69] and plants [70]. This intimate association between microsatellite repeats and retrotransposons has also been put to good use for develop a method named REMAP for genotyping and fingerprinting analyses [71]. The microsatellite–MGEs association could be a molecular symbiosis between two types of genomic sequences. Indeed, the two main molecular pathways of this mutual aid are by means of transposition and recombination. MGEs can not generally invade some chromosomic regions, due, e.g., to the absence of insertion sites or the presence of euchromatin. Moreover, MGEs can be strongly negatively regulated by various mechanisms, they even can encode their own negative trans-regulator [72, 73]. Recombination-related events due to microsatellites, such as unequal crossing over, allow the expansion of MGEs present in the microsatellite flanking regions, even when they are inactive. Contrarily, microsatellites flanking MGEs sequences could be multiplicated and dispersed in the genome during transposition processes. Transposition of additional regions including functional genes are known for both DNA transposons for which complex transposons probably evolve by transposition of homologous insertion sequences to nearby sites within a DNA molecule [74], and retrotransposons which can mediate sequence transduction [75]. Moreover, the 3’ end region of several non LTR-retrotransposons can be implicated as a major source for formation of adenine-rich microsatellites [67]. The potential molecular symbiosis between MGEs and microsatellites show that the behavior and the evolution of repetitive sequences can only be understood within a larger genomic context.
Our research of ESTs containing telomeric-like sequences reveals the presence of two types of telomeric motifs. The vertebrate motif (TTAGGG) is dominant; it constitutes an ancestral motif of telomeres in bilaterian animals and possibly also in the superclade including animals, fungi and amoebozoans [50,76]. More surprisingly, the probably ancestral Arthropoda motif (TTAGG)n [76] has also been found in ~30% of the cases, suggesting that it is not a type of degenerate TTAGGG repeats. Moreover, no sequence of retrotransposons or RT genes has been found in the telomeric regions.
The ability of various factors to stimulate MGE activity was first proposed by McClintock [77], and one of us (EF) has been involved in this field of research for a long time. With regard to LTR retrotransposons and retrovirus, chemical and physical agents have been shown to induce transcription and transposition [78-92]. Non-LTR retrotransposons also respond to these stress [93]. Moreover, expression of retrotransposon promoters are wound-inducible [94,95] and retrotransposons can be activated during viral, bacterial, fungal or parasitic attacks [96-98]. As retrotransposition burst can be an indicator of the stress genomic response, activation of MGEs will be investigated in chaetognaths, because this taxon seems very resistant. Indeed, chaetognaths do acquire eukaryotic and bacterial parasites but not very frequently [99,100]. There is no host-specific parasite known in chaetognaths, which is remarkable for such an old group [60]. Moreover, chaetognaths exhibit a great antibacterial activity [101] and in laboratory, no beheaded chaetognaths, which can survive some 30 days after decapitation and are able to mature spermatozoans and to mate with normal specimens; have exhibited bacterial infestation [102].
CONCLUSION
In spite the low number of retrotransposon ESTs found in a juvenile chaetognath collection added to the small size of the sequences analyzed, this study suggests that chaetognath retrovirus-like retrotranposons could constitute deep-branching clades. The origin of these elements could correspond to the origin of the phylum; indeed, fossils of chaetognath grasping spines support the hypothesis that this animal taxon was present in the Cambrian times or even earlier. Moreover, studies on chaetognath retrotransposons could highlight future research in the exciting domain of the evolutionary origin of the retrovirus. Lastly, owing to the role of LTR-retrotransposons on genome structure, evolution and function, entire elements will be cloned and characterized, starting from the retrotransposon fragments shown in this study. Moreover, activation of these elements during normal development and in situations of stress, including pathogen attack will be sought.
ABBREVIATIONS
REFERENCES
[1] | Feigenbaum DL, Maris RC. Feeding in chaetognatha Oceanogr Mar Biol Ann Rev 1984; 22: 343-92. |
[2] | Casanova J-P. Chaetognatha. South Atlantic Zooplankton. Leiden: Backhuys Publishers 1999; pp. 1353-74. |
[3] | Lewbart AL. Invertebrate Medicine In: New York : Wiley, John & Sons 2004. |
[4] | Shimotori T, Goto T. Developmental fates of the first four blastomeres of the chaetognath Paraspadella gotoi: relationship to protostomes Dev Growth Differ 2001; 43(4): 371-82. |
[5] | Faure E, Casanova J-P. Comparison of chaetognath mitochondrial genomes and phylogenetical implications Mitochondrion 2006; 6(5): 258-62. |
[6] | Marletaz F, Martin E, Perez Y, et al. Chaetognath phylogenomics: a protostome with deuterostome-like development Curr Biol 2006; 16(15): R577-578. |
[7] | Matus DQ, Copley RR, Dunn CW, et al. Broad taxon and gene sampling indicate that chaetognaths are protostomes Curr Biol 2006; 16(15): R575-576. |
[8] | Casanova J-P, Duvert M, Perez Y. Phylogenetic interest of the chaetognath model Mésogée 2001; 59: 27-31. |
[9] | Telford MJ, Holland PWH. Evolution of 28S ribosomal DNA in Chaetognaths: duplicate genes and molecular phylogeny J Mol Evol 1997; 44(2): 135-44. |
[10] | Papillon D, Perez Y, Caubit X, Le Parco Y. Systematics of Chaetognatha under the light of molecular data, using duplicated ribosomal 18S DNA sequences Mol Phyl Evol 2006; 38(3): 621-34. |
[11] | Barthélémy R-M, Chenuil A, Brancart S, Casanova J-P, Faure E. Translational machinery of the chaetognath Spadella cephaloptera: A transcriptomic approach to the analysis of cytosolic ribosomal protein genes and their expression. BMC Evol Biol [serial on the Internet]. Aug 2007 Available from: http://www.biomedcentral.com/1471-2148/7/146 [[cited 2007 August 28]]; 7:146 [about 45 screens] |
[12] | Barthélémy R-M, Péténian F, Vannier J, Casanova J-P, Faure E. Evolutionary history of the chaetognaths inferred from actin and 18S-28S rRNA paralogous genes Int J Zool Res 2006; 2(4): 284-300. |
[13] | Barthélémy R-M, Grino M, Pontarotti P, Casanova J-P, Faure E. The differential expression of ribosomal 18S RNA paralog genes from the chaetognath Spadella cephaloptera Cell Mol Biol Lett 2007; 12(4): 573-83. |
[14] | Barthélémy R-M, Casanova J-P, Grino M, Faure E. Selective expression of two types of 28S rRNA paralogous genes in the chaetognath Spadella cephaloptera Cell Mol Biol (Noisy-le-grand) 2007; 53((Suppl:OL)): 989-93. |
[15] | Barthélémy R-M, Grino M, Pontarotti P, Casanova JP, Faure E. A possible relationship between the phylogenetic branch lengths and the chaetognath rRNA paralog gene functionalities: ubiquitous, tissue-Specific or pseudogenes In: Pontarotti P, Ed. Evolutionary biology from concept to application. New-York: Spinger 2008. in press |
[16] | Arkhipova I, Meselson M. Transposable elements in sexual and ancient asexual taxa Proc Natl Acad Sci USA 2000; 97(26): 14473-7. |
[17] | Kidwell MG, Lisch D. Transposable elements as sources of variation in animals and plants Proc Natl Acad Sci USA 1997; 94(15): 7704-11. |
[18] | Kazazian HH Jr. Mobile elements: drivers of genome evolution Science 2004; 303(5664): 1626-32. |
[19] | Labrador M, Corces VG. Interactions between transposable elements and the host genome In: Craig NL, Craigie R, Gellert M, Lambowitz AM, Eds. Mobile DNA II. Washington DC: American Society for Microbiology Press 2002; pp. 1008-23. |
[20] | Levy A, Sela N, Ast G. TranspoGene and microTranspoGene: transposed elements influence on the transcriptome of seven vertebrates and invertebrates Nucleic Acids Res 2008; 36: D47-52. |
[21] | Wicker T, Sabot F, Hua-Van A, et al. A unified classification system for eukaryotic transposable elements Nature Rev Genet 2007; 8(12): 973-82. |
[22] | Bucheton A. The relationship between the flamenco gene and gypsy in Drosophila: how to tame a retrovirus Trends Genet 1995; 11(9): 349-53. |
[23] | Arkhipova IR. Distribution and phylogeny of Penelope-like elements in eukaryotes Syst Biol 2006; 55(6): 875-85. |
[24] | Dewannieux M, Esnault C, Heidmann T. LINE-mediated retrotransposition of marked Alu sequences Nat Genet 2003; 35(1): 41-8. |
[25] | Brosius J. Genomes were forged by massive bombardments with retroelements and retrosequences Genetica 1999; 107(1-3): 209-38. |
[26] | Ding W, Lin L, Chen B, Dai J. L1 elements, processed pseudogenes and retrogenes in mammalian genomes IUBMB Life 2006; 58(12): 677-85. |
[27] | Gouret P, Vitiello V, Balandraud N, Gilles A, Pontarotti P, Danchin EGJ. FIGENIX: intelligent automation of genomic annotation: expertise integration in a new software platform. BMC Bioinformatics [serial on the Internet]. Aug 2005 Available from: http://www.biomedcentral.com/1471-2105/6/198 [[cited 2005 August 5]]; 6: 198 [about 25 screens] |
[28] | Larkin MA, Blackshields G, Brown NP, et al. Clustal W and Clustal X version 2.0 Bioinformatics 2007; 23(21): 2947-8. |
[29] | Xiong Y, Eickbuch TH. Origin and evolution of retroelements based upon their reverse transcriptase sequences EMBO J 1990; 9(10): 3353-62. |
[30] | Abe H, Ohbayashi F, Sugasaki T, et al. Two novel Pao-like retrotransposons (Kamikaze and Yamato) from the silkworm species Bombyx mori and B. mandarina: common structural features of Pao-like elements Mol Genet Genomics 2001; 265(2): 375-85. |
[31] | Thomson KG, Dietzgen RG, Thomas JE, Teakle DS. Detection of pineapple bacilliform virus using the polymerase chain reaction Ann Appl Biol 1996; 129(1): 57-69. |
[32] | Salzberg SL, Hotopp JC, Delcher AL, et al. Serendipitous discovery of Wolbachia genomes in multiple Drosophila species. Genome Biol [serial on the Internet]. Feb 2005 Available from: http://genomebiology.com/2005/6/3/R23 [[cited 2005 February 22]];6(3) R23 [about 18 screens] |
[33] | Peijnenburg KT, Fauvelot C, Breeuwer JA, Menken SB. Spatial and temporal genetic structure of the planktonic Sagitta setosa (Chaetognatha) in European seas as revealed by mitochondrial and nuclear DNA markers Mol Ecol 2006; 15(11): 3319-38. |
[34] | Meglécz E, Petenian F, Danchin E, D'Acier AC, Rasplus JY, Faure E. High similarity between flanking regions of different microsatellites detected within each of two species of Lepidoptera: Parnassius apollo and Euphydryas aurinia Mol Ecol 2004; 13(6): 1693-700. |
[35] | Meglécz E, Anderson SJ, Bourguet D, et al. Microsatellite flanking region similarities among different loci within insect species Insect Mol Biol 2007; 16(2): 175-85. |
[36] | Zhang DX. Lepidopteran microsatellite DNA: redundant but promising TREE 2004; 19(10): 507-9. |
[37] | Nakamura TM, Cech TR. Reversing time: origin of telomerase Cell 1998; 92(5): 587-90. |
[38] | Lingner J, Hughes TR, Shevchenko A, Mann M, Lundblad V, Cech TR. Reverse transcriptase motifs in the catalytic subunit of telomerase Science 1997; 276(5312): 561-7. |
[39] | Arkhipova IR, Pyatkov KI, Meselson M, Evgen'ev MB. Retroelements containing introns in diverse invertebrate taxa Nat Genet 2003; 33(2): 123-4. |
[40] | Pardue ML, Danilevskaya ON, Traverse KL, Lowenhaupt K. Evolutionary links between telomeres and transposable elements Genetica 1997; 100(1-3): 73-84. |
[41] | Rosén M, Edström J. DNA structures common for chironomid telomeres terminating with complex repeats Insect Mol Biol 2000; 9(3): 341-7. |
[42] | Walter MF, Bozorgnia L, Maheshwari A, Biessmann H. The rate of terminal nucleotide loss from a telomere of the mosquito Anopheles gambiae Insect Mol Biol 2001; 10(1): 105-10. |
[43] | Biessmann H, Mason JM. Telomerase-independent mechanisms of telomere maintenance Cell Mol Life Sci 2003; 60(11): 2325-33. |
[44] | Pardue ML, DeBaryshe PG. Retrotransposons provide an evolutionarily robust non-telomerase mechanism to maintain telomeres Annu Rev Genet 2003; 37: 485-511. |
[45] | Casacuberta E, Marín FA, Pardue ML. Intracellular targeting of telomeric retrotransposon Gag proteins of distantly related Drosophila species Proc Natl Acad Sci USA 2007; 104(20): 8391-6. |
[46] | Gladyshev EA, Arkhipova IR. Telomere-associated endonuclease-deficient Penelope-like retroelements in diverse eukaryotes Proc Natl Acad Sci USA 2007; 104(22): 9352-7. |
[47] | Ning Y, Xu JF, Li Y, et al. Telomere length and the expression of natural telomeric genes in human fibroblasts Hum Mol Genet 2003; 12(11): 1329-36. |
[48] | Anderson LK, Lai A, Stack SM, Rizzon C, Gaut BS. Uneven distribution of expressed sequence tag loci on maize pachytene chromosomes Genome Res 2006; 16(1): 115-22. |
[49] | Podlevsky JD, Bley CJ, Omana RV, Qi X, Chen J. The Telomerase Database Nucleic Acids Res 2008; 36: D339-43. |
[50] | Traut W, Szczepanowski M, Vítková M, Opitz C, Marec F, Zrzavý J. The telomere repeat motif of basal Metazoa Chromosome Res 2007; 15(3): 371-82. |
[51] | Gao Q, Reynolds GE, Innes L, et al. Telomeric transgenes are silenced in adult mouse tissues and embryo fibroblasts but are expressed in embryonic stem cells Stem Cells 2007; 25(12): 3085-92. |
[52] | Villasante A, Abad JP, Planelló R, Méndez-Lago M, Celniker SE, de Pablos B. Drosophila telomeric retrotransposons derived from an ancestral element that was recruited to replace telomerase Genome Res 2007; 17(12): 1909-18. |
[53] | Magalhães GS, Junqueira-de-Azevedo IL, Lopes-Ferreira M, Lorenzini DM, Ho PL, Moura-da-Silva AM. Transcriptome analysis of expressed sequence tags from the venom glands of the fish Thalassophryne nattereri Biochimie 2006; 88(6): 693-9. |
[54] | DeMarco R, Kowaltowski AT, Machado AA, et al. Saci-1, -2, and -3 and Perere, four novel retrotransposons with high transcriptional activities from the human parasite Schistosoma mansoni J Virol 2004; 78(6): 2967-78. |
[55] | Evsikov AV, Graber JH, Brockman JM, et al. Cracking the egg: molecular dynamics and evolutionary aspects of the transition from the fully grown oocyte to embryo Genes Dev 2006; 20(19): 2713-27. |
[56] | Vicient CM, Jääskeläinen MJ, Kalendar R, Schulman AH. Active retrotransposons are a common feature of grass genomes Plant Physiol 2001; 125(3): 1283-92. |
[57] | Rotter D, Bharti AK, Li HM, et al. Analysis of EST sequences suggests recent origin of allotetraploid colonial and creeping bentgrasses Mol Genet Genomics 2007; 278(2): 197-209. |
[58] | Du C, Swigonová Z, Messing J. Retrotranspositions in orthologous regions of closely related grass species. BMC Evol Biol [serial on the Internet]. Aug 2006 Available from http://www.biomedcentral.com/1471-2148/6/62 [[cited 2006 August 16]]; 6:62 [about 15 screens] |
[59] | Gissi C, Pesole G. Transcript mapping and genome annotation of ascidian mtDNA using EST data Genome Res 2003; 13(9): 2203-12. |
[60] | Vannier J, Steiner M, Renvoisé E, Xu S-X, Casanova J-P. Predator arrow worms in the early cambrian food webs Proc R Soc B 2007; 274(1610): 627-33. |
[61] | Volff JN, Körting C, Altschmied J, et al. Jule from the fish Xiphophorus is the first complete vertebrate Ty3/Gypsy retrotransposon from the Mag family Mol Biol Evol 2001; 18(2): 101-11. |
[62] | Dalle Nogare DE, Clark MS, Elgar G, Frame IG, Poulter RT. Xena, a full-length basal retroelement from tetraodontid fish Mol Biol Evol 2002; 19(3): 247-55. |
[63] | Pyatkov KI, Shostak NG, Zelentsova ES, et al. Penelope retroelements from Drosophila virilis are active after transformation of Drosophila melanogaster Proc Natl Acad Sci USA 2002; 99(25): 16150-5. |
[64] | Jordan IK, Matyunina LV, McDonald JF. Evidence for the recent horizontal transfer of long terminal repeat retrotransposon Proc Natl Acad Sci USA 1999; 96(22): 12621-5. |
[65] | Terzian C, Ferraz C, Demaille J, Bucheton A. Evolution of the Gypsy endogenous retrovirus in the Drosophila melanogaster subgroup Mol Biol Evol 2000; 17(6): 908-14. |
[66] | Eickbush TH, Malik HS, et al. Origins and evolution of retrotransposons In: Craig , Ed. Mobile DNA II. Washington DC: ASM Press 1999; pp. 1111-44. |
[67] | Arcot SS, Wang Z, Weber JL, Deininger PL, Batzer MA. Alu repeats: a source for the genesis of primate microsatellites Genomics 1995; 29(1): 136-44. |
[68] | Nadir E, Margalit H, Gallily T, Ben-Sasson SA. Microsatellite spreading in the human genome: evolutionary mechanisms and structural implications Proc Natl Acad Sci USA 1996; 93(13): 6470-5. |
[69] | Wilder J, Hollocher H. Mobile elements and the genesis of microsatellites in dipterans Mol Biol Evol 2001; 18(3): 384-92. |
[70] | Ramsay L, Macaulay M, Cardle L, et al. Intimate association of microsatellite repeats with retrotransposons and other dispersed repetitive elements in barley Plant J 1999; 17(4): 415-25. |
[71] | Kalendar R, Schulman AH. IRAP and REMAP for retrotransposon-based genotyping and fingerprinting Nat Protoc 2006; 1(5): 2478-84. |
[72] | Faure E. A sequence of the U5 region of Drosophila 1731 retrotransposon long terminal repeat (LTR) trans-represses the LTR-directed transcription Biochemistry (Mosc) 1999; 64(6): 678-92. |
[73] | Faure E. Strategy of coevolution between an intracellular parasite (retrotransposon) and his host (fruit fly). A sequence of the U5 region of drosophila 1731 retrotransposon long terminal repeat (LTR) trans-represses the LTR-directed transcription Mésogée 2001; 59: 7-15. |
[74] | Kidwell MG. Transposable elements In: Gregory TR, Ed. The Evolution of the Genome. San Diego: Elsevier 2005; pp. 165-221. |
[75] | Xing J, Wang H, Belancio VP, Cordaux R, Deininger PL, Batzer MA. Emergence of primate genes by retrotransposon-mediated sequence transduction Proc Natl Acad Sci USA 2006; 103(47): 17608-13. |
[76] | Vítková M, Král J, Traut W, Zrzavý J, Marec F. The evolutionary origin of insect telomeric repeats, (TTAGG)n Chromosome Res 2005; 13(2): 145-56. |
[77] | McClintock B. The significance of responses of the genome to challenge Science 1984; 226(4676): 792-801. |
[78] | Faure E, Best-Belpomme M, Champion S. Upregulation of the Drosophila 1731 retrotransposon long-terminal repeat by UV-B irradiation requires a short sequence in the U3 region Arch Biochem Biophys 1996; 326(2): 219-26. |
[79] | Faure E, Best-Belpomme M, Champion S. UVB irradiation upregulation of the Drosophila 1731 retrotransposon LTR requires the same short sequence of U3 region in a human epithelial cell line as in Drosophila cells Photochem Photobiol 1996; 64(5): 807-13. |
[80] | Faure E, Best-Belpomme M, Champion S. X-irradiation activates the Drosophila 1731 retrotransposon LTR and stimulates secretion of an extracellular factor that induces the 1731-LTR transcription in nonirradiated cells J Biochem 1996; 120(2): 313-9. |
[81] | Faure E, Emanoil-Ravier R, Champion S. UVB irradiation-induced transcription from the long terminal repeat of intracisternal A particles and UVB-induced secretion of an extracellular factor that induces transcription of the intracisternal A particles in unirradiated cells J Photochem Photobiol B 1996; 36(1): 61-6. |
[82] | Faure E, Emanoil-Ravier R, Champion S. Induction of transcription from the long terminal repeat of the intracysternal particles type A (IAP) by X-irradiation Arch Physiol Biochem 1997; 105(2): 183-9. |
[83] | Wessler SR. Turned on by stress. Plant retrotransposons Curr Biol 1996; 6(9): 959-61. |
[84] | Kalendar R, Tanskanen J, Immonen S, Nevo E, Schulman AH. Genome evolution of wild barley (Hordeum spontaneum) by BARE-1 retrotransposon dynamics in response to sharp microclimate divergence Proc Natl Acad Sci USA 2000; 97(12): 6603-7. |
[85] | Shim S, Lee SK, Han JK. A novel family of retrotransposons in Xenopus with a developmentally regulated expression Genesis 2000; 26(3): 198-207. |
[86] | Ramallo E, Kalendar R, Schulman AH, Martínez-Izquierdo JA. Reme1, a Copia retrotransposon in melon, is transcriptionally induced by UV light Plant Mol Biol 2008; 66(1-2): 137-50. |
[87] | Faure E, Lecine P, Lipcey C, Champion S, Imbert J. Cell-to-cell contact activates the long terminal repeat of human immunodeficiency virus 1 through its kappaB motif Eur J Biochem 1997; 244(2): 568-74. |
[88] | Faure E, Cavard C, Zider A, Guillet JP, Resbeut M, Champion S. X irradiation-induced transcription from the HIV type 1 long terminal repeat AIDS Res Hum Retroviruses 1995; 11(1): 41-3. |
[89] | Faure E, Lecine P, Imbert J, Champion S. Activation of the HIV type 1 long terminal repeat by X-irradiation involves two main Re1/NF-kappa B DNA-binding complexes AIDS Res Hum Retroviruses 1996; 12(16): 1519-27. |
[90] | Faure E, Lecine P, Imbert J, Champion S. Activation of the transcription from the human immunodeficiency virus type 1 (HIV-1) long terminal repeat by autologous and heterologous cell-to-cell contact Cell Mol Biol (Noisy-le-grand) 1996; 42(6): 811-23. |
[91] | Faure E, Rameil P, Lecine P, et al. Secretion of extracellular factor(s) induced by X-irradiation activates the HIV type 1 long terminal repeat through its kappaB motif AIDS Res Hum Retroviruses 1998; 14(4): 353-65. |
[92] | Kumar S, Orsini MJ, Lee JC, McDonnell PC, Debouck C, Young PR. Activation of the HIV-1 long terminal repeat by cytokines and environmental stress requires an active CSBP/p38 MAP kinase J Biol Chem 1996; 271(48): 30864-9. |
[93] | Morales JF, Snow ET, Murnane JP. Environmental factors affecting transcription of the human L1 retrotransposon. II. Stressors Mutagenesis 2003; 18(2): 151-8. |
[94] | Mhiri C, Morel JB, Vernhettes S, Casacuberta JM, Lucas H, Grandbastien MA. The promoter of the tobacco Tnt1 retrotransposon is induced by wounding and by abiotic stress Plant Mol Biol 1997; 33(2): 257-66. |
[95] | Takeda S, Sugimoto K, Otsuki H, Hirochika H. Transcriptional activation of the tobacco retrotransposon Tto1 by wounding and methyl jasmonate Plant Mol Biol 1998; 36(3): 365-76. |
[96] | Grandbastien MA. Activation of plant retrotransposons under stress conditions Trends Plant Sci 1998; 3(5): 181-7. |
[97] | Capy P, Gasperi G, Biémont C, Bazin C. Stress and transposable elements: co-evolution or useful parasites? Heredity 2000; 85(Pt2): 101-6. |
[98] | Nellåker C, Yao Y, Jones-Brando L, Mallet F, Yolken RH, Karlsson H. Transactivation of elements in the human endogenous retrovirus W family by viral infection. Retrovirology [serial on the Internet]. Jul 2006 Available from: http://www.retrovirology.com/content/3/1/44 [[cited 2006 July 6]]; 3: 34 [about 16 screens] |
[99] | Øresland V, Bray RA. Parasites and headless chaetognaths in the Indian Ocean Mar Biol 2005; 147(3): 725-34. |
[100] | Daponte MC, Gil de Pertierra AA, Palmieri MA, Ostrowski de Nuñez M. Monthly occurrence of parasites of the chaetognath Sagitta friderici off Mar del Plata, Argentina J Plankt Res, in press 2008. |
[101] | Nair S, Simidu U. Distribution and significance of heterotrophic marine bacteria with antibacterial activity Appl Environ Microbiol 1987; 53(12 ): 2957-62. |
[102] | Duvert M, Perez Y, Casanova J-P. Wound healing and survival of beheaded chaetognaths J Mar Biol Assoc UK 2000; 80(5): 891-8. |