Complete Sequencing of the Mitochondrial Genome of Opisthorchis felineus, Causative Agent of Opisthorchiasis.

Opisthorchis felineus, a hepatic trematode, is the causative agent of opisthorchiasis, a dangerous disease in both human beings and animals. Opisthorchiasis is widespread in Russia, especially Western Siberia. The purpose of the present study was to determine the complete mitochondrial DNA sequence of this flatworm. Two parallel methods were employed: (1) capillary electrophoresis to sequence the mitochondrial genome fragments obtained through specific PCR amplification, and (2) high throughput sequencing of the DNA sample. Both methods made possible the determination of the complete nucleotide sequence of the O. felineus mitochondrial genome. The genome consists of a ring molecule 14,277 nt in length that contains 35 genes coding 2 rRNA, 22 tRNA, and 12 proteins: 3 subunits of cytochrome-C-oxidase, 7 subunits of NADH-dehydrogenase, B apocytochrome, and subunit 6 of ATP-synthetase.Like many other flatworms, O. felineus is characterized by the absence of the ATP-synthetase subunit 8 gene. Nineteen out of the 22 tRNAs have a typical "clover leaf" structure. The tRNA(AGC) and tRNA-Cys genes lack DHU-loops, while the tRNA-Ser(UCA) has 2 alternative structures: one with a DHU-loop, and one without it. Analyzing the results obtained from the high throughput sequencing revealed 45 single-nucleotide polymorphisms within the mitochondrial genome. The results obtained in this study may be used in the development of molecular diagnostic methods for opisthorchiasis. This study shows that high throughput sequencing is a fast and effective method for decoding the mitochondrial genome of animals.

Although O. felineus has been studied for over a century, the lack of knowledge about its specific indentifying characteristics has meant that many questions about its prevalence and about how it evolves remain to be answered. Previous molecular analyses of these flukes have not provided molecular markers specific enough to be effective for the purposes of present-day studies [3,4,5], but the complete decoding of this trematode´s mitochondrial genome may enable specific and effective molecular mark-ers to be created, which would have far-ranging applications in research.
the mitochondrial DnA (mtDnA) of most species of animals has some unique features, such as its maternal pattern of inheritance, the absence of recombination and its higher replication rate, which distinguish it from nuclear DnA [6], and which make it a potentially unequalled tool for identification in phylogenetic and phylogeographic studies.
the number of sequenced genomes continues to increase, and now they are widely used for selecting genetic markers characterized by a high evolution rate, and for creating high-resolution phylogenetic trees in which both the sequences proper and the individual gene sequences can be used as markers.
two of the methods available for the complete sequencing and structural analysis of the mitochondrial genome of O. felineus are the subject of the present review.

BIOMATERIAL SOuRCE AND DNA RECOVERy
the O. felineus samples were recovered from an infected cat from the ust-tula settlement (novosibirsk region, russia). the morphological features allowed specialists from the Parasitology and Ichthyology Laboratory (Institute of Systematic and ecology of Animals, russian Academy of Sciences) to determine the species. Both sequencing procedures involved the recovery of DnA from the pooled samples using the phenol-chloroform method [7].

DECODING OF THE O . FELINEuS MTDNA SEquENCE uSING
CAPILLARy ELECTROPHORESIS, AFTER P . SENGER the conserved sequences characteristic of the trematode genomes were identified by comparing the mitochondrial genomes of the Fasciola hepatica (AF216697), Paragonimus westermani (AF216698) and Schistosoma mansoni (AF216698) trematodes using the MeMe/MASt programs (http://meme.sdsc.edu/). universal primers were selected on the basis of those sequences, as well as on the basis of such published sequences as Clonorchis sinensis (DQ116944, AY264851) and O. viverrini (DQ882172, DQ119551). these primers helped to create a set of amplicons, approximately 1,000 pn long, whose sequences were then used to synthesize new primers. then, the remaining overlapping fragments of the mitogenome were amplified. Most amplicons were directly sequenced; some amplicons were cloned, and then at least three clones were subjected to sequencing. the mtDnA sequencing was performed using the Applied Biosystems ABI PrISM 3100 Avant Genetic Analyzer in the DnA Sequencing Institute, Siberian Branch of the russian Academy of Sciences. the complete sequence of O. felineus mtDnA can be found in the GenBank (nc_011127).

DECODING OF THE O . FELINEuS MTDNA SEquENCE uSING THE HIGH THROuGHPuT SEquENCING METHOD
In order to determine the O. felineus mtDnA sequence using the high throughput sequencing method, we employed the techniques developed by the 454 Life Science company with the GS FLX genome analyzer. Having obtained the library of random DnA fragments, we carried out the clonal amplification of the DnA molecules related to the microparticles in the water-in-oil emulsions, as well as the sequencing with the GS FLX genome analyzer using a reagents kit and following the protocols established by the roche Laboratory. One run of the device (12 hours) allowed us to determine 100 mln. nt; the average length of "reading" was about 220 nt. the set of overlapping sequences obtained using the GS FLX genome analyzer was then assembled into contiguous clones using the GS de novo Assembler program pack (roche Diagnostics, roche Applied Science). Finally, the complete nucleotide sequence of the contiguous clone was determined to be mitochondrial genome, 14, 277 nt in length. the average mtDnA reading frequency was 30.

ANALySIS OF BIOINFORMATION
the analysis both of the sequences and of the assembled genome was performed with the Vector ntI 7 program (Informax Inc.). Similar sequences were searched for in the GenBank´s biological sequences databases (http://www. ncbi.nlm.nih.gov/blast). the flatworm´s mitochondrial genetic code was used to translate protein-coding sequences [8]. Most trnA were detected by the trnAscan-Se program, [9] while secondary structures of other flatworms were found manually. In order to identify potential single nucleotide polymorphisms (SnP), some sequences determined during the course of sequencing were aligned relative to the "consensus" sequence of O. felineus mtDnA using the GS reference mapper program (roche). SnPs were detected during the course of at least three individual readings at those points where their sequences did not coincide with the "consensus" sequence. All points where the complete mtDnA sequences determined by capillary electrophoresis and high throughput sequencing methods were not consistent were referred to as SnP as well.

METHODS OF MTDNA SEquENCING
Due to their relatively short lengths, animal mitochondrial genomes were among the first objects of genomic investigation [10], and to date, hundreds of mtDnA sequences are known. the standard method for decoding the mitochondrial genome involves the recovery of mitochondria from the cells and the creation of a mitochondrial DnA sample maximally purified of genome DnA. the following Sanger sequencing suggests the genome decomposition into randomly chosen fragments, cloning using the plasmid vector (library of random fragments), and sequencing of the clones produced using capillary electrophoresis. Since the fragments are overlapping, the sequences produced may be combined into a complete mtDnA sequence. In the present study, for the specific recovery of mitochondrial sequences, we used data on the mtDnA structure of closely related helminthes, which allowed us to identify the conserved sites of the genome, and to amplify the O. felineus mtDnA fragments occurring between them using the Pcr method. the sequences of the fragments obtained were determined by capillary electrophoresis and were combined into a complete mtDnA sequence 14,277 nt in length.
A new method, which makes it possible to detect the genome sequences de novo, is the high throughput sequencing method [11], developed by the 454 Life Science company using the GS FLX genome analyzer. this method involves the fragmentation of up to 300-800 nt of DnA, the amplification of the individual DnA fragments related to microparticles in microdrops formed in the water-in-oil emulsions, the injection of nanoparticles containing immobilized amplified fragments into the microcells on the glass sheet, parallel high throughput sequencing, and the registration of the results obtained from each of the few hundred thousand cells on the glass sheet. the average reading length is approximately 200 nt, and one run of the device can analyze a sequence up to 100 mln nt in length. the large volume of sequences detected using this method allowed us to reject the specific recovery of the mitochondrial genome fragments and to use the sample of "total" O. felineus genome DnA for the sequencing. In spite of the fact that the share of the mtDnA sequences was less that 1% of the whole sequencing volume, it was enough for the reading of mtDnA with 30-fold overlapping that provided a complete "assembly" of the mitochondrial genome sequence following only one run of the GS FLX genome analyzer.

MAJOR CHARACTERISTICS OF THE O . FELINEuS MITOCHONDRIAL GENOME
the O. felineus mitochondrial genome is a ring molecule, 14,277 nt in length. It is the shortest among the currently known mitochondrial genomes of trematodes [12]. Analysis of the genome sequence confirmed the presence of typical mitochondrial genes: 12 protein-coding genes (AtP-synthetase subunit 8 is absent), 22 trnA-, and 2 rrnA-coding genes (table 1).
As with other flatworms, all genes are transcribed from one chain (Fig. 1). the gene sequence of the O. felineus mitochondrial genome is similar to that of F. hepatica [13]; 40 pn of nd4L and nd4 genes are overlapped for different reading frames.
All well-known flatworm mitochondrial genomes, except for the P. westermani genome, are А/Т-rich. the O. felineus mitochondrial genome contains 60% А+Т; moreover, the coding strand is rich in thimine (43%) compared to adenine (17%), guanine (28%), and cytosine (12%). the nucleotide composition is variable in different parts of the O. felineus genome, especially in the third position of codons of proteincoding genes, where the cytosine content is only 8%. codons ending in t and G are more frequent than those ending in A and c. the most frequently appearing codons are ТТТ, Gtt, and ttG. the percentage of ТТТ codons represents almost 10% of the total number, while all codons composed of A and c account for only 2% (table 2). As with other trematode mitochondrial genomes, the start-codons are AtG and GtG, while the stop-codon is tAG. the tGA codon codes for tryptophan, while tAA is not used at all. truncated stop-codons were not found in the O. felineus mitochondrial genome (table 1).
the length of trnA genes in the O. felineus mitochondrial genome ranges from 59 to 72 nucleotides. Most trnA genes are combined in clusters composed of up to five genes. nineteen out of 22 trnA genes are characterized by the typical "clover leaf" structure. As in all trematodes, trnA-Ser(AGn) is lacking in the DHu-loop. the trnA-cys, as in some schistosomes, does not have the DHu-loop [14]. the trnA-Ser(ucn) gene can have two alternative structures: one with the DHu-loop and one without it (Fig. 2).
In addition to short intervals between consecutive genes, flatworm genomes often have long non-coding regions, which are believed to be sequences necessary for the initiation of the mtDnA replication and transcription. As in the F. hepatica genome, the O. felineus non-coding region located between the tRNA-Glu and cox3 genes is divided into 2 parts The corresponding amino-acid and the frequency of occurrence in the mtDNA genes are indicated for each codon. Differences from the standard genetic code are underlined. An open reading frame, 402 pn in length, was detected in the O. felineus non-coding region. A search for similar sequences within the database of biological sequences using both the nucleotide and the amino-acid sequence did not yield any results. Quite long open reading frames different from well-know proteins were also found in the noncoding regions of the mtDnA of other flatworm species: F. hepatica, cestodes Hymenolepis diminuta [15], and monoge- 14212 c c-0 / t-9 -neas Microcotyle sebastis [16]. these reading frames likely code for functional proteins; however, this hypothesis needs to be investigated further in future studies. the mtDnA non-coding region may be used to develop a molecular method for the specific identification of O. felineus. the homology levels between the O. felineus mtDnA sequences and that of two related trematodes, c. sinensis (FJ381664) and F. hepatica (AF216697), from which the mitochondrial genome sequences are well-known, amount to 78% and 64%, respectively. However, these three sequences of non-coding regions located between the tRNA-Glu and cox3 genes do not have significant homology either between themselves or with other sequences contained in the GenBank.

SINGLE NuCLEOTIDE POLyMORPHISMS IN THE O . FELINEuS MTDNA
In the course of the mtDnA sequencing using the high throughput sequencing method, each nucleotide in the genome was "read" an average of 30 times in the process of sequencing of the clonal-amplified individual fragments of the O. felineus mtDNA molecules. comparison of the sequences obtained during the course of individual reading with the consensus sequence permitted the identification of single nucleotide polymorphisms (SnP), which are present in different mtDnA molecules in one organism. Since both Sanger sequencing and high throughput sequencing were performed with DnA recovered from several O. felineus species, comparing the corresponding mtDnA sequences makes it possible to estimate the frequency of hyplotypes occurrence in each SnP.
Data from 45 detected SnPs is presented in table 3. Most SnPs in both animal and human mtDnA [17] involve Т:С and A:G substitutions (corresponding to t:c on the lower strand), which do not cause an amino-acid substitution in the protein products of the corresponding genes. It should be noted that some SnPs looked specific for the mtDnA sequence decoded by one of two technologies and were not found (or were only rarely found) in other sequences. the difference in allele frequency is likely to be the result of errors specific to the Pcr-based methods for the amplification and sequencing of the heterogeneous amplicon mixture, while the ratio of SnP alleles obtained in the process of sequencing individual fragments must be extremely precise.
In the future, the data on specific SnPs and their frequency of occurrence in mtDnA may be used as molecular markers in studies of the natural populations of O. felineus, as well as in the analysis of the pathogenic pathways of this trematode in human populations.
CONCLuSION this review contains the results of the complete sequencing of the O. felineus flatworm mtDnA obtained using two methods. the first method involved the amplification and sequencing of the mtDnA using capillary electrophoresis. Parallel high throughput sequencing of the animal genome DnA sample is performed without any preliminary enrichment with the mtDnA sequences. this enables the complete de-novo sequencing of the mitochondrial genome. the high throughput sequencing method using the GS FLX genome analyzer may be used for the rapid decoding of animal mitochondrial genomes and for the identification of polymorphisms. the newly generated data on the nucleotide sequence of the O. felineus mitochondrial genome may be utilized in the development of specific molecular diagnostic methods for opisthorchiasis. the work was supported by the program "Genomics, Proteomics, and Bioinformatics" of the Institute of cytology and Genetics, russian Academy of Sciences, and by the Federal Agency for Science and Innovations (project no. 02.552.11.7045) of the Bioengineering center, russian Academy of Sciences. We would like to thank n.I. Yurlova and K.P. Fedorov, members of the Institute of Systematic and ecology of Animals, russian Academy of Sciences, for assistance in the identification of O. felineus.