bioinformatics tools for identifying hepatitis c virus subtypes · 2016. 1. 11. · baclig et al....

10
Philippine Journal of Science 141 (1): 25-34, June 2012 ISSN 0031 - 7683 Date Received: 19 Apr 2011 Key Words: 5' untranslated region, bioinformatics tools, hepatitis C virus subtypes, non-structural 5B region *Corresponding author: [email protected] 25 Michael O. Baclig 1 *, Juliet Gopez-Cervantes 2 , and Filipinas F. Natividad 1 1 Research and Biotechnology Division, 2 Liver Disease and Transplant Center, St. Luke's Medical Center, 279 E. Rodriguez Sr., Blvd., Quezon City Bioinformatics Tools for Identifying Hepatitis C Virus Subtypes With the development of freeware bioinformatics software as well as the availability of web-based software, it is now possible to use various bioinformatics tools to identify viral subtypes such as hepatitis C virus (HCV). This study aimed to demonstrate the role of bioinformatics tools in identifying HCV subtypes and to compare the accuracy of HCV-1 subtyping by 5’UTR PCR-RFLP analysis and DNA sequencing. From a clinical viewpoint, accurate genotype and subtype identification of HCV are important because this may be used as guide for deciding which therapy is appropriate to use for a particular patient. From 2005 up to 2008, we had a total of 30 HCV genotype 1 (HCV-1) positive samples. HCV-1 subtypes were identified by an in-house PCR-RFLP analysis and through direct nucleic acid sequencing using nested primers specific to the 5’UTR and non-structural 5B (NS5B) region. Bioinformatics tools play an important role in identifying HCV-1 subtypes by predicting the size of the amplicon; determining the specific restriction enzyme to cut a given nucleic acid sequence; viewing and editing the electropherogram; aligning nucleotide sequences with prototypes; searching for identical sequences; and understanding the evolution and relationships of various subtypes. The HCV nucleotide sequences reported in this study have been deposited to GenBank. Overall, this information can be utilized to generate molecular diagnostic tests in the future. INTRODUCTION There are a number of different methods for HCV genotyping and subtyping. The most frequently used typing methods are line probe assay (LiPA) and sequencing of the 5’UTR. The Versant HCV genotype assay (LiPA) manufactured by Innogenetics has been developed based on hybridization of 5’UTR amplification products with genotype specific probes. On the other hand, the TruGene HCV 5’NCR genotyping kit (Bayer Healthcare, CA) is based on semi-automated sequencing (Verbeeck et al. 2008; Chevaliez et al. 2009). However, it has been shown that genotyping methods using the 5’UTR, including LiPA, may not discriminate subtypes 1a from 1b in 5% to 10% of cases. Thus, other investigators have used different regions of the HCV genome using RFLP analysis or sequencing of the 5’UTR and NS5B for genotyping and subtyping (Zein 2000; Chen and Weck 2002; Zheng et al. 2003; Martro et al. 2008; Qiu et al. 2009; Mora et al. 2010). PCR-RFLP analysis of the nested RT-PCR amplified 5’UTR is generally used for the identification of HCV genotypes in the Philippines (Maramag et al. 2006). It has

Upload: others

Post on 15-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Philippine Journal of Science141 (1): 25-34, June 2012ISSN 0031 - 7683Date Received: 19 Apr 2011

    Key Words: 5' untranslated region, bioinformatics tools, hepatitis C virus subtypes, non-structural 5B region

    *Corresponding author: [email protected]

    25

    Michael O. Baclig1*, Juliet Gopez-Cervantes2, and Filipinas F. Natividad1

    1Research and Biotechnology Division,2Liver Disease and Transplant Center, St. Luke's Medical Center,

    279 E. Rodriguez Sr., Blvd., Quezon City

    Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    With the development of freeware bioinformatics software as well as the availability of web-based software, it is now possible to use various bioinformatics tools to identify viral subtypes such as hepatitis C virus (HCV). This study aimed to demonstrate the role of bioinformatics tools in identifying HCV subtypes and to compare the accuracy of HCV-1 subtyping by 5’UTR PCR-RFLP analysis and DNA sequencing. From a clinical viewpoint, accurate genotype and subtype identification of HCV are important because this may be used as guide for deciding which therapy is appropriate to use for a particular patient. From 2005 up to 2008, we had a total of 30 HCV genotype 1 (HCV-1) positive samples. HCV-1 subtypes were identified by an in-house PCR-RFLP analysis and through direct nucleic acid sequencing using nested primers specific to the 5’UTR and non-structural 5B (NS5B) region. Bioinformatics tools play an important role in identifying HCV-1 subtypes by predicting the size of the amplicon; determining the specific restriction enzyme to cut a given nucleic acid sequence; viewing and editing the electropherogram; aligning nucleotide sequences with prototypes; searching for identical sequences; and understanding the evolution and relationships of various subtypes. The HCV nucleotide sequences reported in this study have been deposited to GenBank. Overall, this information can be utilized to generate molecular diagnostic tests in the future.

    INTRODUCTION There are a number of different methods for HCV genotyping and subtyping. The most frequently used typing methods are line probe assay (LiPA) and sequencing of the 5’UTR. The Versant HCV genotype assay (LiPA) manufactured by Innogenetics has been developed based on hybridization of 5’UTR amplification products with genotype specific probes. On the other hand, the TruGene HCV 5’NCR genotyping kit (Bayer

    Healthcare, CA) is based on semi-automated sequencing (Verbeeck et al. 2008; Chevaliez et al. 2009). However, it has been shown that genotyping methods using the 5’UTR, including LiPA, may not discriminate subtypes 1a from 1b in 5% to 10% of cases. Thus, other investigators have used different regions of the HCV genome using RFLP analysis or sequencing of the 5’UTR and NS5B for genotyping and subtyping (Zein 2000; Chen and Weck 2002; Zheng et al. 2003; Martro et al. 2008; Qiu et al. 2009; Mora et al. 2010).

    PCR-RFLP analysis of the nested RT-PCR amplified 5’UTR is generally used for the identification of HCV genotypes in the Philippines (Maramag et al. 2006). It has

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    26

    been suggested that as the virus continues to evolve and more HCV-infected individuals are tested, new subtypes such as HCV-1c will emerge (Ross et al. 2008; Verma and Chakravarti 2008; Utama et al. 2010). Until now, only three confirmed HCV-1 subtypes specifically 1a, 1b, 1c, and 10 provisional assigned subtypes specifically 1d to 1m have been described (Bracho et al. 2006; Bracho et al. 2008; Martro et al. 2011). Thus, it is likely that typing methods including PCR-RFLP analysis will have to be modified to accommodate the rapidly increasing database of information collected on HCV sequence heterogeneity (Davidson et al. 1995; Buoroa et al. 1999; Lee et al. 2010). In addition, there is little doubt that HCV typing methods require careful redesign.

    The importance of identifying HCV genotypes and subtypes using bioinformatics tools transcends mere academic interest because, it will provide clinicians and scientists with invaluable information about HCV genomics, which can be used for epidemiologic studies. Furthermore, molecular characterization of HCV subtypes is likely to facilitate the development of an effective vaccine. From a clinical point of view, current therapeutic decisions for chronically infected HCV patients are made on the basis of genotyping and subtyping. Thus, accurate identification of subtypes will enable the clinicians to make the proper choice of new antiviral compounds which are likely to show distinct activities against isolates belonging to different subtypes of HCV (Chandra et al. 2007; Chevaliez et al. 2009; Koletzki et al. 2010; Panduro et al. 2010; Pickett et al. 2011).

    Bioinformatics tools have been developed to generate, store, analyze, and visualize biological data. The challenge is to choose user-friendly tools that would give clear and meaningful biological information, without being overwhelmed by the complexity of the data. In this paper, we demonstrate how bioinformatics tools can be used to identify HCV-1 subtypes and we highlight selected freeware bioinformatics software and web-based software. We also report the accuracy of HCV-1 subtyping by 5’UTR PCR-RFLP analysis compared to direct DNA sequencing. Additionally, we compare the HCV-1 subtypes by phylogenetic analysis of the 5’UTR and non-structural 5B (NS5B) region.

    MATERIALS AND METHODS

    Isolation of HCV RNA Viral ribonucleic acid (RNA) was obtained from the St. Luke’s BioBank. These RNA samples were extracted from blood of patients which tested positive for hepatitis C.

    cDNA synthesiscDNA synthesis was carried out from HCV RNA extract using the SuperScript III reverse transcriptase.

    Nested PCR amplification of the 5’UTR

    The 5’UTR amplification was carried with 1.0 pmol each of the primers, 0.5X Phusion HF Buffer, 0.10 mM dNTPs, 0.01 U of Phusion DNA polymerase and RNA template was made up to a volume of 50 µL. Nested PCR was carried out using a programmable thermocycler (G-Storm) at 94 °C for 1 minute, 25 cycles at 94 °C for 25 seconds, 50 °C for 40 °C seconds, 72 °C for 1 minute, and 72 °C for 5 minutes. First round PCR was done using the outer sense primer (5’-CTGTGAGGAACTACTGTCTT-3’) a n d o u t e r a n t i s e n s e p r i m e r (5’-ATACTCGAGGTGCACGGTCTACGAGACCT-3’). One µL of the first round PCR product was reamplified with internal primers (5’-TTCACGCAGAAAGCGTCTAG-3’ and 5’-CACTCTCGAGCACCCTATCAGGCAGT-3’) for another 25 cycles under the same conditions (Chan et al. 1992). Negative and positive controls were included in the nucleic acid extraction, reverse transcription, as well as amplification for quality control to exclude false positive results due to cross contamination. The amplicons were analyzed on 2% agarose gels followed by staining with ethidium bromide, and visualized under a UV transilluminator (Gel Doc). A 25-bp ladder (Invitrogen) was used as molecular weight marker.

    Nested PCR amplification of the NS5B region NS5B nested PCR amplification was carried out with the following thermal profile: initial denaturation at 95 °C for 1 minute followed by 40 cycles at 95 °C for 20 seconds, 56 °C for 30 seconds, 72 °C for 1 minute, and 72 °C for 10 minutes as previously described (Baclig et al. 2010).

    Simulation and testing of the PCR assayAmplify 3 version 3.1.4 for Mac OS (http://engels.genetics.wisc.edu/amplify) and AmplifX 1.5.4 (http://ifrjr.nord.univ-mrs.fr/AmplifX) were used to predict the size of the amplicon.

    Restriction fragment length polymorphism (RFLP) of the 5’UTRNEBcutter version 2.0 was used to map the enzyme used to cut the 251-bp amplicon (http://tools.neb.com/NEBcutter2). The 251-bp amplicon of the 5’UTR was digested using BstU1 for 16 hours at 60 °C. HCV-1 subtypes were identified from the resulting DNA fragments which were visualized using polyacrylamide gel electrophoresis (PAGE) and ethidium bromide staining.

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    27

    Visualization of the electropherogram and sequence alignment ChromasPro version 1.5 was used to visualize and edit the electropherogram (http://www.technelysium.com.au/ChromasPro.html). Sequence alignment and analysis of the 5’UTR and NS5B region were carried out using the ClustalW2 multiple sequence alignment (http://www.ebi.ac.uk/Tools/msa/clustalw2) and Molecular Evolutionary Genetics Analysis (MEGA) version 5 software.

    DNA purification and sequencing A portion of the nested PCR products was purified using the GFX PCR DNA and Gel Band Purification kit (GE Healthcare), prior to direct nucleotide sequencing. PCR products in the 5’UTR and NS5B region were sequenced using Big Dye Terminator Sequencing Ready Reaction Kit and Applied Biosystems 3730xl Automated Sequencer (Macrogen, Korea). The sequence data were aligned with the consensus sequences of confirmed subtypes using ClustalW2 and MEGA5 multiple sequence alignment software. The DNA sequences were then compared for identification with sequence from National Center for Biotechnology Information (NCBI) using the basic local alignment search tool (BLAST) program. The GenBank/EMBL/DDBJ accession numbers of HCV sequences used in the analysis were M62321 (for 1a), D90208 (for 1b), D14853 (for 1c), AF238485 (for 2a), D10988 (for 2b), D17763 (for 3a), Y11604 (for 4a), Y13184 (for 5a), and Y12083 (for 6a).

    Phylogenetic analysis and basic local alignment searchPhylogenetic trees of the 5’UTR and NS5B regions were constructed using MEGA5 software in accordance with the neighbor-joining (NJ) and maximum parsimony (MP) method. The sequence data were then aligned and tested for homology with existing sequences already in the GenBank. The significance of the group was assumed when bootstrap values were greater than 70%. The robustness of the tree was evaluated by 1000 bootstrap replicates.

    Nucleotide sequence accession numbers The nucleic acid sequences reported in this study have been deposited to NCBI through GenBank. It can be retrieved under GenBank accession numbers GQ844690 to GQ844700 and GQ866987 to GQ867012.

    RESULTS AND DISCUSSIONMany relatively rapid and simple typing methods for identifying the genotypes and subtypes of HCV have been

    described. These methods are based on the amplification of subgenomic regions of the virus from clinical specimens by reverse transcription-polymerase chain reaction (RT-PCR) followed by digestion with restriction enzymes, amplification with type-specific primers, and hybridization with type-specific probes. However, the role of bioinformatics tools in identifying HCV subtypes remain poorly described in detail. In our institution, PCR-RFLP analysis of the nested RT-PCR amplified 5’UTR is generally used for the identification of HCV genotypes and subtypes. In this study, the simulation and testing of the PCR product were performed using Amplify 3 and AmplifX. The PCR product size was predicted to be 251-bp (Figure 1). This is consistent with the actual size of the amplicon following agarose gel electrophoresis and ethidium bromide staining (Figure 2). Based on the results of this study, we have demonstrated that it is possible to theoretically determine the size of the amplicon using bioinformatics tools such as Amplify 3 and AmplifX. Engels (1993) has previously shown that Amplify 3 may be used to design PCR experiments and predict the size of the amplicon. In addition, Amplify 3 is a freeware Macintosh program which can also be used as a tool for designing primers.

    NEBcutter is an on-line DNA sequence tool used to find large, non-overlapping, open reading frames and sites for all restriction enzymes from New England BioLabs (Vincze et al. 2003). NEBcutter DNA restriction mapper was used to determine the specific restriction enzyme to differentiate HCV-1a and 1b viruses in this study. Based on this, BstU1 was identified as the restriction enzyme. Of the 30 HCV-1 samples, 16 (53%) were identified as 1a, 13 (43%) were identified as 1b and 1 (4%) was identified as mixed subtype (1a/1b) by PCR-RFLP analysis of the 5’UTR. Results showed that subtypes of HCV-1 viruses can be identified on the basis of the electropherotypes produced following restriction enzyme digestion of the amplified PCR products using BstUI resulting in DNA fragments of 209-bp and 42-bp for HCV-1a viruses. In contrast, the resulting DNA fragments for HCV-1b were 179-bp, 42-bp, and 30-bp. Subtypes 1a and 1b differ by a single nucleotide (A/G) at position -99 in the 5’UTR. The presence of a G residue at position -99 produces a sequence in nested RT-PCR products that is recognized by the enzyme BstUI (Martro et al. 2008).

    The electropherogram was viewed and edited using ChromasPro (Figure 3). Multiple sequence alignment was performed using ClustalW2. In this study, 5’UTR sequencing showed that 15 (50%) were classified as subtype 1a and the remaining 15 (50%) were classified as subtype 1b. Based on the results of this study, the 5’UTR nucleotide sequencing identified subtype 1a in 15/15 (100%) and subtype 1b in 13/15 (87%) of the samples. The predictive value of the 5’UTR PCR-RFLP

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    28

    analysis to subtype 1a was 15/16 (94%). For subtype 1b, the predictive value of 5’UTR PCR-RFLP analysis was 13/13 (100%). One sample (#743) that was typed as mixed subtype (1a/1b) by PCR-RFLP analysis was confirmed as 1b by 5’UTR sequencing. This result could be explained by an A/G polymorphism that may exist at nucleotide position -99. Overall, the 5’UTR PCR-RFLP analysis was accurately subtyped in 28 of 30 (93%) samples, missing 2 subtype 1b viruses. Thus, PCR-RFLP

    analysis of HCV-1 subtypes by 5’UTR-based typing method cannot accurately discriminate 1a from 1b in 2 of 30 (7%) of cases.

    Although the 5’UTR PCR-RFLP analysis provides an easy and rapid method for screening of HCV-1 samples, and is a widely used tool in HCV genotyping and subtyping, sequence and phylogenetic analysis is the gold standard for determining HCV genotypes and subtypes. It is noteworthy to mention that one sample

    Figure 1. Using the software Amplify 3 and AmplifX, the simulated 5’UTR PCR product is 251-bp.

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    29

    (#743) was identified to be a mixed subtype (1a/1b) by PCR-RFLP analysis. It was shown to be HCV-1b by 5’UTR sequence and phylogenetic analysis through bioinformatics tools such as ChromasPro, ClustalW2, MEGA, and BLAST search. Thus, bioinformatics can resolve ambiguous identities. The standard and most definitive method for subtype determination is direct sequencing, which has lower cost for reagents, but requires more time than commercial assay kits. In this study, sequence alignment and analysis were done using ChromasPro, ClustalW2, and MEGA5. It has been shown that the ChromasPro software includes most of the functionality of Chromas, such as assembly of overlapping sequences into a consensus and automatic display of ambiguities for editing. On the other hand, ClustalW2 is a multiple sequence alignment program for DNA or proteins. It calculates the best match for the selected sequences and lines them up, so that the identities can be seen. ClustalW2 is currently maintained

    at the Conway Institute UCD Dublin. MEGA5 is a user-friendly software for building sequence alignments and it also provides statistical analyses of DNA or protein sequence data (Larkin et al. 2007; Goujon et al. 2010; Tamura et al. 2011).

    The GenBank database is an annotated collection of all publicly available DNA sequences produced at NCBI as part of an international collaboration with the European Molecular Biology Laboratory (EMBL) and the DNA Data Bank of Japan (DDBJ). GenBank receives nucleotide sequences produced in many laboratories throughout the world from more than 100,000 distinct organisms including viruses. Each database has its own set of submission and retrieval tools, but the three major public DNA databases exchange information daily so that all three databases should contain the same set of DNA sequences (Mizrachi 2002; Teufel et al. 2006). In order to identify HCV-1 subtypes, the DNA sequences

    Figure 2. Representative gels of digested PCR products.

    Figure 3. Representative electropherogram and nucleotide sequence of partial 5’UTR.

    Digested Undigested M Digested Undigested M

    179 bp209 bp

    42bp42bp30 bp

    HCV-1bHCV-1a

    251bp 251bp

    50-1b50-1b

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    30

    were compared with reference sequences from NCBI using the BLAST program (http://www.ncbi.nlm.nih.gov/BLAST).

    The HCV sequences reported in this study may provide the essential tools for studies on molecular virology, pathogenesis of hepatitis C, drug design, and vaccine development. Because of the diversity of HCV variants, when designing a vaccine, we must consider specific genotypes and subtypes in various geographical regions in order to achieve broad protection.

    Phylogenetic trees are calculated using statistical models to infer evolutionary relationships between organisms. Several methods have been described for phylogenetic analysis such as neighbor-joining (NJ), minimum evolution (ME), maximum likelihood (ML), maximum parsimony (MP), and Bayesian approaches (Procter et al. 2010). In this study, phylogenetic analysis was carried out using MEGA5. This software has evolved to include the creation and exploration of sequence alignments, estimation of sequence divergence, construction, and visualization of phylogenetic trees (Tamura et al. 2007; Tamura et al. 2011). MEGA includes various tests for examining the reliability of the tree such as bootstrap test and the standard error test. In bootstrap test, the same number of sites is randomly sampled with replacement from the original sequences, and a tree is constructed from the resampled data. This process is repeated and the reliability of a sequence cluster is evaluated by its relative frequency of the appearance in bootstrap replications. In the standard error test, the branch lengths of the tree are re-estimated by using the ordinary least-squares method, and the standard errors of these estimates are computed (Kumar et al. 1994; Kumar et al. 2001). Phylogenetic analysis of the partial 5’UTR showed that not all of the subtype 1a viruses clustered together and not all of the subtype 1b viruses grouped together (Figure 4A). The probable reason for this is due to the partial sequence subjected to phylogenetic analysis. It has been suggested that the discriminatory power of the phylogenetic tree analysis depends on the length of the fragment analyzed. Thus, full-length analysis of 5’UTR is recommended to verify these findings. Additionally, subtyping based on the use of variable genomic regions such as the core, E1, and NS5B is recommended to confirm these results (Figure 5). Phylogenetic analysis of the NS5B showed that all of subtype 1b viruses grouped together and all of subtype 1a viruses clustered together supported by bootstrap value of 99% (Figure 4B). One of the possible ways to validate the results is to run the data in different softwares to evaluate their robustness using various methods such as maximum likelihood, maximum parsimony, and Bayesian inference for comparison. Other softwares which can be used to construct phylogenetic trees include phylogeny inference package (PHYLIP) and phylogenetic analysis using parsimony (Felsenstein 1989; Swofford 2003; Kumar et al. 2008).

    Figure 4A. Phylogenetic analysis of partial 5’UTR sequences of 30 HCV-1 samples. HCV reference sequences from GenBank/EMBL/DDBJ were included. The evolutionary history was inferred using neighbor-joining (NJ) method. The numbers at the nodes represent the percent bootstrap support for 1000 replicates. Only values over 70% are shown. Bar at the base of the tree shows the genetic divergence. Phylogenetic analysis was conducted in MEGA5.

    Recombination is a cause of genetic diversity and has important implications to pathogenesis, diagnosis as well as treatment of HCV infection. A natural intergenotypic (2b/1b) and intratypic (1b/1a) recombinants in some families of RNA viruses have been reported including HCV (Colina et al. 2004; Kageyama et al. 2006; Lee et al. 2010; Mes et al. 2010). In this study, phylogenetic tree analysis did not show any evidence for recombinant forms. This result was not surprising because HCV recombination occurs rarely. However, further analysis of full-length genome is needed to show that a recombination event has not truly occurred.

    In the practical sense, bioinformatics tools can be used to design primers in order to amplify a target DNA. It

    708 (1b)544 (1b)

    645 (1b)

    818 (1b)

    781 (1a)

    717 (1b)

    0.005

    785 (1a)

    98

    559 (1b)646 (1b)

    AF 238485 (2a)AF 238485 (2a)

    D10988 (2b)

    D17763 (3a)Y11604 (4a)

    Y13184 (5a)

    M62321 (1a)

    745 (1b)

    701 (1b)

    743 (1b)814 (1b)622 (1b)620 (1b)566 (1b)551 (1b)443 (1b)D90208 (1B)

    797 (1a)550 (1a)583 (1a)714 (1a)765 (1a)787 (1a)835 (1a) 766 (1a)Y12083 (6a)

    723 (1a)821 (1a)837 (1a)D14853 (1c)

    691 (1b)

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    31

    can also be used to identify genotypes and subtypes for epidemiologic studies, as well as for molecular modeling of drug targets. In addition, it can be utilized to identify biomarkers in various disease processes as well as genome-wide association studies and gene expression profiling.

    CONCLUSION Bioinformatics tools play an important role in identifying HCV subtypes by predicting the size of the amplicon; determining the specific restriction enzyme to cut a given DNA sequence; viewing and editing the electropherogram; aligning nucleotide sequences with prototypes; searching

    Figure 4B. Phylogenetic analysis of partial NS5B sequences of 30 HCV-1 samples. HCV reference sequences from GenBank/EMBL/DDBJ were included. The evolutionary history was inferred using maximum parsimony (MP) method. The numbers at the nodes represent the percent bootstrap support for 1000 replicates. Only values over 70% are shown. Bar at the base of the tree shows the genetic divergence. Phylogenetic analysis was conducted in MEGA5.

    765 (1a)

    583 (1a)

    797 (1a)

    785 (1a)

    766 (1a)

    723 (1a)

    835 (1a)

    Y12083 (6a)

    AF238485 (2a)

    D10988 (2b)

    D17763 (3a)

    Y11604 (4a)

    Y13184 (5a)

    781 (1a)

    787 (1a)

    821 (1a)

    714 (1a)

    717 (1b)

    743 (1b)

    D90208 (1b)

    620 (1b)

    443 (1b)

    745 (1b)

    645 (1b)

    622 (1b)

    559 (1b)

    708 (1b)

    701 (1b)

    566 (1b)

    646 (1b)

    691 (1b)

    814 (1b)

    818 (1b)

    D14853 (1c)

    5

    544 (1b)

    551 (1b)

    837 (1a)

    550 (1a)

    99

    99

    (62321 (1a)

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    32

    for identical sequences; and understanding the evolution and relationships of nucleic acid sequences. Overall, this information can be utilized to generate molecular diagnostic tests in the future.

    ACKNOWLEDGEMENTS This work was supported by a research grant from St. Luke’s Medical Center through the Research and Biotechnology Division.

    REFERENCESBACLIG MO, CHAN VF, RAMOS JDA, GOPEZ-

    CERVANTES J, NATIVIDAD FF. 2010. Correlation of the 5’untranslated region (5’UTR) and non-structural 5B (NS5B) nucleotide sequences in hepatitis C virus subtyping. Int J Mol Epidemiol Genet 1:236-244.

    BRACHO M, CARILLO-CRUZ, F, ORTEGA E, MOYA A, GONZALES-CANDELAS F. 2006. A new subtype of hepatitis C virus genotype 1: Complete genome and phylogenetic relationships of an equatorial Guinea isolate. J Gen Virol 87:1697-1702.

    BRACHO MA, SALUDES V, MARTRO E, BARGALLO A, GONZALEZ-CANDELAS F, AUSINA V. 2008. Complete genome of a European hepatitis C virus subtype 1g isolate: Phylogenetic and genetic analyses. Virol J 5:72

    BUOROA S, PIZZIGHELLAB S, BOSCHETTOA R, PELLIZZARIA L, CUSANA M, BONAGUROA R, MENGOLIA C, CAUDAIC C, PADULAC M, EGISTO P, VALENSINC P, PALUA G. 1999. Typing of hepatitis C virus by a new method based on restriction fragment length polymorphism. Intervirol 42:1-8.

    Figure 5. The HCV genome consists of a single open reading frame and two untranslated regions. It encodes a polyprotein of approximately 3011 amino acids. Adapted from Lindenbach and Rice 2005 (Nature 436:933-937).

    CHAN S, MCOMISH F, HOLMES E, DOW B, PEUTHERER J, FOLLETT E, YAP P, SIMMONDS P. 1992. Analysis of a new hepatitis C virus type and its phylogenetic relationship to existing variants. J Gen Virol 73:1131-1141.

    CHANDRA M, THIPPAVUZZULA R, RAMACHANDRA RAO VV, HABIB AM, HABIBULLAH CM, NARASU L, PRAMEELA Y, KHAJA MN. 2007. Genotyping of hepatitis C virus in infected patients from South India. Infect Genet Evol 7:724-730.

    CHEN Z, WECK K. 2002. Hepatitis C virus genotyping: Interrogation of the 5’untranslated region cannot accurately distinguish genotypes 1a and 1b. J Clin Microbiol 40:3127-3134.

    CHEVALIEZ S, BOUVIER-ALIAS M, BRILLET R, PAWLOTSKY JM. 2009. Hepatitis C virus genotype 1 subtype identification in new HCV drug development and future clinical practice. PLoS ONE 4:1-9

    COLINA R, CASANE D, VASQUEZ S, GARCIA-AGUIRRE L, CHUNGA A, ROMERO H, KHAN B, CRISTINA J. 2004. Evidence of intratypic recombination in natural populations of hepatitis C virus. J Gen Virol 85:31-37.

    DAVIDSON F, SIMMONDS P, FERGUSON J, JARVIS L, DOW B, FOLLET E, SEED C, KRUSIUS T, LIN C, MEDGYESI G, KIYOKAWA H, OLIM G, DURAISAMY G, CUYPERS T, SAEED A, TEO D, CONRADIE J, KEW M, LIN M, NUCHAPRAYOON C, NDIMBIE O, YAP P. 1995. Survey of major genotypes and subtypes of hepatitis C virus using RFLP of sequences amplified from the 5’non-coding region. J Gen Virol 76:1197-1204.

    ENGELS W. 1993. Contributing software to the internet: The amplify program. TIBS 18.

    FELSENSTEIN J. 1989. PHYLIP. Phylogeny inference package (version 3.2). Cladistics 5:164-166.

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    33

    GOUJON M, MCWILLIAM H, LI W, VALENTIN F, SQUIZZATO S, PAERN J, LOPEZ R. 2010. A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res 38:S695-699

    KAGEYAMA S, AGDAMAG D, ALESNA E, LEANO P, HEREDIA A, TAC-AN A, JEREZA L, TANIMOTO T, YAMAMURA J, ICHIMURA H. 2006. A natural intergenotypic (2b/1b) recombinant of hepatitis C virus in the Philippines. J Med Virol 78:1423-1428.

    KOLETZKI D, DUMONT S, VERMEIREN H, FEVERY B, DE SMET P, STUYVER LJ. 2010. Development and evaluation of an automated hepatitis C virus NS5B sequence-based subtyping assay. Clin Chem Lab Med 48:1095-1102.

    KUMAR S, TAMURA K, NEI M. 1994. MEGA: Molecular evolutionary genetics analysis software for microcomputers. Comput Appl Biosci 10:189-191.

    KUMAR S, TAMURA K, JOKOBSEN IB, NEI M. 2001. MEGA2: Molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.

    KUMAR S, NEI M, DUDLEY J, TAMURA K. 2008. MEGA: A biologist-centric software for evolutionary analysis of DNA and protein sequences. Brief Bioinform 9:299-306.

    LARKIN MA, BLACKSHIELDS G, BROWN NP, CHENNA R, MCGETTIGAN PA, MCWILLIAM H, VALENTIN F, WALLACE IM, WILM A, LOPEZ R, THOMPSON JD, GIBSON TJ, HIGGINS DG. 2007. ClustalW and ClustalX version 2. Bioinformatics 23: 2947-2948.

    LINDENBACH B, RICE C. 2005. Unraveling hepatitis C virus replication from genome to function. Nature 436:933-938.

    LEE Y, LIN H, CHEN Y, LEE C, WANG S, CHANG J, CHEN T, LIU H, CHEN Y. 2010. Molecular epidemiology of HCV genotypes among injection drug users in Taiwan: Full-length sequences of two new subtype 6w strains and a recombinant form 2b6w. J Med Virol 82:57-68.

    MARAMAG F, RIVERA M, PREDICALA R, BACLIG M, MATIAS R, CERVANTES J. 2006. Hepatitis C genotypes among Filipinos. Phil J Gastroenterol 2:30-32.

    MARTRO E, GONZALES V, BUCKTON A, SALUDES V, FERNANDEZ G, MATAS L, PLANAS R, AUSINA V. 2008. Evaluation of a new assay for hepatitis C virus genotyping targeting both 5’NC and NS5B genomic regions in comparison with reverse hybridization and sequencing methods. J Clin Microbiol 46:192-197.

    MARTRO E, VALERO A, JORDANA-LLUCH E, SALUDES V, PLANAS R, GONZALEZ-CANDELAS F, AUSINA V, BRACHO MA. 2011. Hepatitis C virus sequences from different patients confirm the existence and transmissibility of subtype 2q, a rare subtype circulating in the metropolitan area of Barcelona, Spain. J Med Virol 83:820-826.

    MES TH, VAN DOORNUM GJ. 2010. Recombination in hepatitis C virus genotype 1 evaluated by phylogenetic and population-genetic methods. J Gen Virol 92:279-286.

    MIZRACHI I. 2002. GenBank: The nucleotide sequence database. The NCBI handbook 1-15.

    MORA MV, ROMANO CM, GOMES-GOUVEA MS, GUTIERREZ MF, CARRILHO FJ, PINHO JR. 2010. Molecular characterization distribution and dynamics of hepatitis C virus genotypes in blood donors in Colombia. J Med Virol 82:1889-1898.

    PANDURO A, ROMAN S, KHAN A, TANAKA Y, KURBANOV F, LOPEZ E, CAMPOLLO O, NAZARA Z, MIZOKAMI M. 2010. Molecular epidemiology of hepatitis C virus genotypes in West Mexico. Virus Res 151:19-25.

    PICKETT B, STRIKER R, LEFKOWITZ E. 2011. Evidence for separation of HCV subtype 1a into two distinct clades. J Viral Hepat 18:608-618.

    PROCTER JB, THOMPSON J, LETUNIC I, CREEVEY C, JOSSINET F, BARTON GJ. 2010. Visualization of multiple alignments phylogenies and gene family evolution. Nature Methods 7:S16-25.

    QIU P, CAI XY, DING W, ZHANG Q, NORRIS ED, GREENE JR. 2009. HCV genotyping using statistical classification approach. J Biomed Sci 16:62.

    ROSS RS, VERBEECK J, VIAZOV S, LEMEY P, RANST MV, ROGGENDORF M. 2008. Evidence for a complex mosaic genome pattern in a full-length hepatitis C virus sequence. Evolutionary Bioinformatics 4:249-254.

    SWOFFORD D. 2003. PAUP. Phylogenetic analysis using parsimony (version 4). Sunderland, Massachusetts: Sinauer Associates.

    TAMURA K, DUDLEY J, NEI M, KUMAR S. 2007. Molecular evolutionary genetics analysis software version 4.0 Mol Biol Evol 24:1596-1599.

    TAMURA K, PETERSON D, PETERSON N, STECHER G, NEI M, KUMAR S. 2011. MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance and maximum parsimony methods. Mol Biol Evol 28:2731-2739.

  • Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C Virus Subtypes

    Philippine Journal of ScienceVol. 141 No. 1, June 2012

    34

    TEUFEL A, KRUPP M, WEINMANN A, GALLE PR. 2006. Current bioinformatics tools in genomic biomedical research. Int J Mol Med 17:967-973.

    UTAMA A, TANIA NP, DHENNI R, GANI RA, HASAN I, SANITYOSO A, LELOSUTAN S, MARTAMALA R, LESMANA LA, SULAIMAN A, TAI S. 2010. Genotype diversity of hepatitis C virus in HCV associated liver disease patients in Indonesia. Liver Int 30:1152-11601.

    VERBEECK J, STANLEY M, SHIEH J, CELIS L, HUYCK E, WOLLANTS E, MORIMOTO J, FARRIOR A, SABLON E, JANKOWSKI-HENNIG M, SCHAPER C, JOHNSON P, VAN RANST M, VAN BRUSSEL M. 2008. Evaluation of Versant HCV genotype assay (LiPA) 2.0. J Clin Microbiol 46:1901-1906.

    VERMA V, CHAKRAVARTI A. 2008. Comparison of 5’non coding core with 5’non-coding regions of HCV by RT-PCR: Importance and clinical implications. Curr Microbiol 57:206-211.

    VINCZE T, POSFAI J, ROBERT R. 2003. NEBcutter: A program to cleave DNA with restriction enzymes. Nucleic Acids Res 31:3688-3691.

    ZEIN N. 2000. Clinical significance of hepatitis C virus genotypes. Clin Microbiol Rev 13:223-235.

    ZHENG X, PANG M, CHAN A, ROBERTO A, WARNER D, YEN-LIEBERMAN B. 2003. Direct comparison of hepatitis C virus genotypes tested by INNO-LiPA HCV II and TruGene HCV genotyping methods. J Clin Virol 28:214-216.