on line (dna and amino acid) sequence information lecture 9
TRANSCRIPT
![Page 1: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/1.jpg)
On line (DNA and amino acid) Sequence Information
Lecture 9
![Page 2: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/2.jpg)
Introduction
• Annotation of genes• Basic bioinformatics Databases• NCBI home page• Query and return results• DNA sequence results page• Protein sequence results page
![Page 3: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/3.jpg)
Bioinformatcs Databases• The Biological data, generated by various labs, is
submitted and stored in specific databases is : • The data is Nucleotide: DNA and mRNA (cDNA)
and Proteins sequences• The main “primary” nucleotide sequence
databases are:– United states: Genebank (NCBI) – Europe: Nucleotide sequence database (EMBL)– Japan: DNA databank of Japan.
• These databases also contain sequences related to: – Expressed sequence tags (ESTs) small (800 bp) of mRNA
and can be used to see what genes are expressed…
![Page 4: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/4.jpg)
Protein Databases
• The main protein databases is:• Uniprot: (universal Protein resource)• Uniprot (KB) databases contains data from– SWISS-PROT (most up-to date information)– Trembl: (translation of coding sequences.)– PIR database
• Both the nucleotide and databases contain much more detail than sequences and the detail is referred to annotation.
![Page 5: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/5.jpg)
Global Sequence 5
Annotation of sequences
• Once the gene sequence’s have been determined then the data must be annotated: (Klug 2010)– Identify regulatory regions – Other sequences of interest: exons/ introns, coding
sequences (cds), polyA signal– In protein annotation there are mRNA sequences– Other organisms where the DNA sequence/ AA
sequence is to found– Journals/Reference to where data came from.
![Page 6: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/6.jpg)
Bioinformatics Database
• Bioinformatic Databases contain information for various biological data:
• To faciliate finding information there are a number of specific search engines:– NCBI has ENTREZ– EMBL has SRS
• Consider the following query:– What is the DNA and amino acid sequence for the
following gene: Human BTEB – more detail on the terms can be found by looking at a
sample record: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord
![Page 7: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/7.jpg)
NCBI Entrez search page
![Page 9: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/9.jpg)
Coding section of gene
The Exon intron structure is also available in graphic form
![Page 11: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/11.jpg)
Other databases databases
• The nucleotide (Genbank and EMBL) and protein (Uniprot) contain the “raw data” and are referred to as primary databases.
• More specific databases derive data from these and are referred to as secondary database; examples include protein family and sequence similarity databases such as PROSITE and PRINTS
• There are databases which contain information about specific organisms such as e. coli using Genome online database (GOLD)
![Page 12: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/12.jpg)
Other databases
• Databases for specific types of sequences such as those associated with promoters and other regulatory elements.
• Others include structural databases from the Protein Data Bank
• On-line Mendelian inheritance of man (OMIM) which contains information on human genes and genetic disorders.
![Page 13: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/13.jpg)
Bioinformatics Search Engines
• The Entrez (NCBI) search engine retrives information from NCBI databases and can be used to obtain other information including publications (Pubmed), 3D protein structures, online mendellian inheritance of Man…. A tutorial can be found at: – Entrez: Making use of its power:
• The EMBL uses ExPASy site which utilises the open source application: Sequence retrival system: a tutorial can be found at: – SRS tutotial: quick tour
![Page 14: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/14.jpg)
Other important information sources• PUBMED: Literature research: journal articles/
conference proceedings/ books etc.– Search under many fields: keyword, author….– Returns: journal articles/abstracts– Two types: general/review.
• NCBI account: set up an NCBI account to manage previous searches….
• BTEB pubmed search found at:– http://www.ncbi.nlm.nih.gov/pubmed?term=BTEB&c
md=DetailsSearch
![Page 15: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/15.jpg)
BTEB pubmed search result
![Page 16: On line (DNA and amino acid) Sequence Information Lecture 9](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649ddd5503460f94ad5a4c/html5/thumbnails/16.jpg)