gene and protein

Upload: junior

Post on 10-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Gene and Protein

    1/37

    Gene, Proteins, and Genetic Code

  • 8/8/2019 Gene and Protein

    2/37

    Protein Synthesis in a Cell

  • 8/8/2019 Gene and Protein

    3/37

    Protein and Amino Acids

  • 8/8/2019 Gene and Protein

    4/37

    Protein

  • 8/8/2019 Gene and Protein

    5/37

    Protein

    GOT Ecoli

  • 8/8/2019 Gene and Protein

    6/37

    A protein sequence>gi|7228451|dbj|BAA92411.1| EST AU055734(S20025) corresponds to a region

    MCSYIRYDTPKLFTHVTKTPPKNQVSNSINDVGSRRATDRSVASCSSEKSVGTMSVKNASSISFEDIEKSISNWKIPKVN

    IKEIYHVDTDIHKVLTLNLQTSGYELELGSENISVTYRVYYKAMTTLAPCAKHYTPKGLTTLLQTNPNNRCTTPKTLKWD

    EITLPEKWVLSQAVEPKSMDQSEVESLIETPDGDVEITFASKQKAFLQSRPSVSLDSRPRTKPQNVVYATYEDNSDEPSI

    SDFDINVIELDVGFVIAIEEDEFEIDKDLLKKELRLQKNRPKMKRYFERVDEPFRLKIRELWHKEMREQRKNIFFFDWYE

    SSQVRHFEEFFKGKNMMKKEQKSEAEDLTVIKKVSTEWETTSGNKSSSSQSVSPMFVPTIDPNIKLGKQKAFGPAISEEL

    VSELALKLNNLKVNKNINEISDNEKYDMVNKIFKPSTLTSTTRNYYPRPTYADLQFEEMPQIQNMTYYNGKEIVEWNLDG

    FTEYQIFTLCHQMIMYANACIANGNKEREAANMIVIGFSGQLKGWWNNYLNETQRQEILCAVKRDDQGRPLPDRDGNGNP

    TELKEGFHMEEKDEPIQEDDQVVGTIQKYTKQKWYAEVMYRFIDGSYFQHITLIDSGADVNCIREDEILDQLVQTKREQV

    VNSIYLHDNSFPKSMDLPDQKITEKRAKLQDIPHHEERLLDYREKKSRDGQDKLPMEVEQSMATNKNTKILLRAWLLST

    A protein sequence may have a few hundreds to several

    thousands amino acids.

  • 8/8/2019 Gene and Protein

    7/37

    Protein synthesis

  • 8/8/2019 Gene and Protein

    8/37

    Genetic code.

    .

    AT

    T

    C

    A

    CA

    G

    T

    GG

    A

    .

    .

    I

    H

    S

    G

  • 8/8/2019 Gene and Protein

    9/37

    Notes on translation

    Three Reading frames

    Third base not important

    5 -> 3

    Start and end codon

    Open Reading Frame (ORF)

    Each gene is an ORF, but not all ORF aregenes.

  • 8/8/2019 Gene and Protein

    10/37

    The Central Dogma of Molecular Biology

    DNA RNA Proteintranscript translation

    replication

    genotype phenotype

  • 8/8/2019 Gene and Protein

    11/37

    Exception retroviruses

    DNA RNA Proteintranscript translation

    replication

    genotype phenotype

  • 8/8/2019 Gene and Protein

    12/37

    ProteinPhenotype

    DNA

    (Genotype)

    Biology

  • 8/8/2019 Gene and Protein

    13/37

    Genes One gene encodes one protein (or sometimesRNA).

    Like a program, it starts with start codon (e.g.

    ATG), then each three code one amino acid. Thena stop codon (e.g. TGA) signifies end of the gene.

    Genes are dense in prokaryotes and sparse ineukaryotes.

    In the middle of a eukaryotic gene, there areintrons that are spliced out (as junk) aftertranscription. Good parts are called exons. This isthe task of gene finding.

  • 8/8/2019 Gene and Protein

    14/37

    Gene related diseases Hemophilia: on X chromosome.

    Sickle-Cell Anemia: single nucleotide mutation in the firstexon of beta-globin gene (removes a cutting site). 1 in 12African Americans are carriers. (sick for homozygotes)

    BRCA1 gene (chr. 17q) responsible for inheritedbreast cancer (10% of breast cancer)

    Fragile X syndrome (mentally retard) 1 in 1250 males,

    2500 females (dominate, but females have partiallyexpressed good gene). FMR-1 gene: tri-nucleotide repeats>200 causes disease.

    P53 gene: chr. 17p, tumor suppressor protein.

  • 8/8/2019 Gene and Protein

    15/37

    Genetic Test Example:

    http://www.myriad.com/index.php

    Cons and Pros: Can possibly avoid/early diagnose the disease.

    Can make you unhappier

    Can help insurance company discriminate thedefected gene carriers

  • 8/8/2019 Gene and Protein

    16/37

  • 8/8/2019 Gene and Protein

    17/37

  • 8/8/2019 Gene and Protein

    18/37

    Gene Prediction and Annotation

    Prokaryotes

    1. Start/stop codon (ORF)

    2. Promoters

    3. Content4. Sequence similarity

  • 8/8/2019 Gene and Protein

    19/37

  • 8/8/2019 Gene and Protein

    20/37

    Start Codon

    May miss short genes.

    Do not know which start codon to use.

    Overlapping ORF at different reading frames.

  • 8/8/2019 Gene and Protein

    21/37

    Promoters

    5'-XXXXPPPPPPXXXXXXXXXPPPPPPXXXXGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG XXXX-3

    -35 -10 Gene to be transcribed

    -10: T A T A A T

    77% 76% 60% 61% 56% 82%

    -35: T T G A C A

    69% 79% 61% 56% 54% 54%

    Pribnow box

    Inprokaryotes, the promoter consists of two short sequences at -10 and -35 position

    upstream of the gene, that is, prior to the gene in the direction of transcription. The

    sequence at -10 is called the Pribnow box and usually consists of the six nucleotides

    TATAAT. The Pribnow box is absolutely essential to start transcription in prokaryotes. The

    other sequence at -35 usually consists of the six nucleotides TTGACA. Its presence allows

    a very high transcription rate.

    These rules are only

    approximately correct.

  • 8/8/2019 Gene and Protein

    22/37

    Scoring a 6-mer as Pribnow box Computers deal with exact formulae but not

    English description.

    We need a score function to measure thelikelihood that a 6-mer is a pribnow box

  • 8/8/2019 Gene and Protein

    23/37

    An exemplary function for pribnow

    box fitness evaluation

    log()

  • 8/8/2019 Gene and Protein

    24/37

    Content I codon bias A codon XYZ occurs with different freqencies in

    coding regions and non-coding regions

    different amino acids have different freq.

    Diff. codons for the same amino acid have diff. freq. In non-coding regions approx. p(X)*p(Y)*p(Z)

  • 8/8/2019 Gene and Protein

    25/37

    http://www.kazusa.or.jp/codon/

  • 8/8/2019 Gene and Protein

    26/37

  • 8/8/2019 Gene and Protein

    27/37

  • 8/8/2019 Gene and Protein

    28/37

    Content II - Hidden Markov

    Model (HMM)

  • 8/8/2019 Gene and Protein

    29/37

    Eukaryotes Basic idea similar to Prokaryotes

    Difference:

  • 8/8/2019 Gene and Protein

    30/37

    DN

    A-specific transcription factors

    These are the basic of gene-regulatorynetwork

    Another hot area in Bioinformatics

  • 8/8/2019 Gene and Protein

    31/37

    Splicing

    Consensus sequences have been identified as necessary butnot sufficient forsplicing. In vertebrates, these sequencesare (the slash identifies the exon-intron or intron-exon

    junction): C(orA)AG/GTA(orG)AGT "donor" splice site

    T(orC)nNC(orT)AG/G "acceptor" splice site.

    A third sequence, which in yeast is TACTAAC , is necessarywithin the intron sequence.

    These rules are only

    approximately correct.

  • 8/8/2019 Gene and Protein

    32/37

  • 8/8/2019 Gene and Protein

    33/37

  • 8/8/2019 Gene and Protein

    34/37

  • 8/8/2019 Gene and Protein

    35/37

  • 8/8/2019 Gene and Protein

    36/37

  • 8/8/2019 Gene and Protein

    37/37

    Gene Prediction Software Try Gene Scan at

    http://genes.mit.edu/GENSCAN.htmlby

    using the sequence at

    http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=3253144

    Did Gene Scan work well?