introduction to genomics and proteomics - historical perspective and the future eleftherios p....
TRANSCRIPT
![Page 1: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/1.jpg)
Introduction to Genomics and Proteomics - Historical Perspective and the Future
Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C)
UNIVERSITY OF TORONTO(Course 1505S/Jan. 9, 2001 #1)
![Page 2: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/2.jpg)
Organization of the Lecture
Historical BackgroundThe Human Genome Project
Critical Technologies: • Massive, automated sequencing • DNA and RNA analysis • Mass spectrometry • DNA and protein microarrays • Bioinformatics • Single nucleotide polymorphisms
Applications: • Diagnostics • Therapeutics
• PharmacogeneticsEthicsPatents
(Course 1505S/Jan. 9, 2001 #2)
![Page 3: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/3.jpg)
Historical Milestones
Year Milestone1866 Mendel’s discovery of genes1871 Discovery of nucleic acids1951 First protein sequence (insulin)1953 Double helix structure of DNA1960s Elucidation of the genetic code1977 Advent of DNA sequencing1975-79 First cloning of human genes1986 Fully automated DNA sequencing1995 First whole genome (Haemophilus Influenza)1999 First human chromosome(Chr #22)2000 Drosophila / Arabidopsis genomes2001 Human and mouse genomes
(Course 1505S/Jan. 9, 2001 #3)
![Page 4: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/4.jpg)
Terminology
DNA Genomics
mRNA Transcriptomics
Protein Proteomics
Metabolites Metabolomics
Functional genomics, proteomics ----- etc.
(Course 1505S/Jan. 9, 2001 #4)
![Page 5: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/5.jpg)
History
On June 26, 2000, at The White House, it was announced that the Human Genome Project was essentially completed by -
Celera Genomics (private company)
The National Human Genome Research Initiative and its International Partners (publicly funded)
Work has yet to be published but Celera scientists submitted a paper to “Science” on December 6, 2000.
(Course 1505S/Jan. 9, 2001 #5)
![Page 6: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/6.jpg)
History
On June 26, 2000, at The White House, it was announced that the Human Genome Project was essentially completed by -
Celera Genomics (private company)
The National Human Genome Research Initiative and its International Partners (publicly funded)
Work has yet to be published but Celera scientists submitted a paper to “Science” on December 6, 2000.
(Course 1505S/Jan. 9, 2001 #5)
![Page 7: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/7.jpg)
Diagnostics / Prognostics
• Does my DNA predispose me to a specific disease?
• Do I want to know? (Ethics)
• Genetic mutations disease cancer diabetes Alzheimer’s heart disease
• Whole genome scans for identification of mutations/polymorphisms?
AACC2000-#2 - 1
![Page 8: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/8.jpg)
Pharmacogenetics and Pharmacogenomics
Goal is to associate human sequence polymorphisms with:
• Drug metabolism• Adverse effects• therapeutic efficacy
* Decrease drug development cost * Optimize selection of clinical trial participants * Increase patient benefit
AACC2000-#2 - 3
![Page 9: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/9.jpg)
Critical Protein TechnologiesProtein• Make pure form (recombinant)• Activity• Reagents (antibodies)• Identification (sequencing)• Identify post-translational modification
(glycation, phosphorylation, etc.)• Protein-protein interactions (physiological
function)• Gene protein knockout / transgene
AACC2000 -#2 - 13
![Page 10: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/10.jpg)
Models of Human Disease
• Identify natural human knockouts
• Develop mice with every gene (or gene combination) being knocked out (this project is now underway!)
AACC2000 -#2 -14
![Page 11: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/11.jpg)
Expressed Sequence Tags (ESTs)
• Cloned cDNAs from various tissues (cDNA libraries)
• Can search through by BLAST analysis• Can purchase them, fully sequence and
characterize them
Great help for new gene identification.AACC2000 -#2 -16
![Page 12: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/12.jpg)
Gene Patents
• Gene fragments• Whole genes without function• Whole genes with function• Whole genes with function and utility
(enablement)
AACC2000 -#2 - 18
![Page 13: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/13.jpg)
Where Do We Stand Today? (July 2000)
Public Consortium: 85% of Genome is done* 24% finished form* 22% near finished* 38% draft* rest is being done
Celera: Claims to have more than 99% of genome now!
Incyte: They may have all the genes!AACC2000 -#2 -25
![Page 14: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/14.jpg)
Where Does the Individual Researcher Stand?
• At the end of the day, each gene must be looked at in great detail:- structure- function- physiology/pathways- pathophysiology- connection to disease- tools
• Individual researchers can make the big discoveries on a very specific gene or a very specific gene family
• Great time for individual researchersAACC2000 -#2 - 20
![Page 15: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/15.jpg)
The Future of Genome Projects
Human Mouse (just started) Rat Zebra Fish Dog Other Primates
The Era of Comparative Genomics(you can learn a lot about humans bystudying the yeast, drosophila, mouse, etc.) AACC2000 -#2 - 21
![Page 16: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/16.jpg)
The Impact of the Human Genome Project in Medicine
• You can’t make a car if you are missing parts• Once all genes are known, we will start
understanding their function PATHWAYS• We will then be able to correlate disease states to
certain genes (Pathobiology)DISEASE GENE (S)GENE (S) DISEASE
• We will then find ways for rational treatments (designer drugs), prevention, diagnosis……
AACC2000 -#2 - 22
![Page 17: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/17.jpg)
Gene Manipulation (Ethics??)
Gene modulation ( regulation)
Gene repair
Gene excision
Gene replacement/transplantation
Gene improvementAACC -#2 -23
![Page 18: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/18.jpg)
Celera’s Whole Genome Shotgun Strategy
• Doe not use BAC clones; cuts whole DNA into millions of pieces which are sequenced
• Computer assembles pieces together
• Achieve high accuracy with X6 coverage
• Lots of relatively short gaps
AACC -#2 - 26
![Page 19: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/19.jpg)
Strategy to Sequence Human Genome
Construct a human genomic library in an appropriate vector (BAC)
Assemble overlapping BAC clones in order to obtain full coverage of the distance (restriction map)
DNABACClones
Start sequencing each BAC until you finish the jobAACC -#2 - 27
![Page 20: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/20.jpg)
How are these BACs Sequenced?Shotgun Sequencing
BAC clone is broken down to small pieces which have overlapping ends
Small pieces are sequenced and a computer assembles the pieces based on the overlapping sequence information
Construct contigs (contiguous areas of sequence)
Larger contigs ------------------------AACC -#2 -28
![Page 21: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/21.jpg)
Other Important Genomic Technologies
• Recombinant DNA (cloning)• PCR• Pulsed Field Gel Electrophoresis (PFGE)• Chromosome microdissection• Somatic hybrid cell lines (mapping) [rodent x
human]• Radiation hybrid cell lines [rodent x human]• DNA sequencing
AACC2000- #2- 32
![Page 22: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/22.jpg)
Annotation
What is annotation?
Make sense out of a linear sequence identify genes, intron/exon boundaries, regulatory sequences, predict protein structure, identify motifs, predict function, etc.
Annotation will likely go on for a few years.Major annotation tool BIOINFORMATICS (hardware & software)
![Page 23: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/23.jpg)
Celera Genomics• The publicly funded project started around 1990 with
a goal to produce a highly accurate sequence by 2005• Celera started in 1998 and within 2 years sequenced
more DNA than the publicly funded consortium!
Why?
• No bureaucracy• Facility (300 sequencers x 24h/day)• Powerful supercomputer• Lots of money• More efficient sequencing approach (no BACs
necessary)• Use of data from the publicly funded project
AACC2000- #2 -30
![Page 24: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/24.jpg)
Cloning Vectors
• Replicable units of DNA which can carry exogenously inserted DNA; size of insert varies with vector type:• plasmid 5-10 kb• phage 20 kb• cosmid 45 kb
PAC/BAC (P1- or bacterial artificial chromosome) 100 - 200 kb
YAC (yeast artificial chromosome) 1,000 kb AACC2000- #2- 31
![Page 25: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/25.jpg)
Human Genome
• 3 x 109 base pairs• Approximately 100,000 genes• < 10% of DNA encodes for genes; the
rest represents introns/repetitive elements• Importance of non-coding sequences
currently not understoodAACC2000 -#2 -33
![Page 26: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/26.jpg)
Quality of Sequencing
• Clones are sequenced more than once to verify the sequence many times:
x 4 rough draft 1 error per 100 bases
x 8-11 finished draft 1 error per 10,000 bases
AACC2000 -#2 -34
![Page 27: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/27.jpg)
The Next Race
• It will not be who has the sequence
• It will be how you can use the sequence to arrive at products
* DIAGNOSTICS* THERAPEUTICS
AACC2000 -#2- 35
![Page 28: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/28.jpg)
Genomics and Drug Discovery
Genomic technologies are involved in all aspects of the drug discovery process from target validation though to the marketed drug, which include:
• Molecular target identification• Drug target characterization and validation• Lead discovery• Lead optimization• Clinical candidate to marketed drug
AACC2000- #2- 37
![Page 29: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/29.jpg)
Key Corporate Players in Proteomics
Compay Location ApproachCelera Rockville, MD DatabasesIncyte Pharmaceuticals Palo Alto, CA DatabasesGeneBio Geneva, Switzerland DatabasesProteome Inc. Beverly, MA DatabasesPE Biosystems Framingham, MA InstrumentationCiphergen Biosystems Palo Alto, CA Protein arraysOxford GlycoSciences Oxford, UK 2D gel/MS*Protana Odense, Denmark 2D gel/MSGenomic Solutions Ann Arbor, MI 2D gel/MSLarge Scale Proteomics Corp. Rockville, MD 2D gel/MS______________________________________________________* 2D gel electrophoresis and mass spectrometry
AACC2000- #2-381
![Page 30: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/30.jpg)
Pharmacogenetics and Pharmacogenomics in Drug Discovery_______________________________________________________Aspect of Drug Development Approach _______________________________________________________Drug-drug interactions Examine polymorphism in metabolic
enzymesEfficacy Differentiate responders from
nonrespondersSide Effects Examine variation in gene or genes
involved in mediating the effects (may be mechanism related or unrelated)
Toxicity Gene expression profiling in cells treated with compound. Look for toxicity signatures.
AACC2000- #2- 39
![Page 31: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/31.jpg)
The Biography of the Year 2000(Francis Collins and J.Craig Venter)
![Page 32: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/32.jpg)
Creating an Array of Contigous BAC Clones
![Page 33: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/33.jpg)
The ….omics
![Page 34: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/34.jpg)
![Page 35: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/35.jpg)
Introduction to Genomics and Proteomics - Historical Perspective and the Future
Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C)
UNIVERSITY OF TORONTO(Course 1505S/Jan. 9, 2001 #1)
![Page 36: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/36.jpg)
Organization of the Lecture
Historical BackgroundThe Human Genome Project
Critical Technologies: • Massive, automated sequencing • DNA and RNA analysis • Mass spectrometry • DNA and protein microarrays • Bioinformatics • Single nucleotide polymorphisms
Applications: • Diagnostics • Therapeutics
• PharmacogeneticsEthicsPatents
(Course 1505S/Jan. 9, 2001 #2)
![Page 37: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/37.jpg)
Historical Milestone
Year Milestone1866 Mendel’s discovery of genes1871 Discovery of nucleic acids1951 First protein sequence (insulin)1953 Double helix structure of DNA1960s Elucidation of the genetic code1977 Advent of DNA sequencing1975-79 First cloning of human genes1986 Fully automated DNA sequencing1995 First whole genome (Haemophilus Influenza)1999 First human chromosome2000 Drosophila / Arabidopsis genomes2001 Human and mouse genomes
(Course 1505S/Jan. 9, 2001 #3)
![Page 38: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/38.jpg)
Technologies
DNA Genomics
mRNA Transcriptomics
Protein Proteomics
Metabolites Metabolomics
Functional genomics, proteomics ----- etc.
(Course 1505S/Jan. 9, 2001 #4)
![Page 39: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/39.jpg)
History
On June 26, 2000, at The White House, it was announced that the Human Genome Project was essentially completed by -
Celera Genomics (private company)
The National Human Genome Research Initiative and its International Partners (publicly funded)
Work has yet to be published but Celera scientists submitted a paper to “Science” on December 6, 2000.
(Course 1505S/Jan. 9, 2001 #5)
![Page 40: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/40.jpg)
Predicting the Future
What is going to happen now that the human and other genomes are completed?
How quickly the next steps will happen?
What are the potential difficulties?
Are we expecting too much?(Course 1505S - Jan. 15/01 - #6)
![Page 41: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/41.jpg)
Grand Plan
Find all the genes
Translate genes to proteins
“Compute” function by similarity search and comparison to known proteins
“Compute” structure (Course 1505S - Jan. 15/01 - #7)
![Page 42: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/42.jpg)
Difficulties
• Gene prediction programs are unreliable
• Function inference by just similarity search may be fallacious
• Computation of structure is still unreliable
Our databases may get contaminated with “wrong” information.
(Course 1505S - Jan. 15/01 - #8)
![Page 43: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/43.jpg)
Gene Prediction
• Programs were designed based on knowledge of already cloned genes (ORFs; splice sites; start/stop codons, etc.)
• These programs provide excellent clues for gene presence but they never or rarely predict the complete gene structure
• The computer prediction must be taken as a “starting point” to experimentally clone a gene
How many genes in the genome?Estimate: 27,462 to 312, 278! (Course 1505S - Jan. 15/01 - #9)
![Page 44: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/44.jpg)
What is a Gene?
• Heritable unit corresponding to a phenotype?
• DNA that encodes for a protein?
• DNA that encodes RNA?
• What if RNA is not translated?
• What if a “gene” is not expressed?
(Course 1505S - Jan. 15/01 - #10)
![Page 45: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/45.jpg)
Prediction of Function
What is function? This is not a simple termFunction may be: • a biological process (e.g. serine
protease activity)• a molecular event (e.g. proteolysis of a specific substrate)• a cellular structure (e.g. membrane; chromatin; mitochondrion; etc.)• relevance to a whole process (e.g. cell cycle)• relevance to the whole organism (e.g. ovulation)
* Some scientists have now initiated projects to “compute” function of whole organisms. (Course 1505S - Jan. 15/01 -
#11)
![Page 46: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/46.jpg)
Pattern Recognition
• Looks for motifs that may have functional relevance (family signatures):
* Membrane anchoring* Catalytic site* Nucleotide binding* Nuclear localization signal* Hormone response element* Calcium binding, etc.
• Protein family resources (being created now) (Course 1505S - Jan. 15/01 - # 12)
![Page 47: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/47.jpg)
Homology
• What is “homology”?Definition: Two proteins are homologous if they are related by divergence from a common ancestor.
BDivergent
A CEvolution
Ancestor DHomologous (Course 1505S - Jan. 15/01 - # 13)
![Page 48: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/48.jpg)
Analogy
• What is “analogy”?Definition: Two proteins are “analogous” if they acquired common structural and functional features via convergent evolution from unrelated ancestors.
ConvergentA B
EvolutionC D
Unrelated Analogous (similar structure and/or function)
(Course 1505S - Jan. 15/01 - # 14)
![Page 49: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/49.jpg)
Serine Proteases (Convergent Evolution)
Trypsin-like Subtilisin-like Analogous proteins
Many homologous Many homologousmembers members
Trypsin and subtilisin share groups of catalytic residues with almost identical spatial geometries but they have no other sequence or structural similarities.
(Course 1505S - Jan. 15/01 - # 15)
![Page 50: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/50.jpg)
Human Kallikrein Gene Family (Divergent Evolution)
15 homologous genes on human chromosome 19q13.4
Divergence in tissue expression and substrate specificity
(Course 1505S - Jan. 15/01 - # 16)
![Page 51: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/51.jpg)
OrthologsProteins that usually perform same function in different species (e.g. DNA polymerase; glucose 6-phosphate dehydrogenase; retinoblastoma gene; p53, etc.).
ParalogsProteins that perform different but related functions within one organism [usually formed by gene duplication and divergent evolution] (e.g. the 15 kallikrein genes mentioned above).
(Course 1505S - Jan. 15/01 - # 17)
![Page 52: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/52.jpg)
Functional Annotation - Difficulties
• Who knows if the best matches in a database query is really Orthologs or Paralogs
• Modules: Building blocks of proteins. Finding a “module” in a protein does not mean that a “function” can be assigned since these
modules do not always perform the same function
Aphorism: The properties of a system can be explained by, but not deduced from those of its components (Course 1505S - Jan. 15/01 - # 18)
![Page 53: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/53.jpg)
Structure Prediction
• How proteins fold in 3D space
• We still cannot reliably “compute” structures of > 100 amino acid proteins (ab initio methods)
• Experiment and computation:Crystallography NMR
(Course 1505S - Jan. 15/01 - # 19)
![Page 54: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/54.jpg)
Future
• Lots of rigorous work needs to be done
• Holistic view -- regulation of gene expression -- metabolic pathways -- signaling cascades
Remember: Proteins do not work in isolation but within integrated networks.
(Course 1505S - Jan. 15/01 - # 20)
![Page 55: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/55.jpg)
The Importance of Accurate Functional Annotation
• Function in whole organisms is complex and interrelated
• Need for close collaboration between:- software developers- annotators- experimentalists
• Holistic approaches needed for optimal knowledge-based inference and innovation (drugs, diagnostics, etc.) (Course 1505S - Jan. 15/01 - # 21)
![Page 56: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/56.jpg)
How proten Structure is Elucidated
![Page 57: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/57.jpg)
Protein Annotation
![Page 58: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/58.jpg)
Protein Annotation
![Page 59: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/59.jpg)
PLANT GENOMESSpecies Genome Size
(base pairs)BrassicasThale cress Arabidoopsis 1.0 x 108
thaliana--------------------------------------------------------------------------------------Oilseed rape/ Brassica napus 1.2 x 109
canola--------------------------------------------------------------------------------------CerealsRice Oryza sativa 4.2 x 108
Barley Hordeum vulgare 4.8 x 109
Wheat Triticum aestivum 1.6 x 1010
Maize/corn Zea mays 2.5 x 109
-------------------------------------------------------------------------------------LegumesGarden pea Pitsum sativum 4.1 x 109
Soya bean Glycine max 1.1 x 109
-------------------------------------------------------------------------------------SolanaceaePotato Solanum 1.8 x 109
tuberosumTomato Lycopersicon 1.0 x 109
esculentum-------------------------------------------------------------------------------------Human Homo sapiens 3.2 x 109
![Page 60: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO](https://reader036.vdocuments.site/reader036/viewer/2022070408/56649e4e5503460f94b45139/html5/thumbnails/60.jpg)
The New Human Kallikrein Gene Locus(19q13.4, 300kb)
KLK1 KLK15 PSA KLK2 KLK-L1 KLK-L2 Zyme HSCCE
KLK-L6KLK-L4KLK-L5TLSPNES1KLK-L3Neuropsin SIGLEC-9
Centromere
5 kb 4.6 kb 6.2 kb 5.8 kb 5.8 kb 5.1 kb 9.5 kb 10.5 kb 6.5 kb
5.7 kb 7.1 kb 5.4 kb 5.3 kb 5.8 kb 8.9 kb 6.3 kb 5.4 kb
//
23.6kb 1.5 kb 23.3 kb 13.3 kb 26.7 kb 31.9 kb 5.9 kb 6.3 kb 12.1 kb
2.1 kb 4.5 kb 3.4 kb 1.6 kb 21.3 kb 12.9 kb 43.2 kb
KLK1 KLK15 KLK3 KLK2 KLK4 KLK5 KLK6 KLK7
KLK8 KLK9 KLK10 KLK11 KLK12 KLK13 KLK14
Revised November 29, 2000
ACPTTAPACPT
/ /