bio 5488 genomics mon, wed 10:00-11:30 am, rm 823 mcdonnell

36
Bio 5488 Genomics Mon, Wed 10:00-11:30 am, Rm 823 McDonnell Fri., 10:00-11:30 am, 12:00-1:30 pm, Rm 109 CCB

Upload: pammy98

Post on 11-May-2015

321 views

Category:

Documents


2 download

TRANSCRIPT

  • 1.Bio 5488 Genomics Mon, Wed 10:00-11:30 am, Rm 823 McDonnell Fri., 10:00-11:30 am, 12:00-1:30 pm, Rm 109 CCB

2. Course Web Site

  • http://www.genetics.wustl.edu/bio5488/home.html
  • Linux Primer
  • Lecture notes (if available)
  • Schedule (changes)
  • Weekly Assignments and Answers
  • Weekly Readings

3. Computer Labs

  • 1 hours each Friday (10:00-11:30, 12:00-1:30)
  • Room 109 In the CCB, 700 S. Euclid
    • Access to the bldg. through card key entrance
    • Access to the room through card key on the door
  • Weekly computer assignments

4. Grading

  • 4 credit
  • midterm
  • final
  • weekly assignments
  • 3 credit
  • midterm
  • final

5. Todays Outline

  • Introduction to Genomics
  • Being Quantitative
  • A case study on the differences between humans and chimps

6. Part I, Introduction to Genomics? Book of Life - NY Times (2001) Code of Life PBS (2002) Map of Life Science World (1999) Blueprint of Humanity - BBC (2000) Instruction Set Bob Waterston (2003) Evolutions Notebook Eric Lander (2002) Whats a genome? 7. Some History On Reading Genomes

  • 1951,Fred Sanger , Amino Acid Sequence of Insulin
  • 1960s,Crick, Brenner, Yanofsky, Nirenberg, Matthaei , The Genetic Code
  • 1972,Margret Dayhoff , First Protein Database
  • 1977,Maxim/GilbertandFred Sanger , DNA sequencing
  • 1977,Fred Sanger , Complete sequence of phage X174
  • 1979,Walter Goad , First implementation of Genbank
  • 1989,NHGRI , Human Genome Project initiated
  • 1990,Altschul/Gish/Miller/Myers/Lipman , BLAST

8. 169 Genomes Done!788 more started!

  • 1995,Haemophilus influenzae
  • 1996,Methanococcus jannaschii
  • 1997 , Saccharomyces cerevisiae
  • 1997,Escherichia coli
  • 1998,Caenorhabditis elegans
  • 2000,Drosophila melanogaster
  • 2000,Arabidopsis thaliana
  • 2001,Homo sapiens
  • 2002,Schizosaccharomyces pombe
  • 2002,Oryza sativa
  • 2002,Plasmodium falciparum
  • 2002,Mus musculus

http://wit.integratedgenomics.com/GOLD/ (as of January 2004) 131 Bacterial, 17 Archeal, 21 Eukaryotic 9.

  • Host-Pathogen Interactions
    • Anopheles, Plasmodium, Homo sapiens
  • Horizontal Transfer
    • Yersini pestis(plague)
  • Alternative Energy Sources
    • Chlorobium tepidum(green sulfur bacterium)
  • Extremeophiles
    • Halobacterium ,Deinococcus radiodurans, Pyrococcus furiosis

Genomics and the return of cool biology A B C D E F 10. Thinking Genomics Genetics Genomics Genes Gene Networks Mutations Allele Frequencies Mutant Hunts Systematic Reverse Genetics Structure/Function Comparative Genomics 11. Part II,Thinking Quantitatively 12. From Mendel To Matricies. Biology Is A Quantitative Science!!!

  • Mendels Laws
  • Chargaffs Rules

Gregor Mendel 1823-1884 Erwin Chargaff 1929-1992 13. Thinking Quantitatively

  • Spaces
  • Signal to Noise Ratio
  • Distributions
  • Parametric Versus Nonparametric Data
  • ThePvalue
  • Statistics and Probability

14. Spaces 15. Signal to Noise m/z relative abundance 400 600 800 1000 100 75 50 25 16. Distributions (Histograms) Result Result 5 10 15 20 25 5 10 15 20 25 # of occurrences # of occurrences Normal (Gaussian) Poisson 17. Parametric Versus Nonparametric Data

  • Parametric Distributions
    • Can be described by a standard equation
    • Examples. Normal, Poisson, Binomial
  • Nonparametric Distributions
    • Irregular or highly skewed distributions
    • Usually analyzed by Chi-Square or by simulation (bootstrap)

18. ThePvalue *for discreet variables # of occurrences 0 1 2 3 4 5 6 7 8 9 10 5 10 15 20 25 What is the chance of getting exactly seven? # of occurrences 0 1 2 3 4 5 6 7 8 9 10 5 10 15 20 25 What is the chance of getting seven or better? 19. There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics

  • Comparing averages
  • Comparing variation

-Benjamin Disraeli (1804-1881) Statistics:drawing inferences from the resultsof an experiment 20. The most important questions of life are really only problems of probability. -Pierre Simon, Marquis de Laplace (1749-1827) Probability:Computing the chance of a particularoutcome of an experiment E E C S P(E)=E/S P(E c )=1-P(E) 21. Part III,What makes organisms different from each other? Differences between individuals? Differences between species? 22. Humans and Chimps Homo sapiens99.9% identical Homo sapiensandPan troglodytes99.0% identical 23. Chimps Are Resistant To Many Human Diseases Comparison of disease susceptibility between chimps and humans Condition Human Chimp HIV progression to AIDS common very rare Influenza A symptoms moderate/severe mild Hepatitus B/C complications moderate/severe mild Plasmodium falciparummalaria susceptible resistant Menopause universal rare E. ColiK99 gastroenteritis resistant sensitive Alzheimers disease pathology complete incomplete Epithelial cancers common rare Source:Olson, M.V. et al. White paper advocating the complete sequencing of the common chimpanzee, Pan troglyodytes, (2002) 24. Edward Tyson The Anatomy of a Pygmie Compared with that of a Monkey, and Ape, and a Man.(1699)Conclusion:Human and Chimp brains bear a surprising resemblance. one would be apt to think, that since there is so great a disparity between theSoulof aMan,and aBrute,theOrgantis placed should be very different too. 25. Sir Richard Owen Archencephala, Ruling Brain Not only do the cerebral hemispheres overlap the olfactory lobes and cerebellum, but they extend in advance of the one and further back than the other. " 1862, The Medical Times and Gazette, London The Fold of Humanity 26. Owens classification stands like a Corinthian portico in cow dung. The difference in mind between man and the higher animals, great as it is, certainly is one of degree and not of kind.1871,The Descent of Man 1825-1895 Thomas Huxley Charles Darwin 1809-1882 27. The 20 thCentury Comparison of human and chimps

  • Antibody cross-reactivity (early 1900s)
  • Comparison of protein sequences, Morris Goodwin (1960s)
  • Interspecies hybridizations and melting curves (1970s)
  • Comparison of DNA sequences (1980s)

How can genomics help? 28. Complexity, Genome Size and the C-value Paradox

  • www.genomesize.com

29. Complexity and Gene Number 30. Gene Finding Is Still An Art

  • de novo, in silico
  • Comparative genomics
  • ESTs and cDNAs

Figure 1. Nucleotide Comparison of Celera, Ensembl, and Refseq Transcripts Hogenesch et al. (2001) Cell 106, 413-416. 15,231 6,552 9,315 9,300 364 604 747 Celera (39,114) Ensembl (29,691) Refseq (11,015) 31. What about human/chimp specific genes?

  • Only 14 out of 731 genes on mouse chromosome 16 have no human homolog (Celera)
  • Many genes specific to the mouse are olfactory receptors, also some differences in immunity and reproduction

Probably not going to explain the differences. Consider the human and the mouse 32. What about organism specific variants? http://sayer.lab.nig.ac.jp/~silver/ C-C chemokine receptor (nucleotides 1 to 60) Human_1 ATGGATTATCAAGTGTCAAGTCCAATCTATGACATC A ATTATTATACATCGGAGCCCTGCHuman_2 ATGGATTATCAAGTGTCAAGTCCAATCTATGACATC A ATTATTATACATCGGAGCCCTGCHuman_3 ATGGATTATCAAGTGTCAAGTCCAATCTATGACATC A ATTATTATACATCGGAGCCCTGCHuman_4 ATGGATTATCAAGTGTCAAGTCCAATCTATGACATC A ATTATTATACATCGGAGCCCTGCChimp_1 ATGGATTATCAAGTGTCAAGTCCAATCTATGACATC G ATTATTATACATCGGAGCCCTGCChimp_2 ATGGATTATCAAGTGTCAAGTCCAATCTATGACATC G ATTATTATACATCGGAGCCCTGCChimp_3 ATGGATTATCAAGTGTCAAGTCCAATCTATGACATC G ATTATTATACATCGGAGCCCTGCGoril_1 ATGGATTATCAAGTGTCAAGTCCAACCTATGACATC G ATTATTATACATCGGAGCCCTGCGoril_2 ATGGATTATCAAGTGTCAAGTCCAACCTATGACATC G ATTATTATACATCGGAGCCCTGCGoril_3 ATGGATTATCAAGTGTCAAGTCCAACCTATGACATC G ATTATTATACATCGGAGCCCTGC************************* ********** *********************** 33. FOXP2,The Human Speech Gene? Enard et al. (2002) Nature 418, 869-872

  • Mapped in families with inherited speech defects
  • Forkhead transcription factor

FOXP2 Nucleotide Substitutions T303N N325S 34. Sialic Acid Biology an example of database mining Chou et al. (1998) Proc. Natl. Acad. Sci. USA 95, 11751-11756

  • Apes have lots of Neu5Gc, humans very little
  • Neu5Gc is located on the surface of epithelial cells
  • Neu5Gc is present in very low levels in the brain even in animals that have lots of Neu5Gc

hydroxylase human chimp gorilla mouse A 92 bp deletion in the CMP-Neu5a hydroxylase isspecificto the human lineage ATG ATG ATG ATG 35. Differences in Gene Expression in the Brain? Enard et al (2002) Science 296, 340-343.microarrays 2D Gels 36. The Modern Family? James Balog, Sally and Isabella Two Boys Tarzan, Jane, Boy, and Cheetah