intro to bioinformatics esti yeger-lotem oleg rokhlenko lecture i: introduction & text based...
Post on 20-Dec-2015
220 views
TRANSCRIPT
![Page 1: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/1.jpg)
Intro to BioInformatics
Esti Yeger-LotemOleg Rokhlenko
Lecture I: Introduction & Text Based Search
prepared with some help from friends...
Metsada Pasmanik-Chor, Hanah Margalit, Ron Pinter, Gadi Schuster and numerous web
resources.
![Page 2: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/2.jpg)
Course requirements:1. Attend all lectures.
2. Submit all written assignments.• There will be about 6 assignments.• Each assignment is to be done and submitted in pairs (except
the first).• The pairs are ideally composed of a person from computer
science and a person from life science.
3. A final project or a take home exam, submitted in pairs.
Critically review a topic.Propose and implement new approaches using tools tought in class.Will compose about 50% of the course grade.
4. The course web site: http://webcourse.technion.ac.il/234523
![Page 3: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/3.jpg)
Course outline:
• General information: Introduction to bioInformatics. • Databases search : NCBI - ENTREZ, PubMed, OMIM.• Nucleotides: Pairwise sequence alignment (BLAST, FASTA).• Proteins: Pairwise and multiple sequence alignment (BLASTP, PSI-BLAST, FASTA, CLUSTALW).• Protein structure: secondary and tertiary structure.• Proteins families: motifs, domains, clustering.• Phylogeny: Tree reconstruction methods.• The Human Genome Project.• Gene expression analysis: DNA micro arrays (chips), clustering tools.
![Page 4: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/4.jpg)
Edited by S.I. Letovsky1999.
Please refer to class notes, and to the list of references on our web site.
LITERATURE:
![Page 5: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/5.jpg)
A Few Basic Concepts of Molecular Biology:
• Genetic material - DNA & RNA.• DNA as a sequence of bases (A,C,T,G).• Watson-Crick complementation.
• Proteins.• The central dogma of molecular biology.
![Page 6: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/6.jpg)
Central Dogma
Transcription
mRNA
Cells express different subset of the genes in different tissues and under different conditions
Gene (DNA)
Translation
Protein
![Page 7: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/7.jpg)
Centarl Paradigm of Molecular Biology
DNA RNA Protein Symptomes (Phenotype)
![Page 8: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/8.jpg)
Central Paradigm of Bioinformatics
Geneticinformation
![Page 9: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/9.jpg)
Molecular Structure
GeneticInformation
Central Paradigm of Bioinformatics
![Page 10: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/10.jpg)
Central Paradigm of Bioinformatics
Molecular Structure
GeneticInformation
BiochemicalFunction
![Page 11: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/11.jpg)
Central Paradigm of Bioinformatics
Molecular Structure
GeneticInformation
BiochemicalFunction
Symptoms
![Page 12: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/12.jpg)
Central Paradigm of Bioinformatics
Molecular Structure
GeneticInformation
BiochemicalFunction
Symptoms
![Page 13: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/13.jpg)
• Exponential growth of biological information:growth of sequences, structures, and literature.
• Efficient storage and management tools are most important.
![Page 14: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/14.jpg)
Biological Revolution Necessitates Bioinformatics
•New bio-technologies (automatic sequencing, DNA chips, protein identification, mass specs., etc.) produce large quantities of biological data.
• It is impossible to analyze data by manual inspection.
• Bioinformatics: Development of algorithms that enable theanalysis of the data (from experiments or from databases).Data produced by biologists and stored in database
New informationfor biological and medical useBioinformatics
Algorithms and Tools
![Page 15: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/15.jpg)
Three Specific Examples:
• Molecular evolution and the TREE OF LIFE.
(a classical, basic science problem, since
Darwin’s 1859 ''Origin of Species'').
• The Human Genome Project (HGP):
- Write down all of human DNA on a single
CD
(“completed” 2001).
- Identify all genes, their locations and
function
(far from completion).
• DNA Chips and personalized medicine (leading
edge, future technologies).
![Page 16: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/16.jpg)
Origin of the universe ?
Formation of the solar system
First self replicating systems
Prokaryotes/eukaryotes
Plant/animals
Invertebrates/vertebrates
Mammalianradiation
TREE OF LIFE: Searching Protein Sequence Databases -How far can we see back ?
![Page 17: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/17.jpg)
Microarrays (“DNA Chips”)New technological breakthrough:
– Measure, in one experiment RNA expression levels of thousands of genes.
![Page 18: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/18.jpg)
![Page 19: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/19.jpg)
A Big Goal“The greatest challenge, however, is analytical. … Deeper biological insight is likely to emerge from examining datasets with scores of samples.”
Eric Lander, “array of hope” Nat. Gen. 1999.
BIOINFORMATICS:Provide methodologies for elucidating biological knowledge from biological data.
![Page 20: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/20.jpg)
What is BIOINFORMATICS ?
A field of science in which Biology, Computer Science and Information Technology merge into a single discipline.
Goal: To enable the discovery of new biological insights and create a global perspective for biologists.
![Page 21: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/21.jpg)
Disciplines:
• Development of new algorithms and statistics to assess relationships among members of large data sets.
• Analysis and interpretation of various types of data.
• Development and implementation of tools to efficiently access and manage different types of information.
![Page 22: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/22.jpg)
Why use BIOINFORMATICS ?
• An explosive growth in the amount of biological information necessitates the use of computers for cataloging and retrieval.
• A more global perspective in experimental design (from “one scientist = one gene/protein/disease” paradigm to whole organism consideration).
• Data mining - functional/structural information is important for studying the molecular basis of diseases (and evolutionary patterns).
![Page 23: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/23.jpg)
![Page 24: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/24.jpg)
![Page 25: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/25.jpg)
![Page 26: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/26.jpg)
Why is it Hard to Elucidate from Sequence?
•Genetic information is redundant•Genetic code•Accepted amino acid replacements•Intron-Exon variation•Strain variation
•Structural information is redundant•Conformational changes•Different structures may result in similar functions•Different sequences result in the same structure
•Single genes have multiple functions.•May act as an metabolic enzyme and as a regulator.•Genes are 1-dimensional but function depends on 3-dimensional structure.
![Page 27: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/27.jpg)
![Page 28: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/28.jpg)
-A model organism for plant kingdom - (Arabidopsis thaliana).
-Haernophilus influenzae (2 Mb).
-First Eukaryote genome (Saccharomyces cereviseae (12 Mb)).
-First multi-cellular Eukaryote (Caenorhabditis elegans (100Mb)).-A model organism
for animal kingdom(Drosophila melanogaster).
![Page 30: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/30.jpg)
![Page 32: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/32.jpg)
Similaritysearching
NCBI
![Page 33: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/33.jpg)
ENTREZ
A search and retrieval system for information integration.
![Page 34: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/34.jpg)
![Page 35: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/35.jpg)
![Page 36: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/36.jpg)
![Page 37: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/37.jpg)
![Page 38: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/38.jpg)
• The largest, most used and best known of NLM databases (90% of all searches are done in MEDLINE), > 9 million searches per month..
• > 40 databases online, > 20 million records.
• Links to full-text articles as well as links to other third party sites such as libraries and sequencing centers.
• PubMed provides access and links to the integrated molecular biology databases maintained by NCBI.
PUBMED
![Page 39: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/39.jpg)
TEXT SEARCHING:
MedLine Indexing:MESH (Medical Subject Heading): Use a term to limit retrieval.(Human, animal, male, female, age group, organism, etc.).
Publication Type: Review, clinical trial, letter, journal article, etc.
Search Terms By: Author name, title word, text word, journal title, publication date, phrase, or any combination of these.
• Words are automatically added, but Boolean operators (AND, OR, NOT, in UPPER CASE) are welcome.
Searching PubMed
![Page 40: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/40.jpg)
![Page 41: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/41.jpg)
GenBank Growth
bp sequences
![Page 42: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/42.jpg)
NCBI bioinformatics tools - 1-
![Page 43: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/43.jpg)
NCBI bioinformatics tools -2-
![Page 44: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/44.jpg)
-3-
![Page 45: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/45.jpg)
http://www.ncbi.nlm.nih.gov/Education/index.htm
![Page 46: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/46.jpg)
OTHER TEXT BASED SEARCHES:
• SRS (sequence retrieval system) at EBI, England. http://srs.ebi.ac.uk/
• STAG at DDBJ, Japan.http://stag.genome.ad.jp/
• Expasy at SIB (Swiss Institute of Bioinformatics), Switzerland.
http://ca.expasy.org/ExpasyHunt/
![Page 47: Intro to BioInformatics Esti Yeger-Lotem Oleg Rokhlenko Lecture I: Introduction & Text Based Search prepared with some help from friends... Metsada Pasmanik-Chor,](https://reader033.vdocuments.site/reader033/viewer/2022052701/56649d4e5503460f94a2d7af/html5/thumbnails/47.jpg)
International collaboration of NCBI, DDBJ, EMBL