eecs 730 introduction to bioinformatics function luke huan electrical engineering and computer...
TRANSCRIPT
![Page 1: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/1.jpg)
EECS 730Introduction to Bioinformatics
Function
Luke HuanElectrical Engineering and Computer Science
http://people.eecs.ku.edu/~jhuan/
![Page 2: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/2.jpg)
23/4/19 EECS 730 2
Overview
Gene ontology
Challenges
What is gene ontology
construct gene ontology
Text mining, natural language processing and
information extraction: An Introduction
Summary
![Page 3: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/3.jpg)
23/4/19 EECS 730 3
Ontology <philosophy> A systematic account of Existence. <artificial intelligence> (From philosophy) An explicit formal specification
of how to represent the objects, concepts and other entities that are assumed to exist in some area of interest and the relationships that hold among them.
<information science> The hierarchical structuring of knowledge about things by subcategorising them according to their essential (or at least relevant and/or cognitive) qualities.
This is an extension of the previous senses of "ontology" (above) which has become common in discussions about the difficulty of maintaining subject indices.
The philosophy of indexing everything in existence?
![Page 4: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/4.jpg)
23/4/19 EECS 730 4
Aristotele’s (384-322 BC) Ontology Substance
plants, animals, ... Quality Quantity Relation Where When Position Having Action Passion
![Page 5: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/5.jpg)
23/4/19 EECS 730 5
Ontology and -informatics
In information sciences, ontology is better defined as: “a domain of knowledge, represented by facts and their logical connections, that can be understood by a computer”.
(J. Bard, BioEssays, 2003)
“Ontologies provide controlled, consistent vocabularies to describe concepts and relationships, thereby enabling knowledge sharing”
(Gruber, 1993)
![Page 6: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/6.jpg)
23/4/19 EECS 730 6
Information Exchange in Bio-sciences
Basic challenges: Definition, definition, definition
What is a name? What is a function?
![Page 7: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/7.jpg)
23/4/19 EECS 730 7
Cell
![Page 8: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/8.jpg)
23/4/19 EECS 730 8
Cell
![Page 9: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/9.jpg)
23/4/19 EECS 730 9
Cell
![Page 10: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/10.jpg)
23/4/19 EECS 730 10
Cell
![Page 11: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/11.jpg)
23/4/19 EECS 730 11
Cell
Image from http://microscopy.fsu.edu
![Page 12: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/12.jpg)
23/4/19 EECS 730 12
What’s in a name?
The same name can be used to describe different concepts
![Page 13: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/13.jpg)
23/4/19 EECS 730 13
What’s in a name?
Glucose synthesis Glucose biosynthesis Glucose formation Glucose anabolism Gluconeogenesis
All refer to the process of making glucose from simpler components
![Page 14: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/14.jpg)
23/4/19 EECS 730 14
What’s in a name?
The same name can be used to describe different concepts
A concept can be described using different names
Comparison is difficult – in particular across species or across databases
![Page 15: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/15.jpg)
23/4/19 EECS 730 15
Function (what) Process (why)
Drive nail (into wood) Carpentry
Drive stake (into soil) Gardening
Smash roach Pest Control
Clown’s juggling object Entertainment
What is Function? The Hammer Example
![Page 16: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/16.jpg)
23/4/19 EECS 730 16
Information Explosion
![Page 17: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/17.jpg)
23/4/19 EECS 730 17
Entering the Genome Sequencing Era
Eukaryotic Genome Sequences Year Genome # GenesSize (Mb)
Yeast (S. cerevisiae) 1996 12 6,000
Worm (C. elegans) 1998 97 19,100
Fly (D. melanogaster) 2000 120 13,600
Plant (A. thaliana) 2001 125 25,500
Human (H. sapiens, 1st Draft) 2001 ~3000 ~35,000
![Page 18: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/18.jpg)
23/4/19 EECS 730 18
A Common Language for Annotation of Genes from
Yeast, Flies and Mice
What is the Gene Ontology?
…and Plants and Worms
…and Humans
…and anything else!
![Page 19: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/19.jpg)
23/4/19 EECS 730 19
http://www.geneontology.org/
![Page 20: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/20.jpg)
23/4/19 EECS 730 20
What is the Gene Ontology?
Gene annotation system
Controlled vocabulary that can be applied to all organisms Organism independent
Used to describe gene products proteins and RNA - in any organism
![Page 21: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/21.jpg)
23/4/19 EECS 730 21
Molecular Function = elemental activity/task the tasks performed by individual gene products; examples
are carbohydrate binding and ATPase activity
Biological Process = biological goal or objective broad biological goals, such as mitosis or purine
metabolism, that are accomplished by ordered assemblies of molecular functions
Cellular Component = location or complex subcellular structures, locations, and macromolecular
complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme
The 3 Gene Ontologies
![Page 22: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/22.jpg)
23/4/19 EECS 730 22
Cellular Component where a gene product acts
![Page 23: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/23.jpg)
23/4/19 EECS 730 23
Cellular Component
![Page 24: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/24.jpg)
23/4/19 EECS 730 24
Cellular Component
![Page 25: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/25.jpg)
23/4/19 EECS 730 25
Cellular Component
Enzyme complexes in the component ontology refer to places, not activities.
![Page 26: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/26.jpg)
23/4/19 EECS 730 26
Molecular Function
insulin binding
insulin receptor activity
![Page 27: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/27.jpg)
23/4/19 EECS 730 27
Molecular Function activities or “jobs” of a gene product
glucose-6-phosphate isomerase activity
![Page 28: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/28.jpg)
23/4/19 EECS 730 28
Molecular Function
A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product.
Sets of functions make up a biological process.
![Page 29: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/29.jpg)
23/4/19 EECS 730 29
Biological Processa commonly recognized series of events
cell division
![Page 30: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/30.jpg)
23/4/19 EECS 730 30
Biological Process
transcription
![Page 31: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/31.jpg)
23/4/19 EECS 730 31
Biological Process
Metabolism: degradation or synthesis of biomelecules
![Page 32: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/32.jpg)
23/4/19 EECS 730 32
Biological Process
Development: how a group of cell become a tissue
![Page 33: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/33.jpg)
23/4/19 EECS 730 33
Biological Process
social behavior
![Page 34: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/34.jpg)
23/4/19 EECS 730 34
Ontology applications
Can be used to: Formalise the representation of biological knowledge Standardise database submissions Provide unified access to information through
ontology-based querying of databases, both human and computational
Improve management and integration of data within databases.
Facilitate data mining
![Page 35: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/35.jpg)
23/4/19 EECS 730 35
Gene Ontology Structure
Ontologies can be represented as directed acyclic graphs (DAG), where the nodes are connected by edges Nodes = terms in biology Edges = relationships between the terms
is-a part-of
![Page 36: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/36.jpg)
23/4/19 EECS 730 36
Parent-Child Relationships
Chromosome
Cytoplasmic chromosome
Mitochondrialchromosome
Plastid chromosome
Nuclear chromosome
A child is a subset or instances of
a parent’s elements
![Page 37: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/37.jpg)
23/4/19 EECS 730 37
Parent-Child Relationshipscell
membrane chloroplast
mitochondrial chloroplastmembrane membrane
is-apart-of
![Page 38: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/38.jpg)
23/4/19 EECS 730 38
Annotation in GO
A gene product is usually a protein but can be a functional RNA
An annotation is a piece of information associated with a gene product
A GO annotation is a Gene Ontology term associated with a gene product
![Page 39: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/39.jpg)
23/4/19 EECS 730 39
Terms, Definitions, IDs Term: MAPKKK cascade (mating sensu Saccharomyces)
Goid: GO:0007244
Definition: OBSOLETE. MAPKKK cascade involved in transduction of mating pheromone signal, as described in Saccharomyces.
Evidence code: how annotation is done
Definition_reference: PMID:9561267
![Page 40: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/40.jpg)
23/4/19 EECS 730 40
Annotation Example
GO Term
Gene Product
nek2
centrosomeGO:0005813
Reference
PMID: 11956323
Evidence Code
IDAInferred fromDirect Assay
![Page 41: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/41.jpg)
23/4/19 EECS 730 41
GO Annotation
![Page 42: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/42.jpg)
23/4/19 EECS 730 42
GO Annotation
![Page 43: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/43.jpg)
23/4/19 EECS 730 43
GO Annotation
![Page 44: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/44.jpg)
23/4/19 EECS 730 44
Evidence Code
Indicate the type of evidence in the cited source that supports the association between the gene product and the GO term
http://www.geneontology.org/GO.evidence.html
![Page 45: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/45.jpg)
23/4/19 EECS 730 45
Types of evidence codes
Types of evidence code Experimental codes - IDA, IMP, IGI, IPI, IEP Computational codes - ISS, IEA, RCA, IGC Author statement - TAS, NAS Other codes - IC, ND
Two types of annotation Manual Annotation Electronic Annotation
![Page 46: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/46.jpg)
23/4/19 EECS 730 46
Beyond GO – Open Biomedical Ontologies
Orthogonal to existing ontologies to facilitate combinatorial approaches Share unique identifier space Include definitions
![Page 47: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/47.jpg)
23/4/19 EECS 730 47
Gene Ontology and Text Mining
Derive ontology from text data More general goal: understand text data
automatically
![Page 48: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/48.jpg)
23/4/19 EECS 730 48
Finding GO terms
In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…
Process: response to wounding GO:0009611
Function: protein serine/threonine kinase activity GO:0004674
Component: integral to plasma membrane GO:0005887
…for B. napus PERK1 protein (Q9ARH1)
PubMed ID: 12374299
![Page 49: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/49.jpg)
23/4/19 EECS 730 49
Mining Text DataData Mining / Knowledge Discovery
Structured Data Multimedia Free Text Hypertext
HomeLoan ( Loanee: Frank Rizzo Lender: MWF Agency: Lake View Amount: $200,000 Term: 15 years)
Frank Rizzo boughthis home from LakeView Real Estate in1992. He paid $200,000under a15-year loanfrom MW Financial.
<a href>Frank Rizzo</a> Bought<a hef>this home</a>from <a href>LakeView Real Estate</a>In <b>1992</b>.<p>...Loans($200K,[map],...)
(Taken from ChengXiang Zhai, CS 397cxz, UIUC, CS – Fall 2003)
![Page 50: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/50.jpg)
23/4/19 EECS 730 50
Bag-of-Tokens Approaches
Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or …
nation – 5civil - 1war – 2men – 2died – 4people – 5Liberty – 1God – 1…
FeatureExtraction
Loses all order-specific information!Severely limits context!
Documents Token Sets
![Page 51: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/51.jpg)
23/4/19 EECS 730 51
Natural Language Processing
A dog is chasing a boy on the playgroundDet Noun Aux Verb Det Noun Prep Det Noun
Noun Phrase Complex Verb Noun PhraseNoun Phrase
Prep PhraseVerb Phrase
Verb Phrase
Sentence
Dog(d1).Boy(b1).Playground(p1).Chasing(d1,b1,p1).
Semantic analysis
Lexicalanalysis
(part-of-speechtagging)
Syntactic analysis(Parsing)
A person saying this maybe reminding another person to
get the dog back…
Pragmatic analysis(speech act)
Scared(x) if Chasing(_,x,_).+
Scared(b1)
Inference
![Page 52: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/52.jpg)
23/4/19 EECS 730 52
General NLP—Too Difficult! Word-level ambiguity
“design” can be a noun or a verb (Ambiguous POS) “root” has multiple meanings (Ambiguous sense)
Syntactic ambiguity “natural language processing” (Modification) “A man saw a boy with a telescope.” (PP Attachment)
Anaphora resolution “John persuaded Bill to buy a TV for himself.”
(himself = John or Bill?) Presupposition
“He has quit smoking.” implies that he smoked before.
Humans rely on context to interpret (when possible).This context may extend beyond a given document!
![Page 53: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/53.jpg)
23/4/19 EECS 730 53
Reference for GO
Gene ontology teaching resources: http://www.geneontology.org/
GO.teaching.resources.shtml
![Page 54: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/54.jpg)
23/4/19 EECS 730 54
References for Text Ming
1. C. D. Manning and H. Schutze, “Foundations of Natural Language Processing”, MIT Press, 1999.
2. S. Russell and P. Norvig, “Artificial Intelligence: A Modern Approach”, Prentice Hall, 1995.
3. S. Chakrabarti, “Mining the Web: Statistical Analysis of Hypertext and Semi-Structured Data”, Morgan Kaufmann, 2002.
4. G. Miller, R. Beckwith, C. FellBaum, D. Gross, K. Miller, and R. Tengi. Five papers on WordNet. Princeton University, August 1993.
5. C. Zhai, Introduction to NLP, Lecture Notes for CS 397cxz, UIUC, Fall 2003.
6. M. Hearst, Untangling Text Data Mining, ACL’99, invited paper. http://www.sims.berkeley.edu/~hearst/papers/acl99/acl99-tdm.html
7. R. Sproat, Introduction to Computational Linguistics, LING 306, UIUC, Fall 2003.
8. A Road Map to Text Mining and Web Mining, University of Texas resource page. http://www.cs.utexas.edu/users/pebronia/text-mining/
9. Computational Linguistics and Text Mining Group, IBM Research, http://www.research.ibm.com/dssgrp/
![Page 55: EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science jhuan](https://reader030.vdocuments.site/reader030/viewer/2022032707/56649e155503460f94afeccd/html5/thumbnails/55.jpg)
23/4/19 EECS 730 55
Acknowledge
Some slides are taken from http://www.tulane.edu/~wiser/cells/.