sequential learning 1. what is sequential learning? 2

Download Sequential Learning 1. WHAT IS SEQUENTIAL LEARNING? 2

Post on 17-Jan-2016

214 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

PowerPoint Presentation

Sequential Learning1What is Sequential Learning?2Topics from classClassification learning: learn xyLinear (nave Bayes, logistic regression, )Nonlinear (neural nets, trees, )Not quite classification learning:Regression (y is a number)Clustering, EM, graphical models, there is no y, so build a distributional model the instancesCollaborative filtering/matrix factoringmany linked regression problemsLearning for sequences: learn (x1,x2,,xk)(y1,,yk)special case of structured prediction3A sequence learning task: named entity recognition (NER)October 14, 2002, 4:00 a.m. PT

For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.

Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.

"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.

Richard Stallman, founder of the Free Software Foundation, countered sayingNEROctober 14, 2002, 4:00 a.m. PT

For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.

Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.

"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.

Richard Stallman, founder of the Free Software Foundation, countered sayingpersoncompanyjobTitle4Name entity recognition (NER) is one part of information extraction (IE)NEROctober 14, 2002, 4:00 a.m. PT

For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.

Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.

"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.

Richard Stallman, founder of the Free Software Foundation, countered sayingpersoncompanyjobTitleNAME TITLE ORGANIZATIONBill Gates CEO MicrosoftBill Veghte VP MicrosoftRichard St founder Free Soft..

5

IE Example:Job Openings from the Web

foodscience.com-Job2 JobTitle: Ice Cream Guru Employer: foodscience.com JobCategory: Travel/Hospitality JobFunction: Food Services JobLocation: Upper MidwestContact Phone: 800-488-2611 DateExtracted: January 8, 2001 Source: www.foodscience.com/jobs_midwest.html OtherCompanyJobs: foodscience.com-Job16WBs first application area is extracting job openings from the web.Start at company home page, automatically follow links to find the job openings section, segment job openings from each other, and extract job title, location, description, and all of these fields.(Notice that capitalization, position on page, bold face are just as much indicators of this being a job title as the words themselves.Notice also that part of the IE task is figuring out which links to follow. And often data for a relation came from multiple different pages.)IE Example: A Job Search Site

7How can we do NER?October 14, 2002, 4:00 a.m. PT

For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.

Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.

"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.

Richard Stallman, founder of the Free Software Foundation, countered sayingNEROctober 14, 2002, 4:00 a.m. PT

For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.

Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.

"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.

Richard Stallman, founder of the Free Software Foundation, countered sayingpersoncompanyjobTitleMost common approach: NER by classifying tokensYesterday Pedro Domingos flew to New York.Yesterday Pedro Domingos flew to New YorkPerson name: Pedro Domingos Location name: New YorkGiven a sentence:2) Identify names based on the entity labels person namelocation namebackground1) Break the sentence into tokens, and classify each token with a label indicating what sort of entity it is part of:3) To learn an NER system, use YFCL and whatever features you want. 9Most common approach: NER by classifying tokensYesterday Pedro Domingos flew to New York.Yesterday Pedro Domingos flew to New YorkPerson name: Pedro Domingos Location name: New YorkGiven a sentence:2) Identify names based on the entity labels 1) Break the sentence into tokens, and classify each token with a label indicating what sort of entity it is part of:3) To learn an NER system, use YFCL and whatever features you want. FeatureValueisCapitalizedyesnumLetters8suffix2-osword-1-to-rightflewword-2-to-rightto10NER by classifying tokensYesterday Pedro Domingos flew to New Yorkperson namelocation namebackgroundAnother common labeling scheme is BIO (begin, inside, outside; e.g. beginPerson, insidePerson, beginLocation, insideLocation, outside)

BIO also leads to strong dependencies between nearby labels (eg inside follows begin). A Problem: Similar labels tend to cluster together in textHow can you model these dependencies?A hidden Markov model (HMM): the nave Bayes of sequences11Other nice problems for HMMSParsing addressesHouse numberBuildingRoadCityZip4089 Whispering Pines Nobel Drive San Diego CA 92122StateParsing citationsP.P.Wangikar, T.P. Graycar, D.A. Estell, D.S. Clark, J.S. Dordick (1993) Protein and Solvent Engineering of Subtilising BPN' in Nearly Anhydrous Organic Media J.Amer. Chem. Soc. 115, 12231-12237.AuthorYearTitleJournalVolumeOther nice problems for HMMSSentence segmentation: Finding words (to index) in Asian languages,,Morphology: Finding components of a single worduygarlatramadklarmzdanmsnzcasna, or (behaving) as if you are among those whom we could not civilize

Document analysis: finding tables in plain-text documentsVideo segmentation: splitting videos into naturally meaningful sectionsConverting text to speech (TTS)Converting speech to text (ASR)Other nice problems for HMMSModeling biological sequencese.g., segmenting DNA into genes (transcribed into proteins or not), promotors, TF binding sites, identifying variants of a single gene expresseslac operatorlacZ geneCAP binding sitepromotesCAPproteinlac repressor proteinRNA polymerasebindsTobindsTobindsTocompetesinhibitsWhat Is an HMM?HMMs: HistoryMarkov chains: Andrey Markov (1906)Random walks and Brownian motionUsed in Shannons work on information theory (1948)Baum-Welsh learning algorithm: late 60s, early 70s.Used mainly for speech in 60s-70s.Late 80s and 90s: David Haussler (major player in learning theory in 80s) began to use HMMs for modeling biological sequencesMid-late 1990s: Dayne Freitag/Andrew McCallumFreitag thesis with Tom Mitchell on IE from Web using logic programs, grammar induction, etc.McCallum: multinomial Nave Bayes for textWith McCallum, IE using HMMs on CORA

16

What is an HMM?Generative process:Choose a start state S1 using Pr(S1)For i=1n:Emit a symbol xi using Pr(x|Si)Transition from Si to Sj using Pr(S|S)

Usually the token sequence x1x2x3 is observed and the state sequence S1S2S3 is not (hidden)An HMM is a special case of a Bayes net

S2S4S10.90.50.50.80.20.1S3AC0.60.4AC0.30.7AC0.50.5AC0.90.1

17In previous models, pr(ai) depended just on symbols appearing before some distance but not on the position of the symbol, I.e. not on I.To model drifting/evolving sequences need something more powerful.Hidden Markov models provide one such option.Here states do not correspond to substrings hence the name hidden.There are two kinds of probabilities: transition like before but emission too.

Calculating Pr(seq) not easy since all symbols can potentially be generat

Recommended

View more >