![Page 1: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/1.jpg)
RelationExtraction
Whatisrelationextraction?
ManyslidesadaptedfromDanJurafsky
![Page 2: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/2.jpg)
Extractingrelationsfromtext• Companyreport: “InternationalBusinessMachinesCorporation(IBMor
thecompany)wasincorporatedintheStateofNewYorkonJune16,1911,astheComputing-Tabulating-RecordingCo.(C-T-R)…”
• ExtractedComplexRelation:Company-Founding
Company IBMLocation NewYorkDate June16,1911Original-Name Computing-Tabulating-RecordingCo.
• ButwewillfocusonthesimplertaskofextractingrelationtriplesFounding-year(IBM,1911)Founding-location(IBM,New York)
![Page 3: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/3.jpg)
WhyRelationExtraction?
• Createnewstructuredknowledgebases,usefulforanyapp• Augmentcurrentknowledgebases
• AddingwordstoWordNet thesaurus,factstoFreeBase orDBPedia
• Supportquestionanswering• Thegranddaughterofwhichactorstarredinthemovie“E.T.”?(acted-in ?x “E.T.”)(is-a ?y actor)(granddaughter-of ?x ?y)
• Butwhichrelationsshouldweextract?
3
![Page 4: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/4.jpg)
AutomaticContentExtraction(ACE)
ARTIFACT
GENERALAFFILIATION
ORGAFFILIATION
PART-WHOLE
PERSON-SOCIAL PHYSICAL
Located
Near
Business
Family Lasting Personal
Citizen-Resident-Ethnicity-Religion
Org-Location-Origin
Founder
EmploymentMembership
OwnershipStudent-Alum
Investor
User-Owner-Inventor-Manufacturer
GeographicalSubsidiary
Sports-Affiliation
17 relations from 2008 “Relation Extraction Task”
![Page 5: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/5.jpg)
AutomaticContentExtraction(ACE)
• Physical-LocatedPER-GPEHe was in Tennessee
• Part-Whole-SubsidiaryORG-ORGXYZ, the parent company of ABC
• Person-Social-FamilyPER-PERJohn’s wife Yoko
• Org-AFF-FounderPER-ORGSteve Jobs, co-founder of Apple…
•5
![Page 6: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/6.jpg)
UMLS:UnifiedMedicalLanguageSystem
• 134entitytypes,54relations
Injury disrupts PhysiologicalFunctionBodilyLocation location-of BiologicFunctionAnatomicalStructure part-of OrganismPharmacologicSubstancecauses PathologicalFunctionPharmacologicSubstancetreats PathologicFunction
![Page 7: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/7.jpg)
DatabasesofWikipedia Relations
7
RelationsextractedfromInfoboxStanfordstateCaliforniaStanfordmotto “DieLuft derFreiheit weht”…
WikipediaInfobox
![Page 8: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/8.jpg)
Howtobuildrelationextractors
1. Hand-writtenpatterns2. Supervisedmachinelearning3. Semi-supervisedandunsupervised• Bootstrapping(usingseeds)• Distantsupervision• Unsupervisedlearningfromtheweb
![Page 9: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/9.jpg)
RelationExtraction
Whatisrelationextraction?
![Page 10: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/10.jpg)
RelationExtraction
Usingpatternstoextractrelations
![Page 11: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/11.jpg)
ExtractingRicherRelationsUsingRules
• Intuition:relationsoftenholdbetweenspecificentities• located-in(ORGANIZATION,LOCATION)• founded (PERSON,ORGANIZATION)• cures(DRUG,DISEASE)
• StartwithNamedEntitytagstohelpextractrelation!
![Page 12: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/12.jpg)
NamedEntitiesaren’tquiteenough.Whichrelationsholdbetween2entities?
Drug Disease
Cure?Prevent?
Cause?
![Page 13: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/13.jpg)
Whatrelationsholdbetween2entities?
PERSON ORGANIZATION
Founder?
Investor?
Member?
Employee?
President?
![Page 14: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/14.jpg)
ExtractingRicherRelationsUsingRulesandNamedEntities
Whoholdswhatofficeinwhatorganization?PERSON, POSITION of ORG
• GeorgeMarshall,SecretaryofStateoftheUnitedStates
PERSON(named|appointed|chose|etc.) PERSON Prep?POSITION• TrumanappointedMarshallSecretaryofState
PERSON [be]?(named|appointed|etc.)Prep?ORG POSITION• GeorgeMarshallwasnamedUSSecretaryofState
![Page 15: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/15.jpg)
Hand-builtpatternsforrelations• Plus:• Humanpatternstendtobehigh-precision• Canbetailoredtospecificdomains
• Minus• Humanpatternsareoftenlow-recall• Alotofworktothinkofallpossiblepatterns!• Don’twanttohavetodothisforeveryrelation!• We’dlikebetteraccuracy
![Page 16: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/16.jpg)
RelationExtraction
Usingpatternstoextractrelations
![Page 17: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/17.jpg)
RelationExtraction
Supervisedrelationextraction
![Page 18: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/18.jpg)
Supervisedmachinelearningforrelations
• Chooseasetofrelationswe’dliketoextract• Chooseasetofrelevantnamedentities• Findandlabeldata
• Choosearepresentativecorpus• Labelthenamedentitiesinthecorpus• Hand-labeltherelationsbetweentheseentities• Breakintotraining,development,andtest
• Trainaclassifieronthetrainingset18
![Page 19: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/19.jpg)
Howtodoclassificationinsupervisedrelationextraction
1. Findallpairsofnamedentities(usuallyinsamesentence)
2. Decideif2entitiesarerelated3. Ifyes,classifytherelation• Whytheextrastep?
• Fasterclassificationtrainingbyeliminatingmostpairs• Canusedistinctfeature-setsappropriateforeachtask.
19
![Page 20: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/20.jpg)
RelationExtraction
Classifytherelationbetweentwoentitiesinasentence
AmericanAirlines,aunitofAMR,immediatelymatchedthemove,spokesmanTimWagnersaid.
SUBSIDIARY
FAMILYEMPLOYMENT
NIL
FOUNDER
CITIZEN
INVENTOR…
![Page 21: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/21.jpg)
WordFeaturesforRelationExtraction
• HeadwordsofM1andM2,andcombinationAirlinesWagnerAirlines-Wagner
• BagofwordsandbigramsinM1andM2{American,Airlines,Tim,Wagner,AmericanAirlines,TimWagner}
• WordsorbigramsinparticularpositionsleftandrightofM1/M2M2:-1spokesmanM2:+1said
• Bagofwordsorbigramsbetweenthetwoentities{a,AMR,of,immediately,matched,move,spokesman,the,unit}
AmericanAirlines,aunitofAMR,immediatelymatchedthemove,spokesmanTimWagnersaidMention1 Mention2
![Page 22: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/22.jpg)
NamedEntityTypeandMentionLevelFeaturesforRelationExtraction
• Named-entitytypes• M1:ORG• M2:PERSON
• Concatenationofthetwonamed-entitytypes• ORG-PERSON
• EntityLevelofM1andM2 (NAME,NOMINAL,PRONOUN)• M1:NAME [itor hewouldbePRONOUN]• M2:NAME [thecompanywouldbeNOMINAL]
AmericanAirlines,aunitofAMR,immediatelymatchedthemove,spokesmanTimWagnersaidMention1 Mention2
![Page 23: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/23.jpg)
ParseFeaturesforRelationExtraction
• BasesyntacticchunksequencefromonetotheotherNPNPPPVPNPNP
• ConstituentpaththroughthetreefromonetotheotherNPé NPé Sé Sê NP
• DependencypathAirlinesmatchedWagnersaid
AmericanAirlines,aunitofAMR,immediatelymatchedthemove,spokesmanTimWagnersaidMention1 Mention2
![Page 24: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/24.jpg)
Gazeteer andtriggerwordfeaturesforrelationextraction
• Triggerlistforfamily:kinshipterms• parent,wife,husband,grandparent,etc.[fromWordNet]
• Gazeteer:• Listsofusefulgeoorgeopoliticalwords• Countrynamelist• Othersub-entities
![Page 25: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/25.jpg)
AmericanAirlines,aunitofAMR,immediatelymatchedthemove,spokesmanTimWagnersaid.
![Page 26: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/26.jpg)
Classifiersforsupervisedmethods
• Nowyoucanuseanyclassifieryoulike• MaxEnt• NaïveBayes• SVM• ...
• Trainitonthetrainingset,tuneonthedev set,testonthetestset
![Page 27: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/27.jpg)
EvaluationofSupervisedRelationExtraction
• ComputeP/R/F1 foreachrelation
27
P = # of correctly extracted relationsTotal # of extracted relations
R = # of correctly extracted relationsTotal # of gold relations
F1 =2PRP + R
![Page 28: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/28.jpg)
Summary:SupervisedRelationExtraction
+ Cangethighaccuracieswithenoughhand-labeledtrainingdata,iftestsimilarenoughtotraining
- Labelingalargetrainingsetisexpensive
- Supervisedmodelsarebrittle,don’tgeneralizewelltodifferentgenres
![Page 29: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/29.jpg)
RelationExtraction
Supervisedrelationextraction
![Page 30: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/30.jpg)
RelationExtraction
Semi-supervisedandunsupervisedrelationextraction
![Page 31: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/31.jpg)
Seed-basedorbootstrappingapproachestorelationextraction
• Notrainingset?Maybeyouhave:• Afewseedtuples or• Afewhigh-precisionpatterns
• Canyouusethoseseedstodosomethinguseful?• Bootstrapping:usetheseedstodirectlylearntopopulatearelation
![Page 32: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/32.jpg)
RelationBootstrapping(Hearst1992)
• GatherasetofseedpairsthathaverelationR• Iterate:1. Findsentenceswiththesepairs2. Lookatthecontextbetweenoraroundthepair
andgeneralizethecontexttocreatepatterns3. Usethepatternsforgrep formorepairs
![Page 33: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/33.jpg)
Bootstrapping• <MarkTwain,Elmira>Seedtuple
• Grep (google)fortheenvironmentsoftheseedtuple“MarkTwainisburiedinElmira,NY.”
XisburiedinY“ThegraveofMarkTwainisinElmira”
ThegraveofXisinY“ElmiraisMarkTwain’sfinalrestingplace”
YisX’sfinalrestingplace.
• Usethosepatternstogrep fornewtuples• Iterate
![Page 34: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/34.jpg)
Dipre:Extract<author,book>pairs
• Startwith5seeds:
• FindInstances:TheComedyofErrors,by WilliamShakespeare,wasTheComedyofErrors,byWilliamShakespeare,isTheComedyofErrors,oneofWilliamShakespeare'searliestattemptsTheComedyofErrors,oneofWilliamShakespeare'smost
• Extractpatterns(groupbymiddle,takelongestcommonprefix/suffix)?x , by ?y , ?x , one of ?y ‘s
• Nowiterate,findingnewseedsthatmatchthepattern
Brin, Sergei. 1998. Extracting Patterns and Relations from the World Wide Web.
Author BookIsaacAsimov TheRobots ofDawnDavidBrin Startide RisingJamesGleick Chaos:MakingaNewScienceCharlesDickens GreatExpectationsWilliamShakespeare TheComedyofErrors
![Page 35: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/35.jpg)
DistantSupervision
• Combinebootstrappingwithsupervisedlearning• Insteadof5seeds,• Usealargedatabasetogethuge#ofseedexamples
• Createlotsoffeaturesfromalltheseexamples• Combineinasupervisedclassifier
Snow,Jurafsky,Ng.2005.Learningsyntacticpatternsforautomatichypernym discovery.NIPS17Fei WuandDanielS.Weld.2007.AutonomouslySemantifying Wikipeida.CIKM2007Mintz,Bills,Snow,Jurafsky.2009.Distantsupervisionforrelationextractionwithoutlabeleddata.ACL09
![Page 36: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/36.jpg)
Distantlysupervisedlearningofrelationextractionpatterns
Foreachrelation
Foreachtupleinbigdatabase
Findsentencesinlargecorpuswithbothentities
Extractfrequentfeatures(parse, words,etc)
Trainsupervisedclassifierusingthousandsoffeatures
4
1
2
3
5
PERwasborninLOCPER,born(XXXX),LOCPER’s birthplaceinLOC
<EdwinHubble,Marshfield><AlbertEinstein,Ulm>
Born-In
Hubble wasborninMarshfieldEinstein,born(1879),UlmHubble’sbirthplaceinMarshfield
P(born-in | f1,f2,f3,…,f70000)
![Page 37: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/37.jpg)
Distantsupervisionparadigm
• Likeunsupervisedclassification:• Usesverylargeamountsofunlabeleddata• Notsensitivetogenreissuesintrainingcorpus
• Likesupervisedclassification:• Usesaclassifierwithlotsoffeatures• Supervisedbydetailedhand-createdknowledge• Doesn’trequireiterativelyexpandingpatterns
![Page 38: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/38.jpg)
Unsupervisedrelationextraction
• OpenInformationExtraction:• extractrelationsfromthewebwithnotrainingdata,nolistofrelations
1. Useparseddatatotraina“trustworthytuple”classifier2. Single-passextractallrelationsbetweenNPs,keepiftrustworthy3. Assessorranksrelationsbasedontextredundancy
(Google,acquired,Android)tocompetewithApple.
(Tesla,invented,coiltransformer)38
M.Banko,M.Cararella,S.Soderland,M.Broadhead,andO.Etzioni.2007.Openinformationextractionfromtheweb.IJCAI
![Page 39: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/39.jpg)
EvaluationofSemi-supervisedandUnsupervisedRelationExtraction
• Sinceitextractstotallynewrelationsfromtheweb• Thereisnogoldsetofcorrectinstancesofrelations!
• Can’tcomputeprecision(don’tknowwhichonesarecorrect)• Can’tcomputerecall(don’tknowwhichonesweremissed)
• Instead,wecanapproximateprecision(only)• Drawarandomsampleofrelationsfromoutput,checkprecisionmanually
• Canalsocomputeprecisionatdifferentlevelsofrecall.• Precisionfortop1000newrelations,top10,000newrelations,top100,000• Ineachcasetakingarandomsampleofthatset
• Butnowaytoevaluaterecall39
P̂ = # of correctly extracted relations in the sampleTotal # of extracted relations in the sample
![Page 40: l17 rel extraction - ecology lab · • Words or bigrams in particular positions left and right of M1/M2 M2: -1 spokesman M2: +1 said • Bag of words or bigrams between the two entities](https://reader034.vdocuments.site/reader034/viewer/2022052105/60410e7cb6fad801cd63428c/html5/thumbnails/40.jpg)
RelationExtraction
Semi-supervisedandunsupervisedrelationextraction