learning probabilistic relational models

Post on 18-Mar-2016

61 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Nir Friedman Hebrew University nir@cs.huji.ac.il. Lise Getoor Stanford University getoor@cs.stanford.edu. Daphne Koller Stanford University koller@cs.stanford.edu. Avi Pfeffer Stanford University avi@cs.stanford.edu. Learning Probabilistic Relational Models. - PowerPoint PPT Presentation

TRANSCRIPT

Learning Probabilistic Relational Models

Daphne KollerStanford University

koller@cs.stanford.edu

Nir FriedmanHebrew Universitynir@cs.huji.ac.il

Lise GetoorStanford University

getoor@cs.stanford.edu

Avi PfefferStanford University

avi@cs.stanford.edu

• Data sources– relational and object-oriented databases– frame-based knowledge bases – World Wide Web

Learning from Relational Data

• Problem:– must fix attributes in advance

can represent only some limited set of structures– IID assumption may not hold

• Traditional approaches– work well with flat representations– fixed length attribute-value vectors – assume IID samples

Our Approach• Probabilistic Relational Models (PRMs)

– rich representation language models• relational dependencies• probabilistic dependencies

• Learning PRMs – parameter estimation– model selection

from data stored in relational databases

Outline• Motivation• Probabilistic relational models

– Probabilistic Logic Programming[Poole, 1993]; [Ngo & Haddawy 1994]

– Probabilistic object-oriented knowledge[Koller & Pfeffer 1997; 1998]; [Koller, Levy & Pfeffer; 1997]

• Learning PRMs• Experimental results• Conclusions

Probabilistic Relational Models

• Combine advantages of predicate logic & BNs: – natural domain modeling: objects, properties,

relations;– generalization over a variety of situations;– compact, natural probability models.

• Integrate uncertainty with relational model:– properties of domain entities can depend on

properties of related entities;– uncertainty over relational structure of domain.

Relational SchemaStudentIntelligencePerformance

RegistrationGradeSatisfaction

CourseDifficultyRating

ProfessorPopularity

Teaching-Ability

Stress-Level

Teach

In

Take

• Describes the types of objects and relations in the database

ClassesClasses

RelationshipsRelationships

AttributesAttributes

Example instance I Professor

Prof. GumpPopularity

highTeaching Ability

mediumStress-Level

low

CoursePhil142

Difficulty low

Ratinghigh

CoursePhil101

Difficulty low

Ratinghigh

Reg#5639

GradeA

Satisfaction 3

Reg#5639

GradeA

Satisfaction 3

Reg#5639

GradeA

Satisfaction 3

StudentJohn Doe

Intelligence high

Performance average

StudentJane Doe

Intelligence high

Performance average

What’s Uncertain?

Relations

ProfessorProf. Gump

Popularityhigh

Teaching Abilitymedium

Stress-Levellow

CoursePhil142

Difficulty low

Ratinghigh

CoursePhil101

Difficulty low

Ratinghigh

Reg#5639

GradeA

Satisfaction 3

Reg#5639

GradeA

Satisfaction 3

Reg#5639

GradeA

Satisfaction 3

StudentJohn Doe

Intelligence high

Performance average

StudentJane Doe

Intelligence high

Performance average

Attribute Values

ObjectsStudent

Judy DunnIntelligence

highPerformance

high

StudentJohn Deer

Intelligence ???

Performance ???

Attribute Uncertainty

Fixed skeleton – set of objects in each class– relations between them

Uncertainty– over assignments of values to attributes

ProfessorProf. Gump

Popularity???

Teaching Ability???

Stress-Level???

CoursePhil142

Difficulty ???

Rating???

CoursePhil101

Difficulty ???

Rating???

Reg#5639

GradeA

Satisfaction 3

Reg#5639

GradeA

Satisfaction 3

Reg#5639

Grade???

Satisfaction ???

StudentJane Doe

Intelligence ???

Performance ???

IntellReg.Taker.ficulty,Reg.In.Dif

|Reg.Grade P

PRM: Dependencies

StudentIntelligence

Performance

RegGradeSatisfaction

CourseDifficulty

Rating

ProfessorPopularity

Teaching-Ability

Stress-Level

1.06.03.01.01.08.04.05.01.01.04.05.0

,,,,

,

llhllhhh

CBAID

PRM: Dependencies (cont.)Professor

Prof. GumpPopularity

highTeaching Ability

mediumStress-Level

low

CoursePhil142

Difficulty low

Ratinghigh

CoursePhil101

Difficulty low

Ratinghigh

Reg#5639

GradeA

Satisfaction 3

Reg#5639

GradeA

Satisfaction 3

Reg#5639

Grade?

Satisfaction 3

StudentJohn Doe

Intelligence high

Performance average

StudentJane Doe

Intelligence high

Performance average

StudentJohn Deer

Intelligence low

Performance average

Reg#5639

Grade?

Satisfaction 3

1.06.03.01.01.08.04.05.01.01.04.05.0

,,,,

,

llhllhhh

CBAID

1.06.03.01.01.08.04.05.01.01.04.05.0

,,,,

,

llhllhhh

CBAID

PRM: aggregate dependencies

RegGrade

StudentIntelligence

Performance

Satisfaction

CourseDifficulty

Rating

ProfessorPopularity

Teaching-Ability

Stress-Level

StudentJane Doe

Intelligence high

Performance average

Reg#5077

GradeC

Satisfaction 2

Reg#5054

GradeC

Satisfaction 1

Reg#5639

GradeA

Satisfaction 3

Problem!!!

Need CPTs of varying sizes

avg

1.03.06.04.04.02.07.02.01.0

CBA

hmlavg

PRM: aggregate dependencies

StudentIntelligence

Performance

RegGradeSatisfaction

CourseDifficulty

Rating

ProfessorPopularity

Teaching-Ability

Stress-Level

avg

avg

count

sum, min, max, avg, mode, count

PRM: Summary• A PRM specifies

– a probabilistic dependency structure S• a set of parents for each attribute X.A

– a set of local probability models

• Given a skeleton structure , a PRM specifies a probability distribution over instances I:– over attribute values of all objects in

Classes Objects

)|(),,|( ).()( .

. axparentsX Xx AX

axPSP III

Value of attribute A in object xAttributes

Learning PRMs

Relational

Schema

Database:

• Parameter estimation

• Structure selection

Course Student

Reg

Course Student

Reg

Instance I

Parameter estimation in PRMs• Assume known dependency structure S• Goal: estimate PRM parameters

– entries in local probability models,

• A parameterization is good if it is likely to generate the observed data, instance I .

• MLE Principle: Choose so as to maximize l

),|(log),:( SPSl II

).(|. AxparentsAx

crucial property: decompositionseparate terms for different X.A

ML parameter estimation

IntellReg.Taker.ficulty,Reg.In.Dif

|Reg.Grade P

StudentIntelligence

PerformanceReg

GradeSatisfaction

CourseDifficultyRating

).,.().,.,.(

*

.,.|.

hISlDCNhISlDCAGRN

hISlDCAGR

DB technology well-suited to the computation of suff statistics:

Coursetable

Regtable

Studenttable

IntSGradeRDiffC

...

Count

sufficient statistics

Model Selection• Idea:

– define scoring function – do local search over legal structures

• Key Components:– scoring models– legal models– searching model space

Scoring Models

• Bayesian approach:

• closed form solution

])()|(log[)|(log):(

priorlikelihoodmarginal

SPSPSPSScore

III

Legal Models

• Dependency ordering over attributes:

x.a

y.b

axby .. if X.A depends on Y.B

PaperAccepted

ResearcherReputation author-of

• PRM defines a coherent probability model over skeleton if is acyclic

Guaranteeing AcyclicityHow do we guarantee that a PRM is acyclic for every skeleton?

PRMdependency structure S

dependencygraph

Y.B

X.A

if X.A depends directly on Y.B

dependency graph acyclic acyclic for any Attribute stratification:

Limitation of stratificationPersonM-chromosome

P-chromosome

Blood-type

PersonM-chromosome

P-chromosome

Blood-type

PersonM-chromosome

P-chromosome

Blood-type

Father Mother

Person.M-chrom Person.P-chrom

Person.B-type ???

Guaranteed acyclic relations

PersonM-chromosome

P-chromosome

Blood-type

PersonM-chromosome

P-chromosome

Blood-type

PersonM-chromosome

P-chromosome

Blood-type

Father Mother

• Prior knowledge: the Father-of relation is acyclic– dependence of Person.A on Person.Father.B cannot induce cycles

Guaranteeing acyclicity• With guaranteed acyclic relations, some cycles in

the dependency graph are guaranteed to be safe.• We color the edges in the dependency graph

A cycle is safe if– it has a green edge– it has no red edge

yellow: withinsingle object

X.B

X.Agreen: viag.a. relation

Y.B

X.Ared: viaother relations

Y.B

X.A

Person.M-chrom Person.P-chrom

Person.B-type

Searching Model Space

Student

Course Reg scoreAdd C.AC.B

score

Delete S.IS.P Student

Course Reg

Student

RegCourse

Phase 0: consider only dependencies within a class

Phased structure search

Student

Course Reg scoreAdd C.AR.B

score

Add S.IR.CStudent

Course Reg

Student

RegCourse

Phase 1: consider dependencies from “neighboring” classes, via schema relations

Phased structure search

scoreAdd C.AS.P

score

Add S.IC.B

Phase 2: consider dependencies from “further” classes, via relation chains

Student

Course Reg

Student

Course Reg

Student

Course Reg

Experimental Results:Movie Domain (real data)

11,000 movies, 7,000 actors

ActorGender

AppearsRole-type

MovieProcess

Decade

Genre

source: http://www-db.stanford.edu/movies/doc.html

Genetics domain (synthetic data)PersonM-chromosome

P-chromosome

Blood-type

PersonM-chromosome

P-chromosome

Blood-type

PersonM-chromosome

P-chromosome

Blood-type

Father Mother

Blood-TestContaminated

Result

Experimental Results

-32000

-30000

-28000

-26000

-24000

-22000

-20000

-18000

200 300 400 500 600 700 800

Sco

re

Dataset Size

Median LikelihoodGold Standard

Future directions• Learning in complex real-world domains

– drug treatment regimes– collaborative filtering

• Missing data• Learning with structural uncertainty• Discovery

– hidden variables– causal structure– class hierarchy

Conclusions• PRMs natural extension of BNs:

– well-founded (probabilistic) semantics– compact representation of complex models

• Powerful learning techniques– builds on BN learning techniques– can learn directly from relational data

• Parameter estimation– efficient, effective exploitation of DB technology

• Structure identification– builds on well understood theory– major issues:

• guaranteeing coherence• search heuristics

top related