composite likelihood data augmentation for within-network ... · composite likelihood data...

Composite Likelihood Data Augmentation for Within-Network Statistical Relational Learning

Joseph J. Pfeiffer III1Jennifer Neville1Paul Bennett2

1Purdue University2Microsoft Research

ICDM 2014, Shenzhen

• Items: stock brokers

• Label: fraudulent

• Relationships: phone calls

• Attributes: other observable item traits

- social network users, papers, movies, products, etc.

- buy product, sales rank, etc.,

- friendships, emails, citations, co-purchases, etc.

Relational Machine Learning

Within-network relational machine learning

Within-Network Relational Machine Learning

Learn model parameters using the observed data

Infer (predict) remaining labels using the learned parameters and available data⇥Y

Will Show: Learning/inference approximations prevent

semi-supervised algorithms from converging

Will Develop/Demonstrate: Methods that model parameter uncertainty

overcome approximation errors

• Introduction • Background: Semi-supervised

Relational Machine Learning• Analysis of Relational EM convergence

in real world networks • Relational Stochastic EM and

Relational Data Augmentation• Experiments• Conclusions

Outline

• Introduction• Background: Semi-supervised

Relational Machine Learning • Analysis of Relational EM convergence

Outline

Collective Inference

⇥Y = argmax

logPY (yi|YMBL(vi),xi,⇥Y )

• Relational Machine Learning

- Approximate Learning:

- ApproximateInference:

• Many conditional forms to choose from(e.g., Generative, Logistic)

Background - Semi-Supervised Relational LearningLabels Edges

Attributes

MB(vi)

M-Step:

E-Step: (For all unlabeled instances)

Composite Likelihood (GO)

Pseudolikelihood (GL)ˆ

⇥Y = argmax

PY (YU )

logPY (yi| ˜YMB(vi),xi,⇥Y )

PY (yi|YMB(vi),xi,⇥Y )

P (Y|X,E,⇥Y )

MB(vi)

Semi-supervised Relational EM(Xiang & Neville, 2009)

How do these approximations affect semi-supervised relational learning?

Both learning and inference require approximations

Relational Machine Learning • Analysis of Relational EM convergence

Outline

Collective Inference Approximation• (Relational EM:)

• Over-Propagation Error - Both neighboring yellow- Yellow neighbors are

not predictive • Over-Correction

Impact of Approximations in Semi-Supervised Relational EM

M-Step:

E-Step: (For all unlabeled instances)

Composite Likelihood Approximation

⇥Y = argmax

PY (YU )

logPY (yi| ˜YMB(vi),xi,⇥Y )

PY (yi|YMB(vi),xi,⇥Y )

? P (Purple|Neighbors) ⇡ 0.61

P (Purple|Purple Neighbor) ⇡ 0.96

• Does over propagation during prediction happen in real world, sparsely labeled networks?

Impact of Approximations in Semi-Supervised Relational Learning

0 1000 2000 3000 4000 5000Total Positives

Trial 1Trial 2Trial 3Trial 4Trial 5

0 1000 2000 3000 4000 5000Total Positives

Trial 1Trial 2Trial 3Trial 4Trial 5

Relational Naive Bayes Relational Logistic Regression

Class Prior Class Prior

• Does over correction happen during parameter estimation for semi-supervised relational learning for real world, sparsely labeled networks?

Impact of Approximations in Semi-Supervised Relational Learning

Relational Naive Bayes Relational Logistic Regression

�0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2P (�|y)

�0.2

P (·|�)

P (·|+)

�0.2 0.0 0.2 0.4 0.6 0.8 1.0w[0]

�0.6�0.5�0.4�0.3�0.2�0.1

0.00.10.2

Outline

Relational Data Augmentation • Experiments• Conclusions

Outline

Relational Stochastic EM andRelational Data Augmentation

ParametersPredictions

Fixed Point Stochastic

Fixed Point Relational EM —

Stochastic Relational Stochastic EM

Relational Data Augmentation

• Stochastic approximation to relational EM algorithm• Sample from the joint conditional

distribution of labels • Maximize the composite likelihood

• Contrasts with Relational EM, which utilizes expectations of unlabeled items

• Average over the parameters to reduce the variance (Celex et al., 2001)

• Inference is still performed using a single, fixed point set of parameters

Relational Stochastic EM

Parameters

Predictions

Alternate Between:

Gibbs sample of labels

Maximizing Parameters

Aggregate Parameters:

Final Inference:

YtU ⇠ P t

Y (YU |YL,X,E, ⇥t�1Y )

tY = arg max

logPY (yi| ˜YtMB(vi)

,xi,⇥Y )

⇥Y =1

PY (YU |YL,X,E, ⇥Y )

• Data Augmentation is a Bayesian viewpoint of EM- Parameters are random variables.- Computes posterior predictive

distribution (Tanner&Wong,1987)• Developed a version for relational semi-

supervised learning• Final inference is over a distribution

of parameter values• Requires prior distributions over the

parameters and corresponding sampling methods- RNB: Beta (conjunctive prior)- RLR: Normal

(Metropolis-Hastings sampler)

Parameters

Predictions

Alternate Between:

Gibbs sample of labels

Sample Parameters

Final Parameters:

Final Inference:

YtU ⇠ P t

Y (YU |YL,X,E, ⇥t�1Y )

⇥Y =1

⇥tY ⇠ P t(⇥Y |Yt

U ,YL,X,E)

Relational Data Augmentation • Experiments• Conclusions

Outline

Relational Data Augmentation• Experiments • Conclusions

Outline

• Compare on four real world networks• Competing Models

- Independent- Relational (No CI and CI)- Relational EM- Relational SEM - Relational DA

• Conditional Models- Relational Naive Bayes- Relational Logistic Regression

• Error measures- Zero-One Loss (0-1 Loss)- Mean absolute error (MAE)

Experiments - Setup

Dataset Vertices Edges Attributes

Facebook 5,906 73,374 2

IMDB 11,280 426,167 28

DVD 16,118 75,596 28

Music 56,891 272,544 26

Experiments - DVD

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Percentage of Graph Labeled

0.340.360.380.400.420.440.460.480.50

0.300.320.340.360.380.400.420.440.460.48

Naive Bayes Logistic Regression

Relational EM Relational EM

Relational DA Relational DA

Relational EM’s instability in sparsely labeled domains causes poor performance

0.300.320.340.360.380.400.420.440.46

EExperiments - Facebook

Naive Bayes Logistic Regression

Relational EM Relational EM

Relational DA Relational DA

Relational SEM

Relational Data Augmentation can outperform Relational Stochastic EM

(In Paper): Cast collective inference as a nonlinear dynamical system to

analyze the convergence of Relational SEM

(Finding): Similar to Relational EM, Relational SEM can sometimes learn

parameters that result in an unstable system

Relational Data Augmentation• Experiments • Conclusions

Outline

• Findings - Fixed point approximation error led Relational EM to not converge- Overpropagation and Overcorrection - (In Paper) Fixed point EM methods can result in unstable systems during

inference

• Developed - Relational Stochastic EM method has lower variance in parameter estimates- Relational Data Augmentation for computing a posterior predictive

distribution for the unlabeled instances- Models the uncertainty over the parameter estimates, meaning it can

outperform the Relational Stochastic EM approach- Both work well in conjunction with Composite Likelihood approximation- Both significantly outperformed a variety of competitors under a number of

testing scenarios

Conclusions

Thank you!jpfeiffer@purdue.edu neville@cs.purdue.edu paul.n.bennett@microsoft.com

JenniferNeville

PaulBennett

composite likelihood data augmentation for within-network ... · composite likelihood data...

Documents

the composite marginal likelihood (cml) …the composite...

likelihood and conditional likelihood inference for...

joint and conditional maximum likelihood estimation for...

computing the maximum likelihood estimates: concentrated...

dibels next benchmark goals and composite score · 3. table...

composite water resources management: … · river...

composite likelihood estimation of an autoregressive panel...

clinical horizontal ridge augmentation utilizing ·...

a composite trace fossil of decapod and …changes over...

withcristiano varin and thanks to don fraser, grace yi...

carrier phase gps augmentation using … carrier phase gps...

likelihood and asymptotic theory for statistical...

model comparison with composite likelihood information

composite likelihood estimation of an …...composite...

parameter estimation & maximum likelihood · parameter...

guide de sélection composite - samaro · innovants dans le...

composite likelihood - university of...

sample complexity of composite likelihood

global optimization of gps augmentation … optimization of...

a composite likelihood approach for dynamic …...a...