group norm for learning latent structural svms

Group Norm for Learning Latent Structural SVMsDaozheng Chen (UMD, College Park), Dhruv Batra (TTI-Chicago),

Bill Freeman (MIT), Micah K. Johnson (GelSight, Inc.)

Overview

• Data with complete annotation is rarely ever available.

• Latent variable models capture interaction betweeno observed data (e.g. gradient histogram image features)o latent or hidden variables not observed in the training data (e.g. location of object parts).

• Parameter estimation involve a difficult non-convex optimization problem (EM, CCCP, self-paced learning)

• Our goal• Estimate model parameters• Learn the complexity of latent variable space.

• Our approach• norm for regularization to estimate the parameters of a latent-variable model.

Latent Structural SVM

Prediction Rule:

Label space Latent Space

Joint feature vector

Inducing Group Normw partitioned into P groups; each group corresponds to the parameters of a latent variable state

Induce Group Norm

Alternating Coordinate and Subgradient Descent

nonconvex

convex

convex

Rewrite Learning Objective

Minimize Upper bound of

convex if {hi} is fixed

Digit recognition experiment (following the setup of Kumar et al. NIPS ‘10)• MNIST data: binary classification on four difficult digit pairs

• (1,7), (2,7), (3,8), (8,9)• Training data 5,851 - 6,742, and testing data 974 - 1,135 • Rotate digit images with angles from -60o to 60o • PCA to form 10 dimensional feature vector

Experiment

• Significantly higher accuracy than random sampling.

• 66% faster than full model with no loss in accuracy!

Digit Recognition

Key Contribution

-60o -48o -36o -24o -12o 0o 12o 24o 36o 48o 60o

-60o-48o -12o-60o -48o0o -48o-36o

l2 norm of the parameter vectors for different angles over the 4 digit pairs.• Select only a few angles, much fewer than 22 angles Angles Not Selected

ImagesRotation

(Latent Var.)Feature Vector

• At group level, the norm behave like norm and induces group sparsity.

norm for regularization

• Within each group, the norm behave like norm and does not promote sparsity.

Learning objective:

Subgradient

Felzenszwalb et al. car model on the PASCAL VOC 2007 data. Each row is a component of the model.

Root filters Part filters Part displacement

Component #1

Component #2

group norm for learning latent structural svms

Documents

approach norm

latentvariable model

group sparsity

training data

observed data

group level

overview data

mnist data