fast edge detection using structured forests by piotr

74
Fast Edge Detection Using Structured Forests by Piotr Dollar, C. Lawrence Zitnick (PAMI 2015) Vojtˇ ech Cvrˇ cek Reading Group Presentation June 6, 2018 Fast edge detection June 6, 2018 1 / 22

Upload: others

Post on 07-Jan-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Fast Edge Detection Using StructuredForests

by Piotr Dollar, C. Lawrence Zitnick (PAMI2015)

Vojtech Cvrcek

Reading Group Presentation

June 6, 2018

Fast edge detection June 6, 2018 1 / 22

Problem Definition

given a set of training images and corresponding segmentation masks

predict an edge map for a novel input image

Fast edge detection June 6, 2018 2 / 22

Problem Definition

given a set of training images and corresponding segmentation masks

predict an edge map for a novel input image

Fast edge detection June 6, 2018 2 / 22

Edge structure

Fast edge detection June 6, 2018 3 / 22

Random Forest - Single Tree

ft : X → Y

L R

hθj : X → {L,R}

h1θ1j

(x) = x(k) < τ

θ1j = (k, τ)

h2θ2j

(x) = x(k1)− x(k2) < τ

θ2j = (k1, k2, τ)

hθj (x) = δh1θ1j

(x) + (1− δ)h2θ2j

(x)

δ ∈ {0, 1}

y1

y2

y3 y4 y5

y6 y7

x ∈ X

ft : X → Y

Fast edge detection June 6, 2018 4 / 22

Random Forest - Single Tree

ft : X → Y

L R

hθj : X → {L,R}

h1θ1j

(x) = x(k) < τ

θ1j = (k, τ)

h2θ2j

(x) = x(k1)− x(k2) < τ

θ2j = (k1, k2, τ)

hθj (x) = δh1θ1j

(x) + (1− δ)h2θ2j

(x)

δ ∈ {0, 1}

y1

y2

y3 y4 y5

y6 y7

x ∈ Xft : X → Y

Fast edge detection June 6, 2018 4 / 22

Random Forest - Single Tree

ft : X → Y

L R

hθj : X → {L,R}h1θ1j

(x) = x(k) < τ

θ1j = (k, τ)

h2θ2j

(x) = x(k1)− x(k2) < τ

θ2j = (k1, k2, τ)

hθj (x) = δh1θ1j

(x) + (1− δ)h2θ2j

(x)

δ ∈ {0, 1}

y1

y2

y3 y4 y5

y6 y7

x ∈ Xft : X → Y

Fast edge detection June 6, 2018 4 / 22

Random Forest - Single Tree

ft : X → Y

L R

hθj : X → {L,R}h1θ1j

(x) = x(k) < τ

θ1j = (k, τ)

h2θ2j

(x) = x(k1)− x(k2) < τ

θ2j = (k1, k2, τ)

hθj (x) = δh1θ1j

(x) + (1− δ)h2θ2j

(x)

δ ∈ {0, 1}

y1

y2

y3 y4 y5

y6 y7

x ∈ Xft : X → Y

Fast edge detection June 6, 2018 4 / 22

Random Forest - Single Tree

ft : X → Y

L R

hθj : X → {L,R}h1θ1j

(x) = x(k) < τ

θ1j = (k, τ)

h2θ2j

(x) = x(k1)− x(k2) < τ

θ2j = (k1, k2, τ)

hθj (x) = δh1θ1j

(x) + (1− δ)h2θ2j

(x)

δ ∈ {0, 1}

y1

y2

y3 y4 y5

y6 y7

x ∈ Xft : X → Y

Fast edge detection June 6, 2018 4 / 22

Training

each tree is trained independently in a recursive manner

for a given node j and training set Sj ⊂ X × Y, randomly sampleparameters θj from parameters space

Select θj resulting in a ’good’ split of the data

bad splitgood split

Fast edge detection June 6, 2018 5 / 22

Training

each tree is trained independently in a recursive manner

for a given node j and training set Sj ⊂ X × Y, randomly sampleparameters θj from parameters space

Select θj resulting in a ’good’ split of the data

bad split

good split

Fast edge detection June 6, 2018 5 / 22

Training

each tree is trained independently in a recursive manner

for a given node j and training set Sj ⊂ X × Y, randomly sampleparameters θj from parameters space

Select θj resulting in a ’good’ split of the data

bad split

good split

Fast edge detection June 6, 2018 5 / 22

Training - Information Gain Criterion

information gain criterion:

Ij = I (Sj ,SLj ,SRj ), (1)

where SLj = {(x , y) ∈ Sj | h(x , θ) = 0}, SRj = Sj \ SLj are splits

θj = argmaxθ Ij(Sj , θ)

for multiclass classification (Y ⊂ Z) the standard definition ofinformation gain is:

Ij = H(Sj)−∑

k∈{L,R}

|Skj ||Sj |

H(Skj ) (2)

where H(S) is either the Shannon entropy (H(S) = −∑

y py log(py ))or alternatively the Gini impurity (H(S) = −

∑y py (1− py ))

Fast edge detection June 6, 2018 6 / 22

Training - Information Gain Criterion

information gain criterion:

Ij = I (Sj ,SLj ,SRj ), (1)

where SLj = {(x , y) ∈ Sj | h(x , θ) = 0}, SRj = Sj \ SLj are splits

θj = argmaxθ Ij(Sj , θ)

for multiclass classification (Y ⊂ Z) the standard definition ofinformation gain is:

Ij = H(Sj)−∑

k∈{L,R}

|Skj ||Sj |

H(Skj ) (2)

where H(S) is either the Shannon entropy (H(S) = −∑

y py log(py ))or alternatively the Gini impurity (H(S) = −

∑y py (1− py ))

Fast edge detection June 6, 2018 6 / 22

Training - Information Gain Criterion

information gain criterion:

Ij = I (Sj ,SLj ,SRj ), (1)

where SLj = {(x , y) ∈ Sj | h(x , θ) = 0}, SRj = Sj \ SLj are splits

θj = argmaxθ Ij(Sj , θ)

for multiclass classification (Y ⊂ Z) the standard definition ofinformation gain is:

Ij = H(Sj)−∑

k∈{L,R}

|Skj ||Sj |

H(Skj ) (2)

where H(S) is either the Shannon entropy (H(S) = −∑

y py log(py ))or alternatively the Gini impurity (H(S) = −

∑y py (1− py ))

Fast edge detection June 6, 2018 6 / 22

Training - Information Gain Criterion

information gain criterion:

Ij = I (Sj ,SLj ,SRj ), (1)

where SLj = {(x , y) ∈ Sj | h(x , θ) = 0}, SRj = Sj \ SLj are splits

θj = argmaxθ Ij(Sj , θ)

for multiclass classification (Y ⊂ Z) the standard definition ofinformation gain is:

Ij = H(Sj)−∑

k∈{L,R}

|Skj ||Sj |

H(Skj ) (2)

where H(S) is either the Shannon entropy (H(S) = −∑

y py log(py ))or alternatively the Gini impurity (H(S) = −

∑y py (1− py ))

Fast edge detection June 6, 2018 6 / 22

Training

training stops when a maximum depth is reached or if informationgain or training set size fall below fixed threshold

single output y ∈ Y is assigned to a leaf node based on a problemspecific ensemble model

Fast edge detection June 6, 2018 7 / 22

Training

training stops when a maximum depth is reached or if informationgain or training set size fall below fixed threshold

single output y ∈ Y is assigned to a leaf node based on a problemspecific ensemble model

Fast edge detection June 6, 2018 7 / 22

Testing

combining results from multiple trees depends on a problem specificensemble

classification → majority voting

regression → averaging

Fast edge detection June 6, 2018 8 / 22

Testing

combining results from multiple trees depends on a problem specificensemble

classification → majority voting

regression → averaging

Fast edge detection June 6, 2018 8 / 22

Testing

combining results from multiple trees depends on a problem specificensemble

classification → majority voting

regression → averaging

Fast edge detection June 6, 2018 8 / 22

Structured Forests

structured output space Y, e.g.:

use of structured outputs in random forests presents followingchallenges:

computing splits for structured output spaces of highdimensions/complexity is time consumingit is unclear how to define information gain

solution: mapping (somehow) structured output space Y intomulticlass space C = {1, ..., k}

?Ij - multiclass case

Fast edge detection June 6, 2018 9 / 22

Structured Forests

structured output space Y, e.g.:use of structured outputs in random forests presents followingchallenges:

computing splits for structured output spaces of highdimensions/complexity is time consumingit is unclear how to define information gain

solution: mapping (somehow) structured output space Y intomulticlass space C = {1, ..., k}

?Ij - multiclass case

Fast edge detection June 6, 2018 9 / 22

Structured Forests

structured output space Y, e.g.:use of structured outputs in random forests presents followingchallenges:

computing splits for structured output spaces of highdimensions/complexity is time consuming

it is unclear how to define information gain

solution: mapping (somehow) structured output space Y intomulticlass space C = {1, ..., k}

?Ij - multiclass case

Fast edge detection June 6, 2018 9 / 22

Structured Forests

structured output space Y, e.g.:use of structured outputs in random forests presents followingchallenges:

computing splits for structured output spaces of highdimensions/complexity is time consumingit is unclear how to define information gain

solution: mapping (somehow) structured output space Y intomulticlass space C = {1, ..., k}

?

Ij - multiclass case

Fast edge detection June 6, 2018 9 / 22

Structured Forests

structured output space Y, e.g.:use of structured outputs in random forests presents followingchallenges:

computing splits for structured output spaces of highdimensions/complexity is time consumingit is unclear how to define information gain

solution: mapping (somehow) structured output space Y intomulticlass space C = {1, ..., k}

?

Ij - multiclass case

Fast edge detection June 6, 2018 9 / 22

Structured Edges Clustering

How to efficiently compute splits?

Can the task be transformed into a multiclass problem?

Y ?−→ C = {1, ..., k}

start with an intermediate mapping:

Π : Y → Z (3)

z = Π(y) is a long binary vector, which encodes whether every pair ofpixels in the y belongs to the same or different segment

Fast edge detection June 6, 2018 10 / 22

Structured Edges Clustering

How to efficiently compute splits?

Can the task be transformed into a multiclass problem?

Y ?−→ C = {1, ..., k}

start with an intermediate mapping:

Π : Y → Z (3)

z = Π(y) is a long binary vector, which encodes whether every pair ofpixels in the y belongs to the same or different segment

Fast edge detection June 6, 2018 10 / 22

Structured Edges Clustering

How to efficiently compute splits?

Can the task be transformed into a multiclass problem?

Y ?−→ C = {1, ..., k}

start with an intermediate mapping:

Π : Y → Z (3)

z = Π(y) is a long binary vector, which encodes whether every pair ofpixels in the y belongs to the same or different segment

Fast edge detection June 6, 2018 10 / 22

Structured Edges Clustering

How to efficiently compute splits?

Can the task be transformed into a multiclass problem?

Y ?−→ C = {1, ..., k}

start with an intermediate mapping:

Π : Y → Z (3)

z = Π(y) is a long binary vector, which encodes whether every pair ofpixels in the y belongs to the same or different segment

Fast edge detection June 6, 2018 10 / 22

Structured Edges Clustering

dimension of vectors z ∈ Z is reduced by PCA to m = 5, andclustering (k-means) splits Z into k = 2 clusters

Y pairs−−−→ Z PCA, k-means−−−−−−−−→ C

Ij - multiclass case

Fast edge detection June 6, 2018 11 / 22

Structured Edges Clustering

dimension of vectors z ∈ Z is reduced by PCA to m = 5, andclustering (k-means) splits Z into k = 2 clusters

Y pairs−−−→ Z PCA, k-means−−−−−−−−→ C

Ij - multiclass case

Fast edge detection June 6, 2018 11 / 22

Structured Edges Clustering

dimension of vectors z ∈ Z is reduced by PCA to m = 5, andclustering (k-means) splits Z into k = 2 clusters

Y pairs−−−→ Z PCA, k-means−−−−−−−−→ C

Ij - multiclass case

Fast edge detection June 6, 2018 11 / 22

Structured Edges Clustering

since the elements of Y are of size 16 x 16, the dimension of Z is(2562

)

too expensive → randomly sample m = 256 dimensions of Z

Y sampled pairs−−−−−−−−→ Z PCA, k-means−−−−−−−−→ C

Fast edge detection June 6, 2018 12 / 22

Structured Edges Clustering

since the elements of Y are of size 16 x 16, the dimension of Z is(2562

)too expensive → randomly sample m = 256 dimensions of Z

Y sampled pairs−−−−−−−−→ Z PCA, k-means−−−−−−−−→ C

Fast edge detection June 6, 2018 12 / 22

Structured Edges Clustering

since the elements of Y are of size 16 x 16, the dimension of Z is(2562

)too expensive → randomly sample m = 256 dimensions of Z

Y sampled pairs−−−−−−−−→ Z PCA, k-means−−−−−−−−→ C

Fast edge detection June 6, 2018 12 / 22

Ensemble model

during training, we need to assign a single prediction to a leaf node

during testing, we need to combine multiple predictions into one

to select a single output from a set y1, ..., yk ∈ Y:

compute zi = Πyiselect yk∗ such that k∗ = argmink

∑i,j(zk,j − zi,j)

2 (medoid)

a domain specific ensemble model (for edge map): y′k∗ = E [y

′i ]

Fast edge detection June 6, 2018 13 / 22

Ensemble model

during training, we need to assign a single prediction to a leaf node

during testing, we need to combine multiple predictions into one

to select a single output from a set y1, ..., yk ∈ Y:

compute zi = Πyiselect yk∗ such that k∗ = argmink

∑i,j(zk,j − zi,j)

2 (medoid)

a domain specific ensemble model (for edge map): y′k∗ = E [y

′i ]

Fast edge detection June 6, 2018 13 / 22

Ensemble model

during training, we need to assign a single prediction to a leaf node

during testing, we need to combine multiple predictions into one

to select a single output from a set y1, ..., yk ∈ Y:

compute zi = Πyiselect yk∗ such that k∗ = argmink

∑i,j(zk,j − zi,j)

2 (medoid)

a domain specific ensemble model (for edge map): y′k∗ = E [y

′i ]

Fast edge detection June 6, 2018 13 / 22

Ensemble model

during training, we need to assign a single prediction to a leaf node

during testing, we need to combine multiple predictions into one

to select a single output from a set y1, ..., yk ∈ Y:

compute zi = Πyi

select yk∗ such that k∗ = argmink

∑i,j(zk,j − zi,j)

2 (medoid)

a domain specific ensemble model (for edge map): y′k∗ = E [y

′i ]

Fast edge detection June 6, 2018 13 / 22

Ensemble model

during training, we need to assign a single prediction to a leaf node

during testing, we need to combine multiple predictions into one

to select a single output from a set y1, ..., yk ∈ Y:

compute zi = Πyiselect yk∗ such that k∗ = argmink

∑i,j(zk,j − zi,j)

2 (medoid)

a domain specific ensemble model (for edge map): y′k∗ = E [y

′i ]

Fast edge detection June 6, 2018 13 / 22

Ensemble model

during training, we need to assign a single prediction to a leaf node

during testing, we need to combine multiple predictions into one

to select a single output from a set y1, ..., yk ∈ Y:

compute zi = Πyiselect yk∗ such that k∗ = argmink

∑i,j(zk,j − zi,j)

2 (medoid)

a domain specific ensemble model (for edge map): y′k∗ = E [y

′i ]

Fast edge detection June 6, 2018 13 / 22

Structured Forest Training - Overview

input:

image patch 32× 32, sampled into 7228 features

corresponding segmentation mask 16× 16

randomly selected features per split

segmentation masks → clusters → split information gain

medoid → segmentation mask

averaging → soft edge map

Fast edge detection June 6, 2018 14 / 22

Structured Forest Training - Overview

input:

image patch 32× 32, sampled into 7228 features

corresponding segmentation mask 16× 16

randomly selected features per split

segmentation masks → clusters → split information gain

medoid → segmentation mask

averaging → soft edge map

Fast edge detection June 6, 2018 14 / 22

Structured Forest Training - Overview

input:

image patch 32× 32, sampled into 7228 features

corresponding segmentation mask 16× 16

randomly selected features per split

segmentation masks → clusters → split information gain

medoid → segmentation mask

averaging → soft edge map

Fast edge detection June 6, 2018 14 / 22

Structured Forest Training - Overview

input:

image patch 32× 32, sampled into 7228 features

corresponding segmentation mask 16× 16

randomly selected features per split

segmentation masks → clusters → split information gain

medoid → segmentation mask

averaging → soft edge map

Fast edge detection June 6, 2018 14 / 22

Structured Forest Training - Overview

input:

image patch 32× 32, sampled into 7228 features

corresponding segmentation mask 16× 16

randomly selected features per split

segmentation masks → clusters → split information gain

medoid → segmentation mask

averaging → soft edge map

Fast edge detection June 6, 2018 14 / 22

Structured Forest Training - Overview

input:

image patch 32× 32, sampled into 7228 features

corresponding segmentation mask 16× 16

randomly selected features per split

segmentation masks → clusters → split information gain

medoid → segmentation mask

averaging → soft edge map

Fast edge detection June 6, 2018 14 / 22

Structured Forest Training - Overview

input:

image patch 32× 32, sampled into 7228 features

corresponding segmentation mask 16× 16

randomly selected features per split

segmentation masks → clusters → split information gain

medoid → segmentation mask

averaging → soft edge map

Fast edge detection June 6, 2018 14 / 22

Structured Forest Testing

Fast edge detection June 6, 2018 15 / 22

Structured Forest Testing

Fast edge detection June 6, 2018 15 / 22

Structured Forest Testing

Fast edge detection June 6, 2018 15 / 22

Structured Forest Testing

Fast edge detection June 6, 2018 15 / 22

Efficiency

Why is the method so fast?

a single decision tree → lots of pixel information

lots of pixel information → a small random forest → fast evaluation

Fast edge detection June 6, 2018 16 / 22

Efficiency

Why is the method so fast?

a single decision tree → lots of pixel information

lots of pixel information → a small random forest → fast evaluation

Fast edge detection June 6, 2018 16 / 22

Efficiency

Why is the method so fast?

a single decision tree → lots of pixel information

lots of pixel information → a small random forest → fast evaluation

Fast edge detection June 6, 2018 16 / 22

Multiscale Detection

multiscale version takes original, double, and half resolution of aninput image

resulting three edge maps are averaged

slower (×5)

improved edge quality

Fast edge detection June 6, 2018 17 / 22

Multiscale Detection

multiscale version takes original, double, and half resolution of aninput image

resulting three edge maps are averaged

slower (×5)

improved edge quality

Fast edge detection June 6, 2018 17 / 22

Multiscale Detection

multiscale version takes original, double, and half resolution of aninput image

resulting three edge maps are averaged

slower (×5)

improved edge quality

Fast edge detection June 6, 2018 17 / 22

Multiscale Detection

multiscale version takes original, double, and half resolution of aninput image

resulting three edge maps are averaged

slower (×5)

improved edge quality

Fast edge detection June 6, 2018 17 / 22

Edge Sharpening

individual predictions are noisy and do not perfectly align to eachother or the underlying image data

sharpening takes individual prediction y and produces a new maskthat better aligns it to the image patch x :

compute mean segment s color, µs = E [x(j) | y(j) = s]change pixel j assignment if the pixel color x(j) is closer to differentsegment (s∗ = argmins‖µs − x(j)‖) and such segment labeling is in4-connected vicinity

sharpening can be repeated multiple times, Dollar & Zitnick claimsthat in practice two steps suffice

Fast edge detection June 6, 2018 18 / 22

Edge Sharpening

individual predictions are noisy and do not perfectly align to eachother or the underlying image data

sharpening takes individual prediction y and produces a new maskthat better aligns it to the image patch x :

compute mean segment s color, µs = E [x(j) | y(j) = s]change pixel j assignment if the pixel color x(j) is closer to differentsegment (s∗ = argmins‖µs − x(j)‖) and such segment labeling is in4-connected vicinity

sharpening can be repeated multiple times, Dollar & Zitnick claimsthat in practice two steps suffice

Fast edge detection June 6, 2018 18 / 22

Edge Sharpening

individual predictions are noisy and do not perfectly align to eachother or the underlying image data

sharpening takes individual prediction y and produces a new maskthat better aligns it to the image patch x :

compute mean segment s color, µs = E [x(j) | y(j) = s]

change pixel j assignment if the pixel color x(j) is closer to differentsegment (s∗ = argmins‖µs − x(j)‖) and such segment labeling is in4-connected vicinity

sharpening can be repeated multiple times, Dollar & Zitnick claimsthat in practice two steps suffice

Fast edge detection June 6, 2018 18 / 22

Edge Sharpening

individual predictions are noisy and do not perfectly align to eachother or the underlying image data

sharpening takes individual prediction y and produces a new maskthat better aligns it to the image patch x :

compute mean segment s color, µs = E [x(j) | y(j) = s]change pixel j assignment if the pixel color x(j) is closer to differentsegment (s∗ = argmins‖µs − x(j)‖) and such segment labeling is in4-connected vicinity

sharpening can be repeated multiple times, Dollar & Zitnick claimsthat in practice two steps suffice

Fast edge detection June 6, 2018 18 / 22

Edge Sharpening

individual predictions are noisy and do not perfectly align to eachother or the underlying image data

sharpening takes individual prediction y and produces a new maskthat better aligns it to the image patch x :

compute mean segment s color, µs = E [x(j) | y(j) = s]change pixel j assignment if the pixel color x(j) is closer to differentsegment (s∗ = argmins‖µs − x(j)‖) and such segment labeling is in4-connected vicinity

sharpening can be repeated multiple times, Dollar & Zitnick claimsthat in practice two steps suffice

Fast edge detection June 6, 2018 18 / 22

Parameter Sweeps

Fast edge detection June 6, 2018 19 / 22

Parameter Sweeps

Fast edge detection June 6, 2018 19 / 22

Parameter Sweeps

Fast edge detection June 6, 2018 19 / 22

Parameter Sweeps

Fast edge detection June 6, 2018 19 / 22

Results - SE

Fast edge detection June 6, 2018 20 / 22

Results - HED

Fast edge detection June 6, 2018 21 / 22

Conclusion

realtime

structured learning

suitable preprocessing step for methods requiring speed

Fast edge detection June 6, 2018 22 / 22

Conclusion

realtime

structured learning

suitable preprocessing step for methods requiring speed

Fast edge detection June 6, 2018 22 / 22

Conclusion

realtime

structured learning

suitable preprocessing step for methods requiring speed

Fast edge detection June 6, 2018 22 / 22