detecting cartoons a case study in automatic video-genre classification tzvetanka ianeva arjen de...
TRANSCRIPT
![Page 1: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/1.jpg)
Detecting Cartoons a Case Study in Automatic Video-
Genre Classification
Tzvetanka IanevaArjen de Vries
Hein Röhrig
![Page 2: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/2.jpg)
Outline
• Goal: remove cartoons from search results in TREC-2002 video track
• Our Approach: extract Image Descriptors & SVM Machine Learning
• Related work• Novel Descriptors from Granulometry• SVM Learning• Experimental Results
![Page 3: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/3.jpg)
TREC-2002 video track
• TREC- workshops for large scale evaluation of information retrieval technology
• CWI participation: Probabilistic Multimedia Retrieval Model
• does not distinguish sufficiently “Cartoons”
![Page 4: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/4.jpg)
Example of undesirable ‘cartoon’Query
Best Matches returned
![Page 5: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/5.jpg)
Related work• M.Roach et al. Motion based classification
of cartoons (2001)• B.T.Truong et al. Automatic genre
identification for content-based video categorization (2000)
• J.R.Smith et al. Searching for images and videos on the world wide web
• N.C.Rowe et al. Automatic caption
localization for photographs on www pages
• V.Athitsos et al. [ASF] Distinguishing
photographs and graphics on the www
![Page 6: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/6.jpg)
Cartoons• What is a Cartoon?
– Cartoons do not contain any photographic material
– Photos photographic camera
• Appears easy to find cartoons – Few, simple, strong colors, patches of
uniform colors, strong black edges, text
![Page 7: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/7.jpg)
Quiz: Cartoon or Photo?
![Page 8: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/8.jpg)
![Page 9: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/9.jpg)
Examples not so Typical
![Page 10: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/10.jpg)
![Page 11: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/11.jpg)
Photos like cartoons
![Page 12: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/12.jpg)
![Page 13: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/13.jpg)
“Cartoons” like photos
![Page 14: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/14.jpg)
![Page 15: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/15.jpg)
Artificial photos
![Page 16: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/16.jpg)
![Page 17: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/17.jpg)
Small cues
![Page 18: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/18.jpg)
Overlapping Frames
![Page 19: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/19.jpg)
Mixed
![Page 20: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/20.jpg)
Shadow & Sparkle
![Page 21: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/21.jpg)
Image Descriptors
• greater correlation• normalized• Example: avg. sat., thresh. brightness
Input Image
Image descriptors0.6231 0.9266 …
0.2880 0.4125
(240x352x3)
…
……
1 2 148
1 2 148
![Page 22: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/22.jpg)
Overview of our all image descriptors
Image Descriptors Dimension average saturation 1
threshold brightness 1 color histogram 45 edge-direction histogram 40 compression ratio 1 multi-scale pat. spectrum 60
![Page 23: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/23.jpg)
Brightness and Saturation
• HSV color model• Cartoons brighter =>
use % pixels with Value > 0.4
• Cartoons have strong colors =>
use average Saturation
![Page 24: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/24.jpg)
Saturation in cartoon and photo images
0.2880
0.6231
RGB S-(HSV) RGB S-(HSV)
![Page 25: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/25.jpg)
Brightness in cartoon and photo images
.
0.9266 0.4125
RGB V-(HSV) RGB V-HSV
![Page 26: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/26.jpg)
Histograms
• Image I : XxY -> Rc
• Filter F : I -> I’
• Bins Bk partition of Rc
• hk = #{ (x,y) : I’(x,y) є Bk }
• E.g. brightness metric: I grayscale, c=1, B1 = [ 0, 0.4 ], B2=[0.4,1], return h2
![Page 27: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/27.jpg)
Color Histogram
• More general than brightness & saturation• Again HSV color space• Partition HSV into 3x3x5
= 45 bins• Cartoons have less
colors => col. hist. desc.
![Page 28: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/28.jpg)
Color histogram for in the 45-bin HSV
![Page 29: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/29.jpg)
Color histogram for
in the 45-bin HSV
![Page 30: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/30.jpg)
Edge detection• Cartoons have strong black edges =>
• Approx. total derivative of intensity
I(x,y)Ix,y,
Ix,y
x y
Approx. || and histogram of (, ||) 5 intervals for || 0 … sqrt(20) 8 intervals for 0 … 2
![Page 31: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/31.jpg)
Edge angles & edge magnitudes
![Page 32: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/32.jpg)
Edge histogram
![Page 33: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/33.jpg)
Compressibility
• Cartoons: more simple composition• Detect complexity by measuring
compression ratio• Theory: “Kolmogorov complexity”• Our application: use lossless PNG
compression• Lossy JPEG not useful
0.13548 0.23365
![Page 34: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/34.jpg)
Granulometries
• Idea: measure size distribution of objects
• How? openings by structuring element of growing scale
• Normalized size distribution
• Derivative = pattern spectrum
![Page 35: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/35.jpg)
Openings
• Opening = erosion then dilation with same SE )]([)( ˆ ff BBB
![Page 36: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/36.jpg)
Structuring Elements
• Non-flat parabola better(?) than flat disk
• Parabola: efficient computation, symmetry
)},(),({min)(),(
yxByxffByx
B
![Page 37: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/37.jpg)
Small-scale pattern spectrum descriptors
SE disk
ri = i, i = 1,…20
![Page 38: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/38.jpg)
SVM Learning• Simplest case:
linear separator• SVM finds
hyperplane with largest margin
• Closest points = Support Vectors
![Page 39: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/39.jpg)
SVM Learning: nonseparable
• Noisy data: no separating hyperplane at all!
• Solution: penalty C for points inside the margin
• C SVM machines
![Page 40: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/40.jpg)
SVM = quadratic programming
l
iii
i
ji
l
jijiji
l
ii
y
liC
xxyy
1
1,1
0
,,,1 0:subject to
2
1max
libxwy
Cw
iii
l
ii
bw
,,1 -1:subject to
2
1min
1
2
,,
SVM task:
Equivalent dualproblem:
![Page 41: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/41.jpg)
SVM with kernels
l
iii
i
ji
l
jijiji
l
ii
y
liC
xxkyy
1
1,1
0
,,,1 0:subject to
),(2
1max
SVM task:
Equivalent dualproblem:
libxwy
Cw
iii
l
ii
bw
,,1 -1)(:subject to
2
1min
1
2
,,
FRn : )ˆ()()ˆ,( xxxxk
![Page 42: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/42.jpg)
SVM kernels
2
2
2
ˆexp)ˆ,(
xx
xxk
qxxxxk 1ˆ)ˆ,(
RBF kernels
Polynomialkernels
![Page 43: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/43.jpg)
SVM with kernels: decision function
l
iii
i
ji
l
jijiji
l
ii
y
liC
xxkyy
1
1,1
0
,,,1 0:subject to
),(2
1max
SVM task:
Equivalent dualproblem:
libxwy
Cw
iii
l
ii
bw
,,1 -1)(:subject to
2
1min
1
2
,,
Decision function:
bxxkyxfl
iiii
1
),(sgn)(
![Page 44: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/44.jpg)
Experimental Data
• Key frames from TREC 2002 Video Track
• 13,026 photographic images• 1,620 cartoons• Manually classified• Experiments 1-3: train on (random)
3908 photos and 486 cartoons
![Page 45: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/45.jpg)
Experiment 1: individual performance
0,0027
0
0,0095
0
0
0,0002
0,9541
1
0,754
1
1
0,9497
0,108
0,1106
0,0919
0,1106
0,1106
0,1052
average saturation
treshhold br ightness
color histogram
edge histogram
compression ratio
pattern spectrum
Error photos
Error cartoons
Total error
σ2 = 0.1
0.05 < σ2 < 0.5
σ2 = 0.07
0.05 < σ2 < 0.5
0.05 < σ2 < 0.5
σ2 = 0.07
Et = Ep +Ec
|p|
|p|+|c|
|c|
|p|+|c|
![Page 46: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/46.jpg)
Experiment 2: “convergence” of SVM
learning
0,1020
0,1040
0,1060
0,1080
0,1100
0,1120
erro
r
1/ 2 1/ 4 1/ 6 1/ 8 1/ 10 1/ 12 1/ 14 1/ 16 1/ 18
σ²(Pattern spectrum)
![Page 47: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/47.jpg)
Experiment 3: combined performance
0,0068
0,0111
0,0068
0,009
0,0098
0,011
0,0111
0,6914
0,657
0,7734
0,672
0,6684
0,7046
0,6437
0,0825
0,0825
0,0916
0,0823
0,0826
0,0884
0,0811
all - average saturation
all - treshhold br ightness
all - color histogram
edge histogram
all - compression ratio
all - pattern spectrum
all
Error photos
Error cartoons
Total error
σ2 = 0.06
![Page 48: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/48.jpg)
Experiment 4: web-image classifier on our data
0.0
0.1
0.2
0.3
0.4
0.5
100 200 300 400 500 600
training set
erro
r we
[ASF]
Test set: random 1,000 photos and 1,000 cartoons
![Page 49: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/49.jpg)
Experiment 5: Performance on web images
0
0,02
0,04
0,06
0,08
0,1
erro
r
we [ASF]
+ dimension and file type features
Comparison with 14,039 photographic and 9,512 graphical images harvested from WWW train on (random) 4239 photographics and 2826 graphics
![Page 50: Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig](https://reader036.vdocuments.site/reader036/viewer/2022062308/56649cf35503460f949c1159/html5/thumbnails/50.jpg)
Conclusions
• Hard task: good classifier• Use dynamics/spatio-temporal
relations ?• Semantic Gap?• Combine classifiers? • Granulometry not enough