mobile visual search & mpeg-7 standard
TRANSCRIPT
Mobile Visual Search+
MPEG-‐7 standard
NGUYEN ANH TUAN2015/05/27
Today
• Introduction: Opportunities and Definitions• Mobile Visual Search
Ø System ArchitecturesØ Compact Descriptor for Visual Search
• International Standards: MPEG-‐7
• ApplicationsØ EMODØ BilVideo-‐7
• Summary
2
Introduction
• Opportunities and Definitions
3
What is Mobile Visual Search?
• Content-‐based Image Retrieval
• Input: Multimedia (images)
• Output: related media
• Targets: mobile devices (and
servers)
4
Retrieval results
Opportunities 5
Credit: http://www.emarketer.com/Article/Smartphone-‐Users-‐Worldwide-‐Will-‐Total-‐175-‐Billion-‐2014/1010536
The number of mobile phone users will be 4.55 billion in 2017 (70% of world
population)
Mobile Visual Search System
• Overview of System Architectures• Compact Descriptors for Visual Search (CDVS)
6
Overview of System Architectures
• Mobile devices and visual search servers
Ø The search without servers (on-‐device search) are commonly expensive when the scale is
large.
Ø There are 3 patterns for the system with both mobile devices and visual search servers.
• Images are compressed and described by feature vectors (descriptors).
7
How the search works? [1] 8
[1] Girod B. et al., “Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard”, IEEE MultiMedia, pp. 86—94, 2011
SIFT, SURF, Fisher, DNN features, …
ANN search, binary matching,
…
• Image database• Feature (descriptor)
database• Compressed
descriptor database
Vocabulary tree [1]
Pattern 1: Server-‐centralized model [1] 9
[1] Girod B. et al., “Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard”, IEEE MultiMedia, pp. 86—94, 2011
Send JPEG Images
Pattern 2: Hybrid model [1] 10
[1] Girod B. et al., “Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard”, IEEE MultiMedia, pp. 86—94, 2011
Send features only
Pattern 3: Hybrid model with cache [1] 11
[1] Girod B. et al., “Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard”, IEEE MultiMedia, pp. 86—94, 2011
Need local cache
Synchronization between local cache
and remote DB
Comparison of 3 patterns
Terms Pattern 1: Server-‐centralized model [1]
Pattern 2: Hybrid model w/o cache [1]
Pattern 3: Hybrid model w cache [1]
Transmission dataformat
Image (JPEG, PNG, …) Descriptors (feature vectors)
Load on server/ network Heavy Small Smallest
Load on mobile Small Heavy
System architecture Simple Comparatively
simpleComplex (becauseof synchronization)
12
[1] Girod B. et al., “Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard”, IEEE MultiMedia, pp. 86—94, 2011
MPEG-‐7
• Overview• Compact Descriptors for Visual Search
13
MPEG standard family [3] 14
[3] http://en.wikipedia.org/wiki/Moving_Picture_Experts_Group
MPEG-‐7 standard [4]
• This International Standard is subdivided into 13 parts:Ø Part 1 – SystemsØ Part 2 – 5 General descriptionsØ Part 6 – Reference softwareØ Part 7 – Conformance testingØ Part 8 – Extraction and use of MPEG-‐7 descriptionsØ Part 9 – 11 Profiles, levels and schemasØ Part 12 – Query formatØ Part 13 – Compact descriptors for visual search: specifies an image description tool for visual search applications
15
[4] http://mpeg.chiariglione.org/standards/mpeg-‐7/systems
Timeline of CDVS development
2001• MPEG-‐7 was published.
2010/01
• 91st MPEG Meeting• First contribution to MPEG CDVS
2012/02
• 99th MPEG Meeting• First Test Model (TM 1.0)
2013/10
• 106th MPEG Meeting• CDVS entered Committee Draft (CD)
2014/04
• 108th MPEG Meeting• Draft of International Standard (DIS)
2014/10
• 110th MPEG Meeting• Final DIS (FDIS)• TM 12.0
• Some key components
Ø Core experiments (CE): CE1〜CE8 to
investigate proposals.
Ø Test Models (TM): software reference
models for development purpose.
16
[2] Duan Y. et al., “Compact Descriptors for Visual Search”, IEEE MultiMedia, pp. 30—40, 2014[5] Duan Y. et al., “Overview of the MPEG CDVS standard”, Proc. IEEE DCC, pp. 323—332, 2015
Ø In TM 11.0, mAP increases to 0.85 for 100 images scale
ØMemory is reduced from some MB to some KB.
Ø Search time: 2.0 ms/query
Compact Descriptor for Visual Search (CDVS) [2,5] 17
[2] Duan Y. et al., “Compact Descriptors for Visual Search”, IEEE MultiMedia, pp. 30—40, 2014[5] Duan Y. et al., “Overview of the MPEG CDVS standard”, Proc. IEEE DCC, pp. 323—332, 2015
Vocabulary tree [2]
For better memory usage and to fit the
storage requirements.
For better retrieval accuracy, location information is also
important.
Interest point detection & feature selection [2,5] 18
[2] Duan Y. et al., “Compact Descriptors for Visual Search”, IEEE MultiMedia, pp. 30—40, 2014[5] Duan Y. et al., “Overview of the MPEG CDVS standard”, Proc. IEEE DCC, pp. 323—332, 2015
Vocabulary tree-‐based indexing [1]
Vocabulary tree (VT) by hierarchical k-‐means Inverted index by VT
[1] Girod, B. et. al., “Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard”, IEEE MultiMedia, vol. 18, no. 3, pp. 86-‐94, 2011
19
Image Retrieval [2,5] 20
[2] Duan Y. et al., “Compact Descriptors for Visual Search”, IEEE MultiMedia, pp. 30—40, 2014[5] Duan Y. et al., “Overview of the MPEG CDVS standard”, Proc. IEEE DCC, pp. 323—332, 2015
How good the descriptors are: Pairwise matching [1,2] 21
[1] Girod B. et al., “Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard”, IEEE MultiMedia, pp. 86—94, 2011[2] Duan Y. et al., “Compact Descriptors for Visual Search”, IEEE MultiMedia, pp. 30—40, 2014
Clean image (in database)
Real world image (is taken by
mobile devices)Geometric verification:
Location information is important
Pairwise Matching [2,5] 22
[2] Duan Y. et al., “Compact Descriptors for Visual Search”, IEEE MultiMedia, pp. 30—40, 2014[5] Duan Y. et al., “Overview of the MPEG CDVS standard”, Proc. IEEE DCC, pp. 323—332, 2015
MPEG-‐7 applications
• EMOD• BilVideo-‐7
23
EMOD [6]: A hybrid model with cache 24
[6] Li D. et al., “EMOD: An Efficient On-‐device Mobile Visual Search System”, Proc. ACM Multimedia, 2014
Visual Dictionary size is about 8MB
BilVideo-‐7 [7]: A hybrid model
BilVideo-‐7’s client-‐server architecture Indexing
25
[7] Bastan, M. et. al., "Bilvideo-‐7: an MPEG-‐7-‐ compatible video indexing and retrieval system," IEEE MultiMedia, 2010