"challenges in object detection on embedded devices," a presentation from ceva
TRANSCRIPT
Copyright copy 2014 CEVA Inc 1
Adar Paz
May 29 2014
Challenges in Object Detection on
Embedded Devices
Copyright copy 2014 CEVA Inc 2
CEVA by Numbers
gt 220 licensees amp 330 licensing agreements
gt40 Worldwide handset market
share in 2013 (Strategy
Analytics December 2013)
5 Billion CEVA-powered devices -
shipped worldwide to date
1 in licensable computer vision
and imaging Processors
1 DSP licensor dominant
market share (gt3X of any
other DSP IP vendor)
1 DSP architecture in handsets
ndash more than 900m in 2013
1 DSP core in audio ndash more than
3 billion devices shipped to
date
Copyright copy 2014 CEVA Inc 3
Feature Detection Use in Computer Vision
Corner Blob
Edge
Copyright copy 2014 CEVA Inc 4
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 5
Mobile
Gesture
Control
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 6
Wearables
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 7
Surveillance
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 2
CEVA by Numbers
gt 220 licensees amp 330 licensing agreements
gt40 Worldwide handset market
share in 2013 (Strategy
Analytics December 2013)
5 Billion CEVA-powered devices -
shipped worldwide to date
1 in licensable computer vision
and imaging Processors
1 DSP licensor dominant
market share (gt3X of any
other DSP IP vendor)
1 DSP architecture in handsets
ndash more than 900m in 2013
1 DSP core in audio ndash more than
3 billion devices shipped to
date
Copyright copy 2014 CEVA Inc 3
Feature Detection Use in Computer Vision
Corner Blob
Edge
Copyright copy 2014 CEVA Inc 4
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 5
Mobile
Gesture
Control
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 6
Wearables
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 7
Surveillance
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 3
Feature Detection Use in Computer Vision
Corner Blob
Edge
Copyright copy 2014 CEVA Inc 4
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 5
Mobile
Gesture
Control
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 6
Wearables
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 7
Surveillance
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 4
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 5
Mobile
Gesture
Control
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 6
Wearables
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 7
Surveillance
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 5
Mobile
Gesture
Control
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 6
Wearables
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 7
Surveillance
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 6
Wearables
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 7
Surveillance
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 7
Surveillance
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 8
Automotive
Gesture
Control
Base Technologies
bull Feature Extraction
amp Description
bull Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 9
RTL
design Production Chip
ready Distribution
Why Use a Programmable CV Processor
Development
Time Line
RTL
design
CV
algorithm
design
Production
HW
So
lution
Programmable CV algorithm design Continued algorithms
development
Pro
gra
mm
able
So
lution
Chip
ready Distribution
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 10
Dedicated CV Processor or Mobile GPU C
EV
A-M
M3
10
1 P
erf
orm
an
ce
gain
(re
su
lt p
er
cycle
)
0
5
10
15
20
25
30
MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV amp imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs
Power factor gt50x more efficient
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 12
Flow Chart mdash HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal amp Triggs paper (2005)
Common use is object detection especially pedestrian detection
Reference Code ndash OpenCV 243
Scale 9
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 13
HOG mdash Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1 Vertical Interpolation
Step 2 Horizontal Interpolation
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 14
1 Load 2 X 8 pixels in a single cycle
2 16 filter operations in a single cycle
3 Transposed Store (4x4) in single cycle
4 Perform the load and filter again (12)
5 Transposed Store in single cycle (3)
HOG mdash Bilinear Scaling mdash Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 15
bull Gamma Function P(γ)
bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local
memory in single cycle
bull Loading of multiple gamma values in a single cycle
HOG mdash Gamma Normalization
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 16
bull ORB mdash Oriented FAST and Rotated BRIEF
bull An efficient alternative to SIFT
bull Pyramid is used for scale-invariance
bull Features are detected using FAST9 Harris and non-max-suppress
bull Descriptors are based on BRIEF with normalized orientation
ORB mdash Feature Extraction
Input
Image Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF Descriptors
list Pyramid
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 17
ORB mdash FAST9 Implementation
bull Continuous arc of 9 or more pixels
bull All much brighter then (p+Th)
or
bull All much darker then (p-Th)
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 18
bull Early exit is used to detect potential positions
bull Long memory access of 32 bytes using
bull quickly load consecutive pixels
bull Vector compare is used to compare the center of the corner to
the borders
bull Building a binary (bit) map with positions that need to be calculated
bull Calculation of multiple positions in parallel
bull Using different two dimensional loads
bull Vector predicates are used selectively calculate only the locations
that pass the threshold
bull Using multi-way parallel lookup table access to decide on
consecutive locations
ORB mdash FAST9 Implementation (2)
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 19
ORB mdash FAST9 Implementation (3)
SIMD Efficient Random access to memory
Fast First Pass
Candidate list
Input Image
Fast First Pass
Candidate list
Full FAST9 Second Pass
Output feature list
Input Image
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 20
bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor
bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center
bull Each pair comparison donates a single bit in the descriptor
bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center
BRIEF mdash Descriptor
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 21
BRIEF DSP Implementation
Calculate patch orientation (Patch Moment)
Utilizing LUT capability read the pixels address
Load with random access to memory the pixel couples
Use SIMD capability to efficiently calculate descriptor
Orientation
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 22
bull Inspired by the SIFT descriptor much faster
bull Main modules
bull Detect features location according to pixels response to feature
bull Calculate feature descriptor
SURF mdash Speeded Up Robust Features
Integral Sum Image
Feature Response
Find Local Maximum
Choose Best Features
Feature Descriptor
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 23
+ +
T T
vector memory vector register vector memory vector register
bull Two pass approach
bull Horizontal and vertical data access (Full Bandwidth)
bull Load and transpose data in a single instruction
SURF mdash Integral Sum Image
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 24
bull Calculate feature response using Box Filter
bull Small box A-B+D-C
bull Large box E-F+H-G
bull Total response E-F+H-G - 3x(A-B+D-C)
bull Can be executed in a single cycle
bull Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)(-3)+(D-C)(-3)
SURF mdash Feature Response
Flexible amp Powerful Filter Instruction
-2
1
1A B
C D
E F
G H
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 25
bull Handles multiple features in parallel
bull Memory access of several different features in a single instruction
bull Calculates Integral sum
Vertical and horizontal memory access
bull Calculates gradient using box filter result = (A-B) + (D-C)
SURF mdash Feature Descriptor
Feature Data 0
Feature Data 1
Feature Data 7
parallel load
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG ndash Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF ndash Int Sum
SURF ndash Feature
Response
SURF - Descriptor
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 27
Conclusion mdash
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle efficiency and algorithm flexibility
CV Processor
Efficient filter processing
Good conditional code
execution
Fast amp flexible random access to
memory
Good bit and byte data
manipulation
Wide memory bandwidth
Large internal memory
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You
Copyright copy 2014 CEVA Inc 28
1 ORB an efficient alternative to SIFT or SURF
Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE
2 Histograms of oriented gradients for human detection
Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005
3 SURF Speeded Up Robust Features
Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006
4 BRIEF Binary Robust Independent Elementary Features
Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010
5 Distinctive Image Features from Scale-Invariant Keypoints
David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110
Resource for further Investigation
119936119952119958 are all invited to the CEVA demo table Thank You