"challenges in object detection on embedded devices," a presentation from ceva

28
Copyright © 2014, CEVA Inc. 1 Adar Paz May 29, 2014 Challenges in Object Detection on Embedded Devices

Upload: embedded-vision-alliance

Post on 14-Aug-2015

78 views

Category:

Technology


1 download

TRANSCRIPT

Copyright copy 2014 CEVA Inc 1

Adar Paz

May 29 2014

Challenges in Object Detection on

Embedded Devices

Copyright copy 2014 CEVA Inc 2

CEVA by Numbers

gt 220 licensees amp 330 licensing agreements

gt40 Worldwide handset market

share in 2013 (Strategy

Analytics December 2013)

5 Billion CEVA-powered devices -

shipped worldwide to date

1 in licensable computer vision

and imaging Processors

1 DSP licensor dominant

market share (gt3X of any

other DSP IP vendor)

1 DSP architecture in handsets

ndash more than 900m in 2013

1 DSP core in audio ndash more than

3 billion devices shipped to

date

Copyright copy 2014 CEVA Inc 3

Feature Detection Use in Computer Vision

Corner Blob

Edge

Copyright copy 2014 CEVA Inc 4

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 5

Mobile

Gesture

Control

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 6

Wearables

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 7

Surveillance

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

AgeGender

Detection

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 2

CEVA by Numbers

gt 220 licensees amp 330 licensing agreements

gt40 Worldwide handset market

share in 2013 (Strategy

Analytics December 2013)

5 Billion CEVA-powered devices -

shipped worldwide to date

1 in licensable computer vision

and imaging Processors

1 DSP licensor dominant

market share (gt3X of any

other DSP IP vendor)

1 DSP architecture in handsets

ndash more than 900m in 2013

1 DSP core in audio ndash more than

3 billion devices shipped to

date

Copyright copy 2014 CEVA Inc 3

Feature Detection Use in Computer Vision

Corner Blob

Edge

Copyright copy 2014 CEVA Inc 4

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 5

Mobile

Gesture

Control

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 6

Wearables

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 7

Surveillance

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

AgeGender

Detection

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 3

Feature Detection Use in Computer Vision

Corner Blob

Edge

Copyright copy 2014 CEVA Inc 4

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 5

Mobile

Gesture

Control

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 6

Wearables

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 7

Surveillance

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

AgeGender

Detection

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 4

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 5

Mobile

Gesture

Control

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 6

Wearables

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 7

Surveillance

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

AgeGender

Detection

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 5

Mobile

Gesture

Control

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 6

Wearables

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 7

Surveillance

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

AgeGender

Detection

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 6

Wearables

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 7

Surveillance

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

AgeGender

Detection

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 7

Surveillance

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

AgeGender

Detection

Computer Vision Understanding

Content

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 8

Automotive

Gesture

Control

Base Technologies

bull Feature Extraction

amp Description

bull Feature Matching

and Tracking

Applications

Augmented

Reality

Object Detection

Recognition

Depth Map

Image Stitching

Face Detection and

Recognition

Motion Detection

Emotion Detection

AgeGender

Detection

Segmentation

Irregular Behavior

Detection

Forward Collision

Warning (FCW)

Lane Departure

Warning (LDW)

Pedestrian

Detection (PD)

Traffic Sign Rec

(TSR)

Driver Fatigue

Warning

Surround View

Monitor

Always-On

Computer Vision Understanding Content

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 9

RTL

design Production Chip

ready Distribution

Why Use a Programmable CV Processor

Development

Time Line

RTL

design

CV

algorithm

design

Production

HW

So

lution

Programmable CV algorithm design Continued algorithms

development

Pro

gra

mm

able

So

lution

Chip

ready Distribution

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 10

Dedicated CV Processor or Mobile GPU C

EV

A-M

M3

10

1 P

erf

orm

an

ce

gain

(re

su

lt p

er

cycle

)

0

5

10

15

20

25

30

MM3101 Vs Mobile GPU A MM3101 Vs Mobile GPU B MM3101 Vs Mobile GPU C

sobel

corr3x3

corrsep5x5

corrsep11x11

Histogram

HSV2RGB

RGB2HSV

Max3x3

Min

Median3x3

CornerHarris

Average

Typical set of

representative

CV amp imaging

algorithms

MM3101 shows average 13x performance boost over leading Mobile GPUs

Power factor gt50x more efficient

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 11

Typical Feature Treatment Flow

Sobel

Gaussian

Build image

pyramids

Integral sum

Gamma

normalization

HOG

BRIEF

SIFT

SURF

FREAK

LBP

HAAR

SVM

Decision

tree

Harris

FAST

SIFT

SURF

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 12

Flow Chart mdash HOG Descriptor

Input image

Scaled image

Scale 1

Scaled image

Gamma

Normalization

Gradient

Calculation

Descriptor

Calculation

Bilinear

Scaling

HOG algorithm is based on Dalal amp Triggs paper (2005)

Common use is object detection especially pedestrian detection

Reference Code ndash OpenCV 243

Scale 9

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 13

HOG mdash Bilinear Scaling

Bilinear Interpolation

X

X

X

Step 1 Vertical Interpolation

Step 2 Horizontal Interpolation

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 14

1 Load 2 X 8 pixels in a single cycle

2 16 filter operations in a single cycle

3 Transposed Store (4x4) in single cycle

4 Perform the load and filter again (12)

5 Transposed Store in single cycle (3)

HOG mdash Bilinear Scaling mdash Implementation

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

Memory

vA

vB

vC

vD

Vector Registers

transpose

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 15

bull Gamma Function P(γ)

bull Implemented using lsquoLook Up Tablersquo (LUT) ndash parallel access to local

memory in single cycle

bull Loading of multiple gamma values in a single cycle

HOG mdash Gamma Normalization

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 16

bull ORB mdash Oriented FAST and Rotated BRIEF

bull An efficient alternative to SIFT

bull Pyramid is used for scale-invariance

bull Features are detected using FAST9 Harris and non-max-suppress

bull Descriptors are based on BRIEF with normalized orientation

ORB mdash Feature Extraction

Input

Image Fast9 Harris

Non-Max-

Suppress

Oriented

BRIEF Descriptors

list Pyramid

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 17

ORB mdash FAST9 Implementation

bull Continuous arc of 9 or more pixels

bull All much brighter then (p+Th)

or

bull All much darker then (p-Th)

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 18

bull Early exit is used to detect potential positions

bull Long memory access of 32 bytes using

bull quickly load consecutive pixels

bull Vector compare is used to compare the center of the corner to

the borders

bull Building a binary (bit) map with positions that need to be calculated

bull Calculation of multiple positions in parallel

bull Using different two dimensional loads

bull Vector predicates are used selectively calculate only the locations

that pass the threshold

bull Using multi-way parallel lookup table access to decide on

consecutive locations

ORB mdash FAST9 Implementation (2)

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 19

ORB mdash FAST9 Implementation (3)

SIMD Efficient Random access to memory

Fast First Pass

Candidate list

Input Image

Fast First Pass

Candidate list

Full FAST9 Second Pass

Output feature list

Input Image

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 20

bull Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor

bull The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center

bull Each pair comparison donates a single bit in the descriptor

bull Orientation is normalized by rotating the image (or pairs coordinates in our implementation) according to the moment of the feature center

BRIEF mdash Descriptor

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 21

BRIEF DSP Implementation

Calculate patch orientation (Patch Moment)

Utilizing LUT capability read the pixels address

Load with random access to memory the pixel couples

Use SIMD capability to efficiently calculate descriptor

Orientation

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 22

bull Inspired by the SIFT descriptor much faster

bull Main modules

bull Detect features location according to pixels response to feature

bull Calculate feature descriptor

SURF mdash Speeded Up Robust Features

Integral Sum Image

Feature Response

Find Local Maximum

Choose Best Features

Feature Descriptor

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 23

+ +

T T

vector memory vector register vector memory vector register

bull Two pass approach

bull Horizontal and vertical data access (Full Bandwidth)

bull Load and transpose data in a single instruction

SURF mdash Integral Sum Image

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 24

bull Calculate feature response using Box Filter

bull Small box A-B+D-C

bull Large box E-F+H-G

bull Total response E-F+H-G - 3x(A-B+D-C)

bull Can be executed in a single cycle

bull Two operations in a single cycle

Res = (E-F)+(H-G)

Res += (A-B)(-3)+(D-C)(-3)

SURF mdash Feature Response

Flexible amp Powerful Filter Instruction

-2

1

1A B

C D

E F

G H

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 25

bull Handles multiple features in parallel

bull Memory access of several different features in a single instruction

bull Calculates Integral sum

Vertical and horizontal memory access

bull Calculates gradient using box filter result = (A-B) + (D-C)

SURF mdash Feature Descriptor

Feature Data 0

Feature Data 1

Feature Data 7

parallel load

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 26

Feature Extraction Summary Table

Algorithm Memory Access Execution

2-Dimensional

Access

LUT Support Parallel

Memory Access

Dedicated

Instructions

Vectorized

Conditional Flow

HOG ndash Bilinear

HOG - Gamma

FAST9 - Detector

BRIEF - Descriptor

SURF ndash Int Sum

SURF ndash Feature

Response

SURF - Descriptor

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 27

Conclusion mdash

Critical Ingredients for Efficient CV Processor

CEVA-MM3101 Imaging amp Computer Vision IP platform Includes all above features (and morehellip)

CEVA-CV library already includes various feature extraction algos

Enables shorter development cycle efficiency and algorithm flexibility

CV Processor

Efficient filter processing

Good conditional code

execution

Fast amp flexible random access to

memory

Good bit and byte data

manipulation

Wide memory bandwidth

Large internal memory

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You

Copyright copy 2014 CEVA Inc 28

1 ORB an efficient alternative to SIFT or SURF

Rublee E Rabaud V Konolige K Bradski G - Computer Vision (ICCV) 2011 IEEE

2 Histograms of oriented gradients for human detection

Dalal N Triggs B - Computer Vision and Pattern Recognition 2005 CVPR 2005

3 SURF Speeded Up Robust Features

Herbert Bay Tinne Tuytelaars Luc Van Gool - Computer Vision ndash ECCV 2006

4 BRIEF Binary Robust Independent Elementary Features

Michael Calonder Vincent Lepetit Christoph Strecha Pascal Fua - Computer Vision ndash ECCV 2010

5 Distinctive Image Features from Scale-Invariant Keypoints

David G Lowe - International Journal of Computer Vision Volume 60 Issue 2 pp 91-110

Resource for further Investigation

119936119952119958 are all invited to the CEVA demo table Thank You