integrating computer vision sensor innovatoins into mobile...

16
INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.

Upload: others

Post on 21-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES

Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.

Page 2: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

Computer Vision in Mobile

Tegra K1

It’s time!

Page 3: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

AGENDA Use cases categories

Underlying technologies examples

Performance and power considerations

Software considerations and dilemmas

Page 4: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

VISION FUNCTIONALITY TAXONOMY

3D Reconstruction

Environmental Feature Tracking Face, eye and hand

gesture tracking

Object Reconstruction

Scene Reconstruction

User Facing Scene Facing User Facing Scene Facing

Tracking

Indoor/Outdoor Positional Tracking Body Modeling

Facial Modeling

Body Tracking

Markets

UI / Smart TV / STB

Gaming

Automotive

Social/Media

E-commerce

Modeling/Architecture/DIY/3D printing

Small Scale

Large Scale

Page 5: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

UNDERLYING TECHNOLOGY: DEPTH EXTRACTION

Obtain a depth map for many points on a 2D picture

Not necessarily per every pixel

From there, we can calculate:

— 3D geometry and model

— Body position and movement

— Face features and expression

Aggregating models is easy

— From different shots

— From different sources

Page 6: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

3D SCANNING: THE TECHNOLOGIES Different approaches:

— Structured light Project IR pattern

Find the pattern symbols on the image

Triangulate to find depth

— Stereo Capture two or more images

Find corresponding points

Triangulate to find depth

— Structure from Motion (SfM) Similar to Stereo but using same camera over time (instead of multiple cameras)

— Coded / multiple aperture Project different patterns and solve for depth

— Time of Flight Project pulse of light

Capture returned phase

IR

B

A

Page 7: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

UNDERLYING TECHNOLOGY: VISUAL ODOMETRY

The use of data from cameras to estimate device change in position over time

1. Uses either single, stereo, or omnidirectional cameras

2. Image correction for lens distortion

3. Feature detection

4. Construct optical flow field

5. Estimation of the camera motion from the optical flow

1. Kalman filter or cost function minimization

6. Check potential tracking errors and remove outliers

7. Periodic repopulation of points to maintain coverage across the image

Images from Davide Scaramuzza

Page 8: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

ARE WE THERE YET? Performance

— Do the algorithms fit in the HW? Is the HW fast enough?

— Do they leave enough headroom for the actual application?

— Do the algorithms and the applications work together efficiently?

Power

— Does it fit the constrains of thermal, max current and battery life?

Cost

— New sensors, light sources, etc.

SW infrastructure

— Do the right APIs exist?

— Is the imaging pipeline flexible enough?

— Are there programming languages/environment to support this?

Page 9: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

TEGRA K1: A MAJOR LEAP FORWARD FOR MOBILE & EMBEDDED APPLICATIONS KEPLER GPU, 192 CORES CUDA 12GB/S BANDWIDTH VIDEO IMAGE COMPOSITOR (VIC) DESIGNED FOR MOBILE DEVICES

HD Video Processor 1080p24/30 Video Decode 1080p24/30 Video Encode H.264 | MPEG4 | VC1 | MPEG2 VP8

Kepler GeForce®

GPU w/CUDA

OpenGL-ES nextgen

192 Stream Processors

2D Graphics/Scaling

DAP x5 (12S/TDM)

HDMI eDP/LVDS

ARM

7

Audio

Pro

cess

or Image Processor

25MP Sensor Support ISP 1080p60 Enhanced JPEG Engine

PCIe* G2 x4 + x1

CSI x4 + x4

SATA2 x1 USB 2.0 x3

Security Engine

Display x2

NOR Flash

UART x4 I2C x5

DDR3 Ctlr 64b

800+ MHz

SPI x4 SDIO/MMC x4

28 nm HPM 23x23mm, 0.7mm pitch HS-FCBGA

USB 3.0* x2

Quad Cortex-A15

4x Cores (1+ GHz) NEON SIMD 2 MB L2 (Shared) ARM Trust Zone

Shadow LP C-A15 CPU

TEGRA K1

Page 10: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

KEPLER Architecture 192 CUDA Cores, SM3.2 ISA Compatible to GeForce, Quadro, Tesla 64kb L1 Cache and Shared Memory 128kb L2 Cache 128 kb Register File

HD Video Processor 1080p24/30 Video Decode 1080p24/30 Video Encode H.264 | MPEG4 | VC1 | MPEG2 VP8

Kepler GeForce®

GPU w/CUDA

OpenGL-ES nextgen

192 Stream Processors

2D Graphics/Scaling

DAP x5 (12S/TDM)

HDMI eDP/LVDS

ARM

7

Audio

Pro

cess

or Image Processor

25MP Sensor Support ISP 1080p60 Enhanced JPEG Engine

PCIe* G2 x4 + x1

CSI x4 + x4

SATA2 x1 USB 2.0 x3

Security Engine

Display x2

NOR Flash

UART x4 I2C x5

DDR3 Ctlr 64b

800+ MHz

SPI x4 SDIO/MMC x4

28 nm HPM 23x23mm, 0.7mm pitch HS-FCBGA

USB 3.0* x2

Quad Cortex-A15

4x Cores (1+ GHz) NEON SIMD 2 MB L2 (Shared) ARM Trust Zone

Shadow LP C-A15 CPU

GPU

Page 11: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

SW CONSIDERATIONS

Need APIs and frameworks to develop SW

— Flexible and complete enough for experimentation

— Fast and stable enough for productization

— Portable for installed base

APIs and libraries

— Android Camera HAL v.3

— OpenCV

— OpenVX

— StreamInput

— VisionWorks

— CUDA

Page 12: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

Camera HAL v3 is a fundamentally new API

— Flexible primitives for building

sophisticated use-cases

— Interface is clean and easily extensible

— Apps can have more control, and more

responsibility

Enables sophisticated camera applications

Faster time to market and higher quality

— 1 Request 1 capture

1 result metadata + N image buffers

ANDROID CAMERA HAL V3

Page 13: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

OPENCV LIBRARY Version 2.4.5 >900 functions (x the datatypes)

OpenCV4Tegra acceleration:

— CUDA, NEON, GLSL, TBB multithreading

General Image

Processing

Segmentation Machine Learning,

Detection Image Pyramids Transforms Fitting

Image processing

Video, Stereo, and 3D

Camera Calibration Features Depth Maps Optical Flow Inpainting Tracking

OpenCV

Page 14: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

VISIONWORKS

Sobel

Convolve

Bilateral Filter

Integral Image

Integral Histogram

Corner Harris

Corner FAST

Image Pyramid

Optical Flow PyrLK

Optical Flow Farneback

Warp Perspective

Hough Lines

Fast NLM Denoising

Stereo Block Matching

IME (Iterative Motion

Estimation)

HOG (Histogram of

Oriented Gradients)

Soft Cascade Detector

Object Tracker

TLD Object Tracker

SLAM

Path Estimator

MedianFlow Estimator

Page 15: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

IT IS HAPPENING!

Use cases emerging

Tegra K1 mobile compute power in mobile devices

Software Infrastructure

Page 16: Integrating Computer Vision Sensor Innovatoins into Mobile ...on-demand.gputechconf.com/gtc/2014/presentations/S... · UI / Smart TV / STB Gaming Automotive Social/Media E-commerce

THANKS