xilinx revision xfopencv - missing link electronics · xilinx revision xfopencv 26. june 2018...

29
Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott

Upload: others

Post on 10-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Xilinx reVISION xfOpenCV

26. June 2018Dr.-Ing. Karsten Trott

Page 2: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 2

Introduction to reVISION

Page 3: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 3

Frameworks

Libraries and Tools

Development Kits

DNN

CNNGoogLeNet

SSD

FCN …

Page 4: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Mandates: From Embedded Vision to Autonomous Systems

Xilinx Unique Application Advantages

Responsive

Reconfigurable

Connected

Optimized from Sensors to <8-bit Inference & Control

Reconfigurable for Latest Networks & Sensors

Any-to-Any Connectivity

Barrier to Broad Adoption:

Software Defined Programming, Libraries and Frameworks

Page 4

Page 5: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 5

What is SDSoC?

Page 6: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

HS/SW partitioning with SDSoC

• General-Purpose ports

• Accelerator Coherency Port

• High Performance ports

• Asynch FIFO Interfaces

Page 7: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 7

Under the Hood – Vivado High Level Synthesis

Scheduling Binding Control Extraction

int foo(short int c, short int a,

short int b, short int c)

{

int y = x*a + b + c;

return y;

}

Page 8: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Enabling Software Defined Development Flow

C/C++/OpenCL

Creation

Profiling to Identify

Bottlenecks

System Optimizing

Compiler

Computer Vision

Machine Learning

Scheduling of Pre-Optimized

Neural Network Layers

Optimized Accelerators

& Data Motion Network

.prototxt

& Trained

Weights

DNN

CNNGoogLeNet

SSD

FCN …

Page 9: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 9

reVISION 2017.4: Tools, Platforms and Demos

Page 10: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 10

reVISION wiki pages

Page 11: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 11

reVISION 2017.4 wiki page

Page 12: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 12

Xilinx xfOpenCV on github

Page 13: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 13

Introduction to xfOpenCV

Page 14: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Typical ARM Cortex-A53

OpenCV Needs Acceleration in Embedded

Typical Requirement > 30 FPS

Harris Corner 2.4 FPS

Stereo Depth Map 2.1 FPS

Dense Optical Flow 0.1 FPS

Source: Embedded Vision Alliance,

Embedded Vision Developer Survey, January 2017

Page 15: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

xfOpenCV Library

xfOpenCV directly infers pipelining functions from one to the next, avoiding

frame buffers and external memory

Page 15

Zynq Offers Superior Performance

Rectify

Rectify

Remap

Remap

ARM/eGPU Traditional

DDR

Rectify

Rectify

Remap

Remap

StereoBMStereo

Camera

DDR

Data Streaming Lowers Latency

Rectify

Rectify

Stereo

Camera

Remap

Remap

StereoBM

CV::

StereoLBM@1080p

Xilinx ZU9 Xilinx ZU5 Embedded GPU

Frames/s 700 296 28

Power (W) 4.8 3.3 7.9

Frames/s/watt 145.8 89.7 3.5

• eGPU = nVidia Tegra X1 using VisionWorks for StereoLBM and OpenCV4Tegra for OpticalFlow

• All benchmarks utilize as much resources as possible on GPU (~99%) and programmable logic (~70%)

Page 16: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 16

OpenCV Support with Automatic HW Acceleration

main(){

cv::imread(A);

cv::stereoRectify(A,B,C,D);

cv::stereoLBM(C,D,out);

cv::imshow(out);

}

stereoRectify

stereoLBM

300

300

1 2 3 4Cross-compile OpenCV

application to Zynq

(ARM A9/A53)

Profile and identify

bottleneck functions

Minimal changes to the

code and set functions to

hardware.

Compile using SDSoC

Run on a Zynq board

main(){

cv::imread(A);

xf::stereoRectify<line>(A,B,C,D);

xf::stereoLBM<win,n_disp>(C,D,out);

cv::imshow(out);

}

Page 17: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 17

xfOpenCV: 50+ Most Needed OpenCV Functions Basic Functionality Geometric Transforms Image Processing and

Filters

Feature Detection and

Classifiers

3D Reconstruction Motion Analysis and

Tracking

Absolute difference Scale/Resize Box Canny edge detection StereoLBM Mean Shift Tracking (MST)

Accumulate StereoRectify Gaussian Fast corner LK Dense Optical Flow

Accumulate squared Warp Affine Median SVM (binary)

Accumulate weighted Warp Perspective Sobel Harris corner

Arithmetic addition Remap Custom convolutionHistogram of Oriented

Gradients (HOG)

Arithmetic subtraction Equalize Histogram

Bitwise: AND, OR, XOR, NOT Dilate

Pixel-wise multiplication Erode

Channel combine Bilateral

Channel extract OTSU Thresholding

Color convert Thresholding

Convert bit depth Image pyramid

Table lookup Color Detection

Integral image

Gradient Magnitude

Histogram

Gradient Phase

Min/Max Location

Mean & Standard Deviation

Page 18: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 18

xfOpenCV: Fast and Simple

Standard OpenCV Bilateral filter example

using namespace cv;

Mat imgOutput,Mat imgInput,

// Step 1: Read in imagesimgInput = imread(argv[1], 0);

// Step 2: Call Bilateral functionbilateralFilter(imgInput, imgOutput, 3, sigma_color,sigma_space, cv::BORDER_REPLICATE);

// Step 3: Read out imageimwrite("output_ocv.png", imgOutput);

xfOpencv Bilateral filter example

using namespace xf;

Mat<XF_8UC1, HEIGHT, WIDTH, XF_NPPC1> imgInput;Mat<XF_8UC1, HEIGHT, WIDTH, XF_NPPC1> imgOutput;

// Step 1: Read in imagesimgInput = imread(argv[1], 0);

// Step 2: Call Bilateral functionbilateralFilter <3, XF_BORDER_REPLICATE,XF_8UC1, HEIGHT, WIDTH, XF_NPPC1> (imgInput, imgOutput, sigma_color, sigma_space);

// Step 3: read out imageimwrite("output_ocv.png", imgOutput);

Page 19: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 19

Custom CV Function / Library Creation Flow

stereoRectify

stereoLBM

CUSTOM_CV

300

300

300

1 2 3 4Cross-compile to Zynq

(ARM A9/A53)

Write custom CV

function in C, C++ or

OpenCL.

Optimize for hardware

using HLS

Assign functions to

hardware.

Compile using SDSoC

Run on a Zynq board

main(){

cv::imread(A);

xf::stereoRectify<line>(A,B,C,D);

xf::stereoLBM<win,n_disp>(C,D,E);

CUSTOM_CV(E,out);

cv::imshow(out);

}

CUSTOM_CV(E,out){

#pragma HLS PIPELINE

for(…){

#pragma HLS UNROLL

for(…){ …

}

}

}

Page 20: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 20

reVISION Examples

Page 21: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Computer Vision Design Example: 4K60 Dense Optical Flow main(){

imread(A);

imread(B);

denseOpticalFlowPyrltr(A,B,out)

imshow(out);}

ZCU102 EV Platform

MIPI

AXIPS

PL

Linux

Libraries

Application

Drivers

denseOpticalFlowPyrltrHDMI

Xilinx ZU9

Frames/s 60

Power (W) 4.8

Latency (ms) 16.7

Utlization 15%

• nVidia number using CUDA OpenCV

• Both Xilinx and nVidia benchmarks do not include the camera inputs and HDMI/DP

• LK dense optical flow, non-pyramidal, non-iterative, Window size 53x53

SDSoC

Generated

Platform

DMA

AXI-S

Page 22: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Computer Vision Design Example: Stereo Disparity Mapmain(){

imread(A);

imread(B);

stereoRectify(A,B,C,D);

stereoLBM(C,D,out);

imshow(out);}

}

ZCU102 EV Platform

USB3

AXIPS

PL

SDSoC

Generated

Platform

StereoLBM

DMA

AXI-S

StereoRectify

Linux

Libraries

Application

Drivers

Xilinx ZU9

Frames/s 140

Power (W) 4.8

Latency (ms) 7.1

Utilization 14%

• nVidia number using CUDA OpenCV

• SAD based stereo localBM

• Both Xilinx and nVidia benchmarks do not include the camera inputs and HDMI/DP outputs

HDMI

Page 23: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 24

Semantic Segmentation Demovia xfOpenCV cores and custom CNN

(reVISION 2.0 / SDSoC 2017.2)

f2d.elf -i 1920x1080@UYVY

Page 24: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 25

Object Detection with Deephi DPU (CNN)

Pink – cars

Blue – Pedestrians

Yellow – bicycles/

motorcycles

Additional meta data are

available for any object

Page 25: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 26

Summary

LUT FF DSP48 BRAM18K latency Clock Period frame rate

(cycles) (ns) (Hz)

LK Optical Flow File/Video IO @4K 7732 11978 42 178 8402866 2.853 41.71

LK Optical Flow Video IO @4K (2pix/clk) 15106 22706 82 258 4205768 2.985 79.65

Stereo Vision Video IO @1080 34321 41122 71 335 19194357 2.92 17.84

InitUndistoriRectifyMapInverse 3026 6685 36 3 2073688 2.833 170.22

StereoBM 28643 32087 20 73 19194357 2.92 17.84

remap 2652 2350 15 259 4259970 2.855 82.22

ZU9 device available resources 274080 548160 2520 912

Page 26: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 27

References

Page 27: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Xilinx Developer Forum 2018 (conference with 400 attendees) to see most recent news about Xilinx and his partners

– https://www.xilinx.com/video/events/xilinx-developer-forum-frankfurt-2018.html

Xilinx github website

– https://github.com/Xilinx

Xilinx User Guide (for FPGA and HLS beginners): Introduction to FPGA Design with Vivado High-Level Synthesis

http://www.xilinx.com/support/documentation/sw_manuals/ug998-vivado-intro-fpga-design-hls.pdf

Vivado HLS User Guide and Tutorial manuals

– https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_4/ug902-vivado-high-level-synthesis.pdf

– https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_4/ug871-vivado-high-level-synthesis-tutorial.pdf

The Zynq book

– http://www.zynqbook.com/

XAPP1170: Zynq-7000 All Programmable SoC Accelerator for Floating-Point Matrix Multiplication using Vivado HLS

– http://www.xilinx.com/support/documentation/application_notes/xapp1170-zynq-hls.pdf

XAPP1300: Demystifying the Lucas-Kanade Optical Flow Algorithm with Vivado HLS

– https://www.xilinx.com/support/documentation/application_notes/xapp1300-lucas-kanade-optical-flow.pdf

XCell paper: Using Vivado HLS to Design a Median Filter and Sorting Network for video

– www.xilinx.com/publications/archives/xcell/Xcell86.pdf

Xilinx reVISION stack:

– https://www.xilinx.com/products/design-tools/embedded-vision-zone.html

PYNQ board with BNN and PYTHON

– http://www.pynq.io/

– https://github.com/Xilinx/PYNQ

Page 28

Hyperlinks

Page 28: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 29

Page 29: Xilinx reVISION xfOpenCV - Missing Link Electronics · Xilinx reVISION xfOpenCV 26. June 2018 Dr.-Ing. Karsten Trott. Page 2 Introduction to reVISION. Page 3 ... HS/SW partitioning

Page 30

Follow Xilinx

facebook.com/XilinxInc

twitter.com/XilinxInc

youtube.com/XilinxInc

linkedin.com/company/xilinx

plus.google.com/+Xilinx/