khronos overview nov13

36
© Copyright Khronos Group 2013 - Page 1 Khronos Overview The State of the Art in Open Standards for Visual Computing Neil Trevett Khronos President Vice President Mobile Content, NVIDIA

Upload: the-khronos-group-inc

Post on 08-May-2015

480 views

Category:

Technology


3 download

DESCRIPTION

Following our successful participation at SIGGRAPH Asia 2012 in Singapore, the Khronos Group is excited to demonstrate and educate about Khronos APIs at SIGGRAPH Asia 2013 in Hong Kong. This presentation is the Khronos Overview--the state of the art in open standards for visual computing, by Neil Trevett.

TRANSCRIPT

Page 1: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 1

Khronos Overview The State of the Art in Open

Standards for Visual Computing Neil Trevett

Khronos President Vice President Mobile Content, NVIDIA

Page 2: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 2

Khronos Connects Software to Silicon

ROYALTY-FREE, OPEN STANDARD APIs for

advanced hardware acceleration

Low level silicon to software interfaces needed on every platform

Graphics, video, audio, compute,

vision, sensor and camera processing

Defines the forward looking roadmap for

the silicon community

Shipping on billions of devices across

multiple operating systems

Rigorous conformance tests for

cross-vendor consistency

Khronos is OPEN for any company to

join and participate

Acceleration APIs BY the Industry

FOR the Industry

Page 3: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 3

Making a Difference – One API at a Time

Well over 1 BILLION people are using what

the Khronos members have created

together - Every Day…

Page 4: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 4

Khronos Standards

Visual Computing - Object and Terrain Visualization - Advanced scene construction

3D Asset Handling - Advanced Authoring pipelines

- 3D Asset Transmission Format with streaming and compression

Acceleration in the Browser - WebGL for 3D in browsers

- WebCL – Heterogeneous Computing for the web

Camera

Control API

OpenCL 2.0 Finalized!

glTF cooperation with MPEG

for 3D Asset Compression!

OpenVX 1.0

Provisional

Released!

Sensor Processing - Mobile Vision Acceleration - On-device Sensor Fusion

WebGL and WebCL

Momentum!

Over 100 companies defining royalty-free

APIs to connect software to silicon

Page 5: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 5

OpenCL Milestones • 24 month cadence for major OpenCL 2.0 update

- Slightly longer than 18 month cadence between versions of OpenCL 1.X

• Significant feedback from the developer community on Provisional Specification

- Many suggestions were incorporated into the final 2.0 specification

- Other feedback will be considered for future specification versions

OpenCL 1.0 released. Conformance tests

released Dec08

Dec08

Jun10

OpenCL 1.1 Specification and conformance tests

released

Nov11

OpenCL 1.2 Specification and conformance tests

released

OpenCL 2.0 Specification finalized

and conformance tests released

Jul13

OpenCL 2.0 Provisional Specification

released for public review

Nov13

Page 6: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 6

Key OpenCL 2.0 Features • Shared Virtual Memory

- Host and device kernels can directly share complex, pointer-containing data

structures such as trees and linked lists, providing significant programming

flexibility and eliminating costly data transfers between host and devices

• Nested Parallelism

- Device kernels can enqueue kernels to the same device with no host interaction,

enabling flexible work scheduling paradigms and avoiding the need to transfer

execution control and data between the device and host, often significantly

offloading host processor bottlenecks

• Generic Address Space

- Functions can be written without specifying a named address space for

arguments, especially useful for those arguments that are declared to be a

pointer to a type, eliminating the need for multiple functions to be written for

each named address space used in an application

Page 7: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 7

Broad OpenCL Implementer Adoption • Multiple conformant implementations shipping on desktop and mobile

- For CPUs and GPUs on multiple OS

• Android ICD extension released in latest extension specification

- OpenCL implementations can be discovered and loaded as a shared object

• Multiple implementations shipping in Android NDK

- ARM, Imagination, Vivante, Qualcomm, Samsung …

Page 8: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 8

OpenCL as Parallel Compute Foundation • 100+ tool chains and languages leveraging OpenCL

- Heterogeneous solutions emerging for the most popular programming languages

C++

syntax/compiler

extensions

OpenCL HLM

JavaScript binding to

OpenCL for initiation

of OpenCL C kernels

WebCL River Trail

Language

extensions to

JavaScript

C++ AMP

Shevlin Park

Uses Clang

and LLVM

OpenCL provides vendor optimized,

cross-platform, cross-vendor access to

heterogeneous compute resources

Harlan

High level

language for GPU

programming

Compiler

directives for

Fortran C and C++

Aparapi

Java language

extensions for

parallelism

PyOpenCL

Python wrapper

around

OpenCL

Page 9: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 9

Widespread Developers Leveraging OpenCL • Broad uptake of OpenCL in commercial applications

- For desktop and increasingly mobile apps

• “OpenCL” on Sourceforge, Github, Google Code, BitBucket

finds over 2,000 projects

- x264

- Handbrake

- FFMPEG

- JPEG

- VLC

- OpenCV

- GIMP

- ImageMagick

- IrfanView

- Hadoop, Memcched

- Aparapi – A parallel API (for Java)

- Bolt – a Unified Heterogeneous Library

- Sumatra – next generation of compute enabled Java

- WinZip

- Crypto++

- Bullet physics library

- Etc. Etc.

Page 10: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 10

OpenCL Academic Traction • OpenCL at over 100 Universities Worldwide

Teaching multi-faceted programming courses

- Research with top-tier Universities globally

• Complete University Kits available

- Presentation w/instructor & speaker notes

- Example code, & sample application

• Growing textbook ecosystem

- US, Japan, Europe, China and India

• Number of papers referencing OpenCL on

Google Scholar is growing rapidly

- Over 2000 papers in 2012

• Commercial OpenCL training courses - http://www.accelereyes.com/services/training

http://developer.amd.com/Resources/library/Pages/default.aspx

Page 11: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 11

Leveraging Proven Native APIs into HTML5 • Khronos and W3C liaison

- Leverage proven native API investments into the Web

- Fast API development and deployment

- Designed by the hardware community

- Familiar foundation reduces developer learning curve

Native APIs shipping

or Khronos working group

JavaScript API shipping,

acceleration being developed

or work underway

WebVX? Vision

Processing

WebCAM(!) Camera

control and

video

processing

Possible future

JavaScript APIs or

acceleration

WebStream? Sensor Fusion

Native

JavaScript Canvas

Path Rendering

Camera

Control

HTML

Page 12: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 12

Mobile Web is a Real Time Application

Buttery smooth touch interaction needs continuous

60Hz updates

Apple

iPhone

320x480

153K Pixels

163 DPI

Apple

iPad

1024x768

786K Pixels

132 DPI

2048x1536

3100K

Pixels

326 DPI

Apple

iPad Mini

In 5 years the number of

pixels to process on

mobile screens has gone

up by factor of TWENTY

+ =

Need GPU Acceleration for everything Web!

Page 13: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 13

WebGL Availability in Browsers

- Microsoft – “where you have IE11, you have WebGL – turned on by default and working all the time” - Microsoft - WebGL also enabled for Windows applications - web app framework and web view - Apple - WebGL must be explicitly turned on MAC Safari and only exposed on iOS for iAds - Chrome OS - WebGL is the only cross-platform API to program the GPU - Google IO announcement - Chrome on Android will soon launch with WebGL

Much WebGL content uses three.js library:

http://threejs.org/

Page 14: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 14

Microsoft PhotoSynth2 • Demonstrated at Build 2013

http://channel9.msdn.com/Events/Build/2013/4-072 1:50

Page 15: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 15

C/C++

SDK Dalvik (Java)

Objective C C#

DirectX

HTML/CSS HTML/CSS HTML/CSS

Cross-OS Portability

HTML5 provides cross

platform portability. GPU

accessibility through

WebGL available soon on

~90% mobile systems

Preferred development

environments not

designed for portability

Native code is portable-

but apps must cope with

different available APIs

and libraries

Page 16: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 16

OpenGL 3D API Family Tree

OpenGL ES 1.0

OpenGL ES 1.1 OpenGL ES 2.0 OpenGL ES 3.0

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

OpenGL 1.5 OpenGL 2.0 OpenGL 4.3 OpenGL 2.1

OpenGL 3.0

OpenGL 3.1

OpenGL 3.2

OpenGL 3.3

OpenGL 4.0

OpenGL 4.1

OpenGL 4.2

2002

OpenGL 1.3

ES-Next

GL-Next

OpenGL ES 2.0

Content OpenGL ES 1.1

Content

OpenGL ES 3.0

Content

ES3 is backward compatible

so new features can be

added incrementally Fixed function

3D Pipeline

Programmable vertex

and fragment shaders

WebGL 1.0

OpenGL 4.4 is a

superset of DX11

WebGL 2.0

Desktop 3D

Mobile 3D

OpenGL 4.4

WebGL 2.0 is in development now -

will bring OpenGL ES 3.0

functionality to the Web http://www.khronos.org/webgl/public-mailing-list/

http://www.khronos.org/registry/webgl/specs/latest/

http://www.khronos.org/webgl/wiki/Testing/Conformance

Page 17: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 17

OpenGL ES 3.0 Highlights • Better looking, faster performing games and apps – at lower power

- Incorporates proven features from OpenGL 3.3 / 4.x

- 32-bit integers and floats in shader programs

- NPOT, 3D textures, depth textures, texture arrays

- Multiple Render Targets for deferred rendering, Occlusion Queries

- Instanced Rendering, Transform Feedback …

• Make life better for the programmer

- Tighter requirements for supported features to reduce implementation variability

• Backward compatible with OpenGL ES 2.0

- OpenGL ES 2.0 apps continue to run unmodified

• Standardized Texture Compression

- #1 developer request!

Page 18: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 18

3D Needs a Transmission Format! • Compression and streaming of 3D assets becoming essential

- Mobile and connected devices need access to increasingly large asset databases

• 3D is the last media type to define a compressed format

- 3D is more complex – diverse asset types and use cases

• Needs to be royalty-free

- Avoid an ‘internet video codec war’ scenario

• Eventually enable hardware implementations of successful codecs

- High-performance and low power – but pragmatic adoption strategy is key

Audio Video Images 3D

MP3 H.264 JPEG ? !

An effective and widely adopted codec ignites previously

unimagined opportunities for a media type

Page 19: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 19

glTF – OpenGL Transmission Format • Binary file format for efficient transmission for 3D assets

- Reduce network bandwidth and minimize client processing overhead

• Run-time neutral - DO NOT IMPLY OR MANDATE ANY RUN-TIME BEHAVIOR

- Can be used by any app or run-time – usually WebGL accelerated

• Scalable to handle compression and streaming

- Though baseline format does not include compression

• ‘Direct load efficiency’ for WebGL

- Little or NO processing to drop glTF data into WebGL client

• Carry conditioned data from any authoring format

- Prototyping and optimizing efficient handling of COLLADA assets

A standards-based

content pipeline for

rich native and Web 3D

applications Playback Authoring

Page 20: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 20

COLLADA and glTF Open Source Ecosystem

Tool Interop

Three.js glTF Importer. Rest3D initiative

COLLADA2GLTF

Translator

OpenCOLLADA

Importer/Exporter

and COLLADA

Conformance Tests

On GitHUB

Pervasive WebGL deployment

Other

authoring

formats

Web-based Tools

https://github.com/KhronosGroup/glTF

https://github.com/KhronosGroup/OpenCOLLADA

https://github.com/KhronosGroup/COLLADA-CTS

Page 21: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 21

WebGL as Test-bed for 3D Asset Compression • Integrating and benchmarking 3D geometry compression formats with glTF

- Baseline is GZIP

• Scalable Complexity 3D Mesh Compression codec MPEG-SC3DMC

- Royalty-free graphics compression technology from MPEG (MIT License)

- Open3DGC is efficient JavaScript and C/C++ implementation

- Convertor using Open3DGC to compress 3D Meshes, Skinning, Animations

- https://github.com/amd/rest3d/tree/master/server/o3dgc

• WebGL-loader is Google lightweight compression for WebGL content

• OpenCTM uses LZMA compression

Page 22: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 22

Initial Compression Results • Compression Efficiency

- Gzip (default level=6)

- OpenCTM (default settings)

- Open3DGC and Webgl-loader - Positions on 14 bits

- Normals and texCoords on 10 bits

Open3DGC is 5x-9x more efficient than Gzip

1.3x-2.4x more efficient than OpenCTM and

1.2x-1.5x more efficient than webgl-loader

0

100

200

300

400

CAD(3748 models)

3D Scanned(78 models)

MPEG dataset(1211 models)

Size

(M

Byt

es)

Gzip

OpenCTM

Webgl-loader + Gzip

Open3DGC-ASCII + Gzip

Open3DGC-Binary

Page 23: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 23

OpenVX – Power Efficient Vision Processing • Acceleration API for real-time vision

- Focus on mobile and embedded systems

• Diversity of efficient implementations

- From programmable processors, through

GPUs to dedicated hardware pipelines

• Tightly specified API with conformance

- Portable, production-grade vision functions

• Complementary to OpenCV

- Which is great for prototyping

Open source sample

implementation

Hardware vendor

implementations

OpenCV open

source library

Other higher-level

CV libraries

Application

Acceleration for power-efficient

vision processing

Page 24: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 24

OpenVX Graphs • Vision processing directed graphs for power and performance efficiency

- Each Node can be implemented in software or accelerated hardware

- Nodes may be fused by the implementation to eliminate memory transfers

- Tiling extension enables user nodes (extensions) to also run in local memory

• VXU Utility Library for access to single nodes

- Easy way to start using OpenVX

• EGLStreams can provide data and event interop with other APIs

- BUT use of other Khronos APIs are not mandated

OpenVX Node

OpenVX Node

OpenVX Node

OpenVX Node

Heterogeneous

Processing

Native

Camera

Control

Example Graph and Flow

Page 25: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 25

OpenVX 1.0 Function Overview • Core data structures

- Images and Image Pyramids

- Processing Graphs, Kernels, Parameters

• Image Processing

- Arithmetic, Logical, and statistical operations

- Multichannel Color and BitDepth Extraction and Conversion

- 2D Filtering and Morphological operations

- Image Resizing and Warping

• Core Computer Vision

- Pyramid computation

- Integral Image computation

• Feature Extraction and Tracking

- Histogram Computation and Equalization

- Canny Edge Detection

- Harris and FAST Corner detection

- Sparse Optical Flow

Page 26: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 26

OpenVX Participants and Timeline • Aiming for specification finalization by mid-2014

• Itseez is working group chair

• Qualcomm and TI are specification editors

Page 27: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 27

OpenVX and OpenCV are Complementary

Governance Open Source

Community Driven No formal specification

Formal specification and conformance tests

Implemented by hardware vendors

Scope Very wide

1000s of functions of imaging and vision Multiple camera APIs/interfaces

Tight focus on hardware accelerated functions for mobile vision Use external camera API

Conformance No Conformance testing

Every vendor implements different subset Full conformance test suite / process

Reliable acceleration platform

Use Case Rapid prototyping Production deployment

Efficiency Memory-based architecture

Each operation reads and writes memory Graph-based execution

Optimizable computation, data transfer

Portability APIs can vary depending on processor Hardware abstracted for portability

Page 28: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 28

OpenVX and OpenCL are Complementary

Use Case General Heterogeneous programming Domain targeted - vision processing

Architecture Language-based

– needs online compilation Library-based

- no online compiler required

Target Hardware

‘Exposed’ architected memory model – can impact performance portability

Abstracted node and memory model - diverse implementations can be optimized

for power and performance

Precision Full IEEE floating point mandated Minimal floating point requirements –

optimized for vision operators

Ease of Use Focus on general-purpose math libraries with no built-in vision

functions

Fully implemented vision operators and framework ‘out of the box’

Page 29: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 29

Typical Imaging Pipeline • Pre- and Post-processing can be done on CPU, GPU, DSP…

• ISP controls camera via 3A algorithms

Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF)

• ISP may be a separate chip or within Application Processor

Pre-processing Image Signal Processor

(ISP)

Post-

processing

CMOS sensor

Color Filter Array

Lens

Bayer RGB/YUV

App

Lens, sensor, aperture control 3A

Need for advanced camera control API: - to drive more flexible app camera control

- over more types of camera sensors

- with tighter integration with the rest of the system

Page 30: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 30

Khronos Camera API • Catalyze camera functionality not available on any current platform

- Open API that aligns with future platform direction for easy adoption

- E.g. could be used to implement future versions of Android Camera HAL

• More detailed control per frame

- Focus, flash, format, Region of Interest (ROI) selection

• Global Timing & Synchronization

- E.g. Between cameras and MEMS sensors

• Application control over ISP processing (including 3A)

- Including multiple, re-entrant ISPs

• Control multiple sensors with synch and alignment

- Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras

• Flexible processing/streaming

- Multiple output streams and streaming rows (not just frames)

- RAW, Bayer and YUV Processing

Page 31: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 31

Camera API Design Philosophy • C-language API starting from proven designs

- e.g. FCAM, Android Camera HAL V3

• Design alignment with widely used hardware standards

- e.g. MIPI CSI

• Focus on mobile, power-limited devices

- But do not preclude other use cases such as automotive, surveillance, DSLR…

• Minimize overlap and maximize interoperability with other Khronos APIs

- But other Khronos APIs are not required

• Provide support for vendor-specific extensions

Apr13

Jul13

Group charter approved

4Q13

Provisional specification

1Q14

First draft specification

2Q14

Sample implementation and

tests

3Q14

Specification ratification

Page 32: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 32

‘Always On’ Camera and Sensor Processing • Visual sensor revolution – driving need for significant vision acceleration

- Multi-sensors: Stereo pairs -> Plenoptic arrays -> Active depth cameras

• Devices should be always environmentally-aware – e.g. ‘wave to wake’

- BUT many sensor use cases consume too much power to actually run 24/7

• Smart use of sensors to trigger levels of processing capability

- ‘Scanners’ - very low power, always on, detect events in the environment

ARM 7 1 MIP and accelerometers can

detect someone in the vicinity

DSP / Hardware Low power activation of camera

to detect someone in field of view

GPU / Hardware Maximum acceleration for processing

full depth sensor capability

Page 33: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 33

Sensor Industry Fragmentation …

Page 34: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 34

StreamInput - Sensor Fusion • Defines access to high-quality fused sensor stream and context changes

- Implementers can optimize and innovate generation of the sensor stream

OS Sensor OS APIs (E.g. Android SensorManager or

iOS CoreMotion)

Low-level native API defines access to

fused sensor data stream and context-awareness

Applications

Sensor Sensor

Sensor

Hub Sensor

Hub

StreamInput implementations

compete on sensor stream quality,

reduced power consumption,

environment triggering and context

detection – enabling sensor

subsystem vendors to increased

ADDED VALUE

Middleware (E.g. Augmented Reality engines,

gaming engines)

Platforms can provide

increased access to

improved sensor data stream

– driving faster, deeper

sensor usage by applications

Middleware engines need platform-

portable access to native, low-level

sensor data stream

Mobile or embedded

platforms without sensor

fusion APIs can provide

direct application access

to StreamInput

Page 35: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 35

Khronos APIs for Augmented Reality

Advanced Camera Control and stream

generation

3D Rendering and Video

Composition

On GPU

Audio

Rendering

Application

on CPUs, GPUs

and DSPs

Sensor

Fusion

Vision

Processing

MEMS

Sensors

Camera Control

API

EGLStream - stream data

between APIs

Precision timestamps

on all sensor samples

AR needs not just advanced sensor processing, vision

acceleration, computation and rendering - but also for

all these subsystems to work efficiently together

Page 36: Khronos Overview Nov13

© Copyright Khronos Group 2013 - Page 36

Khronos DevU In Depth Sessions Today