the next-gen technologies driving immersion...the next-gen technologies driving immersion qualcomm...
TRANSCRIPT
The next-gen technologies driving immersion
Qualcomm Technologies, Inc.February 2017
Technology improvements continue to drive more immersive experiences, especially for VR and AR
3
Natural UIs like voice, gestures, and eye tracking
are making interactions more intuitive
2
Scene-based audio is a new paradigm for 3D
audio
High Dynamic Range (HDR) will enhance the visual
quality on all our screens
1
Immersion enhances our experiences
The next-gen technologies driving immersion
4
ImmersiveExperiences
• Draw you in…• Take you to another place…• Keep you present in the moment…
The experiences worth having, remembering, and reliving
5
Immersion enhances everyday experiences
Experiences become more realistic, engaging, and satisfying
Spanning devices at home, work, and throughout life
Life-like video conferencing
Smooth, interactive, cognitive user interfaces
Augmented realityexperiences
Virtual realityexperience
Realistic gaming experiences
Theater-quality moviesand live sports
6
Visual quality
Sound quality
Intuitive interactions
Visual quality
Sound quality
Intuitive interactions
Immersion
Achieving full immersionBy simultaneously focusing on three key pillars
6
7
Visual quality
Sound quality
Intuitiveinteractions
The next generation technologies driving immersionAchieving full immersion at low power to enable a comfortable, sleek form factor
Immersion
Scene-based audio3D audio and positional audio through higher order ambisonics
Natural user interfacesAdaptive, multi-modal user interfaces like voice, gestures, and eye tracking
High dynamic range (HDR)Increased contrast, expanded color
gamut, and increased color depth
HDR for enhanced visual quality
Increased brightness and contrast, expanded color gamut, and increased color depth
9
HDR ON
9
HDR OFF
HDR images and videos are visually stunningMuch more realistic and immersive
10
HDR will enhance the visual quality on all our screensBringing our experiences closer to full immersion
VisualsSo vibrant that they are
eventually indistinguishable from the real world
11
Achieving realistic HDR is challengingReal-life brightness has a wide dynamic range that is hard to capture and replicate
Real-life
• Sun: ~10^9 nits1
• Sunlit scene: ~10^5 nits• Starlight: ~10^-3 nits• Dynamic range: ~10^12:1
Human vision
• Eye’s dynamic range: • ~10^4:1 (static)• ~10^6:1 (dynamic)
• Eyes are sensitive to relative luminance
1 Nit is the unit of luminance, which is also known as candela per square meter (cd/m2). Candela is the unit of luminous intensity.
Camera and display technologies
• Camera sensors can’t capture the full dynamic range
• Display panels can’t replicate the full dynamic range
12
Three technology vectors are essential for HDR Making every pixel count
Contrast and brightness
Color gamut Color depth
Brighter whites and darker blacks closer to the brightness
of real life
A subset of visible colors that can be accurately captured and
reproduced
The number of gradations in color that can be captured and
displayed
13
HDR10 is the next step towards true-to-life visualsA requirement for ULTRA HD PREMIUM certification
Contrast and brightness
Color gamut Color depth Codec
EOTF up to 10,000 nits BT.2020 support
10-bit per channel:over a billion colors!
HEVC Main 10 profile
EOTF is electro optical transfer function.
Display from 0.05 to 1000 nits or 0.0005 to 540 nits
Min. 90% DCI-P3 color reproduction
HDR10Content spec
ULTRA HD PREMIUM
Display spec
14
The time is right for HDR10Technologies and ecosystem are now aligning
HDR10
Ecosystem driversDevice availability
Software support
Content creation and deployment
Technologyadvancements
Multimedia technologies
Display and camera technologies
Power and thermal efficiency
15
Qualcomm® Snapdragon™ 835 processor is ready for ULTRA HD PREMIUM certificationEnjoy vibrant HDR content on a variety of screens
SoC Display
HDR10 content
Movies and TV shows• Netflix, Amazon, etc.
Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.
ULTRA HD PREMIUM certification is a device-level certification. Each Snapdragon device must be certified.
16
A history of multimedia technology leadership
2013
Snapdragon 800
First with 4K (H.264) capture and playback
Snapdragon805
4K playback with HEVC (H.265)
Snapdragon810
4K captureand playbackwith HEVC
Snapdragon820
4K playback@ 60fps
Snapdragon835
HDR10
ULTRA HD PREMIUM-ready
2017
1st1st
1st
1st
1st
17
A heterogeneous computing approach is needed for HDREfficient processing by running the appropriate task on the appropriate engine
* Not to scale
Adreno 540 Visual Processing• HEVC Main 10 video profile support with metadata processing
• Accurate color gamut and tone mapping
• Efficient rendering HDR effects for games with DX12 and Vulkan
• Precise blending of mixed-tone (HDR and SDR) layers
• Native 10-bit color BT.2020 support over HDMI, DP, & DSI displays
Qualcomm Spectra 180 ISP• 14-bit processing pipeline to support the latest camera sensors
• Video and snapshot HDR processing with local tone mapping
Hexagon 682 DSP & Kryo 280 CPU• Multicore CPU: Camera, video, and graphics application processing
• DSP + HVX for accelerated multimedia post processing
Snapdragon 835
SnapdragonX16 LTE modem
Adreno 540 Graphics Processing Unit
(GPU)
Wi-Fi
Hexagon 682 DSP QualcommSpectra 180
HVXQualcomm
All-Ways Aware
QualcommAqstic Kryo 280 CPU
IZat LocationQualcommHaven
DisplayProcessing Unit
(DPU)
VideoProcessing Unit (VPU)
Snapdragon, Qualcomm Adreno, Qualcomm Hexagon, Qualcomm All-Ways Aware, Qualcomm Spectra, Qualcomm Aqstic, Qualcomm Kryo, Qualcomm IZat, and Qualcomm Haven are products of Qualcomm Technologies, Inc.
Scene-based audio for enhanced sound quality
3D audio and positional audio through Higher Order Ambisonics (HOA)
19
True-to-life sound is critical to immersive experiences
The sounds and visuals must match —our hearing perceives depth, direction, and magnitude of sound sources
• Sound sources are all around us
20
True-to-life sound is critical to immersive experiences
The sounds and visuals must match —our hearing perceives depth, direction, and magnitude of sound sources
• Sound sources are all around us• Sound waves merge and reflect
21
True-to-life sound is critical to immersive experiences
The sounds and visuals must match —our hearing perceives depth, direction, and magnitude of sound sources
• Sound sources are all around us• Sound waves merge and reflect• Distinct sound pressure value at every
point in the 3D scene
22
Scene-based audio captures the entire 3D audio scene
Higher Order Ambisonics (HOA) coefficients are the key
• Spherical harmonic-based transforms convert the 3D sound pressure field into a compact and comprehensive representation—the HOA coefficients
• The HOA format is conducive for compression. Spatial encoding compresses the HOA coefficients
• Once calculated, the HOA coefficients are decoupled from the capture and playback
23
Object-based audiofor 3D audio scene
Faces issues with scalingand requires post-processingon capture
• Audio is associated with each object inthe scene
• Audio of each object and its corresponding position needs to be determined through post-processing
• The complexity and bandwidthrequirements increase with the numberof objects in the scene
• As a result, typical usage is a combinationof object- and channel-based audio
24
Channel-based audiofor 3D audio scene
A legacy format with a number of issues
• Mics are subjectively placed in different places depending on audio engineers
• In post-processing, the sound mix is subjectively created and may bear no resemblance to original audio scene
• A variety of formats need to be created, transmitted, and stored, such as 2.0, 5.1, 7.1.4, 22.2
• Playback does not adjust for incorrect speaker layout
25
Scene-based audio is a new paradigm for 3D audioProviding key benefits and solving the major challenges of existing audio formats
MIPS = Millions of Instructions Per Second
Efficient
• Reduced bandwidth and file size• Rendering complexity is
independent of scene complexity• A single format• Scalable layering• Power efficient: high quality per
MIPS
High fidelity
• Higher order ambisonics• The perfect representation of the
3D audio scene• High resolution and increased
sweet spot
Comprehensive
• Simple, real-time capture• Flexible rendering• Seamless integration into audio
workflows/applications• Advanced effects for interactivity
26
Simple real-time capture and flexible renderingHOA coefficients are decoupled from the capture and playback
Flexible rendering• Audio is rendered at the playback location based
on the number and location of the speakers• Recreates best possible reproduction of the original sound scene• Supports any channel format: 2.0, 5.1, 7.1.4, 22.2, binaural, etc.
• Uniform experience across devices and playback locations (theater, home, mobile devices, etc.)
Simple real-time capture• Spatially separated microphones are required
• Ideally, a spherical mic array with 32 mics for 4th order HOA coefficients
• Spot mics can be added• A smartphone with 3 mics offers 1st order HOA coefficients
• Captures the entire 3D audio scene• Generates a single, compact file • Great for live content (sports, user generated content,
etc.) and post-production (movies, etc.)
27
3D positional audio is essential for VR and ARAccurate 3D surround sound based on your head’s position relative to various sound sources
• Sound arrives to each ear at the accurate time and with the correct intensity
• HRTF (head related transfer function):◦ Takes into account typical human facial
and body characteristics, like location, shape, and size of ears
◦ Is a function of frequency and three spatial variables
• Sound at the ears needs to be appropriately adjusted dynamically as your head and the sound sources move — the VR and AR experience
28
Scene-based audio is an ideal solution for VR and ARA natural fit for capturing and playing back 3D positional audio
High fidelity• Captures the entire 3D sound scene
in high quality• Video and audio captured on
the same device
Real-time & simple• Works on a variety of devices (action
camera, smartphone, etc.)• No post-production required but easy
to apply scene-based effects• Great for live events like sports and
user-generated content• Compact file
Capture Playback
SoundsSo accurate thatthey are true to life
Immersive• High-fidelity 3D surround sound
adjusts based on head pose• 3-DOF and 6-DOF support• A natural way to guide a user’s
attention
Efficient• Accurate manipulation of
the sound field • HOA coefficients are computationally
efficient to rotate, stretch, or compress the audio scene
29
Scene-based audio adoption is acceleratingThe entire ecosystem needs to align
Advanced demonstrations
• End-to-end workflow solutions
• Broadcast (TV)• VR• Immersive audio
Standards adoption
• MPEG-H 3D Audio• ATSC 3.0• DVB is considering MPEG-H
3D Audio• Device interoperability:
DisplayPort, HDMI, etc.
Real deployments
• YouTube is using first order ambisonics for spatial audio
• 2018 Winter Olympics in South Korea is using MPEG-H 3D Audio
• Various mics available for purchase
MPEG = Moving Picture Experts Group. ATSC = Advanced Television Systems Committee.
Learn more about our contribution to scene-based audio: https://www.qualcomm.com/scene-based-audio
Intuitive interactions
Adaptive, multi-modal user interfaces are the future
31
Adaptive, multimodal, user interfaces
Speech recognition, eye tracking, and gesture recognition are becoming essentialNatural user interfaces for intuitive interactions
Speech recognition Use natural language
processing
Motion & gesture recognitionUse computer vision, motion sensors, or touch
Face recognitionUse computer vision to recognize facial expressions
Personalized interfacesLearn and know user preferences
based on machine learning
Eye tracking Use computer vision to measure point of gaze
Bringing life to objectsEfficient user interfaces for IoT
32
Voice is a natural way to interact with devicesA hands-free interface is necessary in certain situations
Designed to be• Intuitive• Conversational• Convenient• Productive• Personalized
Underlying technology• Voice activation• Noise filtering, suppression,
and cancellation• Speech recognition• Natural language processing• Voice recognition / biometrics• Deep learning
33
Eye tracking naturally detects our point of interestProviding valuable information for interacting with our devices
Natural user interface
• Gaze tracking and estimationto navigate within next-genapplications
• Fast and secure authenticationthrough iris scan
• Applicable to VR HMDs, AR glasses, and smartphones
Improved visuals
• Gaze tracking and estimation willbe an input to new visual andauditory rendering techniques
• Foveated rendering of graphicsand video enables a more immersive visual user experience
• Eye tracking, when combinedwith machine learning, will also personalize VR and AR experiences
Dynamic calibration
• Each human face has a different inter-pupillary distance (IPD)
• HMDs can also move aroundon the face during use
• CV techniques will be used to dynamically and accuratelyaccount for IPD
RequirementsTracking camera
Eye tracker
Gaze estimation
Latency reduction
System optimization
Robust solution
34
Gesture recognition for natural hand interactionsInteract with the UI like you would in the real world
Benefits
• Intuitive interaction with a device without the need for accessories — grab, select, type, etc.
• A reconstructed hand with accurate movements increases the level of immersion for VR
• Increased productivity by using gestures where appropriate and having predictive UI
Key technologies
• Wide field-of-view camera
• Computer vision
• Machine learning
Identify handsDetect
Track Follow key points on hands and fingers as they move
RecognizeUnderstand the meaning of the hand and finger gestures, even when occluded
Take appropriate action based on current and predicted gesture
Act
QTI is uniquely positioned to support superior immersive experiences
Custom designed SoCs and investments in the core immersive technologies
36
Within device constraints
Development timeSleek form factor
Power and thermal efficiency Cost
QTI is uniquely positioned to support immersive experiencesProviding efficient, comprehensive solutions
Immersive experiences Commercialization
Efficient heterogeneous computing architecture
Comprehensive solutions across tiers
Custom designed processing engines
Snapdragon development platforms
Ecosystem collaboration
App developer tools
Consistent, accurate color
HDR video, photos, and playback
High resolution and frame rate
Positional audio
Noise removal
True-to-life audio processing
Multimodal natural UIs
Intelligent, contextual interactions
Responsive and smooth UIs
Visual quality
Sound quality
Intuitive interactions
Via Snapdragon™ solutions
Via ecosystem enablement
Follow us on:For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog
Nothing in these materials is an offer to sell any of the components or devices referenced herein.
©2016 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Qualcomm, Snapdragon, Adreno and Hexagon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Qualcomm Spectra, Kryo, Qualcomm Haven, IZat, Qualcomm All-Ways Aware and Qualcomm Aqstic are trademarks of Qualcomm Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective owners.
References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiariesor business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially allof its product and services businesses, including its semiconductor business, QCT.
Thank you
38
• Websites◦ Immersive experiences: https://www.qualcomm.com/Immersive
◦ Virtual reality: https://www.qualcomm.com/VR
◦ Augmented reality: https://www.qualcomm.com/AR
◦ Developers: https://developer.qualcomm.com
◦ Newsletter signup: http://www.qualcomm.com/mobile-computing-newsletter
• Presentations◦ Immersive experiences: https://www.qualcomm.com/documents/immersive-experiences-presentation
◦ Virtual reality: https://www.qualcomm.com/documents/making-immersive-virtual-reality-possible-mobile
◦ Augmented reality: https://www.qualcomm.com/documents/mobile-future-augmented-reality
• Papers◦ Virtual reality: https://www.qualcomm.com/documents/whitepaper-making-immersive-virtual-reality-possible-mobile
◦ Immersive experiences: https://www.qualcomm.com/documents/whitepaper-driving-new-era-immersive-experiences-qualcomm
• Videos:◦ Immersive experiences video: https://www.qualcomm.com/videos/immersive-experiences
◦ Immersive experiences webinar: https://www.qualcomm.com/videos/webinar-new-era-immersive-experiences-whats-next
Resources