nvgaze: anatomy-aware augmentation for low … · la yer (x, y) conv2 conv3 conv4 conv5 conv6 fc...
TRANSCRIPT
![Page 1: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/1.jpg)
Michael Stengel, Alexander Majercik
NVGAZE: ANATOMY-AWARE AUGMENTATION FORLOW-LATENCY, NEAR-EYE GAZE ESTIMATION
![Page 2: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/2.jpg)
2
AGENDA
Part I (Michael) 25 min
• Eye tracking for near-eye displays
• Synthetic dataset generation
• Network training and results
Part II (Alexander) 15 min
• Fast Network Inference using cuDNN
• Deep Learning Best Practice
![Page 3: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/3.jpg)
3
NVGAZE TEAM
Michael Stengel Alexander MajercikJoohwan Kim Shalini De Mello
Morgan McGuire David LuebkeSamuli Laine
Perception & LearningNew Experiences GroupNew Experiences GroupNew Experiences Group
New Experiences Group New Experiences Group VP of Graphics Research
![Page 4: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/4.jpg)
4
EYE TRACKING FORNEAR-EYE DISPLAYS
Michael Stengel
![Page 5: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/5.jpg)
5
EYE TRACKING IN VR/AR
Avatars Foveated Rendering
Dynamic Streaming
Attention Studies
Computational Displays Perception
User State Evaluation
[Eisko.com]
Health CareGaze Interaction
Periphery
[arpost.co] [Vedamurthy et al.][Sitzmann et al.]
[Patney et al.]
[Sun et al.]
[eyegaze.com]
[Padmanaban et al.]
![Page 6: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/6.jpg)
6
SUBTLE GAZE GUIDANCEEnlarging virtual spaces through redirected walking
[Sun et al., Siggraph‘18]
![Page 7: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/7.jpg)
7
FOVEATED RENDERING
Accelerating Real-time Computer Graphics
![Page 8: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/8.jpg)
8
ACCOMMODATION SIMULATION
Enhancing Depth Perception
![Page 9: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/9.jpg)
9
GAZE-AS-INPUT
![Page 10: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/10.jpg)
10
LABELEDREALITY
![Page 11: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/11.jpg)
11
EYE TRACKING IN VR/AR
• How do video-based eye tracking systems work?
WORKING PRINCIPLE
Pupil localizationDomain mapping using calibration parameters
3d gaze vector or2d point of regard
Eye
CameraDisplay
Lens
Face
(x,y)
Eye capture
![Page 12: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/12.jpg)
12
ON-AXIS VS OFF-AXIS GAZE TRACKING
Camera view off-axis Camera view on-axis
![Page 13: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/13.jpg)
13
Modded GearVR with integrated gaze tracking
ON-AXIS GAZE TRACKINGEye tracking prototype for Virtual Reality headsets
Components for on-axis eye tracking integration
Eye tracking cameras, dichroic mirrors,
infrared illumination, VR glasses frame
![Page 14: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/14.jpg)
14
ON-AXIS GAZE TRACKINGEye tracking prototype for VR headsets
![Page 15: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/15.jpg)
15
ON-AXIS EYE TRACKING CAMERA VIEW
![Page 16: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/16.jpg)
16
OFF-AXIS GAZE TRACKINGEye tracking prototype for VR headsets
Eye
Camera
Display
Lens
![Page 17: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/17.jpg)
17
OFF-AXIS GAZE TRACKINGEye tracking prototype for VR headsets
![Page 18: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/18.jpg)
18
EYE TRACKING IN VR/AR
• Changing illumination conditions (over-exposure and hard shadows)
• Occlusions from eyes lashes, skin, blink, glasses frame
• Varying eye appearance : flesh, mascara and other make-up
• Reflections
• Camera view and noise (blur, defocus, motion)
• drifting calibration (single-camera case) due to HMD or glasses motion
• End-to-end latency
• Capturing training data is expensive
CHALLENGES FOR MOBILE VIDEO-BASED EYE TRACKERS
→ Reaching low latency AND high robustness is hard !
![Page 19: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/19.jpg)
19
PROJECT GOALS
• Deep learning based gaze estimation
• Higher robustness than previous methods
• Target accuracy is < 2 degrees of angular error (over full field of view!)
• Fast inference ranging in a few milliseconds even on mobile GPU
• Compatibility to any captured input (on-axis, off-axis, near-eye, remote, etc.,dark pupil tracking only, glint-free tracking)
• Explore usage of synthetic data
• Can we learn increase calibration robustness ?
![Page 20: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/20.jpg)
20
RELATED RESEARCH• PupilNet [Fuhl et al., 2017]
• 2-pass CNN-based method running in 8 ms (CPU) performing pupil localization task
• 1st pass on low res image (96x72 pixels)
• 2nd pass on full-res image (VGA resolution)
• trained on 135k manually labeled real images
• Higher robustness than previous ‘hand-crafted’ pupil detectors
• Domain Randomization [Tremblay et al., Nvidia, 2018]
• Image and label generator for automotive setting
• Randomized objects force network to learn essential structure of cars independent of view and lighting condition
![Page 21: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/21.jpg)
21
NVGAZESYNTHETIC EYES DATASET
![Page 22: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/22.jpg)
22
GENERATING TRAINING DATA1: Eye Model
We adopted the eye model from Wood et al. 2015 * and modified it to more
accurately represent human eyes.
* Wood, E., Baltrušaitis, T., Zhang, X., Sugano, Y., Robinson, P., & Bulling, A.
“Rendering of eyes for eye-shape registration and gaze estimation”, ICCV 2015.
Optical
Axis
5 deg
![Page 23: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/23.jpg)
23
GENERATING TRAINING DATA2: Pupil Center Shift
Pupil center is off from iris center,
and it moves as pupil changes in size.
Average displacements:
8mm pupil: 0.1 mm nasal and 0.07 mm up
6mm pupil: 0.15 mm nasal and 0.08 mm up
4mm pupil: 0.2 mm nasal and 0.09 mm up
This is known to cause gaze tracking error of
up to 5 deg in pupil-glint tracking methods.
![Page 24: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/24.jpg)
24
GENERATING TRAINING DATA2: Scanned faces
![Page 25: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/25.jpg)
25
GENERATING TRAINING DATA2: Combining Eye and Head Models
• 10 scanned faces with photorealistic eye, adopted the eye model from Wood et al. 2015
• physical material properties for cornea, sclera and skin under infrared lighting conditions
![Page 26: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/26.jpg)
26
GENERATING TRAINING DATA2: Synthetic Model
![Page 27: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/27.jpg)
27
GENERATING TRAINING DATA3: Dataset
• 4M Synthetic HD eye images for animated eye (400K images per subject) are generated using
Blender on Multi-GPU cluster.
• Render engine used is Cycles as physically accurate path tracer.
![Page 28: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/28.jpg)
28
GENERATING TRAINING DATA3: Dataset
![Page 29: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/29.jpg)
29
ANATOMY-AWARE AUGMENTATION
![Page 30: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/30.jpg)
30
GENERATING TRAINING DATA4: Region Labels
• Region maps are generated out of images with self-illuminating material.
• Refractive effect of air-cornea layer is accounted for.
• Synthetic ground truth is available even if regions are occluded by skin (during blink).
Pupil Iris
Sclera
Skin
Sclera occluded
by skin
Glint
![Page 31: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/31.jpg)
31
ANATOMY-AWARE AUGMENTATION
Samples of real images for comparison
Original Synthetic Image Augmented Synthetic Image
Region-wise
• Contrast scaling
• Blur
• Intensity offset
Global
• Contrast scaling
• Gaussian noise
![Page 32: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/32.jpg)
32
NVGAZE NETWORK
![Page 33: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/33.jpg)
33
NVGAZE INFERENCE OVERVIEW
IR Camera
Gaze Vector
Input Image Convolutional Network
![Page 34: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/34.jpg)
34
NETWORK ARCHITECTURE
16Conv1
24
3654
81122
Fully
Connected
Layer
(x, y)
Conv2
Conv3
Conv4
Conv5Conv6 FC
Layer Resolution Num. Channels
Input 255 x 191 1
Conv1 127 x 95 16
Conv2 63 x 47 24
Conv3 31 x 23 36
Conv4 15 x 11 54
Conv5 7 x 5 81
Conv6 3 x 2 122
Fully convolutional network
In reference design, each layer has …
Stride of 2
No padding
3x3 Conv. kernel
Camera image
640x480
![Page 35: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/35.jpg)
35
NETWORK COMPLEXITY ANALYSIS
![Page 36: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/36.jpg)
36
TRAINING AND VALIDATION
• Trained on a 10 synthetic subjects + 3 real subjects. No fine-tuning.
• Ramp-up and ramp-down for 50 epochs at the beginning and end.
• Adam optimizer with MSE loss
Loss function
![Page 37: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/37.jpg)
37
NEURAL NETWORK PERFORMANCE
Accuracy / Near Eye Display
2.1 degrees of error in average across real subjects
Error is almost evenly distributed across the entire tested visual field
1.7 degrees best-case accuracy when trained for single subject
Accuracy / Remote Gaze Tracking
8.4 degrees average accuracy for remote gaze tracking (same accuracy as state of the
art by Park et al., 2018) but 100x faster
Latency for gaze estimation
<1 milliseconds for inference and data transfer between CPU and GPU space
cuDNN implementation running on TitanV or Jetson TX2
bottleneck is camera transfer @ 120 Hz
Gaze Estimation
![Page 38: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/38.jpg)
38
PUPIL LOCALIZATION
![Page 39: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/39.jpg)
39
NEURAL NETWORK PERFORMANCEPupil Location Estimation
![Page 40: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/40.jpg)
40
![Page 41: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/41.jpg)
41
Our network is more accurate, more robust
and requires less memory than others.
NEURAL NETWORK PERFORMANCEPupil Location Estimation
![Page 42: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/42.jpg)
42
OPTIMIZING FORFAST INFERENCE
Alexander Majercik
![Page 43: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/43.jpg)
43
PROJECT GOALS
• Deep learning based gaze estimation
• Higher robustness than previous methods
• Target accuracy is <2 degrees of angular error
• Fast inference ranging in a few milliseconds even on mobile GPU
• Compatibility to any captured input (on-axis, off-axis, near-eye, remote, etc.,dark pupil tracking only, glint-free tracking)
• Explore usage of synthetic data (large dataset >1,000.000 images)
• Can we learn increase calibration robustness ?
![Page 44: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/44.jpg)
44
PROJECT GOALS
• Deep learning based gaze estimation
• Higher robustness than previous methods
• Target accuracy is <2 degrees of angular error
• Fast inference ranging in a few milliseconds even on mobile GPU
• Compatibility to any captured input (on-axis, off-axis, near-eye, remote, etc.,dark pupil tracking only, glint-free tracking)
• Explore usage of synthetic data (large dataset >1,000.000 images)
• Can we learn increase calibration robustness ?
![Page 45: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/45.jpg)
45
NETWORK LATENCY REQUIREMENTS
Avatars Foveated Rendering
Dynamic Streaming
Attention Studies
Computational Displays Perception
User State Evaluation
[Eisko.com]
Health CareGaze Interaction
Periphery
[arpost.co] [Vedamurthy et al.][Sitzmann et al.]
[Patney et al.]
[Sun et al.]
[eyegaze.com]
![Page 46: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/46.jpg)
46
NETWORK LATENCY REQUIREMENTS
Esports Research at NVIDIA60 ms To Get it RightGaze-Contingent Rendering
and Human perception
Human Perception Esports
![Page 47: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/47.jpg)
47
NETWORK LATENCY REQUIREMENTS
BOTTOM LINE: Network should run in ~1ms!
Esports Research at NVIDIA60 ms To Get it RightGaze-Contingent Rendering
and Human perception
Human Perception Esports
![Page 48: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/48.jpg)
48
Fast inference is also training problem
![Page 49: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/49.jpg)
49
- 7 layer stacked convolutional network
- Input: 293x293 eye image, Output: pupil position in image space
24
52
80124
256512
36
NETWORK DESIGN FOR FAST INFERENCE
![Page 50: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/50.jpg)
50
NETWORK DESIGN FOR FAST INFERENCEKey Design Decisions
![Page 51: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/51.jpg)
51
- Convolutions and FC layers only
NETWORK DESIGN FOR FAST INFERENCEKey Design Decisions
![Page 52: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/52.jpg)
52
- Convolutions and FC layers only
- No max pooling
NETWORK DESIGN FOR FAST INFERENCEKey Design Decisions
![Page 53: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/53.jpg)
53
- Convolutions and FC layers only
- No max pooling
- ReLU activation
NETWORK DESIGN FOR FAST INFERENCEKey Design Decisions
![Page 54: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/54.jpg)
54
- Convolutions and FC layers only
- No max pooling
- ReLU activation
- Data-directed approach
NETWORK DESIGN FOR FAST INFERENCEKey Design Decisions
![Page 55: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/55.jpg)
55
NETWORK DESIGN FOR FAST INFERENCEData-directed approach
![Page 56: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/56.jpg)
56
Better Training -> Simpler Network -> Run Faster
![Page 57: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/57.jpg)
57
![Page 58: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/58.jpg)
58
CUDNN GPU
CPU OpenGLGPU
CPU
![Page 59: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/59.jpg)
59
CUDNN GPU
CPU OpenGLGPU
CPU
CUDNN GPU
CPU OpenGLGPU
CPU
![Page 60: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/60.jpg)
60
FAST INFERENCE WITH NVIDIA CUDNN
- GPU Programming Best Practices
Optimizing the pipeline
![Page 61: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/61.jpg)
61
FAST INFERENCE WITH NVIDIA CUDNN
- GPU Programming Best Practices:
- Minimize CPU-GPU copy
Optimizing the pipeline
![Page 62: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/62.jpg)
62
FAST INFERENCE WITH NVIDIA CUDNN
- GPU Programming Best Practices:
- Minimize CPU-GPU copy
- Minimize kernel launches (pack work into your kernels efficiently)
Optimizing the pipeline
![Page 63: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/63.jpg)
63
FAST INFERENCE WITH NVIDIA CUDNN
- GPU Programming Best Practices:
- Minimize CPU-GPU copy
- Minimize kernel launches (pack work into your kernels efficiently)
- To do both…combine the eye images into a single pass!
Optimizing the pipeline
![Page 64: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/64.jpg)
65
FAST INFERENCE WITH NVIDIA CUDNNMerging the input images
Convolution kernel
![Page 65: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/65.jpg)
66
FAST INFERENCE WITH NVIDIA CUDNNMerging the input images
![Page 66: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/66.jpg)
67
FAST INFERENCE WITH NVIDIA CUDNNMerging the input images
![Page 67: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/67.jpg)
68
FAST INFERENCE WITH NVIDIA CUDNNMerging the input images
![Page 68: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/68.jpg)
69
FAST INFERENCE WITH NVIDIA CUDNNMerging the input images
![Page 69: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/69.jpg)
70
CUDNN GPUCPU
OpenGLGPU
CPU
![Page 70: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/70.jpg)
72
FAST INFERENCE WITH NVIDIA CUDNNResults
Method Time (ms)
Single Image (Python based DL
framework)
Single Image (cuDNN)
Concatenated input (cuDNN)
![Page 71: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/71.jpg)
73
FAST INFERENCE WITH NVIDIA CUDNNResults
Method Time (ms)
Single Image (Python based DL
framework)
~6
Single Image (cuDNN)
Concatenated input (cuDNN)
![Page 72: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/72.jpg)
74
FAST INFERENCE WITH NVIDIA CUDNNResults
Method Time (ms)
Single Image (Python based DL
framework)
~6
Single Image (cuDNN) 0.748
Concatenated input (cuDNN)
![Page 73: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/73.jpg)
75
FAST INFERENCE WITH NVIDIA CUDNNResults
Method Time (ms)
Single Image (Python based DL
framework)
~6
Single Image (cuDNN) 0.748
Concatenated input (cuDNN) 1.022
![Page 74: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/74.jpg)
76
SUMMARY
- Network Latency Requirements
- Foveated rendering, human perception esports
- Network has to execute in ~1ms!
- Network Design for Fast Inference (During Training!)
- Simple network (stacked convolution, no max pooling, relu)
- Complexity is in the data!
- Fast Inference Using NVIDIA cuDNN
- Follow GPU best practices to optimize your pipeline around your well-designed network
![Page 75: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/75.jpg)
77
Try the NvGaze Demo:
VR Theater
SJCC Expo Hall 3, Concourse Level
Tuesday: 12:00pm - 7:00pm
Wednesday: 12:00pm - 7:00pm
Thursday: 11:00am - 2:00pm
![Page 76: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/76.jpg)
78
REFERENCES
NVGaze: An Anatomically-Informed Dataset for Low-Latency, Near-Eye Gaze Estimation [Kim’19]
Adaptive Image‐Space Sampling for Gaze‐Contingent Real‐time Rendering [Stengel’16]
Perception‐driven Accelerated Rendering [Weier’17]
Visualization and Analysis of Head Movement and Gaze Data for Immersive Video in Head-mounted Displays [Loewe’15]
Subtle gaze guidance for immersive environments [Grogorick ‘17]
Towards virtual reality infinite walking: dynamic saccadic redirection [ Sun ’18]
![Page 77: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/77.jpg)
79
Q&A
Michael Stengel Alexander Majercik
New Experiences Group
New Experiences Group
[email protected] out our demo in the Exhibitor Hall !
sites.google.com/nvidia.com/nvgazeDataset and model available at
![Page 78: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/78.jpg)
80
EYE TRACKING IN VR/AR
Avatars Foveated Rendering
Dynamic Streaming
Attention Studies
Computational Displays Perception
User State Evaluation
[Eisko.com]
Health CareGaze Interaction
Periphery
[arpost.co] [Vedamurthy et al.][Sitzmann et al.]
[Patney et al.]
[Sun et al.]
[eyegaze.com]
[Padmanaban et al.]
![Page 79: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/79.jpg)
81
ON-AXIS GAZE TRACKING GLASSESEye tracking prototype for Augmented Reality glasses
Gaze tracking glasses with vertical/horizontal waveguides
Vertical beam splitter Horizontal beam splitter Infared illumination units
![Page 80: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/80.jpg)
82
OFF-AXIS GAZE TRACKING3D Reconstruction Result
![Page 81: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/81.jpg)
83
GAZE CALIBRATION
• Sparse Pattern sampling (e.g. ring pattern), average over time
Calibration Method A – Using calibration network layer
• calibration sets layer weights
• 3d gaze direction directly estimated by network inference
Calibration Method B - Mapping 2d pupil center to 2d screen position
• calibration estimates polynomial mapping functions FL and FR
• localized pupil centers (network inference) are mapped using FL and FR
• derive 3d gaze vector from binocular 2d screen positions
Ring target pattern
![Page 82: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/82.jpg)
84
Retinal Cone Distribution[Goldstein,2007]
FOVEATED RENDERING
Accelerating Real-time Computer Graphics
![Page 83: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/83.jpg)
85
FOVEAL REGION
![Page 84: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/84.jpg)
86
APPLICATION EXAMPLE FOVEATED RENDERING
![Page 85: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/85.jpg)
87
![Page 86: NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW … · La yer (x, y) Conv2 Conv3 Conv4 Conv5 Conv6 FC Layer Resolution Num. Channels Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24](https://reader033.vdocuments.site/reader033/viewer/2022041417/5e1cad2a49773428243f5742/html5/thumbnails/86.jpg)
88
ATTENTION ANALYSIS
Generating 3D Saliency Information
[Loewe and Stengel et al. ETVIS‘15]