energy-efficient face detection using andes risc-v …...image from joint face detection and...
TRANSCRIPT
![Page 1: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/1.jpg)
Energy-Efficient Face Detection Using Andes RISC-V Processor
Presenter: Chien-Hao Chen
Advisor: Prof. Chen-Yi Lee
Date: 2018/03/12
1
![Page 2: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/2.jpg)
Outline • Introduction
• Face Detector on Andes Processor
• Experiment Result
• Conclusion
• Reference
2
![Page 3: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/3.jpg)
Outline • Introduction
• Motivation
• Face Detection Model
• Face Detector on Andes Processor
• Experiment Result
• Conclusion
• Reference
3
![Page 4: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/4.jpg)
Motivation • Cloud computing
– Image upload to cloud → → result returned
• Edge computing
– Image directly computed → → result returned
4
processing
processing
![Page 5: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/5.jpg)
Face Detection Model MTCNN, 2016[1]
1. Resize image and sliding window sampling
2. P-Net (Proposal): Find candidate bounding box
3. R-Net (Refine): Reject the wrong candidate from P-Net
4. O-Net (Output): From R-Net, find more correct face region
P-Net R-Net O-Net
5 Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016
![Page 6: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/6.jpg)
Face Detection Model • P-Net (Proposal):
• Fully convolution with 3 convolution and 1 max pooling layer
• Rough proposal
• R-Net (Refine): • 3 convolution, 2 max pooling and 1 fully connect layer
• Reject false proposal from P-Net
• O-Net (Output): • 4 convolution, 3 max pooling and
1 fully connect layer
• More complicated model
→ Reject false result from R-NET
→ Better face bounding box position
6
Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016
![Page 7: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/7.jpg)
Outline • Introduction
• Face Detector on Andes Processor − Hardware environment
− Model Simplification and Acceleration
• Experiment Result
• Conclusion
• Reference
7
![Page 8: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/8.jpg)
8
Hardware environment Andes RISC-V :
− Processor 60MHz, 64-bit AndesCore
− Xilinx Kintex-7 FPGA XC7K410T
− DRAM: 1GB
− Flash: 64MB
![Page 9: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/9.jpg)
Outline • Introduction
• Face Detector on Andes Processor − Hardware environment
− Model Simplification and Acceleration
• Experiment Result
• Conclusion
• Reference
9
![Page 10: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/10.jpg)
Depth-wise separable convolution [3]
10
Model Simplification and Acceleration
Model Simplify
1 1
![Page 11: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/11.jpg)
Depth-wise MTCNN
• P-Net: (Proposal) • Fully convolution with 1 convolution layer: stride = 2 (channel: 10)
2 DW convolution layer: stride = 1 (channel: 16, 32)
• R-Net: (Refine) • 1 convolution layer: stride = 2
1 DW convolution layer: stride = 2 1 DW convolution layer: stride = 1
• 1 fully connect
• O-Net: (Output) • 1 convolution: stride = 2
2 DW convolution: stride = 2 2 convolution: stride = 1 (channel: 128, 128)
• 1 fully connect
11
Model Simplification and Acceleration
8 24
![Page 12: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/12.jpg)
Motivation
• Ex: If PNET input size 240 × 320 output1 size 115 × 155 × 2 output2 size 115 × 155 × 4
• Soft-max:
𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥 + 𝑒𝑦
𝑒𝑦
𝑒𝑥 + 𝑒𝑦
→ 6 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 & 2 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛
• For output1 Soft-max: → 115 × 155 × 6~107𝑘 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 → 115 × 155 × 2~35𝑘 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛
12
1 2 Soft-max
Approximation
Model Simplification and Acceleration
𝐻𝑜𝑢𝑡 =𝐻𝑖𝑛 − 𝐻𝑓𝑖𝑙𝑡𝑒𝑟 + 𝑃𝑎𝑑𝑑𝑖𝑛𝑔
𝑆𝑡𝑟𝑖𝑑𝑒+ 1
=240 − 12 + 0
2+ 1 = 115
![Page 13: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/13.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
13
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 14: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/14.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
14
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 15: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/15.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
15
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦> 𝑃
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 16: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/16.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
16
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦> 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 17: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/17.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
17
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦> 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 18: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/18.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
18
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦> 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 19: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/19.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
19
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦> 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦
𝑥 > 𝑙𝑛 (𝑃
1 − 𝑃) + 𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 20: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/20.jpg)
Soft-max approximation
• 𝜎𝑥𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
20
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦> 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦
𝑥 > 𝑙𝑛 (𝑃
1 − 𝑃) + 𝑦
constant
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 21: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/21.jpg)
21
𝑒𝑥
𝑒𝑥 + 𝑒𝑦= 0.7
𝑥 = 𝑙𝑛 (0.7
1 − 0.7) + 𝑦
Model Simplification and Acceleration
1 2
![Page 22: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/22.jpg)
Outline • Introduction
• Face Detector on Andes Processor
• Experiment Result
• Conclusion
• Reference
22
![Page 23: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/23.jpg)
• On FDDB[4] database: • P-Net, R-Net threshold = 0.6, 0.7; min-face = 25x25
23
Experiment Result
Method Accuracy @
FPPI 0.01 Accuracy @
FPPI 0.1 Accuracy @
FPPI 1.0
Speedup @ Andes RISC-V
Processor
MTCNN 84.95% 92.40% 94.66% -
Ours 82.59% 88.15% 90.68% 106x
• FPPI: False Positive Per Image
![Page 24: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/24.jpg)
• On FDDB database:
24
Experiment Result
• FPPI: False Positive Per Image
Method Accuracy @
FPPI 1.0
Speedup @ Andes RISC-V
Processor
MTCNN 94.66% -
Ours 90.68% 106x
Method Accuracy
@ FPPI 0.1 Accuracy
@ FPPI 0.01 FPS
(Titan X GPU)
FPS (1080-Ti)
Brodmann17 89.25% 81.88% 200 90
DeepIR 88.45% 82.16% <=1
Xiaomi 87.82% 77.99% 2?
Faceness 86.04% 79.67% 1
Hyperface 85.63% 80.68% 0.33
DP2MFD 85.57% 76.73% <0.05
Ours 88.15% 82.59% 54
![Page 25: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/25.jpg)
• On FDDB database:
• Performance without considering face size under 48x48
• P-Net, R-Net threshold = 0.9, 0.85; min-face = 48x48
• P-Net, R-Net threshold = 0.6, 0.7; min-face = 48x48
25
Method Accuracy @
FPPI 0.01 Accuracy @
FPPI 0.1
Ours 86.64% 87.7%
Method Accuracy @
FPPI 0.01 Accuracy @
FPPI 0.1
Ours 90.53% 93.81%
Experiment Result
![Page 26: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/26.jpg)
Outline • Introduction
• Face Detector on Andes Processor
• Experiment Result
• Conclusion
• Reference
26
![Page 27: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/27.jpg)
• Proposed face detection model
Conclusion
27
Model Size 3.6x smaller
Speedup @ Andes processor
106x faster
Accuracy @ FPPI 1.0
90.68%
![Page 28: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/28.jpg)
Reference
28
[1] Zhang, Kaipeng, et al. "Joint face detection and alignment using multitask cascaded convolutional networks." IEEE Signal Processing Letters 23.10 (2016): 1499-1503.
[2] Li, Haoxiang, et al. "A convolutional neural network cascade for face detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
[3] Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017).
[4] Jain, Vidit, and Erik Learned-Miller. Fddb: A benchmark for face detection in unconstrained settings. Vol. 2. No. 4. UMass Amherst Technical Report, 2010.
![Page 29: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/29.jpg)
Reference
29
[5] Sun, Xudong, Pengcheng Wu, and Steven CH Hoi. "Face detection using deep learning: An improved faster rcnn approach." Neurocomputing 299 (2018): 42-50.
[6] Jiang, Huaizu, and Erik Learned-Miller. "Face detection with the faster R-CNN." 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE, 2017.
[7] Yang, Shuo, et al. "Faceness-net: Face detection through deep facial part responses." IEEE transactions on pattern analysis and machine intelligence 40.8 (2018): 1845-1859.
[8] Ranjan, Rajeev, Vishal M. Patel, and Rama Chellappa. "Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition." IEEE Transactions on Pattern Analysis and Machine Intelligence 41.1 (2019): 121-135.
![Page 30: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/30.jpg)
Reference
30
[9] Ranjan, Rajeev, Vishal M. Patel, and Rama Chellappa. "A deep pyramid deformable part model for face detection." 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, 2015.
![Page 31: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/31.jpg)
Thanks for your listening!
31
![Page 32: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/32.jpg)
32
![Page 33: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/33.jpg)
Soft-max with NMS
33
![Page 34: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/34.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
34
Highest score
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 35: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/35.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
•𝑒𝑥1
𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2
𝑒𝑥2+𝑒𝑦2
35
Highest score
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 36: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/36.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
•𝑒𝑥1
𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2
𝑒𝑥2+𝑒𝑦2
→ 𝑒𝑥1(𝑒𝑥2 + 𝑒𝑦2) > 𝑒𝑥2(𝑒𝑥1 + 𝑒𝑦1)
36
Highest score
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 37: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/37.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
•𝑒𝑥1
𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2
𝑒𝑥2+𝑒𝑦2
→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1
37
Highest score
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 38: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/38.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
•𝑒𝑥1
𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2
𝑒𝑥2+𝑒𝑦2
→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1
38
Highest score
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 39: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/39.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
•𝑒𝑥1
𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2
𝑒𝑥2+𝑒𝑦2
→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1
39
Highest score
𝑒𝑥1+𝑦2 > 𝑒𝑥2+𝑦1
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 40: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/40.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
•𝑒𝑥1
𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2
𝑒𝑥2+𝑒𝑦2
→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1
40
Highest score
𝑒𝑥1+𝑦2 > 𝑒𝑥2+𝑦1 𝑥1 + 𝑦2 > 𝑥2 + 𝑦1
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 41: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/41.jpg)
Soft-max approximation
•𝑒𝑥
𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃
𝑙𝑛 1−𝑃+ 𝑦
Soft-max approximation with NMS
• NMS:
•𝑒𝑥1
𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2
𝑒𝑥2+𝑒𝑦2
→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1
41
Highest score
𝑒𝑥1+𝑦2 > 𝑒𝑥2+𝑦1 𝑥1 + 𝑦2 > 𝑥2 + 𝑦1 𝑥1 − 𝑦1 > 𝑥2 − 𝑦2
• Speedup: 1.43x faster
Model Simplification and Acceleration
1 2 Soft-max
Approximation
![Page 42: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/42.jpg)
Computational Complexity
42
![Page 43: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/43.jpg)
Model operation complexity comparison
43
Experiment Result
Original MTCNN
Network Input size MAC number
P-Net 12x12 44.76K
P-Net* 120x160 55x75x44.76K
=184.6M
R-Net 24x24 1.531M
O-Net 48x48 12.91M
Ours
Network Input size MAC number
P-Net 12x12 7.872K
P-Net* 120x160 55x75x7.872K
=32.47M
R-Net 24x24 319.3K
O-Net 48x48 2.267M
*: Consider P-Net’s input is an image with size 120x160 but not a block only.
![Page 44: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/44.jpg)
Quantization
44
![Page 45: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/45.jpg)
Model size comparison
45
Experiment Result
Original MTCNN
Network Data type Model size (Byte)
P-Net float32 26.04K
R-Net float32 398.5K
O-Net float32 1.542M
Total 1.966M
Ours
Network Data type Model size (Byte)
P-Net int8 1.088K
R-Net int8 137.4K
O-Net int8 402.6K
Total 541.2K
![Page 46: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/46.jpg)
• On FDDB database:
Quantization Result
46
Word Length Accuracy @
FPPI 0.1
Original MTCNN 92.40%
Ours (float32) 88.20%
Ours (int8) 88.15%
• FPPI: False Positive Per Image ANDES
DSP 1 3
![Page 47: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/47.jpg)
Quantization Method
47
ANDES DSP
1 3
![Page 48: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/48.jpg)
• Weight quantization
𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟 = 7 − 𝑐𝑒𝑖𝑙(𝑙𝑜𝑔2(max (𝑎𝑏𝑠 𝑤𝑒𝑖𝑔ℎ𝑡 𝑚𝑖𝑛 , 𝑎𝑏𝑠 𝑤𝑒𝑖𝑔ℎ𝑡 𝑚𝑎𝑥 )))
𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 = 𝑟𝑜𝑢𝑛𝑑 𝑑𝑜𝑤𝑛 𝑜𝑙𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 × 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟
𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 > 126 = 127
𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 < −127 = −128
𝑓𝑖𝑛𝑎𝑙 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 = 𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 ÷ 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟
48
Quantization Method ANDES
DSP 1 3
![Page 49: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/49.jpg)
• Layer output quantization
𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟
= 7
− 𝑐𝑒𝑖𝑙(𝑙𝑜𝑔2(max (𝑎𝑏𝑠 𝑙𝑎𝑦𝑒𝑟 𝑜𝑢𝑡𝑝𝑢𝑡 𝑚𝑖𝑛 , 𝑎𝑏𝑠 𝑙𝑎𝑦𝑒𝑟 𝑜𝑢𝑡𝑝𝑢𝑡 𝑚𝑎𝑥 )))
𝑤ℎ𝑖𝑙𝑒 (𝑠ℎ𝑖𝑓𝑡_𝑠𝑡𝑎𝑟𝑡):
𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑟𝑜𝑢𝑛𝑑 𝑑𝑜𝑤𝑛 𝑜𝑢𝑡𝑝𝑢𝑡 × 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟
𝑜𝑢𝑡𝑝𝑢𝑡 𝑜𝑢𝑡𝑝𝑢𝑡 > 126 = 127
𝑜𝑢𝑡𝑝𝑢𝑡 𝑜𝑢𝑡𝑝𝑢𝑡 < −127 = −128
𝑓𝑖𝑛𝑎𝑙 𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑜𝑢𝑡𝑝𝑢𝑡 ÷ 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟
𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 += 1
49
Quantization Method ANDES
DSP 1 3
![Page 50: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/50.jpg)
• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4
Example
50
ANDES DSP
1 3
![Page 51: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/51.jpg)
• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4
Example
51
ANDES DSP
1 3
7 − 𝑙𝑜𝑔2 4 = 5
![Page 52: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/52.jpg)
• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4
• 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 5 = [−4, −0.25, −0.1875, … , 0.1875, 0.21875, 3.96875]
Example
52
ANDES DSP
1 3
![Page 53: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/53.jpg)
• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4
• 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 5 = [−4, −0.25, −0.1875, … , 0.1875, 0.21875, 3.96875]
• 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑖𝑜𝑛 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 6 = [−2, −0.234375, −0.203125, … , 0.1875, 0.234375, 1.984375]
Example
53
ANDES DSP
1 3
![Page 54: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/54.jpg)
• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4
• 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 5 = [−4, −0.25, −0.1875, … , 0.1875, 0.21875, 3.96875]
• 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑖𝑜𝑛 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 6 = [−2, −0.234375, −0.203125, … , 0.1875, 0.234375, 1.984375]
Example
54
More precise
ANDES DSP
1 3
![Page 55: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/55.jpg)
Speed-up each step
55
![Page 56: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/56.jpg)
• On FDDB database:
56
Experiment Result • FPPI: False Positive Per Image
Method Accuracy @
FPPI 1.0
Speedup @ Andes RISC-V
Processor
Ori-MTCNN 94.66% -
Ours 90.68% 106x
Method Accuracy
@ FPPI 0.1 Accuracy
@ FPPI 0.01 FPS
(Titan X GPU)
FPS (1080-Ti)
Brodmann17 89.25% 81.88% 200 90
DeepIR 88.45% 82.16% <=1
Xiaomi 87.82% 77.99% 2?
Faceness 86.04% 79.67% 1
Hyperface 85.63% 80.68% 0.33
DP2MFD 85.57% 76.73% <0.05
MTCNN 92.40% 84.95% 51
Ours 88.15% 82.59% 54
![Page 57: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/57.jpg)
57
Step Baseline Sim#1 Fast soft-max DSP-Sim#1 DSP-Sim#2
Overall 294.0129 99.81 53.69 3.88 2.78
Overall Speedup - 2.95 1.86 13.84 1.397
FPS 0.0034 0.01002 0.01863 0.25776 0.3601
P-Net Overall time
97.25 77.2 31.2 1.54 1.18
P-Net Overall speedup
- 1.26 2.47 20.30 1.30
R-Net Overall time
59.08 6.158 6.028 0.989 0.628
R-Net Trigger Times 46 22 22 32 29
R-Net normalize 1.28 0.28 0.274 0.0309 0.022
R-Net normalize speedup - 4.59 1.02 8.87 1.43
O-Net Overall time
132.19 15.034 15.004 1.35 0.96
O-Net Trigger Times 14 9 9 8 9
O-Net normalize 9.44 1.67 1.67 0.17 0.107
O-Net normalize speedup - 5.65 1.002 9.9 1.57
![Page 58: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters](https://reader030.vdocuments.site/reader030/viewer/2022040600/5e8e2797dcf1c847362565a3/html5/thumbnails/58.jpg)
58
Step Baseline Sim#1 Fast soft-max DSP-Sim#1 DSP-Sim#2
Overall 294.0129 99.8111858 53.687959 3.879538 2.777296
Overall Speedup - 2.94569088 1.8590982 13.8388 1.396875954
FPS 0.0034012107406
68638 0.01001891713455
0104 0.01862615026848
5554 0.25776 0.360062449
P-Net Overall time
97.248312473297119
77.170741379261017
31.195177435874939
1.536423 1.180413
P-Net Overall speedup
- 1.26017077 2.4738036 20.3038 1.301597831
R-Net Overall time
59.077883005142212
6.1582962274551392
6.0284666419029236
0.988531 0.627762
R-Net Trigger Times 46 22 22 32 29
R-Net normalize 1.284302 0.27992256 0.2740212 0.03089 0.021646966
R-Net normalize speedup - 4.58806178 1.0215361 8.87087 1.426989815
O-Net Overall time
132.18732833862305
15.033685207366943
15.003592789173126
1.345193 0.961341
O-Net Trigger Times 14 9 9 8 9
O-Net normalize 9.441952 1.67040947 1.6670659 0.16815 0.106815667
O-Net normalize speedup - 5.65247753 1.0020057 9.91416 1.574207274