ece 562 computer architecture and design project: improving feature extraction using sift on gpu...
TRANSCRIPT
![Page 1: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/1.jpg)
ECE 562 Computer Architecture and Design
Project: Improving Feature Extraction Using SIFT on GPU
Rodrigo Savage, Wo-Tak Wu
![Page 2: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/2.jpg)
OverviewApplication:
• Object tracking in real timeChallenges:
• Static Scene• Moving objects• Occluding• Collision• Disappearing• Rotation• Scaling
Divide and Conquer:• Feature Extraction and Tracking
Focus on:• Feature Extraction, used SIFT • Improve an existing implementation with GPU
![Page 3: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/3.jpg)
Scale Invariant Feature Transform (SIFT)
Input: imageOutput: keypoints
![Page 4: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/4.jpg)
GPU Implementation
• Selected the GPU implementation by Sinha et al. at UNC at Chapel Hill
• Open-source SiftGPU available (latest V4.00, Sept. 2012)• SIFT well suited to be implemented on GPU
• Tens of thousands of threads handle subsets of data without communication with each other
![Page 5: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/5.jpg)
Attempts to Speed Up
• Tackled the 2 most time consuming processing steps• Blurring images with Gaussian low-pass filter
• Changed pixel data access pattern• Used different schemes of data partitioning
• Keypoint descriptor (128-element vector) calculations• Optimize code in the kernel
• Used usual optimization techniques• Changed GPU memory usage
• Threads management• Experimented with kernel parameters• Maximized usage of available threads
Result: Reduced descriptor compute time from 73 to 22 ms (70%)
![Page 6: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/6.jpg)
Conclusion
• Existing implementation is already pretty good• Hard to take full advantage of the architecture. Need to
have good understanding of• Memory architecture• Thread usage
• CUDA C/C++ compiler (nvcc) optimizes code in different ways. Need to experiment to gain performance
• Hard to debug code running on GPU• Visual Profiler can provide valuable insights on code
behaviors
![Page 7: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/7.jpg)
Backup Slides
![Page 8: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/8.jpg)
References
• SiftGPU available at http://cs.unc.edu/~ccwu/siftgpu/• D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”
International Journal of Computer Vision, November 2004.• Sudipta N. Sinha et al., “GPU-based Video Feature Tracking And Matching,”
Technical Report TR 06-012, Department of Computer Science, UNC Chapel Hill, May 2006.
• NVIDIA GeForce GT 640M LE • CUDA Cores: 384 • Total available graphics memory: 4095 MB
![Page 9: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/9.jpg)
Test image with keypoints
![Page 10: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/10.jpg)
![Page 11: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/11.jpg)
![Page 12: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/12.jpg)
Algorithm
![Page 13: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/13.jpg)
Algorithm
![Page 14: ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu](https://reader030.vdocuments.site/reader030/viewer/2022032723/56649d155503460f949e9e5f/html5/thumbnails/14.jpg)
Algorithm