visual feature learning in autonomous driving

20
Chair of Robotics, Artificial Intelligence and Real-time Systems MasterSeminar M.Sc. Emeç Erçelik [email protected] Office 03.07.057 SS 2020 Visual Feature Learning in Autonomous Driving

Upload: others

Post on 03-Oct-2021

34 views

Category:

Documents


0 download

TRANSCRIPT

PowerPoint PresentationMasterSeminar
Chair of Robotics, Artificial Intelligence and Real-time Systems
Content 3D object detection and tracking
Feature engineering vs. learning
What is feature engineering? * https://en.wikipedia.org/ wiki/Sobel_operator
** Jain, Ashesh, et al. ”Car that knows before you do: Anticipating maneuvers via learning temporal driving models.” Proceedings of the IEEE International Conference on Computer Vision. 2015.
Chair of Robotics, Artificial Intelligence and Real-time Systems
What is feature learning?
Chair of Robotics, Artificial Intelligence and Real-time Systems
Why do we need feature learning? We need several years of experience to design good features in a specific field
And the designed features are not the best
Methods to learn features slightly change from task to task
But other problems arise
Expertise in setting problem-specific parameters
However, results are better with feature learning than engineering features
Chair of Robotics, Artificial Intelligence and Real-time Systems
Autonomous driving
* Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
Understanding the surroundings
Listing the objects
Localizing the objects
Estimating the depth
Sensory Information
* Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
Images
Sensory Information
*Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
Images
Sensory Information
* Chadwick, Simon, Will Maddern, and Paul Newman. "Distant vehicle detection using radar and vision." arXiv preprint arXiv:1901.10951 (2019).
** Caesar, Holger, et al. "nuScenes: A multimodal dataset for autonomous driving." arXiv preprint arXiv:1903.11027 (2019).
Images
Sensory Information
** Jain, Ashesh, et al. ”Car that knows before you do: Anticipating maneuvers via learning temporal driving models.” Proceedings of the IEEE International Conference on Computer Vision. 2015.
Images
Temporal Information Sequences of images
Video
x y z R
1.2 2.3 4.5 0.8
1.4 0.4 1.1 0.3
0.3 1.3 1.5 0.7
More information about the surroundings
Occlusion
Sensor failure
Previous actions
Complementary views
But there is not a single solution to fit everything in one framework
Always review the literature
Topics
1. 3D Object detection using different data modalities
2. Comparison of only LiDAR-based and fusion-based 3D object detection methods
3. Challenges in video object detection and tracking for 2D and 3D bounding boxes
4. Convolutional recurrent neural networks for object detection and tracking
5. Semi-supervised, self-supervised, and unsupervised learning methods for object detection
6. Transfer learning for 3D object detection
Chair of Robotics, Artificial Intelligence and Real-time Systems
Procedure
2. Choose three topics that are of your interest (24.04.2020)
3. You will get a notification e-mail regarding the assigned topic (30.04.2020)
4. Start looking at the resources and the state-of-the-art papers (30.04-15.05.2020)
5. Initial meeting to discuss the collected materials and expected results (15.05.2020)
6. Midterm session to discuss the progress together (12.06.2020)
7. Write a seminar paper on your work and submit the first draft (10.07.2020)
8. Present your work (17.07, 24.07, 31.07.2020)
9. Submit the final version of your paper (07.08.2020)
10. Write a peer-review on the assigned paper of your peers (14.08.2020)
Note: All meetings take place online
Chair of Robotics, Artificial Intelligence and Real-time Systems
Information about the Seminar Time and Location: All meetings are on Fridays, 9:00-11:00, online
Check the web page of the seminar regularly
Chair of Robotics, Artificial Intelligence and Real-time Systems
Gitlab Repository Each group will be granted access to their own repository
Update your repository regularly with your work
All the work should be uploaded to Git Repo before deadlines
After the deadlines, no submission will be accepted
There is a good documentation for Gitlab here
Grading Extracting the related state-of-the-art resources (20%)
Writing a high quality scientific paper (40%)
Writing a review (10%)
Presenting the work (30%)
Chair of Robotics, Artificial Intelligence and Real-time Systems
Notes on Plagiarism Avoid any kind of Copy & Paste!
Cite ALL of the scientific works, ideas and the concepts you use!
What if …?
The responsible Department at TUM will initial the investigation officially
Chair of Robotics, Artificial Intelligence and Real-time Systems
General Information and Resources (Hyperlinks)
IEEE latex template for writing scientific papers
Latex Editor for the final report
You can use sharelatex for writing
A good reference on How to Write a Scientific Paper
Your presentation must not be like this!
A useful tool to manage your references and citations
Chair of Robotics, Artificial Intelligence and Real-time Systems
References Lu, Y., Lu, C., & Tang, C. K. (2017). Online video object detection using association LSTM. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2344-2352).
Xingjian, S. H. I., et al. "Convolutional LSTM network: A machine learning approach for precipitation nowcasting." Advances in neural information processing systems. 2015.
Ku, Jason, et al. "Joint 3d proposal generation and object detection from view aggregation." 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018
Köpüklü, Okan, and Gerhard Rigoll. "Analysis on temporal dimension of inputs for 3d convolutional neural networks." 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS). IEEE, 2018.
Bergmann, Philipp, Tim Meinhardt, and Laura Leal-Taixe. "Tracking without bells and whistles." arXiv preprint arXiv:1903.05625 (2019).
Tripathi, S., Lipton, Z. C., Belongie, S., & Nguyen, T. (2016). Context matters: Refining object detection in video with recurrent neural networks. arXiv preprint arXiv:1607.04648.
Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints
https://arxiv.org/pdf/1910.06727.pdf