Skip to main content

Deep Learning Based 3D Vision

  • Living reference work entry
  • First Online:
Computer Vision
  • 47 Accesses

Synonyms

Machine learning for 3D vision

Related Concepts

Definition

The field of 3D vision covers a large range of techniques developed to estimate 3D information from one or multiple images, such as the absolute or relative pose of a camera and the 3D structure of the scene. For instance, among 3D vision problems, visual odometry and visual localization consist in estimating the pose of a camera in the environment, while stereo matching and multi-view stereo aim to reconstruct the 3D structure of the scene from different viewpoints. Conventional techniques developed to solve 3D vision tasks traditionally rely on low-level image features, such as sparse keypoints or dense photometric matching. These strategies are efficient under favorable conditions but tend to be sensitive to poorly textured scenes and usually do not encapsulate contextual and semantic...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Ummenhofer B, Zhou H, Uhrig J, Mayer N, Ilg E, Dosovitskiy A, Brox T (2017) Demon: depth and motion network for learning monocular stereo. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5038–5047

    Google Scholar 

  2. Wang S, Clark R, Wen H, Trigoni N (2017) Deepvo: towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: IEEE international conference on robotics and automation, pp 2043–2050. IEEE

    Google Scholar 

  3. Li R, Wang S, Long Z, Gu D (2018) Undeepvo: Monocular visual odometry through unsupervised deep learning. In: IEEE international conference on robotics and automation, pp 7286–7291. IEEE

    Google Scholar 

  4. Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858

    Google Scholar 

  5. Zhou H, Ummenhofer B, Brox T (2018) Deeptam: deep tracking and mapping. In: Proceedings of the European conference on computer vision, pp 822–838

    Google Scholar 

  6. Engel J, Schöps T, Cremers D (2014) LSD-SLAM: Large-scale direct monocular SLAM. In: Proceedings of the European conference on computer vision, pp 834–849. Springer

    Google Scholar 

  7. Im S, Jeon H-G, Lin S, Kweon IS (2018) Dpsnet: end-to-end deep plane sweep stereo. In: Proceedings of the international conference on learning representations

    Google Scholar 

  8. Tateno K, Tombari F, Laina I, Navab N (2017) CNN-slam: real-time dense monocular slam with learned depth prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6243–6252

    Google Scholar 

  9. Yang N, Wang R, Stuckler J, Cremers D (2018) Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European conference on computer vision, pp 817–833

    Google Scholar 

  10. Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) Netvlad: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5297–5307

    Google Scholar 

  11. Piasco N, Sidibé D, Demonceaux C, Gouet-Brunet V (2018) A survey on visual-based localization: on the benefit of heterogeneous data. Pattern Recognit 74:90–109

    Article  Google Scholar 

  12. Kendall A, Grimes M, Cipolla R (2015) Posenet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE international conference on computer vision, pp 2938–2946

    Google Scholar 

  13. Sattler T, Zhou Q, Pollefeys M, Leal-Taixe L (2019) Understanding the limitations of CNN-based absolute camera pose regression. arXiv preprint arXiv:1903.07504

    Google Scholar 

  14. Yi KM, Trulls Fortuny E, Ono Y, Lepetit V, Salzmann M, Fua P (2018) Learning to find good correspondences. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Book  Google Scholar 

  15. Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17:1–32

    MathSciNet  MATH  Google Scholar 

  16. Luo W, Schwing AG, Urtasun R (2016) Efficient deep learning for stereo matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5695–5703

    Google Scholar 

  17. Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4040–4048

    Google Scholar 

  18. Pang J, Sun W, Ren JSJ, Yang C, Yan Q (2017) Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: Proceedings of the IEEE international conference on computer vision, pp 887–895

    Google Scholar 

  19. Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE international conference on computer vision, pp 66–75

    Google Scholar 

  20. Furukawa Y, Hernández C et al (2015) Multi-view stereo: a tutorial. Found Trends Comput Graph Vis 9(1–2):1–148

    Article  Google Scholar 

  21. Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. In: Advances in neural information processing systems, pp 2366–2374

    Google Scholar 

  22. Huang P-H, Matzen K, Kopf J, Ahuja N, Huang J-B (2018) Deepmvs: learning multi-view stereopsis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2821–2830

    Google Scholar 

  23. Yao Y, Luo Z, Li S, Fang T, Quan L (2018) Mvsnet: depth inference for unstructured multi-view stereo. In: Proceedings of the European conference on computer vision, pp 767–783

    Google Scholar 

  24. Chen W, Fu Z, Yang D, Deng J (2016) Single-image depth perception in the wild. In: Advances in neural information processing systems, pp 730–738

    Google Scholar 

  25. Li Z, Snavely N (2018) Megadepth: Learning single-view depth prediction from internet photos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2041–2050

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to In So Kweon or Francois Rameau .

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Kweon, I.S., Rameau, F. (2020). Deep Learning Based 3D Vision. In: Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-030-03243-2_848-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03243-2_848-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03243-2

  • Online ISBN: 978-3-030-03243-2

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics