Deep Learning Based 3D Vision

Kweon, In So; Rameau, Francois

doi:10.1007/978-3-030-03243-2_848-1

In So Kweon² &
Francois Rameau²

47 Accesses

Synonyms

Machine learning for 3D vision

Related Concepts

Definition

The field of 3D vision covers a large range of techniques developed to estimate 3D information from one or multiple images, such as the absolute or relative pose of a camera and the 3D structure of the scene. For instance, among 3D vision problems, visual odometry and visual localization consist in estimating the pose of a camera in the environment, while stereo matching and multi-view stereo aim to reconstruct the 3D structure of the scene from different viewpoints. Conventional techniques developed to solve 3D vision tasks traditionally rely on low-level image features, such as sparse keypoints or dense photometric matching. These strategies are efficient under favorable conditions but tend to be sensitive to poorly textured scenes and usually do not encapsulate contextual and semantic...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Ummenhofer B, Zhou H, Uhrig J, Mayer N, Ilg E, Dosovitskiy A, Brox T (2017) Demon: depth and motion network for learning monocular stereo. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5038–5047
Google Scholar
Wang S, Clark R, Wen H, Trigoni N (2017) Deepvo: towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: IEEE international conference on robotics and automation, pp 2043–2050. IEEE
Google Scholar
Li R, Wang S, Long Z, Gu D (2018) Undeepvo: Monocular visual odometry through unsupervised deep learning. In: IEEE international conference on robotics and automation, pp 7286–7291. IEEE
Google Scholar
Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858
Google Scholar
Zhou H, Ummenhofer B, Brox T (2018) Deeptam: deep tracking and mapping. In: Proceedings of the European conference on computer vision, pp 822–838
Google Scholar
Engel J, Schöps T, Cremers D (2014) LSD-SLAM: Large-scale direct monocular SLAM. In: Proceedings of the European conference on computer vision, pp 834–849. Springer
Google Scholar
Im S, Jeon H-G, Lin S, Kweon IS (2018) Dpsnet: end-to-end deep plane sweep stereo. In: Proceedings of the international conference on learning representations
Google Scholar
Tateno K, Tombari F, Laina I, Navab N (2017) CNN-slam: real-time dense monocular slam with learned depth prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6243–6252
Google Scholar
Yang N, Wang R, Stuckler J, Cremers D (2018) Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European conference on computer vision, pp 817–833
Google Scholar
Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) Netvlad: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5297–5307
Google Scholar
Piasco N, Sidibé D, Demonceaux C, Gouet-Brunet V (2018) A survey on visual-based localization: on the benefit of heterogeneous data. Pattern Recognit 74:90–109
Article Google Scholar
Kendall A, Grimes M, Cipolla R (2015) Posenet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE international conference on computer vision, pp 2938–2946
Google Scholar
Sattler T, Zhou Q, Pollefeys M, Leal-Taixe L (2019) Understanding the limitations of CNN-based absolute camera pose regression. arXiv preprint arXiv:1903.07504
Google Scholar
Yi KM, Trulls Fortuny E, Ono Y, Lepetit V, Salzmann M, Fua P (2018) Learning to find good correspondences. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Book Google Scholar
Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17:1–32
MathSciNet MATH Google Scholar
Luo W, Schwing AG, Urtasun R (2016) Efficient deep learning for stereo matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5695–5703
Google Scholar
Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4040–4048
Google Scholar
Pang J, Sun W, Ren JSJ, Yang C, Yan Q (2017) Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: Proceedings of the IEEE international conference on computer vision, pp 887–895
Google Scholar
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE international conference on computer vision, pp 66–75
Google Scholar
Furukawa Y, Hernández C et al (2015) Multi-view stereo: a tutorial. Found Trends Comput Graph Vis 9(1–2):1–148
Article Google Scholar
Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. In: Advances in neural information processing systems, pp 2366–2374
Google Scholar
Huang P-H, Matzen K, Kopf J, Ahuja N, Huang J-B (2018) Deepmvs: learning multi-view stereopsis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2821–2830
Google Scholar
Yao Y, Luo Z, Li S, Fang T, Quan L (2018) Mvsnet: depth inference for unstructured multi-view stereo. In: Proceedings of the European conference on computer vision, pp 767–783
Google Scholar
Chen W, Fu Z, Yang D, Deng J (2016) Single-image depth perception in the wild. In: Advances in neural information processing systems, pp 730–738
Google Scholar
Li Z, Snavely N (2018) Megadepth: Learning single-view depth prediction from internet photos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2041–2050
Google Scholar

Download references

Author information

Authors and Affiliations

KAIST, Daejeon, South Korea
In So Kweon & Francois Rameau

Authors

In So Kweon
View author publications
You can also search for this author in PubMed Google Scholar
Francois Rameau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to In So Kweon or Francois Rameau .

Section Editor information

KAIST, Daejeon, South Korea
In So Kweon

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Kweon, I.S., Rameau, F. (2020). Deep Learning Based 3D Vision. In: Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-030-03243-2_848-1

Download citation

DOI: https://doi.org/10.1007/978-3-030-03243-2_848-1
Published: 20 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03243-2
Online ISBN: 978-3-030-03243-2
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics