Structure and Motion from Depth and Correspondence Models
Recovering structure and motion from videos is a well-studied comprehensive 3D vision task that involves (1) image calibration, (2) two-view pose initialization, and (3) multi-view Structure-from-Motion (SfM). Prior arts are optimization-based methods built over sparse image correspondence inputs. This thesis develops systematic approaches to enhance classic solutions with deep learning models. We introduce EdgeDepth and PMatch for dense monocular depthmaps and dense binocular correspondence map estimations. Since classic approaches typically rely on sparse and accurate inputs, they are less suitable for the dense yet high-variance predictions from dense depth and correspondence models. As a solution, we propose to optimize through the robust inlier-counting-based scoring function, which is widely applied in RANdom SAmpling Consensus (RANSAC). Our system is structured as follows: (1) For image calibration, we introduce WildCamera. The system utilizes a RANSAC algorithm applied to a dense incidence field regressed by a deep model. It calibrates in-the-wild monocular images without checkerboard. (2) In two-view pose estimation, we introduce LightedDepth.It estimates the optimal pose by aligning the depth map with the correspondence map, maximizing the projective inliers. (3) The strategy is extended to a Hough Transform in RSfM for multi-view SfM over a local $3$ to $9$ frame system. (4) We generalize the RSfM discrete inlier counting scoring function to a smoothed scoring function via marginalizing thresholds for general SfM task. To this end, we formulate a comprehensive system that recovers structure and motion from two-view / local multi-view / large-scale multi-view images with dense monocular depthmap and binocular correspondence maps. Compared to prior arts, our methods show comprehensive improvement on two-view, small-scale, and large-scale multi-view systems.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- Attribution-NonCommercial 4.0 International
- Material Type
-
Theses
- Authors
-
Zhu, Shengjie
- Thesis Advisors
-
Liu, Xiaoming
- Committee Members
-
Jain, Anil K.
Morris, Daniel
Boddeti, Vishnu
- Date Published
-
2025
- Subjects
-
Computer science
- Program of Study
-
Computer Science - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- 147 pages
- Permalink
- https://doi.org/doi:10.25335/e9ng-b125