Cortex-inspired developmental learning networks for stereo vision
How does the human brain make sense of the 3D world while its visual input, the retinal images, are only two-dimensional? There are multiple depth-cues exploited by the brain to create a 3D model of the world. Despite the importance of this subject both for scientists and engineers, the underlying computational mechanisms of the stereo vision in the human brain is still largely unknown. This thesis is an attempt towards creating a developmental model of the stereo vision in the visual cortex. By developmental we mean that the features of each neuron are developed, instead of hand-crafted, so that the limited resource is optimally used. This approach helps us learn more about the biological stereo vision, and also yields results superior to those of traditional computer vision approaches, e.g., under weak textures. Developmental networks, such as Where-What Networks (WWN), have been shown promising for simultaneous attention and recognition, while handling variations in scale, location and type as well as inter-class variations. Moreover, in a simpler prior setting, they have shown sub-pixel accuracy in disparity detection in challenging natural images. However, the previous work for stereo vision was limited to 20 pixel stripes of shifted images and unable to scale to real world problems. This dissertation presents work on building neuromorphic developmental models for stereo vision, focusing on 1) dynamic synapse retraction and growth as a method of developing more efficient receptive fields 2) training for images that involve complex natural backgrounds 3) integration of depth perception with location and type information. In a setting of 5 object classes, 7 × 7 = 49 locations and 11 disparity levels, the network achieves above 95% recognition rate for object shapes, under one pixel disparity detection error, and under 10 pixel location error. These results are reported using challenging natural and synthetic textures both on background and foreground objects in disjoint testing.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Solgi, Mojtaba
- Thesis Advisors
-
Weng, Juyang
- Committee Members
-
Stockman, George
Salem, Fathi
Liu, Taosheng
- Date
- 2013
- Subjects
-
Perceptual learning
Learning models (Stochastic processes)
Image processing
Depth perception
Computer vision
Binocular vision
Visual cortex
- Program of Study
-
Computer Science - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- xvi, 136 pages
- ISBN
-
9781303536991
1303536994
- Permalink
- https://doi.org/doi:10.25335/9x4k-yj42