Learning 3D Models from a Single Still Image
Google Tech Talks
January, 29 2008
ABSTRACT
We present an algorithm to convert standard digital pictures into
3-d models.
This is a challenging problem, since an image is formed by a projection of the 3-d scene onto two dimensions, thus losing the depth information. We take a supervised learning approach to this problem, and use a Markov Random Field (MRF) to model the image depth cues as well as the relationships between different parts of the image. We show that even on unstructured scenes (of indoor and outdoor environments, including forests, trees, buildings,
etc.), our algorithm is frequently able to recover fairly accurate 3-d models.
We use our method to create visually pleasing 3-d flythroughs from the
image. We also present a few extensions of these ideas, such as additionally incorporating triangulation (stereo) cues, and using multiple images to produce large scale 3-d models. We also apply our methods to two robotics applications: (a) high speed offroad obstacle avoidance on an autonomously driven remote-controlled car, and (b) having a robot unload items from a dishwasher.
To convert your own image of an outdoor scene, landscape, etc. to a 3-d model, please visit: http://make3d.stanford.edu
Joint work with Min Sun and Andrew Y. Ng.
Speaker: Ashutosh Saxena
Ashutosh is a PhD candidate with Prof. Andrew Y. Ng in the Computer
Science department in Stanford University. He received his B. Tech.
from Indian Institute of Technology (IIT Kanpur) in 2004.
His research focuses on machine learning approaches to problems in
computer vision and in robotic manipulation. Using data-driven machine
learning techniques, he developed algorithms for creating 3-d models from
a single image, and algorithms for robotic manipulation tasks such as
opening doors, and grasping previously unseen objects.