The LabelMe dataset and its applications to scene and object recognition

Subscribers:
348,000
Published on ● Video Link: https://www.youtube.com/watch?v=OPPFBaKc2eU



Duration: 1:04:11
12,350 views
33


Google Tech Talks
August 15, 2008

ABSTRACT

We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. We used the "Tom Sawyer fence painting" approach, and developed a web-based tool that allows easy image annotation and instant sharing of such annotations. Using this annotation tool, we have collected a large dataset that spans many object categories, often containing multiple instances over a wide variety of images. We quantify the contents of the dataset and compare against existing state of the art datasets used for object recognition and detection.

We have applied this dataset to scene and object recognition. Current object recognition systems can only recognize a limited number of object categories; scaling up to many categories is the next challenge. We seek to build a system to recognize and localize many different object categories in complex scenes. We achieve this through a simple approach: by matching the input image, in an appropriate representation, to images in a large training set of labeled images. Due to regularities in object identities across similar scenes, the retrieved matches provide hypotheses for object identities and locations. We build a probabilistic model to transfer the labels from the retrieval set to the input image. We demonstrate the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database.

Joint work with Antonio Torralba, Byran Russell, Kevin Murphy, Rob
Fergus and Ce Liu.

Speaker: Bill Freeman, MIT and Adobe Systems
Bill Freeman is a professor of Electrical Engineering and Computer Science at MIT, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). He studies computer vision, computer graphics, and machine learning, addressing how to represent, manipulate, and understand images. Before joining MIT, he worked for 9 years at Mitsubishi Electric Research Labs, for 6 years at the Polaroid Corporation, and for 1 year as a Foreign Expert at the Taiyuan University of Technology, Shanxi, China. Part time, he works
at Adobe's Creative Technologies Lab. Hobbies include flying cameras in kites.







Tags:
google
techtalks
techtalk
engedu
talk
talks
googletechtalks
education