Large-scale Image Classification: ImageNet and ObjectBank

Subscribers:
348,000
Published on ● Video Link: https://www.youtube.com/watch?v=qdDHp29QVdw



Duration: 1:00:47
12,437 views
81


Google Tech Talk (more info below)
May 5, 2011

Presented by Professor Fei-Fei Li, Stanford University

ABSTRACT

A key challenge in visual recognition is to recognize and label a large number of visual concepts, such as object and scene classes. In this talk, I'll discuss two recent projects in the Stanford Vision Lab on this topic. ImageNet is a large-scale image ontology that is built on the backbone structure of WordNet. In this talk, we show briefly how ImageNet is constructed using Amazon Mechanical Turk. And given the largest (publicly available) dataset of tens of thousands of image classes, how today's state-of-the-art computer vision algorithms do in the problem of large-scale image classification? Then I will discuss a new image representation called "Object Bank", that is a significant departure from all the previous image representation techniques such as Bag-of-Words models using low-level features (SIFT, HOG, GIST, etc.). We show that using the new Object Bank representation, a simple linear SVM classifier can result in superior performances in all standard image classification datasets. Furthermore, sparsity algorithms make our representation more efficient and scalable for large scene datasets, and reveal semantically meaningful feature patterns.




Other Videos By Google TechTalks


2011-06-07Mining Your Logs - Gaining Insight Through Visualization
2011-06-07Heath@Google Series: When Stress Becomes Stressed Out - 5 Ways to Outsmart the Invisible Killer
2011-06-01Bufferbloat: Dark Buffers in the Internet
2011-05-31IMUG Meetup: Mobile App Localization as a Service
2011-05-27Oakland International High School @Google
2011-05-26Self-Publishing: A Googler's Journey
2011-05-25Racial Profiling Analysis in a Post-Beer Summit World
2011-05-25The Middle East and Its Current Political Climate
2011-05-18Near-Optimal Parallel Join Processing in MapReduce
2011-05-18Michel Beaudouin-Lafon_Lessons from the WILD Room, an Interactive Multi-Surface Environment
2011-05-18Large-scale Image Classification: ImageNet and ObjectBank
2011-05-16Predator: A Visual Tracker that Learns from its Errors
2011-05-03Social Networks and Community (Re)Engineering: Creating Health Through Information and Policy
2011-05-02Where Did This Code Come From? Discovering the Provenance of Program Binaries
2011-04-25Health@Google Series: Reset Yourself, Starting with Food
2011-04-25Health@Google Series: Boosting Performance Through Plant-Based Whole Foods
2011-04-15To Harness The Long Tail Online, Location Does Matter As Does Time
2011-04-15Bay Area Vision Meeting: Visual Recognition via Feature Learning
2011-04-15Health@Google Series: Hair Loss and Hair Restoration
2011-04-15Bay Area Vision Meeting: Learning Representations for Real-world Recognition
2011-04-14Bay Area Vision Meeting: Perception for Robotics



Tags:
google tech talk
machine vision
storytelling
image database
image classification