Grounded Visual Generation

Channel:

Subscribers:

351,000

Published on September 15, 2021 6:44:11 PM ● Video Link: https://www.youtube.com/watch?v=fLqF4isdWPg

Duration: 58:27

513 views

Multi-modal data provides an exciting opportunity to train grounded generative models that synthesize images consistent with real world phenomena. In this talk, I will share several of our recent efforts towards creating grounded visual generation models: (1) introducing user attention grounding for text-to-image synthesis, (2) improving text-to-image generation results with stronger language grounding, and (3) taking steps towards creating spatially grounded world models for embodied vision-and-language tasks.

Speaker: Jing Yu Koh, Google

MSR Deep Learning team: https://www.microsoft.com/en-us/research/group/deep-learning-group/

Other Videos By Microsoft Research

2021-09-23	ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
2021-09-23	Zero-Shot Detection via Vision and Language Knowledge Distillation
2021-09-17	Three Explorations on Pre-Training: an Analysis, an Approach, and an Architecture
2021-09-16	Visual Recognition beyond Appearances, and its Robotic Applications
2021-09-16	A Truly Unbiased Model
2021-09-16	Visual question answering & reasoning over vision & language: Beyond limits of statistical learning?
2021-09-15	MDETR: Modulated Detection for End-to-End Multi-Modal Understanding
2021-09-15	Learning Commonsense Understanding through Language and Vision
2021-09-15	Tightly Connecting Vision and Language
2021-09-15	Learning from Unlabeled Videos for Recognition, Prediction, and Control
2021-09-15	Grounded Visual Generation
2021-08-25	The New Jim Code: Reimagining the Default Settings of Technology & Society
2021-08-19	A mechatronic shape display based on auxetic materials
2021-08-16	Dependable IoT- Making data from IoT devices dependable and trustworthy for good decision making
2021-08-11	Lookout System: National Television Commercial (1998)
2021-08-06	Create human-centered AI with the Human-AI eXperience (HAX) Toolkit webinar
2021-08-04	Computing Technology as Racial Infrastructure: A History of the Present & Blueprint for Black Future
2021-07-27	Urban Air Chicago
2021-07-09	The Vanishing Indian Speaks Back: Race, Genomics, and Indigenous Rights
2021-07-08	Recent Advances in Image Captioning, Image-Text Retrieval and…
2021-07-08	Directions in ML: Structured Models for Automated Machine Learning

Channel	Latest
Joshua And Friends	7 hours ago
🐺 lonestarwolf94 🐺	8 hours ago
The Matthews Fam	8 hours ago
BENBROS	8 hours ago
Animations Trailer	8 hours ago
CoraToons	8 hours ago
Oma Rohaeti	9 hours ago
Kelsey Off Grid	9 hours ago
Yosy de Galicia	9 hours ago
Waccau Gameplay	9 hours ago
Yogi Akbar	9 hours ago
Hiro	9 hours ago
R湯哥	9 hours ago
Claireinium	9 hours ago
Rivas	9 hours ago
Time Hack	9 hours ago
Cheap Thrills Arkill	9 hours ago
Zanpact_Musashi Gaming	9 hours ago
AMHarbinger	9 hours ago
PepePeepo	9 hours ago
MelodyShortMusic	9 hours ago
Cipher Games	9 hours ago
Blurrhhh	10 hours ago
Dương Dê	10 hours ago
MLBB EPIC PLAYS	10 hours ago