Visually Grounded Language Understanding and Generation

Channel:

Subscribers:

351,000

Published on November 4, 2019 12:04:41 PM ● Video Link: https://www.youtube.com/watch?v=0EM4GfuZa1o

Duration: 1:03:12

2,177 views

In this talk, I will present our latest work on comprehending and generating visually grounded language. First, we will discuss the challenging task of learning visual grounding language. I will introduce how to pretrain task-agnostic visiolinguistic representations for a variety of vision and language tasks. In the second part of the talk, I will describe our recent work on image captioning that can produce natural language explicitly grounded in entities that object detectors find in the image. At the end of the talk, I will briefly discuss some ongoing work efforts on vision and language multi-task learning and generating goal driven visual dialog without dialog data.

Talk slides: https://www.microsoft.com/en-us/research/uploads/prod/2019/11/Visually-Grounded-Language-Understanding-and-Generation-SLIDES.pdf

See more on this candidate talk at Microsoft Research: https://www.microsoft.com/en-us/research/video/visually-grounded-language-understanding-and-generation/

Other Videos By Microsoft Research

2019-11-20	Designing Restorative Approaches to Moderating Adversarial Online Interactions
2019-11-20	Blind Multi-Microphone Noise Reduction and Dereverberation Algorithms
2019-11-20	High Throughput Computing in the Service of Scientific Discovery
2019-11-20	Towards Grounded Spatio-Temporal Reasoning
2019-11-20	Program synthesis and the art of programming by intent with Dr. Sumit Gulwani [Podcast]
2019-11-13	Hacking the runway with MakeCode with Dr. Thomas Ball and Dr. Teddy Seyed
2019-11-11	A Machine Learning Perspective on Managing Noisy Structured Data
2019-11-06	Optics for the cloud: storage in the zettabyte era with Dr. Ant Rowstron and Mark Russinovich
2019-11-05	Fireside Chat with Stefanie Jegelka
2019-11-05	Project Silica - Storing Data in Glass
2019-11-04	Visually Grounded Language Understanding and Generation
2019-11-04	Efficient and Scalable Deep Learning
2019-11-04	Interpretability in NLP: Moving Beyond Vision
2019-11-01	HAMS Automated License Testing Process in Dehradun
2019-10-30	Art + Architecture + AI = Ada with Jenny Sabin and Asta Roseway [Podcast]
2019-10-29	HAMS: Smartphone-based Driver License Testing Automation
2019-10-29	Structured light: seeing less to see more in optical microscopy
2019-10-25	SpaceInk: Making Space for In-Context Annotations
2019-10-25	Working at Microsoft Research Cambridge
2019-10-25	Our intern experience at Microsoft Research Cambridge
2019-10-24	Microsoft PhD Summit 2019: Scott Saponas [Short Talk]

Tags:

visually grounded language

visiolinguistic representations

image captioning

visual dialog

NLP

computer vision

visual question answering

VQA

Jiasen Lu

Microsoft Research

Channel	Latest
Claireinium	6 hours ago
Time Hack	6 hours ago
AMHarbinger	6 hours ago
MelodyShortMusic	7 hours ago
MLBB EPIC PLAYS	7 hours ago
Mr. Zigs	7 hours ago
SkeithTV	7 hours ago
BB Ria Malupa	8 hours ago
Mystical Gaming	8 hours ago
Akali Challenger	8 hours ago
The Dude Rolls	8 hours ago
Nev's Tech Bits	8 hours ago
Nyx Nekota	9 hours ago
Dota Play	9 hours ago
MachoEspartano	9 hours ago
Anime Xperienze	9 hours ago
CHOUEXP	9 hours ago
SultanTVofficial	9 hours ago
Friki Gamers	9 hours ago
German Quest Guide	9 hours ago
Finesse God	9 hours ago
Lost in Gaming	9 hours ago
Gg gamingz	10 hours ago
Parsa Tube HD	10 hours ago
F A A N C H A N N E L	10 hours ago