Research talk: WebQA: Multihop and multimodal

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=ZC82SL3Np0c



Duration: 14:52
201 views
0


Speaker: Yonatan Bisk, Assistant Professor, Carnegie Mellon University

Web search is fundamentally multimodal and multihop. Often, even before asking a question, individuals go directly to image search to find answers. Further, rarely do we find an answer from a single source, opting instead to aggregate information and reason through implications. Despite the frequency of this everyday occurrence, at present there is no unified question-answering benchmark that requires a single model to answer long-form natural language questions from text and open-ended visual sources that is akin to human experience. The researchers propose to bridge this gap between the natural language and computer vision communities with WebQA. They show that multihop text queries are difficult for a large-scale transformer model, and they also show that existing multi-modal transformers and visual representations do not perform well on open-domain visual queries. Our challenge for the community is to create a unified multimodal reasoning model that seamlessly transitions and reasons regardless of the source modality.

Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit




Other Videos By Microsoft Research


2022-02-08Technology demo: Using technology to combat human trafficking
2022-02-08Technology demo: Project Eclipse: Hyperlocal air quality monitoring for cities
2022-02-08Research talk: Bucket of me: Using few-shot learning to realize teachable AI systems
2022-02-08Tutorial: Best practices for prioritizing fairness in AI systems
2022-02-08Demo: RAI Toolbox: An open-source framework for building responsible AI
2022-02-08Opening remarks: Responsible AI
2022-02-08Closing remarks: Deep Learning and Large Scale AI
2022-02-08Roundtable discussion: Beyond language models: Knowledge, multiple modalities, and more
2022-02-08Research talk: Closing the loop in natural language interfaces to relational databases
2022-02-08Just Tech: Bringing CS, the social sciences, and communities together for societal resilience
2022-02-08Research talk: WebQA: Multihop and multimodal
2022-02-08Opening remarks: Tech for resilient communities
2022-02-08Research talk: Towards Self-Learning End-to-end Dialog Systems
2022-02-08Research talk: Focal Attention: Towards local-global interactions in vision transformers
2022-02-08Research talk: Knowledgeable pre-trained language models
2022-02-08Opening remarks: Deep Learning and Large-Scale AI
2022-02-08Closing remarks: Cloud Intelligence/AIOps
2022-02-08Research talk: Optimizing the cloud supply chain
2022-02-08Research talk: Automating and Optimizing IT Operations Management with AI
2022-02-08Research talk: An intelligent data-driven paradigm towards cloud reliability
2022-02-08Talk: Multidimensional analysis of cloud-native software based on large-scale operation data



Tags:
deep learning
large-scale models
large-scale AI models
AI
artificial intelligence
microsoft research summit