MARI Grand Seminar - Large Language Models and Low Resource Languages

Subscribers:
343,000
Published on ● Video Link: https://www.youtube.com/watch?v=X7c0T7uwtkM



Duration: 1:47:05
1,327 views
31


Watch our two-hour grand seminar on Large Language Models and Low Resource Languages. The event included a keynote by Dr. Monojit Choudhury titled “Predicting, Explaining and Optimizing Performance of LLMs across Languages,” where he discussed whether massively multilingual language models (MMLM) can be leveraged to predict the accuracy of cross-lingual zero-shot and few-shot transfer for a task on target languages with little or no test data. He also gave an overview of Project LITMUS – Linguistically Informed Training and Testing of Multilingual Systems, which involved building several ML models for performance prediction and discuss the what was learnt about the factors that influence cross-lingual transfer.

The talk was followed by a panel discussion with experts from academia and research; including Dr. Monojit Chowdhury, Dr. Edward Ombui, Dr. Sunayana Sitaram, Dr. David Adelani, and moderated by Maxamed Axmed.


Keynote Abstract:

Predicting, Explaining and Optimizing Performance of LLMs across Languages

Given a massively multilingual language models (MMLM), can we predict the accuracy of cross-lingual zero-shot and few-shot transfer for a task on target languages with little or no test data? This seemingly impossible task, if solved, can have several potential benefits. First, we could estimate the performance of a model even in languages where a test set is not available, and/or building one is difficult. Second, one can predict training data configurations that would give certain desired performance across a set of languages, and accordingly strategize data collection plans; this in turn can lead to linguistically fair MMLM-based models. Third, as a byproduct, we would know which factors influence cross-lingual transfer. In this talk, I will give an overview of Project LITMUS – Linguistically Informed Training and Testing of Multilingual Systems, where we build several ML models for performance prediction; besides their applications, I will discuss what we learn about the factors that influence cross-lingual transfer.

Learn more about MARI: https://www.microsoft.com/en-us/research/group/microsoft-africa-research-institute-mari/




Other Videos By Microsoft Research


2023-08-09Keypoint Detection for Measuring Body Size of Giraffes: Enhancing Accuracy and Precision
2023-08-04Scalable and Efficient AI: From Supercomputers to Smartphones
2023-07-18AI for Precision Health
2023-07-07Multilingual Evaluation of Generative AI (MEGA)
2023-07-07The Whole Truth and Nothing But the Truth: Faithful and Controllable Dialogue Response Generation...
2023-07-07Privacy-Preserving Domain Adaptation of Semantic Parsers
2023-05-30Microsoft’s Holoportation™ Communications Technology: Facilitating 3D Telemedicine
2023-05-05Human-Centered AI: Ensuring Human Control While Increasing Automation
2023-05-03Escapement: A Tool for Interactive Prototyping with Video via Sensor-Mediated Abstraction of Time
2023-05-03AdHocProx: Sensing Mobile, Ad-Hoc Collaborative Device Formations using Dual Ultra-Wideband Radios
2023-05-01MARI Grand Seminar - Large Language Models and Low Resource Languages
2023-04-27Innovating through uncertainty: Getting super curious and combining disparate elements
2023-04-13WiDS Career Panel: Gabriela de Queiroz, Juliet Hougland (Netflix), and Samantha Sifleet
2023-03-24Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
2023-03-23Foundation models and the next era of AI
2023-02-24Behind the label: Glimpses of data labelling labours for AI
2023-02-17Art of doing disruptive research
2023-02-17Fighting the Global Social Media Infodemic: from Fake News to Harmful Content
2023-02-15Responsible AI Tracker Tour
2023-02-14Automating Commonsense Reasoning
2023-02-13Reinforcement Learning (RL) Open Source Fest 2022 Final Project Presentations