Advances in Natural Language Generation for Indian Languages

Subscribers:
351,000
Published on ● Video Link: https://www.youtube.com/watch?v=Djmow7unR5Y



Duration: 59:38
1,061 views
59


Speaker: Dr. Raj Dabre
Host: Sunayana Sitaram

Much of recent progress for natural language generation (NLG) has been in the context of English and, in general, high resource languages, however, Indian languages have yet to see similar paradigm shifts despite their speaking population comprising about a fifth of the world's population. Two major constraints are data and compute, and in this talk, I will touch on both. I will begin with our earliest work on IndicBART, which leveraged monolingual data and helped overcome resource scarcity of Indian languages as measured on the IndicNLG benchmark. I will then highlight three recent works, two focusing on overcoming data scarcity via mass crawling, cleaning and synthetic data creation with the third focusing on compute scarcity via leveraging romanization alongside an existing strong English LLM. This will hopefully lead to discussions which will help push the boundary of language modeling and NLG for Indian languages.

See more at https://www.microsoft.com/en-us/research/video/advances-in-natural-language-generation-for-indian-languages/




Other Videos By Microsoft Research


2024-09-18AI for Business Transformation: The Business of Data
2024-09-18Ludic Design for Accessibility
2024-09-16At the Foothills of an AI Era in Science | Gilbert S. Omenn Grand Challenges Address
2024-09-03Fostering appropriate reliance on AI
2024-08-27ML for High-Performance Climate and Earth Virtualization Engines
2024-08-27Final intern talk: Distilling Self-Supervised-Learning-Based Speech Quality Assessment into Compact
2024-08-26Decoding the Human Brain – A Neurosurgeon’s Experience
2024-08-09Mapping the World: Creating a Global and Temporal High-Resolution Building Density Map
2024-08-08AgriAdvisor Concept Video
2024-07-15Proactive Resume and Pause of Resources for Microsoft Azure SQL Database Serverless
2024-07-12Advances in Natural Language Generation for Indian Languages
2024-06-06Making Sentence Embeddings Robust to User-Generated Content
2024-06-06Keynote: Building Globally Equitable AI
2024-06-04AutoGen Update: Complex Tasks and Agents
2024-06-04MatterGen: A Generative Model for Materials Design
2024-06-04Driving Industry Evolution: Exploring the Impact of Generative AI on Sector Transformation
2024-06-04Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
2024-06-04Panel Discussion: Generative AI for Global Impact: Challenges and Opportunities
2024-06-04Keynote: Building Globally Equitable AI
2024-05-14Join us for Research Forum on June 4
2024-05-14MSR Talk: Unsupervised Speech Reverberation Control with Diffusion Implicit Bridges