Building Better Language Models Through Global Understanding

Subscribers:
351,000
Published on ● Video Link: https://www.youtube.com/watch?v=fcStqSMuoYw



Duration: 0:00
451 views
14


Modern language models have achieved remarkable capabilities in English, but human knowledge and experience span thousands of languages, each encoding unique perspectives and problem-solving approaches. From the algorithmic precision required to handle Arabic’s root-pattern morphology to the contextual reasoning needed for Japanese’s topic-prominent structure, each language presents distinct computational challenges that push the boundaries of natural language processing. The talk will address pressing challenges in multilingual AI development, including training data imbalances, cross-lingual transfer limitations, safe and harmless generations, and evaluation complexity across different languages. I’ll discuss practical solutions and emerging research directions that could help bridge the current performance gap between high-resource and low-resource languages. By building truly multilingual AI systems, we not only expand technology access but also develop more sophisticated models capable of handling the full spectrum of human language complexity.

Learn more about Microsoft Research Lab – Africa, Nairobi: https://www.microsoft.com/en-us/research/lab/microsoft-research-lab-africa-nairobi/seminars/




Other Videos By Microsoft Research


2025-08-27Six Years of Rowhammer: Breakthroughs and Future Directions
2025-08-25Sub-Population Identification of Multi-morbidity in Sub-Saharan African Populations
2025-08-19MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
2025-08-11Medical Bayesian Kiosk (2010)
2025-08-07Reimagining healthcare delivery and public health with AI
2025-08-05VeriTrail: Detect hallucination and trace provenance in AI workflows
2025-07-31Computational models for brain science
2025-07-30VoluMe: Authentic 3D Video Calls from Live Gaussian Splat Prediction
2025-07-28How I became a StoryTeller (and how YOU can too)
2025-07-28Make some noise: Teaching the language of audio to an LLM using sound tokens
2025-07-28Building Better Language Models Through Global Understanding
2025-07-24Navigating medical education in the era of generative AI
2025-07-22DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
2025-07-21AI Testing and Evaluation: Reflections
2025-07-20Intern talk: Distilling Self-Supervised-Learning-Based Speech Quality Assessment into Compact Models
2025-07-15AI Testing and Evaluation: Learnings from cybersecurity
2025-07-10Scalable emulation of protein equilibrium ensembles with BioEmu
2025-07-10How AI will accelerate biomedical research and discovery
2025-07-09Introducing Microsoft AI Economy Institute
2025-07-07AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
2025-07-03Against Softmaxing Culture: Understanding Relational Practices in Expert and Ordinary Forms of Work