Scalable and Efficient AI: From Supercomputers to Smartphones

Subscribers:
343,000
Published on ● Video Link: https://www.youtube.com/watch?v=bfexfASu9h4



Duration: 1:04:54
1,718 views
63


Speakers: Torsten Hoefler, Professor in the Scalable Parallel Computing Lab at ETH Zürich
Host: Saeed Maleki, Principal Research SDE at Microsoft Research

Billion-parameter artificial intelligence models have proven to show exceptional performance in a large variety of tasks ranging from natural language processing, computer vision, and image generation to mathematical reasoning and algorithm generation. Those models usually require large parallel computing systems, often called "AI Supercomputers", to be trained initially. We will outline several techniques ranging from data ingestion, parallelization, to accelerator optimization that improve the efficiency of such training systems. Yet, training large models is only a small fraction of practical artificial intelligence computations. Efficient inference is even more challenging - models with hundreds-of-billions of parameters are expensive to use. We continue by discussing model compression and optimization techniques such as fine-grained sparsity as well as quantization to reduce model size and significantly improve efficiency during inference. These techniques may eventually enable inference with powerful models on hand-held devices.




Other Videos By Microsoft Research


2023-11-02Task Focused IR in the Era of Generative AI Workshop: Invited Talks
2023-11-02Task Focused IR in the Era of Generative AI Workshop: Intro + Keynote
2023-10-19The Prompt with Trevor Noah | Episode 1: IHME Population Mapping
2023-10-03Wildlife Conflict Resolution: Boma & Cattle Detection in the Masai Mara using AI
2023-09-26CCEdit results
2023-09-22WiDS Fireside Chat with Jaime Teevan and Ming Ye
2023-09-22End-to-End Encrypted Group Chats with MLS: Design, Implementation and Verification
2023-09-22Final intern talk: Improving Frechet Audio Distance for Generative Music Evaluation
2023-09-15Microsoft Research India - who we are.
2023-08-09Keypoint Detection for Measuring Body Size of Giraffes: Enhancing Accuracy and Precision
2023-08-04Scalable and Efficient AI: From Supercomputers to Smartphones
2023-07-18AI for Precision Health
2023-07-07Multilingual Evaluation of Generative AI (MEGA)
2023-07-07The Whole Truth and Nothing But the Truth: Faithful and Controllable Dialogue Response Generation...
2023-07-07Privacy-Preserving Domain Adaptation of Semantic Parsers
2023-05-30Microsoft’s Holoportation™ Communications Technology: Facilitating 3D Telemedicine
2023-05-05Human-Centered AI: Ensuring Human Control While Increasing Automation
2023-05-03Escapement: A Tool for Interactive Prototyping with Video via Sensor-Mediated Abstraction of Time
2023-05-03AdHocProx: Sensing Mobile, Ad-Hoc Collaborative Device Formations using Dual Ultra-Wideband Radios
2023-05-01MARI Grand Seminar - Large Language Models and Low Resource Languages
2023-04-27Innovating through uncertainty: Getting super curious and combining disparate elements