Scalable and Efficient AI: From Supercomputers to Smartphones

Channel:

Subscribers:

351,000

Published on August 4, 2023 1:39:57 PM ● Video Link: https://www.youtube.com/watch?v=bfexfASu9h4

Duration: 1:04:54

1,718 views

Speakers: Torsten Hoefler, Professor in the Scalable Parallel Computing Lab at ETH Zürich
Host: Saeed Maleki, Principal Research SDE at Microsoft Research

Billion-parameter artificial intelligence models have proven to show exceptional performance in a large variety of tasks ranging from natural language processing, computer vision, and image generation to mathematical reasoning and algorithm generation. Those models usually require large parallel computing systems, often called "AI Supercomputers", to be trained initially. We will outline several techniques ranging from data ingestion, parallelization, to accelerator optimization that improve the efficiency of such training systems. Yet, training large models is only a small fraction of practical artificial intelligence computations. Efficient inference is even more challenging - models with hundreds-of-billions of parameters are expensive to use. We continue by discussing model compression and optimization techniques such as fine-grained sparsity as well as quantization to reduce model size and significantly improve efficiency during inference. These techniques may eventually enable inference with powerful models on hand-held devices.

Other Videos By Microsoft Research

2023-11-02	Task Focused IR in the Era of Generative AI Workshop: Invited Talks
2023-11-02	Task Focused IR in the Era of Generative AI Workshop: Intro + Keynote
2023-10-19	The Prompt with Trevor Noah \| Episode 1: IHME Population Mapping
2023-10-03	Wildlife Conflict Resolution: Boma & Cattle Detection in the Masai Mara using AI
2023-09-26	CCEdit results
2023-09-22	WiDS Fireside Chat with Jaime Teevan and Ming Ye
2023-09-22	End-to-End Encrypted Group Chats with MLS: Design, Implementation and Verification
2023-09-22	Final intern talk: Improving Frechet Audio Distance for Generative Music Evaluation
2023-09-15	Microsoft Research India - who we are.
2023-08-09	Keypoint Detection for Measuring Body Size of Giraffes: Enhancing Accuracy and Precision
2023-08-04	Scalable and Efficient AI: From Supercomputers to Smartphones
2023-07-18	AI for Precision Health
2023-07-07	Multilingual Evaluation of Generative AI (MEGA)
2023-07-07	The Whole Truth and Nothing But the Truth: Faithful and Controllable Dialogue Response Generation...
2023-07-07	Privacy-Preserving Domain Adaptation of Semantic Parsers
2023-05-30	Microsoft’s Holoportation™ Communications Technology: Facilitating 3D Telemedicine
2023-05-05	Human-Centered AI: Ensuring Human Control While Increasing Automation
2023-05-03	Escapement: A Tool for Interactive Prototyping with Video via Sensor-Mediated Abstraction of Time
2023-05-03	AdHocProx: Sensing Mobile, Ad-Hoc Collaborative Device Formations using Dual Ultra-Wideband Radios
2023-05-01	MARI Grand Seminar - Large Language Models and Low Resource Languages
2023-04-27	Innovating through uncertainty: Getting super curious and combining disparate elements

Channel	Latest
RobtheMod	7 hours ago
MadMorph	8 hours ago
Rantoni	8 hours ago
elrubiusOMG	8 hours ago
gameranx	8 hours ago
Markiplier	9 hours ago
WolfeyVGC	9 hours ago
Mr DeKart	9 hours ago
I Dream of Indie Games	10 hours ago
Fire Within Us	10 hours ago
Family Friendly Gaming	10 hours ago
JL Tomy - Live	10 hours ago
3p Venom	10 hours ago
MumboElite	10 hours ago
Yannex	10 hours ago
Six9 FF	10 hours ago
RkReddy	10 hours ago
SammyJam	10 hours ago
Aniket shivalkar	10 hours ago
Hero Wars Central	10 hours ago
Kinotechka	10 hours ago
Dav1	10 hours ago
obiiWan7	10 hours ago
Papai Toons	10 hours ago
Ritmo Cabarete Digital TV	10 hours ago