Accelerating Transformers via Kernel Density Estimation Insu Han

Channel:

Google TechTalks

Subscribers:

349,000

Published on June 6, 2023 6:23:49 PM ● Video Link: https://www.youtube.com/watch?v=TNNqB8hUp_o

Duration: 53:09

787 views

A Google TechTalk, presented by Insu Han, 2023/05/30
A Google Algorithms Seminar. ABSTRACT: Dot-product attention mechanism plays a crucial role in modern deep architectures (e.g., Transformer) for sequence modeling, however, naïve exact computation of this model incurs quadratic time and memory complexities in sequence length, hindering the training of long-sequence models. Critical bottlenecks are due to the computation of partition functions in the denominator of softmax function as well as the multiplication of the softmax matrix with the matrix of values. Our key observation is that the former can be reduced to a variant of the kernel density estimation (KDE) problem, and an efficient KDE solver can be further utilized to accelerate the latter via subsampling-based fast matrix products. Our proposed KDEformer can approximate the attention in sub-quadratic time with provable spectral norm bounds, while all prior results merely provide entry-wise error bounds. Empirically, we verify that KDEformer outperforms other attention approximations in terms of accuracy, memory, and runtime on various pre-trained models.

Bio: Insu Han is a postdoctoral research fellow at Yale University, hosted by Amin Karbasi. He completed his Ph.D. degree in the School of Electrical Engineering at the Korea Advanced Institute of Science and Technology (KAIST) in 2021, under the supervision of Jinwoo Shin. Before that, he obtained his Bachelor's degree in Electrical Engineering and minored in Mathematics at KAIST. He has worked on developing and analyzing approximate algorithms for large-scale machine learning problems and their applications. His most recent work focuses on accelerating the attention mechanism in large language models via fast kernel density estimation methods. In 2019, he was the recipient of the Microsoft Research Asia Fellowship.

Other Videos By Google TechTalks

2023-07-03	2023 Blockly Developer Summit DAY 1-6: Generative Block Programming in MIT App Inventor
2023-07-03	2023 Blockly Developer Summit Day 2-11: Onboarding New Users
2023-07-03	2023 Blockly Developer Summit Day 2-15: Thoughts on Bidirectional Text to Blocks to Text
2023-07-03	2023 Blockly Developer Summit Day 2-6: Code.org - Sprite Lab
2023-07-03	2023 Blockly Developers Summit Day 1-1: Welcome
2023-07-03	2023 Blockly Developer Summit Day 2-14: Text to Blocks to Text with Layout
2023-07-03	2023 Blockly Developer Summit Day 2-8: Active STEM with Unruly Splats
2023-06-29	A Constant Factor Prophet Inequality for Online Combinatorial Auctions
2023-06-21	Open Problems in Mechanistic Interpretability: A Whirlwind Tour
2023-06-11	Online Prediction in Sub-linear Space
2023-06-06	Accelerating Transformers via Kernel Density Estimation Insu Han
2023-06-06	Differentially Private Synthetic Data via Foundation Model APIs
2023-06-05	Foundation Models and Fair Use
2023-05-30	Differentially Private Online to Batch
2023-05-30	Differentially Private Diffusion Models Generate Useful Synthetic Images
2023-05-30	Improving the Privacy Utility Tradeoff in Differentially Private Machine Learning with Public Data
2023-05-30	Randomized Approach for Tight Privacy Accounting
2023-05-30	Almost Tight Error Bounds on Differentially Private Continual Counting
2023-05-30	EIFFeL: Ensuring Integrity for Federated Learning
2023-05-30	Differentially Private Diffusion Models
2023-05-15	Damian Grimling \| Sentistocks \| Sentimenti \| web3 talks \| March 9th 2023 \| MC: Blake DeBenon

Channel	Latest
Scott Jund	6 hours ago
Smutsen	6 hours ago
BeastyqtSC2	6 hours ago
Exalted	6 hours ago
Bonkol Live	6 hours ago
Teh Spearhead	6 hours ago
Ashe Challenger	6 hours ago
Austinmp88	6 hours ago
Ask About Parenting & Care	6 hours ago
GranaDy	7 hours ago
Catninja909	7 hours ago
Sion VOD Gaming	7 hours ago
Outplanet Studios	7 hours ago
RakuInariLP	7 hours ago
Xmilek62	7 hours ago
BranOnline	7 hours ago
ketsueki_randi	7 hours ago
beavsbaut	7 hours ago
PIMPNITE	7 hours ago
JugZone	7 hours ago
ItzMiketheman	7 hours ago
Secretnc	7 hours ago
Jeisonlk	7 hours ago
Kaghoegaming	7 hours ago
The Missing Level	7 hours ago