Abstracts: NeurIPS 2024 with Weizhu Chen
Channel:
Subscribers:
351,000
Published on ● Video Link: https://www.youtube.com/watch?v=EIdZgX9Daic
Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.
Show notes: https://www.microsoft.com/en-us/research/podcast/abstracts-neurips-2024-with-weizhu-chen/
Listen to the Abstracts series: https://www.microsoft.com/en-us/research/podcast-series/abstracts/