Abstracts: NeurIPS 2024 with Weizhu Chen

Subscribers:
351,000
Published on ● Video Link: https://www.youtube.com/watch?v=EIdZgX9Daic



Duration: 0:00
31 views
0


Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.

Show notes: https://www.microsoft.com/en-us/research/podcast/abstracts-neurips-2024-with-weizhu-chen/

Listen to the Abstracts series: https://www.microsoft.com/en-us/research/podcast-series/abstracts/