Pushing the capabilities of Gemma 3 via distillation and RL fine-tuning
Channel:
Subscribers:
2,520,000
Published on ● Video Link: https://www.youtube.com/watch?v=y-jRGYnDgfM
Specialized capabilities (e.g. math abilities, coding, multilinguality, tool use...) are key areas of improvement in post-training. In this talk we explore a novel strategy involving large-scale distillation and RL finetuning to push specialized capabilities in LMs while still improving their generality.
Subscribe to Google for Developers → https://goo.gle/developers
Speakers: Johan Ferret
Products Mentioned: Gemma