How Llama 2 works: Ghost Attention, Quality Supervised Fine-tuning, RLHF for Safety and Helpfulness
We go through the various mechanisms behind Llama 2.
Pre-training: 2 trillion tokens
Supervised Fine-tuning: Tens of thousands of high quality samples
RLHF: To make outputs safer and more helpful
Ghost Attention: To help make the attention mechanism work for longer prompts
I do not agree with all of them, but overall Llama 2 is a great model to use!
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Slides can be found here: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/Llama%202.pdf
Part 1 here: https://www.youtube.com/watch?v=SBBFxwnABLM
How ChatGPT works: https://www.youtube.com/watch?v=wA8rjKueB3Q
Llama paper: https://arxiv.org/abs/2302.13971
Transformer Paper: https://arxiv.org/abs/1706.03762
Grouped Query Attention (GQA): https://arxiv.org/pdf/2305.13245.pdf
Rotary Positional Embeddings: https://arxiv.org/abs/2104.09864
Constitutional AI (Anthropic): https://arxiv.org/abs/2212.08073
RLHF Paper (OpenAI): https://arxiv.org/abs/2203.02155
Less is More for Alignment (LIMA): https://arxiv.org/abs/2305.11206
Phi-1 - Textbooks are all you Need (small but specialized model): https://arxiv.org/abs/2306.11644
Tiny Stories (small but specialized model): https://arxiv.org/abs/2305.07759
~~~~~~~~~~~~~~~~~~~~~~~~~~~
0:00 Ghost Attention
4:48 Llama 2 has the best Open Source Performance
7:43 Llama 2 vs Llama 1
11:23 Rotary Positional Embeddings (RoPE)
20:23 Overall Training Flow
21:11 Pre-training
25:50 Supervised Fine-Tuning (SFT)
32:52 Human Feedback to train Reward Models
47:14 Reinforcement Learning from Human Feedback (RLHF)
1:06:55 Discussion
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin