Learning via Self-Play: An AlphaGo/AlphaZero Story

Channel:

John Tan Chong Min

Subscribers:

5,470

Published on June 23, 2022 5:38:21 AM ● Video Link: https://www.youtube.com/watch?v=a4lrFMSwGWQ

Category:

Let's Play

Duration: 54:24

171 views

Since its historic win against Go champion Lee Sedol in 2016, AlphaGo has made headlines throughout the world as it was thought that AI would take another decade to surpass humans in Go. AlphaGo uses an initial supervised learning procedure to learn from the games of human professionals, before conducting self-play reinforcement learning to improve itself further. AlphaGo Zero took it one step further and learnt the game with just the rules and without any human knowledge, and managed to have better performance than AlphaGo!

It is exciting how reinforcement learning methods can be made superhuman with self-play, and this presentation serves to give a beginner’s overview to the winning methods behind AlphaGo/AlphaGo Zero - namely:

(1) Monte Carlo Tree Search (which helps to balance the explore-exploit tradeoff and serve as a way to lookahead and self-improve),

(2) a neural network to approximate how well the board position is (the value network), and

(3) a neural network to decide which moves to focus on (the policy network)

00:00 Intro (AlphaGo Movie)
3:11 Start of Talk
4:35 Explore-Exploit Tradeoff
8:28 Monte Carlo
11:40 Monte Carlo Tree Search
18:58 AlphaGo (Neural Networks + MCTS)
23:46 Policy Network (Breadth)
26:34 Value Network (Depths)
28:14 AlphaGo: An Overview
31:19 AlphaGo Zero (no human expert knowledge)
34:51 MCTS in AlphaGo Zero
37:40 Self-play
39:00 Simplicity is better: Human features can be distracting
39:23 AlphaGo Zero Performance
40:18 How to achieve superhuman performance?
41:27 Q&A

Other Videos By John Tan Chong Min

2022-06-30	Solving the game of Nim with Monte Carlo / Monte Carlo Tree Search (Part 2)
2022-06-30	Solving the game of Nim with Monte Carlo / Monte Carlo Tree Search
2022-06-30	Solving the game of Nim with Monte Carlo / Monte Carlo Tree Search
2022-06-30	Solving the game of Nim with Monte Carlo / Monte Carlo Tree Search (Part 2)
2022-06-30	Solving the game of Nim with Monte Carlo / Monte Carlo Tree Search (Part 2)
2022-06-30	Solving the game of Nim with Monte Carlo / Monte Carlo Tree Search (Part 2)
2022-06-30	Solving the game of Nim with Monte Carlo / Monte Carlo Tree Search
2022-06-27	Solving the Game of Nim with Value Estimates
2022-06-27	Solving the Game of Nim with Monte Carlo / Monte Carlo Tree Search (Part 1)
2022-06-27	Solving the Game of Nim with Monte Carlo / Monte Carlo Tree Search (Part 2)
2022-06-22	Learning via Self-Play: An AlphaGo/AlphaZero Story
2022-06-08	Geometric Deep Learning (Part 1)
2022-06-08	Geometric Deep Learning (Part 2)
2022-06-06	Transformers (Part 2)
2022-06-01	Equation Solver!
2022-05-30	Transformers (Part 1)
2022-05-30	Can a Neural Network model if-else statements?
2022-05-29	Minesweeper AI using Neural Network!
2022-05-29	Minesweeper AI using Neural Network! (Part 2)
2022-05-28	CodinGame : Let's try to solve the Minesweeper Puzzle! (Part 2)
2022-05-28	CodinGame: Let's try to solve the Minesweeper Puzzle!

Channel	Latest
RetroGamingLoft	6 hours ago
APENLIVE	6 hours ago
Disc	6 hours ago
Kerios	6 hours ago
Silverreploid CLE	6 hours ago
LiaqN【りあん】	7 hours ago
SIL3A	7 hours ago
Rodrigo Lopes	7 hours ago
PVP Gamer	7 hours ago
BrattyHarp	7 hours ago
HomoErectous	7 hours ago
Blacker Lotus	7 hours ago
Paradox	7 hours ago
Vargskelethor Joel: Mini Highlights	7 hours ago
MW Informática	7 hours ago
RavenFromTheSky	7 hours ago
The Gaming Otter	7 hours ago
Shinu Made Games	7 hours ago
SkSfps1	7 hours ago
Mais Esports	7 hours ago
Ireland Deity	7 hours ago
Obsidian Changeling	7 hours ago
Ashir	7 hours ago
Heretican Part II	7 hours ago
ep games	7 hours ago