[AlphaGo Zero] Mastering the game of Go without human knowledge | TDLS

Published on ● Video Link: https://www.youtube.com/watch?v=_x9bXso3wo4



Category:
Let's Play
Duration: 1:17:39
6,062 views
98


Toronto Deep Learning Series

For slides and more information, visit https://tdls.a-i.science/events/2019-02-25/

Discussion lead: Liam Hinzman
Discussion facilitators: Tahseen Shabab , Susan Chang

Mastering the Game of Go without Human Knowledge

"A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from selfplay. Here, we introduce an algorithm based solely on reinforcement learning, without human data, guidance, or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also the winner of AlphaGo’s games. This neural network improves the strength of tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100-0 against the previously published, champion-defeating AlphaGo."




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2019-03-315-min [machine learning] paper challenge | AISC
2019-03-28[Variational Autoencoder] Auto-Encoding Variational Bayes | AISC Foundational
2019-03-25[GQN] Neural Scene Representation and Rendering | AISC
2019-03-21Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples | AISC
2019-03-18Understanding the Origins of Bias in Word Embeddings
2019-03-14[Original Style Transfer] A Neural Algorithm of Artistic Style | TDLS Foundational
2019-03-11[RecSys 2018 Challenge winner] Two-stage Model for Automatic Playlist Continuation at Scale |TDLS
2019-03-07[OpenAI GPT2] Language Models are Unsupervised Multitask Learners | TDLS Trending Paper
2019-03-04You May Not Need Attention | TDLS Code Review
2019-02-28[DDQN] Deep Reinforcement Learning with Double Q-learning | TDLS Foundational
2019-02-25[AlphaGo Zero] Mastering the game of Go without human knowledge | TDLS
2019-02-21Transformer XL | AISC Trending Papers
2019-02-19Computational prediction of diagnosis & feature selection on mesothelioma patient records | AISC
2019-02-18Support Vector Machine (original paper) | AISC Foundational
2019-02-11Tensor Field Networks | AISC
2019-02-07ACAI: Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer
2019-02-04Code Review: Transformer - Attention Is All You Need | AISC
2019-02-04[StyleGAN] A Style-Based Generator Architecture for GANs, part2 (results and discussion) | TDLS
2019-02-04[StyleGAN] A Style-Based Generator Architecture for GANs, part 1 (algorithm review) | TDLS
2019-02-04TDLS: Learning Functional Causal Models with GANs - part 1 (algorithm review)
2019-02-04TDLS: Learning Functional Causal Models with GANs - part 2 (results and discussion)



Tags:
alphago zero
alphago
reinforcement learning
alpha go
deepmind