This Team won the Minecraft RL BASALT Challenge! (Paper Explanation & Interview with the authors)

Channel:

Yannic Kilcher

Subscribers:

300,000

Published on January 11, 2022 9:17:53 PM ● Video Link: https://www.youtube.com/watch?v=a4P8v8lGFPw

Duration: 1:23:51

12,609 views

304

#minerl #minecraft #deeplearning

The MineRL BASALT challenge has no reward functions or technical descriptions of what's to be achieved. Instead, the goal of each task is given as a short natural language string, and the agent is evaluated by a team of human judges who rate both how well the goal has been fulfilled, as well as how human-like the agent behaved. In this video, I interview KAIROS, the winning team of the 2021 challenge, and discuss how they used a combination of machine learning, efficient data collection, hand engineering, and a bit of knowledge about Minecraft to beat all other teams.

OUTLINE:
0:00 - Introduction
4:10 - Paper Overview
11:15 - Start of Interview
17:05 - First Approach
20:30 - State Machine
26:45 - Efficient Label Collection
30:00 - Navigation Policy
38:15 - Odometry Estimation
46:00 - Pain Points & Learnings
50:40 - Live Run Commentary
58:50 - What other tasks can be solved?
1:01:55 - What made the difference?
1:07:30 - Recommendations & Conclusion
1:11:10 - Full Runs: Waterfall
1:12:40 - Full Runs: Build House
1:17:45 - Full Runs: Animal Pen
1:20:50 - Full Runs: Find Cave

Paper: https://arxiv.org/abs/2112.03482
Code: https://github.com/viniciusguigo/kairos_minerl_basalt
Challenge Website: https://minerl.io/basalt/

Paper Title: Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft

Abstract:
Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agent's learning. In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function. Our approach uses the available human demonstration data to train an imitation learning policy for navigation and additional human feedback to train an image classifier. These modules, together with an estimated odometry map, are then combined into a state-machine designed based on human knowledge of the tasks that breaks them down in a natural hierarchy and controls which macro behavior the learning agent should follow at any instant. We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators. Codebase is available at this https URL.

Authors: Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Bharat Prakash

Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/2017636191

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Other Videos By Yannic Kilcher

2022-02-10	[ML News] DeepMind AlphaCode \| OpenAI math prover \| Meta battles harmful content with AI
2022-02-08	Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents (+Author)
2022-02-07	OpenAI Embeddings (and Controversy?!)
2022-02-06	Unsupervised Brain Models - How does Deep Learning inform Neuroscience? (w/ Patrick Mineault)
2022-02-04	GPT-NeoX-20B - Open-Source huge language model by EleutherAI (Interview w/ co-founder Connor Leahy)
2022-01-29	Predicting the rules behind - Deep Symbolic Regression for Recurrent Sequences (w/ author interview)
2022-01-27	IT ARRIVED! YouTube sent me a package. (also: Limited Time Merch Deal)
2022-01-25	[ML News] ConvNeXt: Convolutions return \| China regulates algorithms \| Saliency cropping examined
2022-01-21	Dynamic Inference with Neural Interpreters (w/ author interview)
2022-01-19	Noether Networks: Meta-Learning Useful Conserved Quantities (w/ the authors)
2022-01-11	This Team won the Minecraft RL BASALT Challenge! (Paper Explanation & Interview with the authors)
2022-01-05	Full Self-Driving is HARD! Analyzing Elon Musk re: Tesla Autopilot on Lex Fridman's Podcast
2022-01-02	Player of Games: All the games, one algorithm! (w/ author Martin Schmid)
2021-12-30	ML News Live! (Dec 30, 2021) Anonymous user RIPS Tensorflw \| AI prosecutors rising \| Penny Challenge
2021-12-28	GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
2021-12-27	Machine Learning Holidays Live Stream
2021-12-26	Machine Learning Holiday Live Stream
2021-12-24	[ML News] AI learns to search the Internet \| Drawings come to life \| New ML journal launches
2021-12-21	[ML News] DeepMind builds Gopher \| Google builds GLaM \| Suicide capsule uses AI to check access
2021-11-27	Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions (Paper Explained)
2021-11-25	Peer Review is still BROKEN! The NeurIPS 2021 Review Experiment (results are in)

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

minecraft

minerl

minerl basalt

minecraft machine learning

minecraft ai

human-like ai

minecraft bot

minecraft ai challenge

minecraft reinforcement learning

behavior cloning

kairos

minecraft kairos

minerl kairos

minerl winners

interview

with the authors

minecraft deep learning

minecraft behavior cloning

gail

generative adversarial imitation learning

state machine

Channel	Latest
YaBoyRoshi	10 hours ago
Play Nintendo	10 hours ago
Steam	11 hours ago
PopCross Studios	12 hours ago
Arcade City	12 hours ago
The Mexican Runner	13 hours ago
Kage848	13 hours ago
Flik's Gaming Stuff	13 hours ago
ArCanOMG	14 hours ago
Sony	14 hours ago
TheREALRandomLozzie!!	15 hours ago
Nowim	15 hours ago
RTGame	16 hours ago
Tamae	16 hours ago
ForceCommander	17 hours ago
Dawko	17 hours ago
MKIceAndFire	17 hours ago
IntroGameOver	17 hours ago
Badaw Gaming	18 hours ago
Autocz	18 hours ago
いっつう	18 hours ago
alanzoka	18 hours ago
Spuffi	18 hours ago
oGVexx	19 hours ago
CarbotAnimations	19 hours ago