Galactica: A Large Language Model for Science (Drama & Paper Review)

Subscribers:
287,000
Published on ● Video Link: https://www.youtube.com/watch?v=ZTs_mXwMCs8



Category:
Review
Duration: 51:33
43,504 views
1,815


#ai #galactica #meta

Galactica is a language model trained on a curated corpus of scientific documents, such as papers, knowledge bases, reviews, and other articles. The model can be used in a generative fasion to assist scientific writing, do reference prediction, and much more, including a new approach to do step-by-step reasoning using a clever encoding of intermediate steps. This video explains the paper, but also dives into the drama that ensued once Meta released a public demo of the model.

OUTLINE:
0:00 - Introduction
1:30 - Drama around the public demo
16:00 - Start of paper review
20:30 - Dataset construction and encoding
23:30 - Encoding step-by-step reasoning using a scratchpad
33:00 - Modelling scientific references & citations
35:05 - Prompt Pre-Training
37:10 - Architecture details
38:30 - Experimental results
49:20 - Conclusion

Paper: https://galactica.org/static/paper.pdf
Website: https://galactica.org/explore/

Abstract:
Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%. Galactica also performs well on reasoning, outperforming Chinchilla on mathematical MMLU by 41.3% to 35.7%, and PaLM 540B on MATH with a score of 20.4% versus 8.8%. It also sets a new state-of-the-art on downstream tasks such as PubMedQA and MedMCQA dev of 77.6% and 52.9%. And despite not being trained on a general corpus, Galactica outperforms BLOOM and OPT-175B on BIG-bench. We believe these results demonstrate the potential for language models as a new interface for science. We open source the model for the benefit of the scientific community.

Authors: Ross Taylor Marcin Kardas Guillem Cucurull Thomas Scialom Anthony Hartshorn Elvis Saravia Andrew Poulton Viktor Kerkez Robert Stojnic


Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n




Other Videos By Yannic Kilcher


2023-03-11This ChatGPT Skill will earn you $10B (also, AI reads your mind!) | ML News
2023-03-02LLaMA: Open and Efficient Foundation Language Models (Paper Explained)
2023-02-24Open Assistant Inference Backend Development (Hands-On Coding)
2023-02-04OpenAssistant - ChatGPT's Open Alternative (We need your help!)
2022-12-31Open Assistant Live Coding (Open-Source ChatGPT Replication)
2022-12-29AI Essay Competition (lab42)
2022-12-26Open Assistant Live Coding (Open-Source ChatGPT Replication)
2022-12-07ChatGPT: This AI has a JAILBREAK?! (Unbelievable AI Progress)
2022-11-27[ML News] GPT-4 Rumors | AI Mind Reading | Neuron Interaction Solved | AI Theorem Proving
2022-11-25CICERO: An AI agent that negotiates, persuades, and cooperates with people
2022-11-19Galactica: A Large Language Model for Science (Drama & Paper Review)
2022-11-13[ML News] Multiplayer Stable Diffusion | OpenAI needs more funding | Text-to-Video models incoming
2022-11-09The New AI Model Licenses have a Legal Loophole (OpenRAIL-M of BLOOM, Stable Diffusion, etc.)
2022-11-04ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview)
2022-11-01Is Stability turning into OpenAI?
2022-10-21Neural Networks are Decision Trees (w/ Alexander Mattick)
2022-10-07This is a game changer! (AlphaTensor by DeepMind explained)
2022-10-02[ML News] OpenAI's Whisper | Meta Reads Brain Waves | AI Wins Art Fair, Annoys Humans
2022-09-18[ML News] Stable Diffusion Takes Over! (Open Source AI Art)
2022-09-17How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit
2022-09-13More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt)



Tags:
deep learning
machine learning
arxiv
explained
neural networks
ai
artificial intelligence
paper
galactica
meta
meta ai
facebook ai
ai science
galactica ai
galactica model
yann lecun
research
fair
deep learning tutorial
what is deep learning
introduction to deep learning