[ML News] Chips, Robots, and Models

Channel:

Switzerland

Subscribers:

292,000

Published on April 30, 2024 7:13:42 PM ● Video Link: https://www.youtube.com/watch?v=tRavLU8Ih4A

Duration: 39:13

28,939 views

985

OUTLINE:
0:00 - Intro
0:19 - Our next-generation Meta Training and Inference Accelerator
01:39 - ALOHA Unleashed
03:10 - Apple Inks $50M Deal with Shutterstock for AI Training Data
04:28 - OpenAI Researchers, Including Ally of Sutskever, Fired for Alleged Leaking
05:01 - Adobe's Ethical Firefly AI was Trained on Midjourney Images
05:52 - Trudeau announces $2.4billion for AI-related investments
06:48 - RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
07:15 - CodeGemma - an official Google release for code LLMs
07:24 - Mistral AI: Cheaper, Better, Faster, Stronger
08:08 - Vezora/Mistral-22B-v0.1
09:00 - WizardLM-2, next generation state-of-the-art-LLM
09:31 - Idefics2, the strongest Vision-Language-Model (VLM) below 10B!
10:14 - BlinkDL/rwkv-6-world
10:50 - Pile-T5: Trained T5 on the Pile
11:35 - Model Card for Zephyr 141B-A39B
12:42 - Parler TTS
13:11 - RHO-1: Not all tokens are what you need
14:59 - Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

References:
https://twitter.com/ayzwah/status/1780263768968273923
https://ai.meta.com/blog/next-generation-meta-training-inference-accelerator-AI-MTIA/?utm_source=twitter
https://twitter.com/soumithchintala/status/1778087952964374854?t=Mb-mQvm4YIZ35pVpEijs6g&s=09
https://deepnewz.com/tech/apple-inks-50m-deal-shutterstock-ai-training-data
https://twitter.com/TolgaBilge_/status/1778598047821291793?t=zInlPDRZzozcz7-pjFSnyA&s=09
https://twitter.com/javilopen/status/1778821749792034911?t=oGLiMj6GQdKTuM6GbiYrAg&s=09
https://twitter.com/paulg/status/1781329523155357914?t=vCQT2mJf5BbtjdN1BMFYFQ&s=09
https://twitter.com/RichardSocher/status/1776706907295846628
https://www.cbc.ca/news/politics/federal-government-ai-investment-1.7166234
https://arxiv.org/pdf/2404.07839
https://huggingface.co/blog/codegemma
https://mistral.ai/news/mixtral-8x22b/
https://twitter.com/MistralAILabs/status/1780606904273702932?t=JlSCcYulpJL74pNJbtSZag&s=09
https://huggingface.co/Vezora/Mistral-22B-v0.1
https://huggingface.co/Vezora/Mistral-22B-v0.2
https://twitter.com/WizardLM_AI/status/1779899325868589372?t=l0Fd-4mfdtz3np_gALKaLA&s=09
https://twitter.com/_philschmid/status/1779922877589889400?t=7q1xg1LRy80mV8JGRm4aqA&s=09
https://huggingface.co/BlinkDL/rwkv-6-world
https://blog.eleuther.ai/pile-t5/
https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
https://huggingface.co/MaziyarPanahi/zephyr-orpo-141b-A35b-v0.1-GGUF
https://twitter.com/reach_vb/status/1778138382633140276?t=Mb-mQvm4YIZ35pVpEijs6g&s=09
https://arxiv.org/pdf/2404.07965
https://arxiv.org/pdf/2404.05719
https://sambanova.ai/blog/samba-coe-the-power-of-routing-ml-models-at-scale
https://www.microsoft.com/en-us/research/project/vasa-1/
https://twitter.com/twelve_labs/status/1780939765405065254?t=5ONxSzdwnghsKcwq3IPmEQ&s=09
https://drive.google.com/file/d/1Av5jpsbH3g09TRD1PfRh0nLsYrN_iu7_/view
https://arxiv.org/pdf/2404.12387
https://arxiv.org/abs/2404.12241
https://arxiv.org/pdf/2404.12241
https://twitter.com/Alon_Jacoby/status/1780650122382049596
https://audiodialogues.github.io/
https://os-world.github.io/
https://ai.meta.com/blog/openeqa-embodied-question-answering-robotics-ar-glasses/?utm_source=twitter&utm_medium=organic_social&utm_content=video&utm_campaign=dataset
https://arxiv.org/pdf/2404.07503
https://arxiv.org/pdf/2404.06654
https://twitter.com/amanrsanger/status/1779620682340704386?t=UnOronFwkESwAXiE0i0R4A&s=09
https://huggingface.co/datasets/xai-org/RealworldQA
https://github.com/PygmalionAI/aphrodite-engine
https://github.com/jina-ai/reader/?tab=readme-ov-file
https://r.jina.ai/https://x.com/elonmusk
https://r.jina.ai/https://github.com/jina-ai/reader
https://github.com/rogeriochaves/langstream
https://twitter.com/mvpatel2000/status/1777891913313440215?t=m5POrtTTS33tgwmRztQj3w&s=09
https://github.com/databricks/megablocks
https://github.com/nus-apr/auto-code-rover
https://github.com/nus-apr/auto-code-rover/blob/main/preprint.pdf
https://twitter.com/karpathy/status/1683143097604243456?t=7V_ApJFbjrm4TbxM5n3nXA&s=09
https://twitter.com/karpathy/status/1777427944971083809?t=s6xYQmYkhQyiFU65Fwq9tw&s=09
https://github.com/BasedHardware/Friend
https://twitter.com/argmaxinc/status/1781382688819282132?t=vCQT2mJf5BbtjdN1BMFYFQ&s=09
https://twitter.com/awnihannun/status/1778519566437794109?t=8N5PjwlKJpGotTx_HXZQrQ&s=09
https://twitter.com/Prince_Canuma/status/1776399292036501898
https://pytorch.org/blog/torchtune-fine-tune-llms/

If you want to support me, the best thing to do is to share out the content :)

Other Videos By Yannic Kilcher

2024-11-23	TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)
2024-10-19	GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
2024-10-12	Were RNNs All We Needed? (Paper Explained)
2024-10-05	Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)
2024-08-04	Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)
2024-07-08	Scalable MatMul-free Language Modeling (Paper Explained)
2024-06-26	Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained)
2024-06-01	xLSTM: Extended Long Short-Term Memory
2024-05-21	[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action)
2024-05-01	ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)
2024-04-30	[ML News] Chips, Robots, and Models
2024-04-28	TransformerFAM: Feedback attention is working memory
2024-04-27	[ML News] Devin exposed \| NeurIPS track for high school students
2024-04-24	Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
2024-04-23	[ML News] Llama 3 changes the game
2024-04-17	Hugging Face got hacked
2024-04-15	[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news)
2024-04-13	[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)
2024-04-08	Flow Matching for Generative Modeling (Paper Explained)
2024-04-06	Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)
2024-03-26	[ML News] Grok-1 open-sourced \| Nvidia GTC \| OpenAI leaks model names \| AI Act

Tags:

deep learning

machine learning

arxiv

explained

neural networks

ai

artificial intelligence

paper

Recent Activity

Channel	Latest
CistReactZ	7 hours ago
山口組速報	7 hours ago
Woody Fruity	8 hours ago
Mystical Gaming	8 hours ago
T1 Jaguar	9 hours ago
MachoEspartano	9 hours ago
Peebs	10 hours ago
Akali Challenger	10 hours ago
Intekam	10 hours ago
TSOUL822	10 hours ago
Iggy010	10 hours ago
Lost in Gaming	10 hours ago
Salty Pretzels ⛛	10 hours ago
El Kilu	11 hours ago
SteamTrapper0505	11 hours ago
FolkNewGeneration	11 hours ago
Cobb the Drummer	11 hours ago
MAXONEPLAY	11 hours ago
AuMiO VXC	11 hours ago
El Orellana	11 hours ago
Panic Lens	11 hours ago
StemSullGameClips	11 hours ago
Stone Nguyen	11 hours ago
Fongrr	11 hours ago
Left4dead2k	11 hours ago