[ML News] Chips, Robots, and Models

Subscribers:
255,000
Published on ● Video Link: https://www.youtube.com/watch?v=tRavLU8Ih4A



Duration: 39:13
27,112 views
965


OUTLINE:
0:00 - Intro
0:19 - Our next-generation Meta Training and Inference Accelerator
01:39 - ALOHA Unleashed
03:10 - Apple Inks $50M Deal with Shutterstock for AI Training Data
04:28 - OpenAI Researchers, Including Ally of Sutskever, Fired for Alleged Leaking
05:01 - Adobe's Ethical Firefly AI was Trained on Midjourney Images
05:52 - Trudeau announces $2.4billion for AI-related investments
06:48 - RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
07:15 - CodeGemma - an official Google release for code LLMs
07:24 - Mistral AI: Cheaper, Better, Faster, Stronger
08:08 - Vezora/Mistral-22B-v0.1
09:00 - WizardLM-2, next generation state-of-the-art-LLM
09:31 - Idefics2, the strongest Vision-Language-Model (VLM) below 10B!
10:14 - BlinkDL/rwkv-6-world
10:50 - Pile-T5: Trained T5 on the Pile
11:35 - Model Card for Zephyr 141B-A39B
12:42 - Parler TTS
13:11 - RHO-1: Not all tokens are what you need
14:59 - Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

References:
https://twitter.com/ayzwah/status/1780263768968273923
https://ai.meta.com/blog/next-generation-meta-training-inference-accelerator-AI-MTIA/?utm_source=twitter
https://twitter.com/soumithchintala/status/1778087952964374854?t=Mb-mQvm4YIZ35pVpEijs6g&s=09
https://deepnewz.com/tech/apple-inks-50m-deal-shutterstock-ai-training-data
https://twitter.com/TolgaBilge_/status/1778598047821291793?t=zInlPDRZzozcz7-pjFSnyA&s=09
https://twitter.com/javilopen/status/1778821749792034911?t=oGLiMj6GQdKTuM6GbiYrAg&s=09
https://twitter.com/paulg/status/1781329523155357914?t=vCQT2mJf5BbtjdN1BMFYFQ&s=09
https://twitter.com/RichardSocher/status/1776706907295846628
https://www.cbc.ca/news/politics/federal-government-ai-investment-1.7166234
https://arxiv.org/pdf/2404.07839
https://huggingface.co/blog/codegemma
https://mistral.ai/news/mixtral-8x22b/
https://twitter.com/MistralAILabs/status/1780606904273702932?t=JlSCcYulpJL74pNJbtSZag&s=09
https://huggingface.co/Vezora/Mistral-22B-v0.1
https://huggingface.co/Vezora/Mistral-22B-v0.2
https://twitter.com/WizardLM_AI/status/1779899325868589372?t=l0Fd-4mfdtz3np_gALKaLA&s=09
https://twitter.com/_philschmid/status/1779922877589889400?t=7q1xg1LRy80mV8JGRm4aqA&s=09
https://huggingface.co/BlinkDL/rwkv-6-world
https://blog.eleuther.ai/pile-t5/
https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
https://huggingface.co/MaziyarPanahi/zephyr-orpo-141b-A35b-v0.1-GGUF
https://twitter.com/reach_vb/status/1778138382633140276?t=Mb-mQvm4YIZ35pVpEijs6g&s=09
https://arxiv.org/pdf/2404.07965
https://arxiv.org/pdf/2404.05719
https://sambanova.ai/blog/samba-coe-the-power-of-routing-ml-models-at-scale
https://www.microsoft.com/en-us/research/project/vasa-1/
https://twitter.com/twelve_labs/status/1780939765405065254?t=5ONxSzdwnghsKcwq3IPmEQ&s=09
https://drive.google.com/file/d/1Av5jpsbH3g09TRD1PfRh0nLsYrN_iu7_/view
https://arxiv.org/pdf/2404.12387
https://arxiv.org/abs/2404.12241
https://arxiv.org/pdf/2404.12241
https://twitter.com/Alon_Jacoby/status/1780650122382049596
https://audiodialogues.github.io/
https://os-world.github.io/
https://ai.meta.com/blog/openeqa-embodied-question-answering-robotics-ar-glasses/?utm_source=twitter&utm_medium=organic_social&utm_content=video&utm_campaign=dataset
https://arxiv.org/pdf/2404.07503
https://arxiv.org/pdf/2404.06654
https://twitter.com/amanrsanger/status/1779620682340704386?t=UnOronFwkESwAXiE0i0R4A&s=09
https://huggingface.co/datasets/xai-org/RealworldQA
https://github.com/PygmalionAI/aphrodite-engine
https://github.com/jina-ai/reader/?tab=readme-ov-file
https://r.jina.ai/https://x.com/elonmusk
https://r.jina.ai/https://github.com/jina-ai/reader
https://github.com/rogeriochaves/langstream
https://twitter.com/mvpatel2000/status/1777891913313440215?t=m5POrtTTS33tgwmRztQj3w&s=09
https://github.com/databricks/megablocks
https://github.com/nus-apr/auto-code-rover
https://github.com/nus-apr/auto-code-rover/blob/main/preprint.pdf
https://twitter.com/karpathy/status/1683143097604243456?t=7V_ApJFbjrm4TbxM5n3nXA&s=09
https://twitter.com/karpathy/status/1777427944971083809?t=s6xYQmYkhQyiFU65Fwq9tw&s=09
https://github.com/BasedHardware/Friend
https://twitter.com/argmaxinc/status/1781382688819282132?t=vCQT2mJf5BbtjdN1BMFYFQ&s=09
https://twitter.com/awnihannun/status/1778519566437794109?t=8N5PjwlKJpGotTx_HXZQrQ&s=09
https://twitter.com/Prince_Canuma/status/1776399292036501898
https://pytorch.org/blog/torchtune-fine-tune-llms/

If you want to support me, the best thing to do is to share out the content :)




Other Videos By Yannic Kilcher


2024-06-01xLSTM: Extended Long Short-Term Memory
2024-05-21[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action)
2024-05-01ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)
2024-04-30[ML News] Chips, Robots, and Models
2024-04-28TransformerFAM: Feedback attention is working memory
2024-04-27[ML News] Devin exposed | NeurIPS track for high school students
2024-04-24Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
2024-04-23[ML News] Llama 3 changes the game
2024-04-17Hugging Face got hacked
2024-04-15[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news)
2024-04-13[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)
2024-04-08Flow Matching for Generative Modeling (Paper Explained)
2024-04-06Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)
2024-03-26[ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act
2024-03-17[ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction
2024-03-10[ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama
2024-03-07On Claude 3
2024-03-05No, Anthropic's Claude 3 is NOT sentient
2024-03-01[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles
2024-02-22Gemini has a Diversity Problem
2024-02-19V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained)



Tags:
deep learning
machine learning
arxiv
explained
neural networks
ai
artificial intelligence
paper