Embeddings Walkthrough (Part 2): Context-Dependent Embeddings, Shifting Embedding Space

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=YZTThlPg0rU



Category:
Walkthrough
Duration: 1:15:20
462 views
17


We'll talk about how to make Transformer's next-token objective more in line to sentence meaning objective.

- Joint query and key similarity retrieval, e.g. Cohere Reranker
- Shifting embedding space via generating hypothetical documents or via hinting, e.g. Hypothetical Document Embeddings (HyDE), Recitation Augmented Language Models
- My Experiments to change context for embeddings: Pre-pending context, Appending context, Modifying text chunk by context

~~~
Part 1: https://www.youtube.com/watch?v=gVZryxJRdSY

Slides: https://github.com/tanchongmin/strictjson/blob/main/Experiments/Embeddings%20Walkthrough.pdf
Jupyter Notebook for my experiments: https://github.com/tanchongmin/strictjson/blob/main/Experiments/Context-Dependent-Embeddings.ipynb

OpenAI Sentence Embedding Paper: https://arxiv.org/abs/2201.10005
Cohere Reranker: https://docs.cohere.com/docs/reranking
HyDE: https://arxiv.org/abs/2212.10496
Recitation Augmented Language Models: https://arxiv.org/abs/2210.01296

~~~

0:00 Issues with Sentence Embeddings
15:30 How Retrieval Augmented Generation (RAG) is typically done
17:29 Joint Query and Key Processing without External Embeddings
36:16 Hypothetical Document Embeddings (HyDE)
41:55 Guiding LLM by Hinting
50:00 My idea: Context-Dependent Embeddings
53:16 Key idea: Multiple embeddings by context modification
1:05:50 Discussion

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2024-04-29StrictJSON (LLM Output Parser) Ask Me Anything #1
2024-04-22Tutorial #14: Write latex papers with LLMs such as Llama 3!
2024-04-16SORA Deep Dive: Predict patches from text, images or video
2024-04-09OpenAI CLIP Embeddings: Walkthrough + Insights
2024-03-26TaskGen - LLM Agentic Framework that Does More, Talks Less: Shared Variables, Memory, Global Context
2024-03-18CRADLE (Part 2): An AI that can play Red Dead Dedemption 2. Reflection, Memory, Task-based Planning
2024-03-11CRADLE (Part 1) - AI that plays Red Dead Redemption 2. Towards General Computer Control and AGI
2024-03-05TaskGen - A Task-based Agentic Framework using StrictJSON at the core
2024-02-27SymbolicAI / ExtensityAI Paper Overview (Part 2) - Evaluation Benchmark Discussion!
2024-02-20SymbolicAI / ExtensityAI Paper Overview (Part 1) - Key Philosophy Behind the Design - Symbols
2024-02-13Embeddings Walkthrough (Part 2): Context-Dependent Embeddings, Shifting Embedding Space
2024-02-06Embeddings Walkthrough (Part 1) - Bag of Words to word2vec to Transformer contextual embeddings
2024-01-29V* - Better than GPT-4V? Iterative Context Refining for Visual Question Answer!
2024-01-23AutoGen: A Multi-Agent Framework - Overview and Improvements
2024-01-09AppAgent: Using GPT-4V to Navigate a Smartphone!
2024-01-08Tutorial #13: StrictJSON, my first Python Package! - Get LLMs to output into a working JSON!
2023-12-20"Are you smarter than an LLM?" game speedrun
2023-12-08Is Gemini better than GPT4? Self-created benchmark - Fact Retrieval/Checking, Coding, Tool Use
2023-12-04Learning, Fast and Slow: 10 Years Plan - Memory Soup, Hier. Planning, Emotions, Knowledge Sharing
2023-12-01Tutorial #12: Use ChatGPT and off-the-shelf RAG on Terminal/Command Prompt/Shell - SymbolicAI
2023-11-20JARVIS-1: Multi-modal (Text + Image) Memory + Decision Making with LLMs in MineCraft!