OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=lIoLCip0HwM



Game:
Vector (2012)
Duration: 1:38:21
2,411 views
39


With the increasing usefulness of ChatGPT and its interfacing with external tools, vector databases can be the key to performing memory-augmented search. This enables applications like imbuing ChatGPT with more domain-specific knowledge, or be able to chat with a document such as a PDF or book.

This week, we have a special guest. Manas, an undergraduate from Nanyang Technological University (NTU) - Singapore - will be showcasing his Talk to Book applications, such as Atomic Habits ( https://www.gptbook.club/atomic-habits ).

We will also be discussing how vector embeddings are generated and how they can be compared to one another.

Also, we have a special appearance by Tim Scarfe! He is the main creator of the Machine Learning Street Talk channel, which features insightful commentaries and interviews on popular Machine Learning topics. Check it out here: https://www.youtube.com/@MachineLearningStreetTalk

~~~~~~~~~~~~~~~~~~~~~~~

References:

GPT Book Club (Talk to any book by Manas): https://www.gptbook.club/atomic-habits
My own tutorial on using vector embeddings and retrieval from Pinecone: https://www.youtube.com/watch?v=rh-WNG4yJag
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/OpenAI%20Vector%20Embeddings.pdf

OpenAI Embedding Paper: https://arxiv.org/abs/2201.10005
OpenAI Embeddings Page: https://platform.openai.com/docs/guides/embeddings

BERT Paper: https://arxiv.org/abs/1810.04805

Generative AI Paper (Tim Scarfe and I were talking about agents being able to interact with one another with different personality): https://arxiv.org/abs/2304.03442

Pinecone Vector Database: https://www.pinecone.io/

~~~~~~~~~~~~~~~~~~~~~~~

0:00 Introduction
1:03 Sharing by Manas on Atomic Habits GPT
8:38 Free-flow Discussion between Manas, Tim Scarfe, Mehul and I
28:50 Embedding Space
32:02 Traditional Approach: TF-IDF
35:43 Modern Approach: Vector Embeddings
37:03 Token Embeddings
39:58 3D Embedding Visualization
41:44 OpenAI Embedding Paper + my opinion of how embeddings can be trained
57:13 Issues with Contrastive Learning
1:04:36 Distance Metrics
1:09:04 Will we lose any meaning by normalizing vectors? (Note: Cosine Similarity is not affected)
1:14:13 External Vector Database (e.g. Pinecone)
1:15:22 Use Cases
1:16:06 Discussion
1:37:15 Conclusion

~~~~~~~~~~~~~~~~~~~~~~~~~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/fXCZCPYs
Online AI blog: https://delvingintotech.wordpress.com/.
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Twitch: https://www.twitch.tv/johncm99
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2023-05-25Hierarchy! The future of AI: How it helps representations and why it is important.
2023-05-18Prediction builds representations! Fixed Bias speeds up learning!
2023-05-09Memory: How is it encoded, retrieved and how it can be used for learning systems
2023-05-02I created a Law Court Simulator with GPT4!
2023-05-02I created a Law Court Simulator with ChatGPT!
2023-04-25Creating a ChatGPT Harry Potter Text-based RPG game!
2023-04-25Learn from just Memory Storage and Retrieval: Generative Agents Interacting in Simulation!
2023-04-18The future is neuro-symbolic: Expressiveness of ChatGPT and generalizability of symbols (SymbolicAI)
2023-04-17Can GPT4 solve the Abstraction and Reasoning Corpus (ARC) Challenge Zero-Shot?
2023-04-12GPT4: Zero-shot Classification without any examples + Fine-tune with reflection
2023-04-11OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!
2023-04-11Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation
2023-04-04Creating JARVIS: ChatGPT + APIs - HuggingGPT, Memory-Augmented Context, Meta GPT structures
2023-04-02Is GPT4 capable of self-improving? Are we heading for AGI or AI doom?
2023-03-28How Visual ChatGPT works + Toolformer/Wolfram Alpha. LLMs with Tools/APIs/Plugins is the way ahead!
2023-03-21Tokenize any input, even continuous vectors! - Residual Vector Quantization - VALL-E (Part 2)
2023-03-07Using Transformers to mimic anyone's voice! - VALL-E (Part 1)
2023-02-28Learning Part-Whole Structure by Chunking - More Efficient than Deep Learning!!!
2023-02-21High-level planning with large language models - SayCan
2023-02-13Learning, Fast and Slow: Towards Fast and Adaptable Agents in Changing Environments
2023-02-07Using Logic Gates as Neurons - Deep Differentiable Logic Gate Networks!



Other Statistics

Vector Statistics For John Tan Chong Min

There are 2,411 views in 1 video for Vector. About an hours worth of Vector videos were uploaded to his channel, less than 0.52% of the total video content that John Tan Chong Min has uploaded to YouTube.