Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation
LLMs like ChatGPT are known to hallucinate. If we can ground the LLM with an external memory (e.g. document, pdf), this may let the LLM generate more reliable outputs. We can also augment the output with the reference link (like Bing Search)!
For this tutorial, we use OpenAI Embeddings, Tokenizer (tiktoken), PineCone.
Disclaimer: Please do not openly show your OpenAI / PineCone API key like me. I am only showing it for educational purposes and have deleted the exposed key.
~~~~~~~~~~~~~~~~~~~
References:
Main Discussion Video: https://www.youtube.com/watch?v=lIoLCip0HwM
Original GPT4-Retrieval Augmentation Notebook: https://github.com/openai/openai-cookbook/tree/main/examples/vector_databases/pinecone
Modified Notebook (the one used in this tutorial): https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Tutorial/GPT4_Retrieval_Augmentation.ipynb
Pinecone: https://app.pinecone.io/organizations
OpenAI Chat Completions: https://platform.openai.com/docs/guides/chat/introduction
OpenAI Models: https://platform.openai.com/docs/models/gpt-3-5
OpenAI API Keys: https://platform.openai.com/account/api-keys
OpenAI API Usage: https://platform.openai.com/account/usage
LangChain Documentation: https://python.langchain.com/en/latest/
LangChain Recursive Character Text Splitter: https://python.langchain.com/en/latest/modules/indexes/text_splitters/examples/recursive_text_splitter.html
LangChain ReadTheDocs Documentation: https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/readthedocs_documentation.html
~~~~~~~~~~~~~~~~~~
0:00 Introduction
0:48 Prepare Documents for Loading
4:15 Generate Embeddings in Chunks
9:40 Retrieval-Augmented Generation
16:04 Conclusion
~~~~~~~~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: https://discord.gg/fXCZCPYs
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/.
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin