LLMs - Chunking Strategies and Chunking Refinement

Channel:

LLMs Explained - Aggregate Intellect - AI.SCIENCE

Subscribers:

22,300

Published on April 11, 2024 11:38:09 AM ● Video Link: https://www.youtube.com/watch?v=719pA_hdg7c

Duration: 7:55

806 views

Check out my essays: https://aisc.substack.com/
OR book me to talk: https://calendly.com/amirfzpr
OR subscribe to our event calendar: https://lu.ma/aisc-llm-school
OR sign up for our LLM course: https://maven.com/aggregate-intellect/llm-systems

🟢 Chunking for precision, not just recall: The goal of chunking is to provide the LLM with the most relevant information possible, not just all of the information. This means that the chunks should be carefully chosen to avoid including any distracting information.

🟢 Refinement is key: Once the chunks are retrieved, they may need to be refined before being fed into the LLM. This refinement can involve summarization, entity extraction, or other techniques. The goal of refinement is to make the chunks more concise and easier for the LLM to understand.

🟢 Online vs. offline refinement: Some refinement can be done offline, before the chunks are even retrieved. This is especially helpful, for example, where the language used in the documents may need to be converted into a more general-purpose one before being fed into the LLM. Other refinement tasks, such as summarization, need to be done online, taking the specific query into account.

🟢 Topic span: This is a technique for rearranging the document to group together sentences that are related to the same topic. This can be helpful for LLMs, which may not be able to follow the linear progression of a document. It can be thought of as fine-grained topic modeling where the topics are specifically designed for LLMs. It involves identifying very granular topics, without regard for whether they make sense to humans.

Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE

2024-07-24	Generative AI Tools and Adoption
2024-07-02	PandasAI - From Open Source to User-centric Products
2024-06-28	Agents Embedded in the Real World
2024-06-19	Human Feedback Foundation - LLMs
2024-06-11	LLM Products vs Traditional Digital Products
2024-05-27	LLM as Personal Financial Assistant
2024-05-13	LLM Products for Regulated Industries
2024-04-25	LLMs and Business Workflows
2024-04-17	5 Commandments of Building LLM Products
2024-04-16	Building a LLM Testing API
2024-04-11	LLMs - Chunking Strategies and Chunking Refinement
2024-04-10	Large Language Models as a Building Blocks
2024-04-04	Competitive Advantage for Startups in era of LLMs
2024-04-02	Intersection Between LLMs and Products
2024-03-28	What is the right team composition in era of LLMs?
2024-03-28	Building an LLM Teacher-bot
2024-03-27	What is the relationship between LLMs and multi-modality?
2024-03-26	What are the system level considerations for using LLMs?
2024-03-22	What is the relationship between language and intelligence?
2024-03-21	How do you improve your RAG pipeline?
2024-03-20	Are long context LLMs the death of RAG?

Tags:

deep learning

machine learning

Channel	Latest
Claaaaash	6 hours ago
ALLANDRODesk	6 hours ago
Killersopla	6 hours ago
RICOLIVES116	6 hours ago
Renegades React	6 hours ago
Romanian TVee	6 hours ago
fwengli	7 hours ago
Josh Loves The Mic	7 hours ago
Reaper Gaming	7 hours ago
FoxWaifu	7 hours ago
Kprorus	7 hours ago
Nerd Dendê	7 hours ago
Ace101Infinity	7 hours ago
Stephen White	7 hours ago
LIA MENDI	7 hours ago
MasterG Productions	7 hours ago
Luis Xita	7 hours ago
Exitosa Noticias	7 hours ago
Android4L	7 hours ago
MrDeadman	7 hours ago
Burning P	7 hours ago
Rimas 100	7 hours ago
Drazu Empire	7 hours ago
じょかあき@胡狼あき	7 hours ago
Canal RCN	7 hours ago