LLMs - Chunking Strategies and Chunking Refinement

Published on ● Video Link: https://www.youtube.com/watch?v=719pA_hdg7c



Duration: 7:55
114 views
8


Check out my essays: https://aisc.substack.com/
OR book me to talk: https://calendly.com/amirfzpr
OR subscribe to our event calendar: https://lu.ma/aisc-llm-school
OR sign up for our LLM course: https://maven.com/aggregate-intellect/llm-systems

🟢 Chunking for precision, not just recall: The goal of chunking is to provide the LLM with the most relevant information possible, not just all of the information. This means that the chunks should be carefully chosen to avoid including any distracting information.

🟢 Refinement is key: Once the chunks are retrieved, they may need to be refined before being fed into the LLM. This refinement can involve summarization, entity extraction, or other techniques. The goal of refinement is to make the chunks more concise and easier for the LLM to understand.

🟢 Online vs. offline refinement: Some refinement can be done offline, before the chunks are even retrieved. This is especially helpful, for example, where the language used in the documents may need to be converted into a more general-purpose one before being fed into the LLM. Other refinement tasks, such as summarization, need to be done online, taking the specific query into account.

🟢 Topic span: This is a technique for rearranging the document to group together sentences that are related to the same topic. This can be helpful for LLMs, which may not be able to follow the linear progression of a document. It can be thought of as fine-grained topic modeling where the topics are specifically designed for LLMs. It involves identifying very granular topics, without regard for whether they make sense to humans.







Tags:
deep learning
machine learning