Normie Tools for Validating LLM Outputs

Channel:

LLMs Explained - Aggregate Intellect - AI.SCIENCE

Subscribers:

22,300

Published on December 16, 2023 7:49:51 PM ● Video Link: https://www.youtube.com/watch?v=xbXEE7pqwMI

Duration: 14:20

284 views

Speaker: Benjamin Labaschin

Summary
=======
The speaker discusses the importance of validating outputs from large language models and the challenges that come with it. They suggest strategies such as monitoring embeddings drift, conducting A/B testing, and performing human evaluation. The speaker demonstrates their approach using a chatbot powered by Llama 2 and explains the code structure and validation process. They also discuss the use of Pydantic base models and while loops to ensure expected responses. The speaker mentions the tool Pydantic for refining model responses for catching formatting problems and the use of DSLs and JSON schemas for handling complex content.

Topics
======

⃝ Validation Strategies
* Monitoring embeddings drift
* Conducting A/B testing
* Performing human evaluation

⃝ API Message Validation
* Using Pydantic base models
* Checking for expected messages
* Using while loops for validation
* Refining model responses
* Setting criteria for response generation

⃝ Handling Complex Content
* Using domain-specific language (DSL)
* Utilizing JSON schemas
* Specifying desired output structure

Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE

2024-03-12	What Kind of Risks Are Specific to LLMs?
2024-03-08	LLMs, What Skills to Learn? and What a Time to be Alive!
2024-03-07	How do you Force an LLM to Keep Track of the Assumptions a Document Makes?
2024-03-06	How to Annotate Data for LLM Applications
2024-03-05	What is the Role of Data Quality and Diversity in LLM Systems?
2023-12-16	Testing Strategies for LLMs - SHERPA - Open Source Project Update, 2023-12-08
2023-12-16	Evaluating Job Exposure to Large Language Models
2023-12-16	Empirical Rigor in ML
2023-12-16	Evaluation of Multimodal RAG Systems using the LlamaIndex
2023-12-16	Intro to Language Model Operations (LLM-Ops)
2023-12-16	Normie Tools for Validating LLM Outputs
2023-12-16	Automatic Evaluation of Dialogue Systems using LLMs
2023-10-27	SHERPA - Open Source Project Update, 2023-09-29
2023-10-27	Eliciting Business Insights at Scale with Conversational AI
2023-10-27	Challenges and Solutions for LLMs in Production
2023-10-27	Practical Applications, Impact, and ROI of Generative AI
2023-10-27	Role of Human Factors in Adoption of Generative AI in Life Sciences
2023-10-27	Constructing Synthetic Datasets using LLMs
2023-10-27	LLMs, Gen AI and Stakeholder Buy-in
2023-10-19	Council: A Framework for Developing Generative AI Applications
2023-10-19	LLM Pitch Session - Technical Customer Experience and Effective Communication

Tags:

deep learning

machine learning

Channel	Latest
PryGames	6 hours ago
Edi Solo Gaming	6 hours ago
BIGAME	6 hours ago
Darker Senpai	6 hours ago
Korea Retro Game	6 hours ago
TRC Gameplay	6 hours ago
MultiSt3p	6 hours ago
Neuro	6 hours ago
Cahlaflour	7 hours ago
Subroza	7 hours ago
ChrisTheNjord	7 hours ago
MrEgor	7 hours ago
Strumienie z Ruczaju	7 hours ago
GrimJesuz Iam MrGrim	7 hours ago
lucasburn	7 hours ago
Brodaty	7 hours ago
LiN	7 hours ago
Hindi Techub	7 hours ago
Reggaeville	7 hours ago
Val Starlight	7 hours ago
The Real Silent Gamer	7 hours ago
MattGamesYT	7 hours ago
ReFax09VODS	7 hours ago
Nintendo Thumb	7 hours ago
SaintWoody	7 hours ago