Can Sherpa (multi-agent llm) Handle Multi-modality?

Channel:

LLMs Explained - Aggregate Intellect - AI.SCIENCE

Subscribers:

22,300

Published on March 13, 2024 12:00:27 PM ● Video Link: https://www.youtube.com/watch?v=Kau7Gk2Olqo

Duration: 1:58

73 views

Check out my essays: https://aisc.substack.com/
OR book me to talk: https://calendly.com/amirfzpr
OR subscribe to our event calendar: https://lu.ma/aisc-llm-school

AF: Can Sherpa handle multimodality?

PC: Inside the Sherpa library, what we try to do is implement different agent execution strategy. The execution strategy itself is multimodal because it doesn't really care about what kind of tasks you're handling.

In the demo, I was able to generate images, code and so on. Those things happened through "actions". For example, the diagram generation tool from the intermediate representation.

You need to describe the data in a string format so that a large language model can handle it. We have some default actions. Most of them handle text. But you can create your own actions to deal with images with a customized model for images to text.

AF: That's an opinionated design choice that we have made, which is Sherpa only handles the task orchestration at the agent level and all of the data specific activities are delegated to the tools. That way we have a separation of responsibilities.

Let's say if a user provides an image to Sherpa, and say: extract this information and do this math, then the agents in Sherpa know to call a specific image to text type of tool to get the text description of the image. Then, use that text and send it to the math tool that writes Python code related to whatever the operation is and get the result back and, then send it to another tool that does the summarization, maybe at the end, it calls another tool yet again and sends it the results and it reads it out with text to speech kind of tool, right?

All of those handling the modalities are just completely delegated to the tools to make the system very generalizable and scalable. You can just completely focus on building the right tools and delegate all the LLM handling to Sherpa because it will take care of those. And you can just create an army of various specialized models and systems and just make them available through APIs to the system.

Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE

2024-03-28	Building an LLM Teacher-bot
2024-03-27	What is the relationship between LLMs and multi-modality?
2024-03-26	What are the system level considerations for using LLMs?
2024-03-22	What is the relationship between language and intelligence?
2024-03-21	How do you improve your RAG pipeline?
2024-03-20	Are long context LLMs the death of RAG?
2024-03-19	How Do You choose between training, fine-tuning, and using small models?
2024-03-15	Multi-agent LLMs Course #business #startup https://maven.com/forms/30a683
2024-03-15	LLM Evaluation, Validation, and Verification
2024-03-14	How Do You Validate LLM Systems Beyond Benchmarks?
2024-03-13	Can Sherpa (multi-agent llm) Handle Multi-modality?
2024-03-12	What Kind of Risks Are Specific to LLMs?
2024-03-08	LLMs, What Skills to Learn? and What a Time to be Alive!
2024-03-07	How do you Force an LLM to Keep Track of the Assumptions a Document Makes?
2024-03-06	How to Annotate Data for LLM Applications
2024-03-05	What is the Role of Data Quality and Diversity in LLM Systems?
2023-12-16	Testing Strategies for LLMs - SHERPA - Open Source Project Update, 2023-12-08
2023-12-16	Evaluating Job Exposure to Large Language Models
2023-12-16	Empirical Rigor in ML
2023-12-16	Evaluation of Multimodal RAG Systems using the LlamaIndex
2023-12-16	Intro to Language Model Operations (LLM-Ops)

Tags:

deep learning

machine learning

Channel	Latest
Scott Jund	6 hours ago
Smutsen	6 hours ago
BeastyqtSC2	6 hours ago
Exalted	6 hours ago
Bonkol Live	6 hours ago
Teh Spearhead	6 hours ago
Ashe Challenger	6 hours ago
Austinmp88	6 hours ago
Ask About Parenting & Care	6 hours ago
GranaDy	7 hours ago
Catninja909	7 hours ago
Sion VOD Gaming	7 hours ago
Outplanet Studios	7 hours ago
RakuInariLP	7 hours ago
Xmilek62	7 hours ago
BranOnline	7 hours ago
ketsueki_randi	7 hours ago
beavsbaut	7 hours ago
PIMPNITE	7 hours ago
JugZone	7 hours ago
ItzMiketheman	7 hours ago
Secretnc	7 hours ago
Jeisonlk	7 hours ago
Kaghoegaming	7 hours ago
The Missing Level	7 hours ago