Empirical - Open Source LLM Evaluation UI

Channel:

John Tan Chong Min

Subscribers:

5,450

Published on May 11, 2024 2:33:32 PM ● Video Link: https://www.youtube.com/watch?v=oXPh0MJv0UM

Duration: 44:38

300 views

Had a great conversation with Empirical's CEO, Arjun Attam today.

He has built a great open source tool to enable anyone to evaluate across any LLM, dataset and workflow procedure, as all you have to do is to put the LLM prompt / python script to a .json file, as well as whatever input/output dataset you would be using to evaluate.

Essentially, Empricial's business model is to provide value for a generic audience, and then help consult customers to aid them in integrating LLMs in an optimised fashion in their workflow :)

Super easy to use too. Check out their GitHub for more information:
https://github.com/empirical-run/empirical

As a side note, we both share the same goals of helping others, and making sure the value is brought to the table first, before even thinking of compensation. That is the reason why I did this YouTube channel too - to share knowledge, encourage discussion, and I have enjoyed the journey from the very beginning :)

~~

Empirical Repo: https://github.com/empirical-run/empirical

My projects that are mentioned:
StrictJSON Repo: https://github.com/tanchongmin/strictjson
TaskGen Repo: https://github.com/simbianai/taskgen

~~

0:00 Introduction
1:03 Empirical Demo to evaluate LLM parsing JSON
6:03 empiricalrc.json configuration
17:16 How to use Empirical CLI
19:11 Results of gpt-3.5-turbo vs Llama 3 for JSON parsing (using StrictJSON for Llama 3)
20:54 Evaluating LLM output via Empirical UI
25:50 How to use Empirical for your workflow
28:56 Why Open Source?
31:40 How does Empirical Monetise?
35:08 Empirical’s Target Customers
38:36 Arjun’s Life Motivation - Empowering People via Technology
43:38 Concluding Remarks

~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin

Other Videos By John Tan Chong Min

2024-08-21	AriGraph (Part 2) - Knowledge Graph Construction and Retrieval Details
2024-08-13	alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
2024-07-30	AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
2024-07-23	NeoPlanner - Continually Learning Planning Agent for Large Environments guided by LLMs
2024-07-17	Intelligence = Sampling + Filtering
2024-07-12	Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus
2024-07-02	TaskGen Conversational Class v2: JARVIS, Psychology Counsellor, Sherlock Holmes Shop Assistant
2024-06-04	CodeAct: Code As Action Space of LLM Agents - Pros and Cons
2024-05-28	TaskGen Conversation with Dynamic Memory - Math Quizbot, Escape Room Solver, Psychology Counsellor
2024-05-21	Integrate ANY Python Function, CodeGen, CrewAI tool, LangChain tool with TaskGen! - v2.3.0
2024-05-11	Empirical - Open Source LLM Evaluation UI
2024-05-07	TaskGen Ask Me Anything #1
2024-04-29	StrictJSON (LLM Output Parser) Ask Me Anything #1
2024-04-22	Tutorial #14: Write latex papers with LLMs such as Llama 3!
2024-04-16	SORA Deep Dive: Predict patches from text, images or video
2024-04-09	OpenAI CLIP Embeddings: Walkthrough + Insights
2024-03-26	TaskGen - LLM Agentic Framework that Does More, Talks Less: Shared Variables, Memory, Global Context
2024-03-18	CRADLE (Part 2): An AI that can play Red Dead Dedemption 2. Reflection, Memory, Task-based Planning
2024-03-11	CRADLE (Part 1) - AI that plays Red Dead Redemption 2. Towards General Computer Control and AGI
2024-03-05	TaskGen - A Task-based Agentic Framework using StrictJSON at the core
2024-02-27	SymbolicAI / ExtensityAI Paper Overview (Part 2) - Evaluation Benchmark Discussion!

Channel	Latest
강자	6 hours ago
Beverlyビバリー	6 hours ago
Garena Free Fire VN	6 hours ago
AgentJ Gaming	6 hours ago
Soccer Gameplay	6 hours ago
POWER OF GAME	6 hours ago
笠希々	6 hours ago
Dunkelschloss	6 hours ago
Yusuke Yamamoto [Otaku President]	6 hours ago
よっしぃ game channel	6 hours ago
フリーランスなおきち広島弁ゲーム実況	6 hours ago
Atomix Knight	7 hours ago
阿德 (藝圓創)	7 hours ago
Tama Ch	7 hours ago
やまだちゃんねる	7 hours ago
Krosmaster Team Spain	7 hours ago
fin	7 hours ago
MacTom	7 hours ago
Kikoskia	7 hours ago
ゆっくり田んぼ	7 hours ago
TTKT Studio	7 hours ago
TOHO animation	7 hours ago
Dan Field	7 hours ago
ゆあちゃんねる / Yua Channel	7 hours ago
アサルトサイジ1プレイ動画も上げてます	7 hours ago