Building a LLM Judge with Weights & Biases
Channel:
Subscribers:
118,000
Published on ● Video Link: https://www.youtube.com/watch?v=zaNR3WaPTfo
Evaluating LLM outputs accurately is critical to being able to iterate quickly on a LLM system. Human annotations can be slow and expensive and using LLMs instead promises to solve this. However, aligning a LLM Judge with human judgements is often hard with many implementation details to consider. In this workshop we will explore:\nEvaluating specialized LLMs using Weave\nProductionizing the latest LLM-as-a-judge research\nImproving on your existing judge\nBuilding annotation UIs\n\n#MicrosoftReactor\n\n[eventID:23760]