VeriTrail: Detect hallucination and trace provenance in AI workflows

Subscribers:
351,000
Published on ● Video Link: https://www.youtube.com/watch?v=pi6Km-vKSVY



Duration: 0:00
5,735 views
270


Dasha Metropolitansky, Research Data Scientist, Microsoft Research Special Projects, introduces VeriTrail, a new method for closed-domain hallucination detection in multi-step AI workflows. Unlike prior methods, VeriTrail provides traceability: it identifies where hallucinated content was likely introduced, and it establishes the provenance of faithful content by tracing a path to the source text. VeriTrail also outperforms baseline methods in hallucination detection. The combination of traceability and effective hallucination detection makes VeriTrail a powerful tool for auditing the integrity of content generated by language models.

Microsoft Research Special Projects VeriTrail paper: https://www.microsoft.com/en-us/research/publication/veritrail-closed-domain-hallucination-detection-with-traceability/
VeriTrail blog post: https://www.microsoft.com/en-us/research/blog/veritrail-detecting-hallucination-and-tracing-provenance-in-multi-step-ai-workflows/
Claimify video: https://www.microsoft.com/en-us/research/video/claimify-extracting-high-quality-claims-from-language-model-outputs/
Claimify paper: https://www.microsoft.com/en-us/research/publication/towards-effective-extraction-and-evaluation-of-factual-claims/
Claimify blog post: https://www.microsoft.com/en-us/research/blog/claimify-extracting-high-quality-claims-from-language-model-outputs/




Other Videos By Microsoft Research


2025-09-24Pushing boundaries of complex reasoning in small language models
2025-09-22zk-promises: Anonymous Moderation, Reputation, & Blocking from Anonymous Credentials with Callbacks
2025-09-22More is Less: Extra Features in Contactless Payments Break Security
2025-09-18Sub-Population Identification of Multi-morbidity in Sub-Saharan African Populations
2025-09-03Echoes in GenAI generations
2025-08-27Six Years of Rowhammer: Breakthroughs and Future Directions
2025-08-25Sub-Population Identification of Multi-morbidity in Sub-Saharan African Populations
2025-08-19MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
2025-08-11Medical Bayesian Kiosk (2010)
2025-08-07Reimagining healthcare delivery and public health with AI
2025-08-05VeriTrail: Detect hallucination and trace provenance in AI workflows
2025-07-31Computational models for brain science
2025-07-30VoluMe: Authentic 3D Video Calls from Live Gaussian Splat Prediction
2025-07-28How I became a StoryTeller (and how YOU can too)
2025-07-28Make some noise: Teaching the language of audio to an LLM using sound tokens
2025-07-28Building Better Language Models Through Global Understanding
2025-07-24Navigating medical education in the era of generative AI
2025-07-22DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
2025-07-21AI Testing and Evaluation: Reflections
2025-07-20Intern talk: Distilling Self-Supervised-Learning-Based Speech Quality Assessment into Compact Models
2025-07-15AI Testing and Evaluation: Learnings from cybersecurity