Building the Next Generation of Conversational AI

Channel:

a16z

Subscribers:

204,000

Published on March 15, 2025 3:37:39 AM ● Video Link: https://www.youtube.com/watch?v=bTcpNQH8ViQ

Duration: 0:00

9,854 views

Inside the Code: Ankit Kumar (Sesame) & Anjney Midha (a16z) on the Future of Voice AI

What goes into building a truly natural-sounding AI voice? In this episode, Sesame’s cofounder and CTO, Ankit Kumar, joins a16z’s Anjney Midha for a deep dive into the research and engineering behind their voice technology.

They discuss the technical challenges of real-time speech generation, the trade-offs in balancing personality with efficiency, and why the team is open-sourcing key components of their model. Ankit breaks down the complexities of multimodal AI, full-duplex conversation modeling, and the computational optimizations that enable low-latency interactions. They also explore the evolution of natural language as a user interface and its potential to redefine human-computer interaction.

Plus, we take audience questions on everything from scaling laws in speech synthesis to the role of in-context learning in making AI voices more expressive.

Key Takeaways:
How Sesame achieves natural voice interactions through real-time speech generation.
The impact of open-sourcing their speech model and what it means for AI research.
The role of full-duplex modeling in improving AI responsiveness.
How computational efficiency and system latency shape AI conversation quality.
The growing role of natural language as a user interface in AI-driven experiences.

For anyone interested in AI and voice technology, this episode offers an in-depth look at the latest advancements pushing the boundaries of human-computer interaction.

Follow everyone on X:
Ankit Kumar - https://x.com/_apkumar
Anjney Midha - https://x.com/anjneymidha

Check out everything a16z is doing with artificial intelligence, including articles, projects, and more podcasts here – https://a16z.com/ai/

Other Videos By a16z

2025-04-02	Can the Military Move at Startup Speed? How the Army and Navy Are Rebuilding
2025-03-26	The Top 100 GenAI Products, Ranked and Explained
2025-03-25	Jensen Huang: Why Open Source Will Win the AI Race
2025-03-21	Automating Developer Email with MCP and AI Agents
2025-03-20	The Future of Digital Workers
2025-03-20	Why Every Nation Needs Its Own AI Strategy: Jensen Huang & Arthur Mensch
2025-03-20	Scaling Medicaid Innovation with Rajaie Batniji, Sanjay Basu, and Afia Asamoah
2025-03-18	Why AI Voice Feels More Human Than Ever
2025-03-18	It's Time to Build for America - American Dynamism Summit 2025
2025-03-17	Why American Dynamism Is Just Getting Started
2025-03-14	Building the Next Generation of Conversational AI
2025-03-13	Vibecoding is Here - How AI is Changing How We Build Online
2025-03-07	Agent Experience: Building an Open Web for the AI Era
2025-03-05	DeepSeek, Reasoning Models, and the Future of LLMs
2025-02-28	Staying vigilant against deepfakes
2025-02-28	Avoiding vulnerabilities in AI code
2025-02-28	How to use DeepSeek safely
2025-02-21	Agents, Lawyers, and LLMs
2025-02-20	Who Will Own the Internet? a16z’s Chris Dixon on AI and Crypto
2025-02-14	Reasoning Models Are Remaking Professional Services
2025-02-06	What DeepSeek Means For The Future Of AI \| Tech Veterans Weigh In

Channel	Latest
Willow TV	6 hours ago
Chapter Master Valrak	6 hours ago
Sony Pictures España	6 hours ago
YoshiTheFox	6 hours ago
RyoXIII Gaming	6 hours ago
ONE Media Brazil	6 hours ago
Indiatimes	6 hours ago
A1starGamer	6 hours ago
yama 22	6 hours ago
Village Hunter	6 hours ago
Alsumaria السومرية	6 hours ago
NomadaFirefox	6 hours ago
朝美しるこ	6 hours ago
saaaz	6 hours ago
クレインクレイン	6 hours ago
Robot Review Tech	6 hours ago
The technique dobi	6 hours ago
The MXG Player	6 hours ago
TheVR Gaming+	6 hours ago
Kdeous	6 hours ago
GuidingLight	6 hours ago
Rozgrywka TV	6 hours ago
UploadVR	7 hours ago
眠りのたくま人狼殺	7 hours ago
夜見れな/yorumi rena【にじさんじ所属】	7 hours ago