How load balancing AI workloads delivers faster user response times (demo)

Channel:
Subscribers:
296,000
Published on ● Video Link: https://www.youtube.com/watch?v=x_2ehkP1GJk



Duration: 0:00
165 views
16


Join Emanuele Mazza, a Networking Product Specialist at Google Cloud, to learn about how Cloud Load Balancing uses custom metrics to provide queue depth as a metric for load balancing AI workloads to deliver faster user response time to prompts while optimizing TPU and GPU utilization. In this demo, we showcase the gamification of the load balancer where the attendee is competing against our load balancer operating in the region. The console shows how to configure the load balancer and select the least loaded optimal endpoints, ensuring less wait time for GPUs and faster inference response.

Learn more about Cloud Load Balancing here: https://cloud.google.com/load-balancing?e=48754805




Other Videos By Google Cloud


2025-04-07Mercari improves UX on its ecommerce marketplace with Google AI and Weights & Biases
2025-04-07Google Cloud and MLB deliver baseball globally with Media CDN
2025-04-07Google Cloud and MLB deliver baseball globally with Media CDN
2025-04-07Introduction to Vertex AI Studio: Course Preview
2025-04-06Google Cloud and MLB deliver baseball globally with Media CDN
2025-04-04New Way Now: Virgin Media O2 puts data at the heart of UK connectivity with Google Cloud
2025-04-03How Vodafone uses Google Gemini to create new content with AI to boost employee productivity (demo)
2025-04-03How Google Cloud and Deutsche Telekom are making AI a seamless part of daily life (demo)
2025-04-03DT & Google Gemini: Building Autonomous Network Agents
2025-04-03How AI network operations detect and resolve network issues quickly (demo)
2025-04-03How load balancing AI workloads delivers faster user response times (demo)
2025-04-03How to use Google Cloud Customer Engagement Suite to boost customer satisfaction (demo)
2025-04-03How autonomous network opeartions are enabled with Google Cloud and Ericsson (demo)
2025-04-03How to unlock enterprise expertise for employees with agents using Google Agentspace (demo)
2025-04-03Cómo usar Google Cloud Customer Engagement Suite para aumentar la satisfacción del cliente
2025-04-03How to create AI agents using Gemini on Vertex AI and AI Studio (demo)
2025-04-03How to use multi-modal gen AI to elevate field agent knowledge and productivity (demo)
2025-04-03Protecting cybersecurity with NVIDIA and Google Cloud
2025-04-03WealthAPI delivers next-gen financial insights in real time with Gemini and DataStax
2025-04-03Stax AI transforms trust accounting data with Google Cloud and MongoDB
2025-04-02Replit makes software creation easy for everyone with Google Cloud and Anthropic