End-to-end Reinforcement Learning for the Large-scale Traveling Salesman Problem

Channel:

Subscribers:

351,000

Published on December 12, 2022 8:34:55 PM ● Video Link: https://www.youtube.com/watch?v=qaOp1iRL-14

Duration: 30:07

643 views

2022 Data-driven Optimization Workshop: End-to-end Reinforcement Learning for the Large-scale Traveling Salesman Problem

Speaker: Yan Jin, Huazhong University of Science and Technology

Traveling Salesman Problem (TSP) is one of the most studied routing problems that arise in the practical applications of logistics. Traditional approaches not only rely on hand-crafted rules of experts, but also are time-consuming on iterative search. This limits their applications in time sensitive scenarios, e.g., on-call routing and ride hailing service. We propose an end-to-end approach based on hierarchical reinforcement learning for addressing the large-scale TSP. Using a divide-and-conquer strategy, the upper-level policy chooses a small subset of cities from all remaining cities that are to be traversed, while the lower-level policy takes a Transformer model on the chosen cities to solve a shortest path with prescribed starting and ending cities. These two policies are jointly trained by reinforcement learning algorithms, and the TSP solutions can be directly generated without any search procedure. The proposed approach takes advantage of inference efficiency of Transformer model and provides highly competitive results.

Other Videos By Microsoft Research

2023-01-24	SmartKC: A Low-cost, Smartphone-based Corneal Topographer
2023-01-11	MSR-IISc AI Seminar Series: On Learning-Aware Mechanism Design - Michael I. Jordan
2022-12-22	Tongue-Gesture Recognition in Head-Mounted Displays
2022-12-15	Global Renewables Watch - AI for Good Lab - Geospatial
2022-12-15	Toward a Healthy Research Ecosystem for Large Language Models \| Panel Discussion
2022-12-14	Joint Pricing and Inventory Management with Demand Learning
2022-12-14	SITI 2022 - Panel Discussion and moderated Q&A session
2022-12-12	Machine Learning for Combinatorial Optimization: Some Empirical Studies
2022-12-12	Online Facility Location with Predictions
2022-12-12	Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
2022-12-12	End-to-end Reinforcement Learning for the Large-scale Traveling Salesman Problem
2022-12-06	Personality Predictions from Automated Video Interviews: Explainable or Unexplainable Models?
2022-12-06	Responsible AI: An Interdisciplinary Approach \| Panel Discussion
2022-12-06	Personalizing Responsibility within AI Systems: A Case for Designing Diversity
2022-12-06	Evidence-based Evaluation for Responsible AI
2022-12-06	Towards Trustworthy Recommender Systems: From Shallow Models to Deep Models to Large Models
2022-12-06	Development of a Game-Based Assessment to Measure Creativity
2022-12-06	Interpretability, Responsibility and Controllability of Human Behaviors
2022-12-06	On the Adversarial Robustness of Deep Learning
2022-12-06	The Long March Towards AI Fairness
2022-12-06	Towards Human Value Based Natural Language Processing (NLP)

Channel	Latest
German Quest Guide	6 hours ago
Lost in Gaming	6 hours ago
Parsa Tube HD	7 hours ago
NintendoCapriSun	7 hours ago
JLO CESAR	7 hours ago
Noobverest	7 hours ago
Jan Szy	7 hours ago
ELXBACK	7 hours ago
Prem Jeff SP	7 hours ago
StemSullGameClips	7 hours ago
Magic Five	7 hours ago
domisumReplay: Renekton	7 hours ago
77Game Play	7 hours ago
T8 Batman	8 hours ago
Mehmet Uzun	8 hours ago
domisumReplay: Syndra	8 hours ago
domisumReplay: Mordekaiser	8 hours ago
NICKRICK Games	8 hours ago
Alberto Gamer	8 hours ago
Shhoto	8 hours ago
DismArchus	8 hours ago
20fadhil: Revolution	8 hours ago
Zanginary	8 hours ago
Baba Behwish	8 hours ago
Camed P	8 hours ago