The Compensated Coupling (or Why the Future is the Best Guide for the Present)

Published on ● Video Link: https://www.youtube.com/watch?v=wYxq-KJ7v5A



Category:
Guide
Duration: 31:56
338 views
7


Sid Banerjee (Cornell University)
https://simons.berkeley.edu/talks/compensated-coupling-or-why-future-best-guide-present
Joint IFML/Data-Driven Decision Processes Workshop

What makes online decision-making different from other decision-making/optimization problems? While it seems clear that the unique features are the sequential nature of taking actions and uncertainty in future outcomes, most techniques for solving such problems tend to obfuscate these features - so are these the best ways to think about these settings? I will present the compensated coupling: a simple paradigm for reasoning about and designing online decision-making policies, based on a sample-pathwise accounting of their performance compared to some benchmark policy. This approach generalizes many standard results used in studying Markov decision processes and reinforcement learning, but also gives us new policies which are much simpler and more effective than existing heuristics. For a large class of widely-studied control problems including online resource-allocation, dynamic pricing, generalized assignment, online bin packing, and bandits with knapsacks, I will illustrate how these new algorithms achieve constant regret (i.e., additive loss compared to the hindsight optimal which is independent of the horizon and state-space) under a wide range of conditions. Time permitting, I will try and describe how we can use this technique to incorporate side information and historical data in these settings, and achieve constant regret with as little as a single data trace.




Other Videos By Simons Institute for the Theory of Computing


2022-10-12Learning Across Bandits in High Dimension via Robust Statistics
2022-10-12Are Multicriteria MDPs Harder to Solve Than Single-Criteria MDPs?
2022-10-12A Game-Theoretic Approach to Offline Reinforcement Learning
2022-10-11The Statistical Complexity of Interactive Decision Making
2022-10-11A Tutorial on Finite-Sample Guarantees of Contractive Stochastic Approximation With...
2022-10-11A Tutorial on Finite-Sample Guarantees of Contractive Stochastic Approximation With...
2022-10-11Stochastic Bin Packing with Time-Varying Item Sizes
2022-10-10Constant Regret in Exchangeable Action Models: Overbooking, Bin Packing, and Beyond
2022-10-08On The Exploration In Load-Balancing Under Unknown Service Rates
2022-10-08Sample Complexity Of Policy-Based Methods Under Off-Policy Sampling And ...
2022-10-08The Compensated Coupling (or Why the Future is the Best Guide for the Present)
2022-10-08Higher-Dimensional Expansion of Random Geometric Complexes
2022-10-08On the Power of Preconditioning in Sparse Linear Regression
2022-10-07What Functions Do Transformers Prefer to Represent?
2022-10-01Optimality of Variational Inference for Stochastic Block Model
2022-10-01Machine Learning on Large-Scale Graphs
2022-10-01Survey on Sparse Graph Limits + A Toy Example
2022-10-01Long Range Dependence in Evolving Networks
2022-09-30Stochastic Processes on Sparse Graphs: Hydrodynamic Limits and Markov Approximations
2022-09-30Large Deviation Principle for the Norm of the Adjacency Matrix and the Laplacian Matrix of...
2022-09-30Longitudinal Network Models, Log-Linear Multigraph Models, and Implications to Estimation and...



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Joint IFML/Data-Driven Decision Processes Workshop
Sid Banerjee