Robustly Learning Mixtures of (Clusterable) Gaussians via the SoS Proofs to Algorithms Method

Published on ● Video Link: https://www.youtube.com/watch?v=uxrthnm1fzY



Duration: 1:00:48
1,084 views
27


Sam Hopkins, UC Berkeley
Probability, Geometry, and Computation in High Dimensions
Seminar, Oct. 6, 2020

Gaussian Mixture Models (GMMs) are the canonical generative models for non-homogeneous data: they have been studied in statistics and (later) computer science for more than 100 years. A seminal algorithm of Moitra and Vailiant shows that for every fixed k, mixtures of k Gaussians in d dimensions can be learned in time polynomial in d. However, this algorithm is highly non-robust: if the underlying distribution of samples is merely close to being a GMM (or if any samples are corrupted), their algorithm fails to learn the distribution.

We design and analyze a new, robust algorithm to learn high-dimensional GMMs, under the assumption that the mixture is clusterable — that is, all pairs of Gaussians in the mixture have total variation distance close to 1. Prior robust algorithms for learning GMMs work only for spherical Gaussians — that is, with covariance (upper bounded by) identity.

In this talk I will attempt to emphasize the SoS Proofs to Algorithms method which we use to design and analyze our algorithm. This method, which offers a mechanical way to turn a sufficiently simple proof of identifiability of parameters of a distribution into an efficient algorithm to learn those parameters, has proven to be exceptionally powerful in a wide range of high-dimensional statistics settings.

https://simons.berkeley.edu/events/robustly-learning-mixtures-clusterable-gaussians-sos-proofs-algorithms-method




Other Videos By Simons Institute for the Theory of Computing


2020-10-12New Tools for Analysis of Markov Chains via High-Dimensional Expansion
2020-10-12Lee-Yang Zeros and the Complexity of the Ferromagnetic Ising Model on Bounded-Degree Graphs
2020-10-12Determinantal Representations and the Principal Minor Map
2020-10-12A (Slightly) Improved Approximation Algorithm for Metric TSP
2020-10-12Zero-Free Regions for Repulsive Gasses
2020-10-12A Geometric Approach to Conic Stability of Polynomials
2020-10-12Paving Property for Strongly Rayleigh Distributions
2020-10-12Simplicial Generation of Chow Rings of Matroids
2020-10-12Classical Algorithms, Correlation Decay, and Complex Zeros of Quantum Partition Functions
2020-10-12Spectral Sets and Derivatives of The Psd Cone
2020-10-09Robustly Learning Mixtures of (Clusterable) Gaussians via the SoS Proofs to Algorithms Method
2020-10-05Richard M. Karp Distinguished Lecture – Safe Learning in Robotics
2020-10-02Deep Robust Reinforcement Learning and Regularization
2020-10-02Mixed Autonomy Traffic: A Reinforcement Learning Perspective
2020-10-02CoinDICE: Off-Policy Confidence Interval Estimation via Dual Lens
2020-10-02Rigorous Uncertainty Quantification for Off-policy Evaluation in Reinforcement Learning: a Variation
2020-10-02Is Safe Learning the Future of RL?
2020-10-01Policy Gradients Methods, Neural Policy Classes, and Distribution Shift
2020-10-01Fast Reinforcement Learning With Generalized Policy Updates
2020-10-01Exploiting Latent Structure and Bisimulation Metrics for Better Generalization
2020-10-01Invariant Prediction for Generalization in Reinforcement Learning



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing