Chasing the Long Tail: What Neural Networks Memorize and Why

Published on ● Video Link: https://www.youtube.com/watch?v=w_BUN5tPiuA



Duration: 51:41
1,712 views
22


Vitaly Feldman (Apple ML Research)
https://simons.berkeley.edu/node/22921
Societal Considerations and Applications

Deep learning algorithms that achieve state-of-the-art results on image and text recognition tasks tend to fit the entire training dataset (nearly) perfectly including mislabeled examples and outliers. This propensity to memorize seemingly useless data and the resulting large generalization gap have puzzled many practitioners and is not explained by existing theories of machine learning. We provide a simple conceptual explanation and a theoretical model demonstrating that memorization of outliers and mislabeled examples is necessary for achieving close-to-optimal generalization error when learning from long-tailed data distributions. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. We then demonstrate the utility of memorization and support our explanation empirically. These results rely on a new technique for efficiently estimating memorization and influence of training data points. Our results allow us to quantify the cost of limiting memorization in learning and explain the disparate effects that privacy and model compression have on different subgroups.




Other Videos By Simons Institute for the Theory of Computing


2022-11-10Decision-Aware Learning for Global Health Supply Chains
2022-11-10Supply-Side Equilibria in Recommender Systems
2022-11-10What Really Matters for Fairness in Machine Learning: Delayed Impact and Other Desiderata
2022-11-10Predictive Modeling in Healthcare – Special Considerations
2022-11-10Bringing Order to Chaos: Navigating the Disagreement Problem in Explainable ML
2022-11-09Pipeline Interventions
2022-11-09Algorithmic Challenges in Ensuring Fairness at the Time of Decision
2022-11-09Improving Refugee Resettlement
2022-11-09Learning to Predict Arbitrary Quantum Processes
2022-11-09A Kerfuffle: Differential Privacy and the 2020 Census
2022-11-08Chasing the Long Tail: What Neural Networks Memorize and Why
2022-11-08Concurrent Composition Theorems for all Standard Variants of Differential Privacy
2022-11-08Privacy Management: Achieving the Possimpible
2022-11-07Privacy-safe Measurement on the Web: Open Questions From the Privacy Sandbox
2022-10-29Transmission Neural Networks: From Virus Spread Models to Neural Networks
2022-10-29Spatial Spread of Dengue Virus: Appropriate Spatial Scales for Transmission
2022-10-28A Global Comparison of COVID-19 Variant Waves and Relationships with Clinical and...
2022-10-28Diversity and Inequality in Information Diffusion on Social Networks
2022-10-28Learning through the Grapevine and the Impact of the Breadth and Depth of Social Networks
2022-10-28Just a Few Seeds More: The Inflated Value of Network Data for Diffusion...
2022-10-27Bayesian Learning in Social Networks



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Epidemics and Information Diffusion
Vitaly Feldman