Differentially Private Sampling from Distributions

Subscribers:
348,000
Published on ● Video Link: https://www.youtube.com/watch?v=AkIPBjZAy9s



Duration: 44:48
672 views
11


A Google TechTalk, presented by Marika Swanberg, 2023-08-22
Google Algorithms Seminar. ABSTRACT: We initiate an investigation of private sampling from distributions. Given a dataset with n independent observations from an unknown distribution P, a sampling algorithm must output a single observation from a distribution that is close in total variation distance to P while satisfying differential privacy. Sampling abstracts the goal of generating small amounts of realistic-looking data.
We provide tight upper and lower bounds for the dataset size needed for this task for three natural families of distributions: arbitrary distributions on {1,…,k}, arbitrary product distributions on {0,1}^d, and product distributions on {0,1}^d with bias in each coordinate bounded away from 0 and 1. We demonstrate that, in some parameter regimes, private sampling requires asymptotically fewer observations than learning a description of P nonprivately; in other regimes, however, private sampling proves to be as difficult as private learning. Notably, for some classes of distributions, the overhead in the number of observations needed for private learning compared to non-private learning is completely captured by the number of observations needed for private sampling.

This work appeared at NeurIPS and the full version can be found here: https://arxiv.org/abs/2211.08193

Bio: Marika is a rising fifth year PhD student at Boston University advised by Adam Smith. Her research spans differential privacy, cryptography, and their intersection with legal questions, and more recently she has become interested in practical implementations of differential privacy. Before interning at Google, she was a visiting assistant professor at Reed College, and she has also done research for Tumult Labs.




Other Videos By Google TechTalks


2024-03-28How Your Brain Processes Code
2024-03-25Fixed-point Error Bounds for Mean-payoff Markov Decision Processes
2024-03-19One Tree to Rule Them All: Polylogarithmic Universal Steiner Trees
2024-01-26Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies
2023-12-05Socially Responsible Software Development (Teaching Software Design Systematically)
2023-12-04Understanding and Mitigating Copying in Diffusion Models
2023-12-04Efficient Training Image Extraction from Diffusion Models Ryan Webs
2023-11-30High-Dimensional Prediction for Sequential Decision Making
2023-09-01Representational Strengths and Limitations of Transformers
2023-09-01Steven Goldfeder | CEO Offchain Labs / Arbitrum | web3 talks | Aug 24 2023 | MC: Marlon Ruiz
2023-08-29Differentially Private Sampling from Distributions
2023-07-14Revisiting Nearest Neighbors from a Sparse Signal Approximation View
2023-07-032023 Blockly Developer Summit Day 2-5: Plug-ins Demonstration
2023-07-032023 Blockly Developer Summit DAY 1-5: The Future of Computational Thinking
2023-07-032023 Blockly Developer Summit DAY 1-7: Cubi - Extending Blockly for Teachers
2023-07-032023 Blockly Developer Summit DAY 1-12: Serialization and Visual Diff
2023-07-032023 Blockly Developer Summit Day 2-2: Blockly Themes for Accessibility
2023-07-032023 Blockly Developer Summit DAY 1-14: BlocksCAD - Math + Coding + Design
2023-07-032023 Blockly Developer Summit Day 2-3: Revisiting Performance in Blockly
2023-07-032023 Blockly Developer Summit DAY 1-11: Performance
2023-07-032023 Blockly Developer Summit Day 1-2 Year in Review and Roadmap