Improving the Privacy Utility Tradeoff in Differentially Private Machine Learning with Public Data

Subscribers:
348,000
Published on ● Video Link: https://www.youtube.com/watch?v=KzxAP20TMJc



Duration: 38:17
363 views
9


A Google TechTalk, presented by Ashwinee Panda & Xinyu Tang (Princeton University), 2023/03/29
ABSTRACT: Differential privacy (DP) has become the de-facto measure of privacy. By training machine learning models with Differentially Private Stochastic Gradient Descent (DP-SGD), we can provide provable guarantees that the trained model does not leak too much information about its training data. However, DP-SGD can compromise the accuracy of machine learning models because gradient clipping increases bias and adding Gaussian noise increases variance of each gradient update. In this talk we present two algorithms, DP-RAFT and DOPE-SGD, that leverage public data to improve the privacy utility tradeoff in DP-SGD. When ample public data is available to pretrain a model we propose DP-RAFT, a recipe that privately selects the best hyperparameters for fine-tuning to maximize the signal to noise ratio of private updates. In instances where limited public data is available we propose DOPE-SGD, an algorithm that applies advanced data augmentation to enhance the quality of public data and incorporates gradients from (augmented) public data in clipping to reduce the effect of added noise in the privatized gradients.




Other Videos By Google TechTalks


2023-07-032023 Blockly Developer Summit Day 2-14: Text to Blocks to Text with Layout
2023-07-032023 Blockly Developer Summit Day 2-8: Active STEM with Unruly Splats
2023-06-29A Constant Factor Prophet Inequality for Online Combinatorial Auctions
2023-06-21Open Problems in Mechanistic Interpretability: A Whirlwind Tour
2023-06-11Online Prediction in Sub-linear Space
2023-06-06Accelerating Transformers via Kernel Density Estimation Insu Han
2023-06-06Differentially Private Synthetic Data via Foundation Model APIs
2023-06-05Foundation Models and Fair Use
2023-05-30Differentially Private Online to Batch
2023-05-30Differentially Private Diffusion Models Generate Useful Synthetic Images
2023-05-30Improving the Privacy Utility Tradeoff in Differentially Private Machine Learning with Public Data
2023-05-30Randomized Approach for Tight Privacy Accounting
2023-05-30Almost Tight Error Bounds on Differentially Private Continual Counting
2023-05-30EIFFeL: Ensuring Integrity for Federated Learning
2023-05-30Differentially Private Diffusion Models
2023-05-15Damian Grimling | Sentistocks | Sentimenti | web3 talks | March 9th 2023 | MC: Blake DeBenon
2023-04-21Branimir Rakic | CTO & Co-Founder of OriginTrail | web3 talks | Feb 27th 2023 | MC: Alex Ticamera
2023-04-15A Nearly Tight Analysis of Greedy k-means++
2023-04-15Introduction to Length-Constrained Expanders and Expander Decompositions
2023-04-07Improved Feature Importance Computation for Tree Models Based on the Banzhaf Value
2023-04-07A Unifying Theory of Distance to Calibration