Improving the Privacy Utility Tradeoff in Differentially Private Machine Learning with Public Data

Channel:

Google TechTalks

Subscribers:

349,000

Published on May 30, 2023 6:55:15 PM ● Video Link: https://www.youtube.com/watch?v=KzxAP20TMJc

Duration: 38:17

363 views

A Google TechTalk, presented by Ashwinee Panda & Xinyu Tang (Princeton University), 2023/03/29
ABSTRACT: Differential privacy (DP) has become the de-facto measure of privacy. By training machine learning models with Differentially Private Stochastic Gradient Descent (DP-SGD), we can provide provable guarantees that the trained model does not leak too much information about its training data. However, DP-SGD can compromise the accuracy of machine learning models because gradient clipping increases bias and adding Gaussian noise increases variance of each gradient update. In this talk we present two algorithms, DP-RAFT and DOPE-SGD, that leverage public data to improve the privacy utility tradeoff in DP-SGD. When ample public data is available to pretrain a model we propose DP-RAFT, a recipe that privately selects the best hyperparameters for fine-tuning to maximize the signal to noise ratio of private updates. In instances where limited public data is available we propose DOPE-SGD, an algorithm that applies advanced data augmentation to enhance the quality of public data and incorporates gradients from (augmented) public data in clipping to reduce the effect of added noise in the privatized gradients.

Other Videos By Google TechTalks

2023-07-03	2023 Blockly Developer Summit Day 2-6: Code.org - Sprite Lab
2023-07-03	2023 Blockly Developer Summit Day 2-7: How to Convince Teachers to Teach Coding
2023-06-29	A Constant Factor Prophet Inequality for Online Combinatorial Auctions
2023-06-21	Open Problems in Mechanistic Interpretability: A Whirlwind Tour
2023-06-11	Online Prediction in Sub-linear Space
2023-06-06	Accelerating Transformers via Kernel Density Estimation Insu Han
2023-06-06	Differentially Private Synthetic Data via Foundation Model APIs
2023-06-05	Foundation Models and Fair Use
2023-05-30	Differentially Private Online to Batch
2023-05-30	Differentially Private Diffusion Models Generate Useful Synthetic Images
2023-05-30	Improving the Privacy Utility Tradeoff in Differentially Private Machine Learning with Public Data
2023-05-30	Randomized Approach for Tight Privacy Accounting
2023-05-30	Almost Tight Error Bounds on Differentially Private Continual Counting
2023-05-30	EIFFeL: Ensuring Integrity for Federated Learning
2023-05-30	Differentially Private Diffusion Models
2023-05-15	Damian Grimling \| Sentistocks \| Sentimenti \| web3 talks \| March 9th 2023 \| MC: Blake DeBenon
2023-04-21	Branimir Rakic \| CTO & Co-Founder of OriginTrail \| web3 talks \| Feb 27th 2023 \| MC: Alex Ticamera
2023-04-15	A Nearly Tight Analysis of Greedy k-means++
2023-04-15	Introduction to Length-Constrained Expanders and Expander Decompositions
2023-04-07	Improved Feature Importance Computation for Tree Models Based on the Banzhaf Value
2023-04-07	A Unifying Theory of Distance to Calibration

Channel	Latest
Склад Левши	6 hours ago
Scott Jund	6 hours ago
Smutsen	6 hours ago
BeastyqtSC2	6 hours ago
Exalted	6 hours ago
Bonkol Live	6 hours ago
Teh Spearhead	6 hours ago
Ashe Challenger	6 hours ago
Andrius Klimka	6 hours ago
Austinmp88	6 hours ago
Ask About Parenting & Care	7 hours ago
NewMulti2k	7 hours ago
Opala Flutuante	7 hours ago
GranaDy	7 hours ago
Catninja909	7 hours ago
Sion VOD Gaming	7 hours ago
mlodyhubson	7 hours ago
Outplanet Studios	7 hours ago
RakuInariLP	7 hours ago
Xmilek62	7 hours ago
BranOnline	7 hours ago
ketsueki_randi	7 hours ago
beavsbaut	7 hours ago
PIMPNITE	7 hours ago
JugZone	7 hours ago