Differentially Private Synthetic Data via Foundation Model APIs

Subscribers:
348,000
Published on ● Video Link: https://www.youtube.com/watch?v=WvCfGPSzaUs



Duration: 55:30
327 views
6


A Google TechTalk, presented by Sivakanth Gopi, 2023/06/01
A Google Algorithms Seminar. ABSTRACT: Generating good quality DP synthetic data is the holy grail of DP research. Current SOTA is to fine-tune a pretrained generative model on private data using DPSGD and use it to generate DP synthetic data. But DP fine-tuning is cumbersome and we may not have access to the weights of pretrained foundation models.

In this work, we show that we can generate DP Synthetic Data via APIs (DPSDA), where we treat foundation models as blackboxes and only utilize their inference APIs. Such an approach can leverage the power of large foundation models whose model weights are unknown and is easy to use because it doesn't require any model training. However, this comes with greater challenges due to strictly more restrictive model access and the additional need to protect privacy from the API provider.

We present a new framework called Private Evolution (PE) to solve this problem and show its initial promise on synthetic images. Surprisingly, PE can match or even outperform state-of-the-art (SOTA) DP fine-tuning methods. For example, on CIFAR10 (with ImageNet as the public data), we achieve FID of 8 with privacy cost ϵ = 0.67, significantly improving the previous SOTA from ϵ = 32. Based on joint work with Zinan Lin, Janardhan Kulkarni, Harsha Nori and Sergey Yekhanin.

About the speaker: Sivakanth Gopi is a senior researcher in the Algorithms group at Microsoft Research Redmond. He received a PhD in Computer Science from Princeton University in 2018 during which he received a STOC best paper award for his work on private information retrieval. He completed his undergraduate studies at IIT Bombay with a major in computer science and a minor in mathematics. His main research interests are in coding theory and its applications to both theory and practice, and differential privacy.




Other Videos By Google TechTalks


2023-07-032023 Blockly Developer Summit Day 2-16: Curriculum Development Panel Discussion
2023-07-032023 Blockly Developer Summit Day 2-11: Onboarding New Users
2023-07-032023 Blockly Developer Summit Day 2-15: Thoughts on Bidirectional Text to Blocks to Text
2023-07-032023 Blockly Developer Summit Day 2-6: Code.org - Sprite Lab
2023-07-032023 Blockly Developer Summit Day 2-7: How to Convince Teachers to Teach Coding
2023-07-032023 Blockly Developer Summit Day 2-14: Text to Blocks to Text with Layout
2023-06-29A Constant Factor Prophet Inequality for Online Combinatorial Auctions
2023-06-21Open Problems in Mechanistic Interpretability: A Whirlwind Tour
2023-06-11Online Prediction in Sub-linear Space
2023-06-06Accelerating Transformers via Kernel Density Estimation Insu Han
2023-06-06Differentially Private Synthetic Data via Foundation Model APIs
2023-06-05Foundation Models and Fair Use
2023-05-30Differentially Private Online to Batch
2023-05-30Differentially Private Diffusion Models Generate Useful Synthetic Images
2023-05-30Improving the Privacy Utility Tradeoff in Differentially Private Machine Learning with Public Data
2023-05-30Randomized Approach for Tight Privacy Accounting
2023-05-30Almost Tight Error Bounds on Differentially Private Continual Counting
2023-05-30EIFFeL: Ensuring Integrity for Federated Learning
2023-05-30Differentially Private Diffusion Models
2023-05-15Damian Grimling | Sentistocks | Sentimenti | web3 talks | March 9th 2023 | MC: Blake DeBenon
2023-04-21Branimir Rakic | CTO & Co-Founder of OriginTrail | web3 talks | Feb 27th 2023 | MC: Alex Ticamera