Efficient Training Image Extraction from Diffusion Models Ryan Webs

Subscribers:
348,000
Published on ● Video Link: https://www.youtube.com/watch?v=K-jeWq-RFaY



Duration: 34:38
890 views
12


A Google TechTalk, presented by Ryan Webster, 2023-09-13
Abstract: The recent demonstration of Carlini et al. shows highly duplicated training images can be copied by diffusion models during generation, which is problematic in terms of data privacy and copyright. Known as an extraction attack, this method reconstructs training images using only a model's generated samples. As the original work requires on the order of gpu-years to perform, we provide a pipeline that can run in gpu-days and can extract a similar number of images. We first de-duplicate the public dataset LAION-2B and demonstrate a high level of duplicated images. We then provide whitebox and blackbox extraction attacks on par with the original attack, whilst requiring significantly less network evaluations. As we can evaluate more samples, we expose the phenomenon of template copies, wherein a diffusion model copies a fixed image region and varies another. We demonstrate that new diffusion models that deduplicate their training set do not generate exact copies as in Carlini et al., but do generate templates. We conclude with several insights into copied images from a data perspective.




Other Videos By Google TechTalks


2024-04-22Design is Testability
2024-04-12Charles Hoskinson | CEO of Input Output Global | web3 talks | Apr 4th 2024 | MC: Marlon Ruiz
2024-04-08Limitations of Stochastic Selection with Pairwise Independent Priors
2024-04-02NASA CARA - Air Traffic Control in Spaaaaaaaace
2024-03-28How Your Brain Processes Code
2024-03-25Fixed-point Error Bounds for Mean-payoff Markov Decision Processes
2024-03-19One Tree to Rule Them All: Polylogarithmic Universal Steiner Trees
2024-01-26Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies
2023-12-05Socially Responsible Software Development (Teaching Software Design Systematically)
2023-12-04Understanding and Mitigating Copying in Diffusion Models
2023-12-04Efficient Training Image Extraction from Diffusion Models Ryan Webs
2023-11-30High-Dimensional Prediction for Sequential Decision Making
2023-09-01Representational Strengths and Limitations of Transformers
2023-09-01Steven Goldfeder | CEO Offchain Labs / Arbitrum | web3 talks | Aug 24 2023 | MC: Marlon Ruiz
2023-08-29Differentially Private Sampling from Distributions
2023-07-14Revisiting Nearest Neighbors from a Sparse Signal Approximation View
2023-07-032023 Blockly Developer Summit Day 2-5: Plug-ins Demonstration
2023-07-032023 Blockly Developer Summit DAY 1-5: The Future of Computational Thinking
2023-07-032023 Blockly Developer Summit DAY 1-7: Cubi - Extending Blockly for Teachers
2023-07-032023 Blockly Developer Summit DAY 1-12: Serialization and Visual Diff
2023-07-032023 Blockly Developer Summit Day 2-2: Blockly Themes for Accessibility