Efficient Training Image Extraction from Diffusion Models Ryan Webs

Channel:

Google TechTalks

Subscribers:

349,000

Published on December 4, 2023 6:30:49 PM ● Video Link: https://www.youtube.com/watch?v=K-jeWq-RFaY

Duration: 34:38

890 views

A Google TechTalk, presented by Ryan Webster, 2023-09-13
Abstract: The recent demonstration of Carlini et al. shows highly duplicated training images can be copied by diffusion models during generation, which is problematic in terms of data privacy and copyright. Known as an extraction attack, this method reconstructs training images using only a model's generated samples. As the original work requires on the order of gpu-years to perform, we provide a pipeline that can run in gpu-days and can extract a similar number of images. We first de-duplicate the public dataset LAION-2B and demonstrate a high level of duplicated images. We then provide whitebox and blackbox extraction attacks on par with the original attack, whilst requiring significantly less network evaluations. As we can evaluate more samples, we expose the phenomenon of template copies, wherein a diffusion model copies a fixed image region and varies another. We demonstrate that new diffusion models that deduplicate their training set do not generate exact copies as in Carlini et al., but do generate templates. We conclude with several insights into copied images from a data perspective.

Other Videos By Google TechTalks

2024-04-22	Design is Testability
2024-04-12	Charles Hoskinson \| CEO of Input Output Global \| web3 talks \| Apr 4th 2024 \| MC: Marlon Ruiz
2024-04-08	Limitations of Stochastic Selection with Pairwise Independent Priors
2024-04-02	NASA CARA - Air Traffic Control in Spaaaaaaaace
2024-03-28	How Your Brain Processes Code
2024-03-25	Fixed-point Error Bounds for Mean-payoff Markov Decision Processes
2024-03-19	One Tree to Rule Them All: Polylogarithmic Universal Steiner Trees
2024-01-26	Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies
2023-12-05	Socially Responsible Software Development (Teaching Software Design Systematically)
2023-12-04	Understanding and Mitigating Copying in Diffusion Models
2023-12-04	Efficient Training Image Extraction from Diffusion Models Ryan Webs
2023-11-30	High-Dimensional Prediction for Sequential Decision Making
2023-09-01	Representational Strengths and Limitations of Transformers
2023-09-01	Steven Goldfeder \| CEO Offchain Labs / Arbitrum \| web3 talks \| Aug 24 2023 \| MC: Marlon Ruiz
2023-08-29	Differentially Private Sampling from Distributions
2023-07-14	Revisiting Nearest Neighbors from a Sparse Signal Approximation View
2023-07-03	2023 Blockly Developer Summit Day 2-5: Plug-ins Demonstration
2023-07-03	2023 Blockly Developer Summit DAY 1-5: The Future of Computational Thinking
2023-07-03	2023 Blockly Developer Summit DAY 1-7: Cubi - Extending Blockly for Teachers
2023-07-03	2023 Blockly Developer Summit DAY 1-12: Serialization and Visual Diff
2023-07-03	2023 Blockly Developer Summit Day 2-2: Blockly Themes for Accessibility

Channel	Latest
TYT Sports	6 hours ago
SinCityBartender	7 hours ago
Vamos Show	7 hours ago
Night Hour	7 hours ago
TheGearsProdigy	7 hours ago
CoolBoy Conundrum	7 hours ago
Estib PLAYZ	7 hours ago
Press-Start Mx	7 hours ago
Momobile	7 hours ago
Joaner	7 hours ago
Nicky Ogata e Família	8 hours ago
Leon Indie	8 hours ago
Samuel Ramogo	8 hours ago
Comrakoff's Dead Channel	8 hours ago
Quindinho	8 hours ago
RJ Anda	8 hours ago
Baalorlord Unedited	8 hours ago
ALONG BOSS	8 hours ago
Flowerland of John	8 hours ago
FivePoints Vids	8 hours ago
LankyBox World	8 hours ago
Santaeid	8 hours ago
Erutan Live	8 hours ago
Subzeroark	8 hours ago
JIJI PLAYS	8 hours ago