Awesome GenAI Watermarking
1.0.0
This repo include papers about the watermarking methods for generative AI models. Watermarking is a method for embedding an imperceptible, but recoverable signal (payload) into a digital asset (cover). With generative models, there are approaches which train the model to produce the watermark in every output and this behaviour should be hard to disable. We refer to this as "Fingerprint Rooting" or just "Rooting".
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
Watermarking is not Cryptography | IWDW | 2006 | - | Author webpage | - TODO |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data | ICCV | 2021 | - | Arxiv | - Rooting GAN models. By embedding watermark into training data to exploit transferability |
PTW: Pivotal Tuning Watermarking for Pre-Trained Image Generators | USENIX | 2023 | Github | Arxiv | - Focus on GANs, but latent diffusion models should work too |
The Stable Signature: Rooting Watermarks in Latent Diffusion Models | ICCV | 2023 | Github | Arxiv | - Meta/FAIR author Finetune a model in accordance with encoder/decoder to reveal a secret message in its output. - robust to watermark removal and model purification (quality deterioration) - Static watermarking |
Stable Signature is Unstable: Removing Image Watermark from Diffusion Models | - | 2024 | - | Arxiv | - Stable Signature model purification via finetuning |
Flexible and Secure Watermarking for Latent Diffusion Model | ACM MM | 2023 | - | - | - References Stable Signature and improves by adding flexibility by allowing for embedding different messages w.o. finetuning |
A Training-Free Plug-and-Play Watermark Framework for Stable Diffusion | - | 2024 | - | Arxiv | - TODO |
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models | NeurIPS Workshop on Diffusion Models | 2023 | - | Arxiv | - TODO |
RoSteALS: Robust Steganography using Autoencoder Latent Space | CVPR Workshops (CVPRW) | 2023 | Github | Arxiv | - Post-hoc watermarking |
DiffusionShield: A Watermark for Copyright Protection against Generative Diffusion Models | NeurIPS Workshop on Diffusion Models | 2023 | - | Arxiv | - Not about Rooting -Data Poisoning protected images which will reproduce if used as training data in diffusion model |
A Recipe for Watermarking Diffusion Models | - | 2023 | Github | Arxiv | - Framework for 1. small unconditional/class-conditional DMs via training from scratch on watermarked data and 2. text-to-image DMs via finetuning a backdoor-trigger-output - Lots of references on watermarking discriminative models - Static watermarking |
Intellectual Property Protection of Diffusion Models via the Watermark Diffusion Process | - | 2023 | - | Arxiv | - Threat model: Check ownership of model by having access to the model - Hard to read - Explains difference between static and dynamic watermarking with many references |
Securing Deep Generative Models with Universal Adversarial Signature | - | 2023 | Github | Arxiv | - 1. Find optimal signature for an image individually. - 2. Finetune a GenAI model on these images. |
Watermarking Diffusion Model | - | 2023 | - | Arxiv | - Finetuning a backdoor-trigger-output - Static watermarking - CISPA authors |
Catch You Everything Everywhere: Guarding Textual Inversion via Concept Watermarking | - | 2023 | - | Arxiv | - Guards concepts obtained through textual inversion (An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion) from abuse by allowing to identify concepts in generated images. - Very interesting references on company and government stances on watermarking |
Generative Watermarking Against Unauthorized Subject-Driven Image Synthesis | - | 2023 | - | Arxiv | - Different from Glaze in that style synthesis from protected source images is not prevented, but recognizable via watermarks - CISPA authors |
Towards the Vulnerability of Watermarking Artificial Intelligence Generated Content | - | 2024 | - | OpenReview | - Watermark removal and forgery in one method, using GAN - References two types of watermarking: 1. Learn/finetune model to produce watermarked output and 2. post-hoc watermarking after the fact (static vs. dynamic, see "Intellectual Property Protection of Diffusion Models via the Watermark Diffusion Process") |
Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks | ICLR | 2024 | Github | Arxiv | - They show that low budget watermarking methods are beaten by diffusion purification and propose an attack that can even remove high budget watermarks by model substitution |
A Transfer Attack to Image Watermarks | - | 2024 | - | Arxiv | - Watermark removal by "no-box"-attack on detectors (no access to detector-API, instead training classifier to distinguish watermarked and vanilla images) |
EditGuard: Versatile Image Watermarking for Tamper Localization and Copyright Protection | CVPR | 2024 | Github | Arxiv | - Post-hoc watermarking with tamper localization |
Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space | - | 2024 | - | Arxiv | - Discusses 3 categories for watermarks with references: before, during, and after generation |
Stable Messenger: Steganography for Message-Concealed Image Generation | - | 2023 | - | Arxiv | - Post-hoc watermarking - Watermark embedding during generation according to "Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space", but I think it is actually post-hoc. |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
StegaStamp: Invisible Hyperlinks in Physical Photographs | CVPR | 2020 | Github | Arxiv | - Watermark in physical images that can be captured from video stream - "Towards the Vulnerability of Watermarking Artificial Intelligence Generated Content" speculates that Deepmind SynthID works similarly to this |
ChartStamp: Robust Chart Embedding for Real-World Applications | ACM MM | 2022 | Github | - | - Like StegaStamp, but it introduces less clutter in flat regions in images |
Unadversarial Examples: Designing Objects for Robust Vision | NeurIPS | 2021 | Github | Arxiv | - Perturbations to make detection easier |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees | - | 2024 | Github | Arxiv | - Withdrawn from arxiv |
PiGW: A Plug-in Generative Watermarking Framework | - | 2024 | Did not look for it yet | Arxiv | - Withdrawn from arxiv |
Benchmarking the Robustness of Image Watermarks (Wait for ICML source) | ICML | 2024 | Github | Arxiv | - TODO |
WMAdapter: Adding WaterMark Control to Latent Diffusion Models | - | 2024 | Did not look for it yet | Arxiv | - TODO |
Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious? | - | 2024 | Did not look for it yet | Arxiv | - TODO |
Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection | - | 2024 | Did not look for it yet | Arxiv | - TODO |
ProMark: Proactive Diffusion Watermarking for Causal Attribution | CVPR | 2024 | - | Arxiv | - TODO |
Watermarking Images in Self-Supervised Latent Spaces | ICASSP | 2022 | Github | Arxiv | - TODO |
Generative Autoencoders as Watermark Attackers: Analyses of Vulnerabilities and Threats | ICML Workshop DeployableGenerativeAI | 2023 | - | - | - Attack on pixel-watermarks using LDM autoencoders |
Invisible Image Watermarks Are Provably Removable Using Generative AI | - | 2023 | Github | Arxiv | - Is not about rooting a model, but removing watermarks with diffusion purification - Evaluates stable signature and Tree-Ring Watermarks. Tree-ring is robust against their attack. - Earlier Version of Generative Autoencoders as Watermark Attackers |
WaterDiff: Perceptual Image Watermarks Via Diffusion Model | IVMSP-P2 Workshop at ICASSP | 2024 | - | - | - TODO |
Squint Hard Enough: Attacking Perceptual Hashing with Adversarial Machine Learning | USENIX | 2022 | - | - | - Attacks on perceptual hashes |
Evading Watermark based Detection of AI-Generated Content | CCS | 2023 | Github | Arxiv | - Evaluation of robustness of image watermarks + Adversarial sample for evasion |
Diffusion Models for Adversarial Purification | ICML | 2022 | Github | Arxiv | - Defense against adversarial pertubation, including imperceptible watermarks in images |
Flow-Based Robust Watermarking with Invertible Noise Layer for Black-Box Distortions | AIII | 2023 | Github | - | - Like HiDDeN, just a neural watermark encoder/extractor |
HiDDeN: Hiding Data With Deep Networks | ECCV | 2018 | Github | Arxiv | - Main tool used in Stable Signature - Contains differentiable approx. of JPEG compression - Dynamic watermarking |
Glaze: Protecting artists from style mimicry by text-to-image models | USENIX | 2023 | Github | Arxiv | - Is not about Rooting, but denying style stealing |
DUAW: Data-free Universal Adversarial Watermark against Stable Diffusion Customization | - | 2023 | - | Arxiv | - Seem similar to Glaze on first glance. Authors may have been unlucky to do parallel work |
Responsible Disclosure of Generative Models Using Scalable Fingerprinting | ICLR | 2022 | Github | Arxiv | - Rooting GAN models. Seems to have introduced the idea of scalably producing many models fast with large message space (TODO: check this later), similar to how Stable Signature did it later for stable diffusion. |
On Attribution of Deepfakes | - | 2020 | - | Arxiv | - They show that an image can be created that looks like it may have been generated by a targeted model. They also propose a framework how to achieve deniability for such cases. |
Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms | ACM MM | 2022 | Github | Arxiv | - Is not about rooting a model, but about attacking post-hoc watermarking of images - Lots of references on invertible NNs |
DocDiff: Document Enhancement via Residual Diffusion Models | ACM MM | 2023 | Github | Arxiv | - Is not about rooting a model, but about post-hoc watermarking of images - Includes classic watermark removal |
Warfare:Breaking the Watermark Protection of AI-Generated Content | - | 2023 | Did not look for it yet | Arxiv | - Is not about rooting a model, but about attacking post-hoc watermarking - Includes 1. watermark removal and 2. forging |
Leveraging Optimization for Adaptive Attacks on Image Watermarks | ICML (Poster) | 2024 | Did not look for it yet | Arxiv | - Is not about rooting a model, but about attacking post-hoc watermarking |
A Somewhat Robust Image Watermark against Diffusion-based Editing Models | - | 2023 | Did not look for it yet | Arxiv | - Is not about rooting a model, but about post-hoc watermarking of images - Takes watermarks literally and injects hidden images |
Hey That's Mine Imperceptible Watermarks are Preserved in Diffusion Generated Outputs | - | 2023 | - | Arxiv | - Is not about rooting a model. They show that watermarks in training data are recognizable in output and allow for intellectual property claims |
Benchmarking the Robustness of Image Watermarks | - | 2024 | Github | Arxiv | - Just a benchmark/framework for testing watermarks against |
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks | ACM MM | 2023 | Did not look for it yet | Arxiv | - Is not about generative models, but discriminative models |
Adversarial Attack for Robust Watermark Protection Against Inpainting-based and Blind Watermark Removers | ACM MM | 2023 | Did not look for it yet | - | - Post-hoc watermark with enhanced robustness against inpainting |
A Novel Deep Video Watermarking Framework with Enhanced Robustness to H.264/AVC Compression | ACM MM | 2023 | Github | - | - Post-hoc watermark for videos |
Practical Deep Dispersed Watermarking with Synchronization and Fusion | ACM MM | 2023 | Did not look for it yet | Arxiv | - Post-hoc watermark for images with enhanced robustness to transformations |
Generalizable Synthetic Image Detection via Language-guided Contrastive Learning | - | 2023 | Github | Arxiv | - Is not about rooting, but GenAI image detection |
Enhancing the Robustness of Deep Learning Based Fingerprinting to Improve Deepfake Attribution | ACM MM-Asia | 2022 | - | - | - Is not about rooting, but transformation-robustness strategies for watermarks |
You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership | NeurIPS | 2021 | Github | Arxiv | - Watermarking the sparsity mask of winning lottery tickets |
Self-Consuming Generative Models Go MAD | ICLR (Poster) | 2024 | - | Arxiv | - Contains a reason why GenAI detection is important: Removing generated content from training sets |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
Proactive Detection of Voice Cloning with Localized Watermarking | - | 2024 | Github | Arxiv | - Meta/FAIR author |
MaskMark: Robust Neural Watermarking for Real and Synthetic Speech | ICASSP | 2024 | Audio samples | IEEExplore | - |
Collaborative Watermarking for Adversarial Speech Synthesis | ICASSP | 2024 | - | Arxiv | - Meta/FAIR author |
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis | NeurIPS | 2020 | Github | Arxiv | - Very good GAN for Speech synthesis (TODO: Is this SotA?) - Can do live synthesis even on CPU - Quality is on par with autoregressive models |
Spoofed Training Data for Speech Spoofing Countermeasure Can Be Efficiently Created Using Neural Vocoders | ICASSP | 2023 | - | Arxiv | - Include vocoder generated training data to enhance detection capabilities for countermeasures |
AudioQR: Deep Neural Audio Watermarks For QR Code | IJCAI | 2023 | Github | - | - Imperceptible QR-codes in audio for the visually impaired |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
ASVspoof 2021 Challenge | - | 2021 | Github | Arxiv | - Challenge for audio spoofing detection |
ADD 2022: the first Audio Deep Synthesis Detection Challenge | ICASSP | 2022 | Github | Arxiv | - Official Chinese challenge website (NO HTTPS!) |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models | - | 2023 | Github | Arxiv | - |
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding | S&P | 2021 | Github | Arxiv | - |
Resilient Watermarking for LLM-Generated Codes | - | 2024 | Github Appendix | Arxiv | - Code |
Provably Robust Multi-bit Watermarking for AI-generated Text via Error Correction Code | - | 2024 | - | Arxiv | - Error correction |
Provable Robust Watermarking for AI-Generated Text | ICLR | 2024 | Github | Arxiv | - Apparently good and robust LLM Watermarking |
Towards Codable Watermarking for Injecting Multi-Bits Information to LLMs | ICLR | 2024 | Github | Arxiv | - TODO |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks | ACSAC | 2021 | - | Arxiv | - |
Model Extraction Attack and Defense on Deep Generative Models | Journal of Physics | 2022 | - | - | - |
Model Extraction and Defenses on Generative Adversarial Networks | - | 2021 | - | Arxiv | - |
Paper | Proceedings / Journal | Venue Year / Last Updated | Code | Alternative PDF Source | Notes |
---|---|---|---|---|---|
A Comprehensive Survey on Robust Image Watermarking | Neurocomputing | 2022 | - | Arxiv | - Not about model rooting |
A Systematic Review on Model Watermarking for Neural Networks | Frontiers in Big Data | 2021 | - | Arxiv | - Not about model rooting |
A Comprehensive Review on Digital Image Watermarking | - | 2022 | - | Arxiv | - Not about model rooting |
Copyright Protection in Generative AI: A Technical Perspective | - | 2024 | - | Arxiv | - About IP protection in GenAI in general |
Security and Privacy on Generative Data in AIGC: A Survey | - | 2023 | - | Arxiv | - About security aspects in GenAI in general |
Detecting Multimedia Generated by Large AI Models: A Survey | - | 2024 | - | Arxiv | - About detecting GenAI in general |
Audio Deepfake Detection: A Survey | - | 2023 | - | Arxiv | - Contains overview of spoofed audio datasets, spoofing methods, and detection methods - Very good servey |
Summarization of the systematization given in this review.
Goal | Explaination | Motivation |
---|---|---|
Fidelity | High prediction quality on original tasks | model performance shouldn't significantly degrade |
Robustness | Watermark should resist removal | protects against copyright evasion |
Reliability | Minimal false negatives | ensures rightful ownership is recognized |
Integrity | Minimal false positives | prevents wrongful accusations of theft |
Capacity | Supports large information amounts | allows comprehensive watermarks |
Secrecy | Watermark must be secret and undetectable | prevents unauthorized detection |
Efficiency | Fast watermark insertion and verification | avoids computational burden |
Generality | Independent of datasets and ML algorithms | facilitates widespread application |