Detecting GAN-generated Imagery using Color Cues¶

Authors: Scott McCloskey, Michael Albright Affiliation: Honeywell ACST Venue: arXiv, 2018 — arxiv:1812.08247

TL;DR¶

This paper identifies two forensic cues to distinguish GAN-generated images from real camera imagery by analyzing how generator networks process color. The first cue exploits overlapping color channel weights in the generator (unlike camera spectral sensitivities); the second leverages the fact that generator normalization suppresses saturated pixels. The saturation-based method achieves 0.7 AUC on fully synthetic images and 0.61 AUC on face-swapped images.

Contributions¶

Analysis of GAN generator architecture, focusing on the color formation layer that collapses K > 3 depth channels to RGB
Identification of two architectural properties that differ from real cameras:
Generator color weights overlap and include negative values, unlike camera spectral response functions
Normalization steps (pixel-wise or layer-wise) constrain output values to a uniform distribution, preventing the saturation and under-exposure regions present in natural images
Two forensic detection methods based on these cues: color chromaticity histograms (r-g space) and saturation frequency features (over- and under-exposed pixel counts)
Experimental evaluation on NIST Media Forensics Challenge 2018 datasets

Method¶

The paper analyzes two stages of GAN generation:

Color image formation: The generator's final layer converts K > 3 depth feature maps to RGB via a 1×1 convolution (or similar). This is conceptually similar to a camera's Bayer color filter array collapsing spectral information to RGB. However, the learned weights differ fundamentally from spectral response functions: they have negative values, exhibit high overlap across channels, and share common peaks. Real camera spectral sensitivities are non-negative, have limited overlap, and peak at different wavelengths.

Normalization effects: Both pixel-wise normalization (dividing by the vector magnitude) and layer-wise normalization (standardizing each depth plane) constrain feature values to a fixed range before RGB conversion. This produces well-behaved, uniformly distributed output intensities. Real cameras, by contrast, receive irradiance with a logarithmic distribution, leading to regions of saturation (over-exposure) and under-exposure.

Two detection pipelines follow:

Color forensics: Compute r-g chromaticity coordinates for each pixel and build a 2D histogram. Use a pre-trained Intensity Noise Histogram (INH) network (from prior work) fine-tuned on chromaticity histograms to classify as GAN or camera.
Saturation forensics: Count pixels with intensity ≥ τ_o ∈ {240, 245, 250, 255} (over-exposed) and ≤ τ_u ∈ {0, 5, 10, 15} (under-exposed). Train a linear SVM on these eight feature counts.

Results¶

Saturation-based detection: - GAN Crop (fully synthetic regions): 0.7 AUC - GAN Full (camera with synthetic face replacements): 0.61 AUC - Over-exposure features outperform combined over- and under-exposure features (0.70 vs. 0.67 AUC on Crop)

Color image forensics: - GAN Crop: 0.56 AUC - GAN Full: 0.54 AUC - Performance little better than random; likely hindered by pre-trained network expecting different image statistics or by re-touched camera images in evaluation set

Training data: 1,387 GAN images from 30 LSUN categories (progressive GAN) + ImageNet camera images.

Connections¶

Complementary to deepfake detection methods that target semantic artifacts (eye blinking, mismatched eye colors)
Related to image forensics via statistical analysis of image formation; contrasts with learning-based forensic detectors that risk being circumvented by adversarial GAN fine-tuning
Addresses synthetic media detection in the context of online disinformation

Notes¶

Strengths: The architectural analysis is insightful—the contrast between generator color weights and camera spectral sensitivities is well-illustrated. The saturation method is simple, interpretable, and achieves reasonable performance on fully synthetic images. The focus on architectural properties common across GANs (rather than image-specific artifacts) could generalize better as generator designs evolve.

Limitations: Color forensics underperformed, likely due to noise from pre-trained networks or image re-touching in the evaluation set. The saturation method dilutes on partially GAN-modified images (GAN Full), where synthetic regions are small. The paper does not discuss robustness to image compression or other post-processing. Both methods are trained on a specific GAN architecture (Progressive GAN) and may not transfer to other generative models.

Future work: The authors acknowledge the rapid pace of GAN innovation and the risk of their forensic cues becoming obsolete as architectures evolve. Re-training the full color-forensics network with more data, testing on other GAN variants, and examining robustness under compression are natural next steps.