Skip to content

SwapMe and FaceSwap Dataset

A face tampering dataset introduced by Two-Stream Neural Networks for Tampered Face Detection, created using two commercial and open-source face-swapping applications.

Composition

Tampered images: 2010 total - 1005 images generated using SwapMe iOS app - 1005 images generated using FaceSwap open-source software

Authentic images: 2800 total - 1400 per subset (corresponding to each swapping application)

Total: 4810 images

Creation methodology

  1. Selected 1005 target-source face pairs
  2. Generated one tampered image per pair using each application
  3. Applied post-processing including:
  4. Boundary blurring for seamless blending
  5. Image resizing to match context
  6. Color/illumination blending

Only high-quality results were retained to ensure realistic tampering.

Dataset characteristics

Advantages: - Large scale: 2010 tampered images suitable for training deep learning methods - High visual quality: Tampering is realistic and difficult to detect through visual inspection alone - Diverse content: Images cover diverse events (holidays, sports, conferences) and identities (different ages, genders, races) - Two algorithms: Uses two different face-swapping techniques to avoid overfitting to algorithm-specific artifacts - Face-specific: Focuses on facial regions rather than general image tampering

Challenges: - Limited to face regions in images with clear faces - Only two swapping algorithms (may not generalize to other techniques) - Not publicly released (as of the paper's publication in 2018)

Evaluation protocol

The paper uses a cross-algorithm training and testing protocol: - Train on FaceSwap subset (705 tampered + 1400 authentic images) - Test on SwapMe subset (300 tampered + 300 authentic images)

This ensures the model does not overfit to algorithm-specific artifacts and tests generalization to different face-swapping techniques.

Baseline results

Face-level AUC on SwapMe test set: - Two-stream network (proposed): 0.927 - Face classification stream alone: 0.854 - Patch triplet stream alone: 0.875 - Steganalysis features + SVM: 0.794

Notes

The SwapMe/FaceSwap dataset addresses limitations of prior face tampering datasets by focusing specifically on faces and using realistic post-processing. However, the lack of public release limits its broader impact compared to datasets like FaceForensics++ and DFDC, which became standard benchmarks for the field.