Face Reenactment¶
Face reenactment refers to techniques that transfer facial expressions, speech movements, head pose, or gaze from a source face to a target face while preserving the target person's identity. Unlike face-swap deepfakes that replace the entire face, reenactment techniques modify only the dynamic aspects of a face—expressions, mouth movements, and eye gaze—while keeping identity features intact.
Key techniques¶
Face2Face: Real-time photorealistic facial reenactment introduced by Thies et al. (2016). Uses a parametric 3D morphable face model trained on video of the target person to transfer expressions from a source actor. The approach reconstructs facial geometry and texture, allowing seamless transfer of micro-expressions and subtle emotion. Results appear highly realistic and are difficult for humans to detect.
NeuralTextures: A neural rendering approach that learns a set of learned feature maps ("neural textures") that represent facial appearance. Expressions and poses can be transferred by editing the neural texture coordinates while preserving identity. More robust to extreme poses and expressions than Face2Face.
Audio-driven synthesis: Methods that generate facial animations synchronized to speech or music, particularly useful for virtual avatars and video dubbing.
Why face reenactment is harder to detect¶
Expression reenactment preserves identity cues that help humans verify authenticity. Detection methods must identify subtle inconsistencies in expression flow, eye blinks, or lighting rather than obvious artifacts. Temporal inconsistencies (jerky movements, unnatural transitions) are more reliable signals than frame-level features, but evaluating temporal dynamics is computationally expensive.
Detection and defenses¶
- MesoNet: A Compact Facial Video Forgery Detection Network demonstrates 95% detection rate for Face2Face videos under realistic compression
- Detection relies on identifying blurring in face regions or subtle texture artifacts introduced by expression transfer
- Video compression significantly degrades detection performance; high compression can reduce accuracy by 10-15%
Related concepts¶
- Deepfakes — identity swap variant of facial manipulation
- Facial manipulation detection — general methods for detecting any facial manipulation
- Synthetic media — broader category of AI-generated content