Multimodal Detection¶
Methods and systems that jointly analyze multiple modalities — text, images, audio, video — to improve detection accuracy for tasks like fake news, deception, deepfakes, and misinformation. Multimodal approaches capitalize on the observation that false or manipulated content often leaves consistent traces across modalities: visual inconsistencies, linguistic patterns, and audio anomalies that alone might be subtle but together provide strong signals.
Key Papers¶
- A Deep Learning Approach for Multimodal Deception Detection — Deep learning with 3D-CNN (video), audio features, CNN on transcripts, and micro-expressions for courtroom deception detection
- Wang et al. (2018) — EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection — Combines image and text features with event information for social media fake news
- Detecting GAN-generated Imagery using Color Cues — GAN-based approach for detecting color-manipulated images
Related topics¶
- Deception Detection (video-based deception cues)
- Deepfakes and synthetic media (multimodal manipulation)
- Fake news detection (combining text and images)