Multimodal Detection¶

Methods and systems that jointly analyze multiple modalities — text, images, audio, video — to improve detection accuracy for tasks like fake news, deception, deepfakes, and misinformation. Multimodal approaches capitalize on the observation that false or manipulated content often leaves consistent traces across modalities: visual inconsistencies, linguistic patterns, and audio anomalies that alone might be subtle but together provide strong signals.

Key Papers¶

A Deep Learning Approach for Multimodal Deception Detection — Deep learning with 3D-CNN (video), audio features, CNN on transcripts, and micro-expressions for courtroom deception detection
Wang et al. (2018) — EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection — Combines image and text features with event information for social media fake news
Detecting GAN-generated Imagery using Color Cues — GAN-based approach for detecting color-manipulated images

Deception Detection (video-based deception cues)
Deepfakes and synthetic media (multimodal manipulation)
Fake news detection (combining text and images)

Multimodal Detection¶

Key Papers¶

Related topics¶