A Survey on Stance Detection for Mis- and Disinformation Identification¶

Authors: Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein Venue: arXiv, 2021 — arxiv:2103.00242

TL;DR¶

Comprehensive survey of stance detection for misinformation and disinformation. Stance detection—determining whether a text supports, denies, questions, or merely comments on a claim—plays dual roles: as a standalone fact-checking tool and as a component in broader verification pipelines. The paper reviews task formulations, datasets, methods, and applications across multiple languages and platforms.

Contributions¶

Systematic review of stance detection literature with explicit focus on applications to misinformation and disinformation
Taxonomy of stance detection formulations: standalone fact-checking, component in multi-stage pipelines, and application to rumour verification
Comprehensive catalog of existing stance detection datasets in English and non-English languages, with source, target, context, and evidence characteristics
Overview of approaches from early lexical and feature-based methods to modern neural and pre-trained language models (BERT, RoBERTa, GPT)
Discussion of stance detection challenges specific to misinfo/disinfo: class imbalance, implicit stance, cross-platform variation

Method¶

The survey organizes stance detection literature along multiple dimensions:

Stance formulations: (i) Direct fact-checking where stance of author toward document is veracity label; (ii) Component task within fact-checking pipelines requiring evidence retrieval and justification; (iii) Rumour stance detection in social media threads
Stance definitions: From Pang & Lee's (2007) speaker standpoint definition to Kucuk & Can's (2020) classification-task framing with support/deny/question/comment/neutral categories
Datasets: Overview of 15+ English datasets (Rumour Has It, PHEME, Emergent, FNC-1, RumourEval, FEVER, Snopes, TibFact) and non-English resources (Arabic FC, DART, AraStance)
Approaches: Feature engineering (lexical, topic models, graph features) → LSTM/CNN → BERT-based fine-tuning → cross-lingual transfer and pre-training strategies (RoBERTa fine-tuning, pattern-exploiting training, adversarial robustness)
Evidence handling: Methods for single vs. multiple evidence documents, retrieval-then-classification pipelines (FNC-1, FEVER)

Results¶

The survey documents significant progress but persistent challenges:

Modern pre-trained models (BERT, RoBERTa, GPT) substantially improve over feature-based baselines on most benchmarks
Best published results on FEVER reach ~70 F1 (Zhou et al. 2020 with graph neural networks); on FNC-1 approaches upper bounds through careful feature engineering
Transfer and zero-shot learning strategies work across languages—Arabic stance models exceed 76 F1 on ANS dataset using mBERT
Key remaining challenges: class imbalance (unrelated posts dominate), implicit stance (sarcasm, negation), cross-platform variation, need for multi-hop reasoning on multiple evidence documents

Connections¶

Complements A Survey on Multimodal Disinformation Detection on stance as a component of broader verification
Cited by and extends prior rumour-focused surveys including RumourEval 2019: Determining Rumour Veracity and Support for Rumours on shared task formulations
Related to Rumour Verification pipeline work via Propagation-based fake news detection
Overlaps with Fact-checking and corrections literature, particularly SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours and The Fact Extraction and VERification (FEVER) Shared Task on task design

Notes¶

This is a foundational and comprehensive reference for anyone working in automated misinformation detection. Strengths: systematic taxonomy of formulations, breadth across task variants and languages, clear positioning of stance within broader verification pipelines. The paper captures the state-of-the-art circa 2021 and reflects the NLP community's shift from feature engineering toward pre-trained contextualized embeddings. Useful for grounding design choices in new stance detection work and understanding historical context of the shift from FNC-1 to FEVER and beyond.