Natural Language Processing¶

Computational linguistics and NLP methods for analyzing, understanding, and detecting false claims in text.

Key Papers¶

Efficient Estimation of Word Representations in Vector Space — Introduces efficient architectures (CBOW, Skip-gram) for learning word representations at scale; foundational to modern NLP pipelines and widely used in fake news detection systems as feature extractors
Distributed Representations of Words and Phrases and their Compositionality — Extends Skip-gram with phrase handling, negative sampling, and subsampling of frequent words; demonstrates compositional structure of word vectors enabling semantic arithmetic; achieves 72% accuracy on phrase analogy tasks
Misinformation Detection on YouTube Using Video Captions: Applies pre-trained word embeddings (GloVe, Word2Vec) to YouTube video captions for misinformation classification; shows 0.92–0.95 F1-score binary classification and 0.85–0.90 three-class, outperforming metadata-only baselines
Automated identification of media bias in news articles: an interdisciplinary literature review: Interdisciplinary review mapping manual media bias analysis methods from social sciences to computational NLP approaches; identifies opportunities for applying NLP to event selection, source selection, labeling, and other bias forms
Oshikawa, Qian, & Wang (2020) — A Survey on Natural Language Processing for Fake News Detection: Comprehensive survey systematically comparing NLP task formulations, datasets, and methods (preprocessing, machine learning models, rhetorical approaches, evidence collection); findings show attention-based LSTM models outperform hand-crafted linguistic features, though meta-data remains critical