Information Retrieval¶
Information Retrieval (IR) is the task of finding relevant documents or passages from a large collection in response to a query or claim. In the context of misinformation and fact-checking, IR is used to retrieve evidence documents and passages that support or refute claims. Classic approaches use TF-IDF and BM25; modern approaches employ neural ranking models. For fact verification, IR is typically the first component of a pipeline: retrieve candidate documents, then select relevant sentences, then classify claim veracity.
Key papers¶
- [[2020-guu-realm]] — Jointly pre-trains a neural knowledge retriever with a language model using masked language modeling; retrieves from a textual knowledge corpus and achieves state-of-the-art Open-QA results
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks — Combines dense passage retrieval with seq2seq models for knowledge-intensive tasks; proposes RAG for end-to-end training of retriever and generator components
- Thorne et al. (2018) — FEVER: A Large-Scale Dataset for Fact Extraction and VERification — analyzes document and sentence retrieval components of fact verification pipeline; uses DrQA TF-IDF implementation for document retrieval achieving 55.30% recall@5 on Wikipedia evidence retrieval task
- Vo & Lee (2020) — Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News — neural ranking approach to retrieve fact-checking articles relevant to original tweets; uses BM25 for candidate retrieval, then re-ranks with Multimodal Attention Network (MAN) combining text and image features; demonstrates 4.7% NDCG@1 improvement over text-only retrieval on Snopes dataset; applies multimodal retrieval to the novel task of finding fact-checked evidence for social media claims.
Related topics¶
- Fact-checking and corrections — IR retrieves evidence for fact verification
- Claim Verification — IR is first stage of automated claim verification
- Natural Language Inference — combined with NLI for end-to-end verification systems