Skip to content
TELLER: A Trustworthy Framework For Explainable, Generalizable and Controllable Fake News Detection

TELLER: A Trustworthy Framework For Explainable, Generalizable and Controllable Fake News Detection

Authors: Hui Liu, Wenya Wang, Haoru Li, Haoliang Li Venue: arXiv preprint — 2402.07776

TL;DR

TELLER proposes a dual-system framework for fake news detection that combines LLM-driven cognition (decomposing content into yes/no questions) with a neural-symbolic decision system (learning interpretable rules). The approach prioritizes explainability, generalizability, and human controllability, achieving 76% accuracy on GossipCop and over 80% on three other datasets while maintaining transparency in decision-making.

Contributions

  1. A systematic framework (TELLER) for trustworthy fake news detection that operationalizes explainability, generalizability, and controllability as design principles.
  2. A dual-system architecture: a cognition system that decomposes human fact-checking expertise into logical predicates via LLMs, and a decision system that learns interpretable rules via a neural-symbolic model (Disjunctive Normal Form layer).
  3. Comprehensive empirical evaluation on four datasets (LIAR, Constraint, PolitiFact, GossipCop) demonstrating feasibility, explainability, generalizability, and controllability.
  4. Demonstration that the framework outperforms direct LLM prompting and enables human intervention through rule adjustment.

Method

Cognition System

The cognition system mimics human fact-checking by decomposing the detection problem into interpretable yes/no questions. Using LLMs (FLAN-T5, Llama2, GPT-3.5-turbo), the system:

  1. Generates a set of question templates Q from human expertise, where each template Qᵢ corresponds to a logical predicate Pᵢ
  2. For input news T, instantiates these templates with concrete claims extracted from T to form logic atoms
  3. Computes truth values μᵢ for each logic atom by querying the LLM

The paper proposes two strategies for obtaining truth values: - For open-vocabulary LLMs (FLAN-T5, Llama2): sample m times and count affirmative responses - For closed-vocabulary LLMs (GPT-3.5-turbo): use post-softmax logits to mitigate irrelevant token influence

Decision System

The decision system learns to aggregate logic atom truth values into a final veracity prediction using a neural-symbolic approach:

  1. Stacks C conjunctive layers (Sᴸ∧) and J disjunctive layers (Sᴸ∨) in alternation, with each layer corresponding to a truthfulness label
  2. Each Sᴸ∨ learns a conjunction of logic atoms corresponding to a candidate rule
  3. The final DNF Layer applies softmax to produce probability distribution over labels

The decision system can learn disjunctive normal form rules end-to-end (e.g., "label true if rule₁₂₃ OR rule₂₇ is true"), enabling both interpretability and error correction of imperfect LLM predictions.

Results

Binary classification (Closed-domain): - TELLER achieves 76.53% accuracy on GossipCop (closed setting) - Over 80% accuracy on Constraint, PolitiFact, and LIAR - Outperforms GPT-3.5-turbo Direct prompting by significant margins - F1 scores exceed Direct by average of 7% and 6% in respective settings

Cross-domain generalization: - Consistently outperforms Direct prompting across all four datasets with no generalization algorithm - Negligible performance drop compared to in-domain training

Multi-classification (LIAR fine-grained labels): - Framework forms Direct for FLAN-T5 and Llama2 series, demonstrating robustness to noisy LLM predictions

Explainability: - Questions and logic atoms are human-readable and verifiable - Learned DNF rules are interpretable and can be manually adjusted - Symbolic components enable identification of which questions drove decisions

Controllability: - Manual adjustment of DNF Layer weights enables human intervention - Demonstrates feasibility of human experts correcting misaligned rules - Intervention experiments show consistent improvement when adjusting low-confidence predictions

Connections

Notes

Strengths: - Novel approach to explainability through decomposed questions rather than post-hoc explanations - Systematic treatment of three critical aspects (explainability, generalizability, controllability) - Strong empirical results across diverse datasets and LLMs - Human-in-the-loop capability through rule adjustment - Practical applicability—the cognition system can leverage different LLMs

Limitations acknowledged: - Trustworthiness limited to algorithmic design; data collection and deployment governance remain open - Integrating external knowledge sources improved performance but added complexity - DNF Layer expressiveness constrained by simple architecture; more sophisticated decision models could enhance results - Trade-off between trustworthiness and decision system complexity

Open questions: - How does performance scale as question template sets grow? - What is the optimal level of human expert involvement for practical deployment? - Can the framework extend to real-time or streaming fact-checking scenarios?