TELLER: A Trustworthy Framework For Explainable, Generalizable and Controllable Fake News Detection¶

Authors: Hui Liu, Wenya Wang, Haoru Li, Haoliang Li Venue: arXiv preprint — 2402.07776

TL;DR¶

TELLER proposes a dual-system framework for fake news detection that combines LLM-driven cognition (decomposing content into yes/no questions) with a neural-symbolic decision system (learning interpretable rules). The approach prioritizes explainability, generalizability, and human controllability, achieving 76% accuracy on GossipCop and over 80% on three other datasets while maintaining transparency in decision-making.

Contributions¶

A systematic framework (TELLER) for trustworthy fake news detection that operationalizes explainability, generalizability, and controllability as design principles.
A dual-system architecture: a cognition system that decomposes human fact-checking expertise into logical predicates via LLMs, and a decision system that learns interpretable rules via a neural-symbolic model (Disjunctive Normal Form layer).
Comprehensive empirical evaluation on four datasets (LIAR, Constraint, PolitiFact, GossipCop) demonstrating feasibility, explainability, generalizability, and controllability.
Demonstration that the framework outperforms direct LLM prompting and enables human intervention through rule adjustment.

Method¶

Cognition System¶

The cognition system mimics human fact-checking by decomposing the detection problem into interpretable yes/no questions. Using LLMs (FLAN-T5, Llama2, GPT-3.5-turbo), the system:

Generates a set of question templates Q from human expertise, where each template Qᵢ corresponds to a logical predicate Pᵢ
For input news T, instantiates these templates with concrete claims extracted from T to form logic atoms
Computes truth values μᵢ for each logic atom by querying the LLM

The paper proposes two strategies for obtaining truth values: - For open-vocabulary LLMs (FLAN-T5, Llama2): sample m times and count affirmative responses - For closed-vocabulary LLMs (GPT-3.5-turbo): use post-softmax logits to mitigate irrelevant token influence

Decision System¶

The decision system learns to aggregate logic atom truth values into a final veracity prediction using a neural-symbolic approach:

Stacks C conjunctive layers (Sᴸ∧) and J disjunctive layers (Sᴸ∨) in alternation, with each layer corresponding to a truthfulness label
Each Sᴸ∨ learns a conjunction of logic atoms corresponding to a candidate rule
The final DNF Layer applies softmax to produce probability distribution over labels

The decision system can learn disjunctive normal form rules end-to-end (e.g., "label true if rule₁₂₃ OR rule₂₇ is true"), enabling both interpretability and error correction of imperfect LLM predictions.

Results¶

Binary classification (Closed-domain): - TELLER achieves 76.53% accuracy on GossipCop (closed setting) - Over 80% accuracy on Constraint, PolitiFact, and LIAR - Outperforms GPT-3.5-turbo Direct prompting by significant margins - F1 scores exceed Direct by average of 7% and 6% in respective settings

Cross-domain generalization: - Consistently outperforms Direct prompting across all four datasets with no generalization algorithm - Negligible performance drop compared to in-domain training

Multi-classification (LIAR fine-grained labels): - Framework forms Direct for FLAN-T5 and Llama2 series, demonstrating robustness to noisy LLM predictions

Explainability: - Questions and logic atoms are human-readable and verifiable - Learned DNF rules are interpretable and can be manually adjusted - Symbolic components enable identification of which questions drove decisions

Controllability: - Manual adjustment of DNF Layer weights enables human intervention - Demonstrates feasibility of human experts correcting misaligned rules - Intervention experiments show consistent improvement when adjusting low-confidence predictions

Connections¶

Related to Fake news detection methods through neural-symbolic approaches combining neural and logical components
Addresses Explainable AI concerns in fake news detection systems
Builds on Trustworthy AI frameworks that emphasize transparency and human oversight
Complements Fact-checking and corrections systems by decomposing verification into interpretable steps
Uses Neural-symbolic AI models for interpretable decision-making

Notes¶

Strengths: - Novel approach to explainability through decomposed questions rather than post-hoc explanations - Systematic treatment of three critical aspects (explainability, generalizability, controllability) - Strong empirical results across diverse datasets and LLMs - Human-in-the-loop capability through rule adjustment - Practical applicability—the cognition system can leverage different LLMs

Limitations acknowledged: - Trustworthiness limited to algorithmic design; data collection and deployment governance remain open - Integrating external knowledge sources improved performance but added complexity - DNF Layer expressiveness constrained by simple architecture; more sophisticated decision models could enhance results - Trade-off between trustworthiness and decision system complexity

Open questions: - How does performance scale as question template sets grow? - What is the optimal level of human expert involvement for practical deployment? - Can the framework extend to real-time or streaming fact-checking scenarios?