The Role of User Profiles for Fake News Detection¶

Authors: Kai Shu, Xinyi Zhou, Suhang Wang, Reza Zafarani, Huan Liu Venue: arXiv:1904.13355 [cs.SI], April 2019

TL;DR¶

Content-only fake news detectors plateau because fake news is crafted to mimic real news textually. This paper systematically characterizes Twitter users who predominantly share fake versus real news across both partitions of FakeNewsNet (PolitiFact and GossipCop), comparing six explicit Twitter API attributes and four inferred implicit ones between the two groups. A User Profile Feature (UPF) vector — formed by averaging per-user feature vectors across all users who shared a given news item — outperforms RST and LIWC text baselines by wide margins, reaching F1 0.904 on PolitiFact and F1 0.966 on GossipCop with Random Forest, with account registration age as the single most informative feature.

Contributions¶

Defines two user-group selection procedures (absolute share count and Fake news Ratio $FR(i) = n_i^{(f)} / (n_i^{(r)} + n_i^{(f)})$) to identify representative fake-news and real-news propagators after bot filtering via Botometer (score threshold 0.5).
Statistical comparative analysis (t-tests) of implicit profile features — predicted age, Big Five personality, inferred geolocation, profile image object type, and political bias score — between $U^{(f)}$ and $U^{(r)}$, showing statistically significant differences (p < 0.05) on both datasets.
Constructs the UPF vector by averaging per-user feature representations across sharers of each news item and evaluates it against RST and LIWC content baselines across five classifiers (LR, NB, DT, RF, AdaBoost).
Feature importance analysis (Gini impurity in Random Forest) ranking features by discriminative power for fake/real classification.

Method¶

User group construction. After Botometer-based bot removal (14–21% of accounts filtered depending on partition and class), users are partitioned into "Only Fake," "Only Real," and "Fake and Real" groups. Representative sets $U^{(f)}$ and $U^{(r)}$ are selected by combining top-$K$ absolute sharers from the exclusive partitions with users satisfying $FR \in [1-t,1]$ or $FR \in [0,t]$ (threshold $t = 0.2$, chosen for stability). Notably, a larger fraction of bot accounts appears among fake-news sharers than real-news sharers in both datasets.

Implicit profile features. Four demographic/behavioral attributes are inferred from tweet history and profile images, not from API metadata directly: - Age: linear regression over a psycholinguistic lexicon applied to recent tweets; younger users skew toward fake-news sharing. - Personality: predicted via the Pear tool (unsupervised Big Five model); fake-news sharers show lower Neuroticism scores than real-news sharers. - Geolocation: predicted at city level using the pigeo tool (Location Indicative Words); distribution differs between groups, with more real-news sharing in the US East Coast in these datasets. - Profile image type: VGG16 on ImageNet (1000 classes); image category distributions differ, with "wig" and "mask" categories over-represented among fake-news sharers. - Political bias: scored on $[-1, 1]$ based on interest-similarity to partisan accounts; fake-news sharers skew right-leaning, real-news sharers skew neutral.

Explicit profile features. Six fields from the Twitter API: Verified (binary), RegisterTime (days since account creation), StatusCount (total posts), FavorCount (total favorites), FollowerCount, FollowingCount. Fake-news sharers have older accounts (registered ~130 days earlier on PolitiFact), fewer followers, more following counts, and publish fewer posts.

UPF vector and classification. For news item $a$ shared by user set $\mathcal{U}$: $$\mathbf{f} = \frac{1}{|\mathcal{U}|} \sum_{u_i \in \mathcal{U}} \mathbf{u}_i$$ where $\mathbf{u}_i$ concatenates all implicit and explicit features. Profile image features are PCA-reduced from 1000 to 10 dimensions. The UPF vector is the input to a binary fake/real classifier. Baselines use RST discourse features and LIWC psycholinguistic features extracted from news article text.

Results¶

Random Forest on 80/20 train/test split (5 repetitions, average reported):

Method	PolitiFact F1	GossipCop F1
RST	0.781	0.593
LIWC	0.834	0.730
UPF	0.904	0.966
RST + UPF	0.915	0.967
LIWC + UPF	0.919	0.963

UPF alone substantially outperforms both content baselines. Adding content features (RST or LIWC) to UPF yields marginal further gains, confirming the two feature families are nearly orthogonal.

UPF is robust to classifier: on PolitiFact RF (F1 0.904), SVM (0.865), DT (0.847), LR (0.842); on GossipCop RF (0.966), DT (0.931), SVM (0.918), LR (0.914).

Feature importance (Gini impurity, RF): RegisterTime (0.937) >> Verified (0.099) > Political Bias (0.063) > Personality (0.036) > StatusCount (0.035). The dominance of RegisterTime reflects the practice of creating new accounts specifically to amplify fake news. On GossipCop, implicit features alone (F1 0.962) nearly match the all-features ceiling (0.966), while explicit features alone (0.895) lag further.

Connections¶

Uses the FakeNewsNet dataset for both PolitiFact and GossipCop partitions.
Directly extends the social-context detection paradigm: the authors position UPF as a social-context approach complementary to content-based methods.
Political bias as a feature is explored further in the political bias topic.
The feature engineering topic covers the broader landscape of hand-crafted feature approaches for fake news detection.
See user profiles topic for work building on this direction.
Zhou et al. (2020) — SAFE is the content-based counterpart from the same group: a multi-modal neural approach on the same FakeNewsNet benchmark that complements UPF by operating without social context.

Notes¶

The dominance of RegisterTime in feature importance (Gini 0.937 vs. all others < 0.1) suggests the model is largely capturing a single signal — account age — rather than a rich demographic profile. This is corroborated by how close explicit features alone (F1 0.865 on PolitiFact) come to the full UPF (F1 0.904).

The political bias finding (fake-news sharers skew right-leaning) is artifact-dependent: PolitiFact and GossipCop cover US political and entertainment news, and the bias scoring method itself is calibrated on US partisan proxies. These findings should not be generalized to other national contexts or topic domains.

The inference pipeline for implicit features (Pear, pigeo, VGG16, political bias scorer) is computationally non-trivial and requires access to users' tweet histories — a significant operational hurdle for real-time or high-volume deployment. The paper does not discuss latency or scalability.

Statistical comparisons (t-tests for RQ2) report p-values but not effect sizes, limiting interpretability of the magnitude of group differences.