Multi-task learning¶

Multi-task learning (MTL) is a machine learning paradigm where a single model jointly learns multiple related tasks. By sharing representations between tasks, MTL can improve generalization and data efficiency, particularly when tasks are complementary or when some tasks have limited training data.

Motivation¶

Multi-task learning is motivated by the observation that learning shared representations across related tasks can: - Reduce overfitting by increasing effective training set size - Improve generalization through inductive bias from auxiliary tasks - Leverage unlabeled or partially labeled data for auxiliary tasks - Capture shared structure between tasks more efficiently than single-task models

Architecture patterns¶

A shared network layer processes all tasks' input, then task-specific output layers predict task-specific labels. This forces the model to learn shared representations.

Each task has its own model, but task-specific parameters are regularized to be similar. This allows more task-specific flexibility while encouraging representation sharing.

Applications in misinformation detection¶

Rumour verification: Joint learning of veracity (main task) with stance classification and rumour detection (auxiliary tasks) — see Kochkina et al. 2018
Fake news classification: Combining headline, body, and credibility prediction tasks
Claim verification: Joint learning of evidence retrieval and claim-evidence relevance ranking

Effectiveness factors¶

Multi-task learning effectiveness depends on: - Task relatedness: Highly related tasks benefit more from shared representations - Data balance: When tasks have imbalanced dataset sizes, auxiliary tasks with abundant labels can help main tasks with sparse labels - Label distribution properties: Tasks with lower kurtosis (more balanced class distributions) and higher entropy show greater MTL gains

Key papers¶

Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media: applies Copula Ordinal Regression for jointly modeling outlet factuality and political ideology; auxiliary tasks at different bias granularities reduce prediction error
Kumar & Carley (2019) — Tree LSTMs with Convolution Units to Predict Stance and Rumor Veracity in Social Media Conversations — demonstrates multi-task learning (stance + rumor veracity) with Tree LSTM architectures; alternating task training strategy; achieves 12% and 15% F1-macro improvements over single-task baselines on PHEME dataset.
Kochkina et al. (2018) — All-in-one: Multi-task Learning for Rumour Verification — demonstrates MTL benefits for rumor veracity prediction with stance classification and rumor detection as auxiliary tasks; analyzes link between dataset properties and MTL effectiveness

Multi-task learning¶

Motivation¶

Architecture patterns¶

Hard parameter sharing¶

Soft parameter sharing¶

Applications in misinformation detection¶

Effectiveness factors¶

Key papers¶

See also¶