Skip to content

Multi-task learning

Multi-task learning (MTL) is a machine learning paradigm where a single model jointly learns multiple related tasks. By sharing representations between tasks, MTL can improve generalization and data efficiency, particularly when tasks are complementary or when some tasks have limited training data.

Motivation

Multi-task learning is motivated by the observation that learning shared representations across related tasks can: - Reduce overfitting by increasing effective training set size - Improve generalization through inductive bias from auxiliary tasks - Leverage unlabeled or partially labeled data for auxiliary tasks - Capture shared structure between tasks more efficiently than single-task models

Architecture patterns

Hard parameter sharing

A shared network layer processes all tasks' input, then task-specific output layers predict task-specific labels. This forces the model to learn shared representations.

Soft parameter sharing

Each task has its own model, but task-specific parameters are regularized to be similar. This allows more task-specific flexibility while encouraging representation sharing.

Applications in misinformation detection

  • Rumour verification: Joint learning of veracity (main task) with stance classification and rumour detection (auxiliary tasks) — see Kochkina et al. 2018
  • Fake news classification: Combining headline, body, and credibility prediction tasks
  • Claim verification: Joint learning of evidence retrieval and claim-evidence relevance ranking

Effectiveness factors

Multi-task learning effectiveness depends on: - Task relatedness: Highly related tasks benefit more from shared representations - Data balance: When tasks have imbalanced dataset sizes, auxiliary tasks with abundant labels can help main tasks with sparse labels - Label distribution properties: Tasks with lower kurtosis (more balanced class distributions) and higher entropy show greater MTL gains

Key papers

See also