Ordinal regression¶

Ordinal regression (or ordinal classification) is a supervised learning task where the target variable has a natural order or ranking. Unlike nominal classification (e.g., cat vs. dog, where there is no natural ordering), ordinal targets have meaningful progression: low → medium → high, or negative → neutral → positive.

Examples in misinformation research¶

Factuality assessment: Rating news outlets or articles on a 3-point scale (low, mixed, high) or claim veracity on a 5-point scale (false, mostly-false, mixed, mostly-true, true). Order matters: confusing "high" with "low" is worse than confusing "high" with "mixed."

Political ideology: Rating news outlets on a 7-point left-right spectrum (extreme-left, left, center-left, center, center-right, right, extreme-right). Predicting center when the true label is extreme-right is a gross error (opposite direction).

Sentiment analysis: 5-point sentiment scale (very negative → very positive). Misclassifying positive as negative is worse than misclassifying positive as neutral.

Why ordinal regression matters¶

Better loss functions: Standard classification loss (0-1) treats all misclassifications equally. Ordinal regression uses loss functions that penalize out-of-order errors more heavily. - E.g., confusing class 1 with class 2 is less penalized than confusing class 1 with class 5 - Common metrics: Mean Absolute Error (MAE), Ordinal-aware metrics

Improved predictions: By respecting the ordering, models learn that intermediate predictions are more plausible than extreme ones for ambiguous cases.

Data efficiency: Ordinal structure provides inductive bias, improving generalization with limited data.

Common approaches¶

Threshold-based: Learn a single regression model to predict a continuous value, then apply ordinal thresholds to convert to class labels. Simple, interpretable.

One-vs-rest: Train K-1 binary classifiers, where each distinguishes between "≤ k" and "> k" for k = 1 to K-1. Enforces ordering naturally.

Copula ordinal regression: Model the joint distribution of multiple ordinal variables capturing dependencies (e.g., factuality and bias jointly). Used when multiple ordinal tasks are correlated.

Neural approaches: Embed inputs in a shared representation space, then apply ordinal-aware loss functions (e.g., ordinal cross-entropy).

Evaluation metrics¶

Mean Absolute Error (MAE): Average distance between predicted and true ordinal class. Respects ordering.

Mean Absolute Error - Macro (MAE_M): Macro-averaged variant, more robust to class imbalance than simple MAE.

Ordinal-aware accuracy: Variants that assign partial credit for near-miss predictions.

Spearman correlation: Rank correlation between predictions and true labels, capturing whether the ranking is preserved.

Key papers¶

Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media: applies Copula Ordinal Regression to jointly predict outlet trustworthiness (3-point) and political ideology (7-point); demonstrates multi-task learning with auxiliary tasks at different ordinal granularities improves performance
General ordinal regression foundation: Ordinal Regression for Machine Learning and sentiment analysis literature (e.g., SemEval sentiment tasks using 5-point scales)

Classification: more general supervised learning task
Regression: predicting continuous values; ordinal regression bridges classification and regression
Machine learning: broader field
Multi-task learning: joint learning of multiple ordinal tasks (e.g., bias and factuality) can exploit their correlation