Data augmentation¶
Data augmentation encompasses techniques for expanding training datasets by generating synthetic examples or applying transformations to existing data. In NLP, data augmentation methods range from simple rule-based transformations (token swapping, deletion) to adversarial example generation designed to improve model robustness and generalization.
Key papers¶
- TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP — Provides automated toolkit for data augmentation via pre-implemented adversarial attack recipes, enabling generation of training data for adversarial training.
Related topics¶
- Adversarial training (using augmented data for training)
- Adversarial robustness (motivation for augmentation)
- Model Evaluation (evaluating augmentation effectiveness)