Neural language models¶
Neural language models are autoregressive or bidirectional statistical models trained to predict the next token (word or subword) in a sequence given context. Modern models like GPT-2, GPT-3, BERT, and T5 are trained on massive corpora using transformer architectures and serve as the backbone for diverse NLP applications.
Key properties¶
Scale effects:
Model capability improves dramatically with parameter count and training data size. GPT-3 (175B parameters) exhibits few-shot learning and reasoning abilities absent in smaller models (GPT-2: 1.5B parameters, BERT: 340M parameters), suggesting emergent behaviors arise from scale.
Exposure bias and training-inference mismatch:
During training, models see human-written text; at inference, they sample from their own distributions. This distributional shift leads to characteristic artifacts: repetition, loss of coherence over long sequences, vocabulary concentration on high-likelihood tokens.
Output distribution properties:
Language models concentrate probability mass on a narrow set of high-likelihood tokens. Human text, by contrast, exploits the full range of the vocabulary. This asymmetry is the foundation of white-box Generated text detection.
Key papers¶
- Ippolito et al. (2019) — Detection is Easiest when Humans are Fooled: Comparative study of human and automatic detection of GPT-2 outputs across decoding strategies; identifies brittleness of automatic detectors to distribution shift and greater robustness of human judgment.
- Gehrmann et al. (2019) — GLTR: Analysis of language model output distribution properties showing generated text concentrates on high-rank tokens; introduces statistical detection framework exploiting this asymmetry.
- Solaiman et al. (2019) — OpenAI GPT-2 Release Report: Comprehensive risk analysis and detection strategies for large language models; studies effects of sampling method on both generation quality and detectability.
- Zellers et al. (2019) — Grover: Explores controlled generation of news-like text and the role of conditional vs. unconditional generation in creating realistic artifacts.
- Adelani et al. (2019) — Generating Sentiment-Preserving Fake Online Reviews: Demonstrates practical attack using fine-tuned GPT-2 to generate fake product reviews indistinguishable from authentic ones; shows both human raters and automated detectors fail to reliably identify machine-generated reviews.
Related topics¶
- Language Models — broader field including non-neural models
- Natural Language Generation — generation strategies and quality
- Generated text detection — detecting outputs from neural language models
- Synthetic media — generation of other modalities (video, audio)