Word Embeddings¶

Continuous vector representations of words learned from large corpora that capture semantic and syntactic relationships. Word embeddings form the foundation for modern NLP systems by representing words in a dense vector space where similar words have similar representations.

Key Papers¶

Efficient Estimation of Word Representations in Vector Space — CBOW and Skip-gram architectures enabling efficient learning of high-quality word embeddings from billions of words; demonstrates linear structure of embeddings enabling vector arithmetic (king − man + woman ≈ queen)
Distributed Representations of Words and Phrases and their Compositionality — Extends word embedding methods with phrase representations and techniques (negative sampling, subsampling) that improve both speed and accuracy; demonstrates compositional structure allowing meaningful vector addition
Anand, Chakraborty & Park (2016) — We used Neural Networks to Detect Clickbaits: You won't believe what happened Next! — Applies distributed word embeddings (with character-level CNN) to clickbait detection; achieves 98% accuracy on 15K headlines

Text Representations (broader category of representation learning)
Natural Language Processing (broader NLP methods)
Phrase Embeddings (extension to multi-word units)

Word Embeddings¶

Key Papers¶

Related topics¶