All terms · Model Architecture

Embedding

A numerical representation of text (or other data) that captures meaning, enabling semantic search and comparison.

An embedding is a vector—a list of numbers—that represents the meaning of a word, sentence, or document. Embeddings are derived from neural networks trained on large text corpora. The key insight is that embeddings capture meaning: similar words and concepts have similar embeddings (distance apart), so you can find semantically related documents even if they don't share keywords.

Embeddings power retrieval-augmented generation (RAG), semantic search, and clustering. You convert documents into embeddings, store them in a vector database, and when a user asks a question, you embed their question and find the most similar documents. Embedding models come in various sizes—small models are fast and cheap, large models are slower but more accurate.

Lead embeddings are generated by models like OpenAI's text-embedding-3, Cohere Embed, or open-source alternatives.

Example

The embedding of "car" and "automobile" would be very close (similar meaning); "car" and "banana" would be far apart.

Related terms

Vector Database

A specialized database optimized for storing and searching embeddings by similarity.

RAG (Retrieval-Augmented Generation)

A technique that combines document retrieval with AI generation to ground responses in factual data.

Transformer

The neural network architecture that powers modern large language models.

Back to glossary