Embedding — Technical Glossary

Embeddings are how modern AI systems represent meaning numerically. An embedding model takes any input (a word, a sentence, a paragraph, an image, an audio clip) and produces a fixed-length vector of numbers, usually somewhere between 384 and 4096 dimensions. The clever part is that the model is trained so that similar inputs produce similar vectors. The cosine distance between two embeddings is a good proxy for how semantically related their inputs are.

The most common use case is semantic search. Instead of matching documents by keyword, you embed both the query and the documents, then return the documents whose embeddings are closest to the query embedding. This works much better than keyword search for queries like “how do I fix my squeaky door” because the embedding captures intent and meaning rather than literal word matches. Almost every modern search system uses embeddings somewhere in the stack, from Google’s web search to internal documentation tools to RAG (retrieval-augmented generation) systems for LLMs.

Embeddings are also the foundation of vector databases (Pinecone, Weaviate, Chroma, pgvector) which specialise in storing and querying high-dimensional vectors efficiently. A vector database can find the 10 most similar embeddings out of billions in a few milliseconds using approximate nearest neighbour algorithms. This is the technology that makes RAG systems work: store all your company’s documents as embeddings, then for each user query, retrieve the most relevant documents and feed them to an LLM as context.

In DeAI, embedding models are usually one of the workloads that decentralised inference networks support alongside chat-style LLMs. They’re cheaper to run than full chat models (smaller, faster) which makes them a natural fit for distributed inference. Embeddings are also relevant for decentralised search and recommendation systems. The OYM coverage of inference projects (Venice, Morpheus, io.net) usually notes whether their API supports embedding endpoints alongside text generation.

Related terms