Embedding Models Gallery

Interactive reference for embedding models used in RAG, semantic search, and clustering. Each model lists MTEB score, embedding dimensions, max context, license, and per-token cost. Specs verified against the original paper, HuggingFace model card, and the MTEB leaderboard.

What is an embedding model?

An embedding model maps text (or images, code, audio) to a fixed-length vector where cosine similarity tracks meaning. They power RAG retrieval, semantic search, deduplication, clustering, and recommendation. The two axes that matter in production: retrieval quality (MTEB) and deployment footprint (dims, context, cost).

gemini-embedding-001

API

Google DeepMind · Jul 2025

Matryoshka · 768 → 3,072 dim

Google's July 2025 Gemini-based embedder. 3072 dims, 2K context, Matryoshka to 1536/768, 100+ languages. Topped MTEB multilingual at launch. $0.15 per 1M tokens.

GeminiMatryoshkaMultilingualDecoder-LLM

$0.15 / 1M tokens

Cohere embed-v4.0

Matryoshka · 256 → 1,536 dim

Cohere's April 2025 flagship. 1536 dims, 128K context, Matryoshka, multimodal (text + images), 100+ languages. API-only, $0.12 per 1M tokens.

CohereMatryoshkaMultimodalMultilingual

$0.12 / 1M tokens