Skip to main content
Embedding providers convert text into vector representations that enable semantic search. These vectors capture the meaning of text, allowing Cognee to find conceptually related content even when the wording is different.
New to configuration?See the Setup Configuration Overview for the complete workflow:install extras → create .env → choose providers → handle pruning.

Supported Providers

Cognee supports multiple embedding providers:
  • OpenAI — Text embedding models via OpenAI API (default)
  • Azure OpenAI — Text embedding models via Azure OpenAI Service
  • Google Gemini — Embedding models via Google AI
  • Mistral — Embedding models via Mistral AI
  • Ollama — Local embedding models via Ollama
  • Fastembed — CPU-friendly local embeddings
  • Custom — OpenAI-compatible embedding endpoints
LLM/Embedding Configuration: If you configure only LLM or only embeddings, the other defaults to OpenAI. Ensure you have a working OpenAI API key, or configure both LLM and embeddings to avoid unexpected defaults.

Configuration

Set these environment variables in your .env file:
  • EMBEDDING_PROVIDER — The provider to use (openai, gemini, mistral, ollama, fastembed, custom)
  • EMBEDDING_MODEL — The specific embedding model to use
  • EMBEDDING_DIMENSIONS — The vector dimension size (must match your vector store)
  • EMBEDDING_API_KEY — Your API key (falls back to LLM_API_KEY if not set)
  • EMBEDDING_ENDPOINT — Custom endpoint URL (for Azure, Ollama, or custom providers)
  • EMBEDDING_API_VERSION — API version (for Azure OpenAI)
  • EMBEDDING_MAX_TOKENS — Maximum tokens per request (optional)

Provider Setup Guides

OpenAI provides high-quality embeddings with good performance.
EMBEDDING_PROVIDER="openai"
EMBEDDING_MODEL="openai/text-embedding-3-large"
EMBEDDING_DIMENSIONS="3072"
# Optional
# EMBEDDING_API_KEY=sk-...   # falls back to LLM_API_KEY if omitted
# EMBEDDING_ENDPOINT=https://api.openai.com/v1
# EMBEDDING_API_VERSION=
# EMBEDDING_MAX_TOKENS=8191
Use Azure OpenAI Service for embeddings with your own deployment.
EMBEDDING_PROVIDER="openai"
EMBEDDING_MODEL="azure/text-embedding-3-large"
EMBEDDING_ENDPOINT="https://<your-az>.cognitiveservices.azure.com/openai/deployments/text-embedding-3-large"
EMBEDDING_API_KEY="az-..."
EMBEDDING_API_VERSION="2023-05-15"
EMBEDDING_DIMENSIONS="3072"
Use Google’s embedding models for semantic search.
EMBEDDING_PROVIDER="gemini"
EMBEDDING_MODEL="gemini/text-embedding-004"
EMBEDDING_API_KEY="AIza..."
EMBEDDING_DIMENSIONS="768"
Use Mistral’s embedding models for high-quality vector representations.
EMBEDDING_PROVIDER="mistral"
EMBEDDING_MODEL="mistral/mistral-embed"
EMBEDDING_API_KEY="sk-mis-..."
EMBEDDING_DIMENSIONS="1024"
Installation: Install the required dependency:
pip install mistral-common[sentencepiece]
Run embedding models locally with Ollama for privacy and cost control.
EMBEDDING_PROVIDER="ollama"
EMBEDDING_MODEL="nomic-embed-text:latest"
EMBEDDING_ENDPOINT="http://localhost:11434/api/embed"
EMBEDDING_DIMENSIONS="768"
HUGGINGFACE_TOKENIZER="nomic-ai/nomic-embed-text-v1.5"
Installation: Install Ollama from ollama.ai and pull your desired embedding model:
ollama pull nomic-embed-text:latest
Use Fastembed for CPU-friendly local embeddings without GPU requirements.
EMBEDDING_PROVIDER="fastembed"
EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"
EMBEDDING_DIMENSIONS="384"
Installation: Fastembed is included by default with Cognee.Known Issues:
  • As of September 2025, Fastembed requires Python < 3.13 (not compatible with Python 3.13+)
Use OpenAI-compatible embedding endpoints from other providers.
EMBEDDING_PROVIDER="custom"
EMBEDDING_MODEL="provider/your-embedding-model"
EMBEDDING_ENDPOINT="https://your-endpoint.example.com/v1"
EMBEDDING_API_KEY="provider-..."
EMBEDDING_DIMENSIONS="<match-your-model>"

Advanced Options

EMBEDDING_RATE_LIMIT_ENABLED="true"
EMBEDDING_RATE_LIMIT_REQUESTS="10"
EMBEDDING_RATE_LIMIT_INTERVAL="5"
# Mock embeddings for testing (returns zero vectors)
MOCK_EMBEDDING="true"

Important Notes

  • Dimension Consistency: EMBEDDING_DIMENSIONS must match your vector store collection schema
  • API Key Fallback: If EMBEDDING_API_KEY is not set, Cognee uses LLM_API_KEY (except for custom providers)
  • Tokenization: For Ollama and Hugging Face models, set HUGGINGFACE_TOKENIZER for proper token counting
  • Performance: Local providers (Ollama, Fastembed) are slower but offer privacy and cost benefits
I