Skip to main content
Run Cognee entirely on your own machine — no cloud API key required. The key rule is that both the LLM provider and the embedding provider must be configured together to use a local backend; configuring only one will cause the other to fall back to OpenAI. Before you start:
  • Complete Quickstart to understand basic operations
  • Install Ollama if using the Ollama options below
After switching to a local provider for the first time, call cognee.prune.prune_system(metadata=True) before running cognify to ensure there are no stale vector collections from the previous (OpenAI) embedding dimensions.
Fully local setup using Ollama for both text generation and embeddings.Prerequisites: Install Ollama and pull the required models:
ollama pull llama3.1:8b
ollama pull nomic-embed-text:latest
.env configuration:
# LLM — Ollama
LLM_PROVIDER="ollama"
LLM_MODEL="llama3.1:8b"
LLM_ENDPOINT="http://localhost:11434/v1"
LLM_API_KEY="ollama"

# Embeddings — Ollama
EMBEDDING_PROVIDER="ollama"
EMBEDDING_MODEL="nomic-embed-text:latest"
EMBEDDING_ENDPOINT="http://localhost:11434/api/embed"
EMBEDDING_DIMENSIONS="768"
HUGGINGFACE_TOKENIZER="nomic-ai/nomic-embed-text-v1.5"
LLM_API_KEY="ollama" is a placeholder required by the client library — Ollama itself does not validate it. HUGGINGFACE_TOKENIZER is the HuggingFace repo ID of the tokenizer used for token counting when sending requests to the Ollama embedding endpoint.

Troubleshooting

Cognee is free and open source — running it locally with Ollama needs no account, subscription, or paid API key. You are not being asked to pay for anything.The error appears because Ollama is one of the providers Cognee requires a non-empty LLM_API_KEY for, even though Ollama itself ignores the value. If LLM_API_KEY is unset or empty, Cognee raises LLMAPIKeyNotSetError before it ever contacts your local server.The fix is to set any placeholder string — the convention is ollama:
LLM_PROVIDER="ollama"
LLM_MODEL="llama3.1:8b"
LLM_ENDPOINT="http://localhost:11434/v1"
LLM_API_KEY="ollama"
For the local examples above, keep LLM_API_KEY="ollama" in place. Fastembed does not need an embedding API key, and Ollama embeddings use the same local placeholder. Use the complete .env blocks in the tabs above so neither provider falls back to OpenAI.
If Cognee can’t reach Ollama, work through these checks:
  1. Ollama is running. Start the server with ollama serve, or open the Ollama desktop app. Verify with:
    curl http://localhost:11434/api/tags
    
  2. Endpoints match Ollama’s API surface. The LLM endpoint must use the OpenAI-compatible path and the embedding endpoint uses the native Ollama path:
    LLM_ENDPOINT="http://localhost:11434/v1"
    EMBEDDING_ENDPOINT="http://localhost:11434/api/embed"
    
  3. The required models are pulled. Cognee does not pull models on demand:
    ollama pull llama3.1:8b
    ollama pull nomic-embed-text:latest
    
  4. Running Cognee in Docker? localhost inside the container does not point at Ollama on the host. Use host.docker.internal instead:
    LLM_ENDPOINT="http://host.docker.internal:11434/v1"
    EMBEDDING_ENDPOINT="http://host.docker.internal:11434/api/embed"
    
  5. Repeated timeouts under load. Ollama processes requests sequentially. If the default EMBEDDING_BATCH_SIZE of 36 overwhelms it, lower the batch size:
    EMBEDDING_BATCH_SIZE="5"
    
For more detail on embedding-side tuning, see Embedding Providers → Ollama.

LLM Providers

Configure OpenAI, Azure, Gemini, Anthropic, Ollama, or custom LLM providers

Embedding Providers

Set up OpenAI, Mistral, Ollama, Fastembed, or custom embedding services

Setup Configuration

Full configuration reference for all backends