- Complete Quickstart to understand basic operations
- Install Ollama if using the Ollama options below
After switching to a local provider for the first time, call
cognee.prune.prune_system(metadata=True) before running cognify to ensure there are no stale vector collections from the previous (OpenAI) embedding dimensions.- Ollama (LLM + Embeddings)
- Ollama LLM + Fastembed
Fully local setup using Ollama for both text generation and embeddings.Prerequisites: Install Ollama and pull the required models:.env configuration:
LLM_API_KEY="ollama" is a placeholder required by the client library — Ollama itself does not validate it.
HUGGINGFACE_TOKENIZER is the HuggingFace repo ID of the tokenizer used for token counting when sending requests to the Ollama embedding endpoint.Troubleshooting
`LLMAPIKeyNotSetError: LLM API key is not set` on a fully local setup
`LLMAPIKeyNotSetError: LLM API key is not set` on a fully local setup
Cognee is free and open source — running it locally with Ollama needs no account, subscription, or paid API key. You are not being asked to pay for anything.The error appears because Ollama is one of the providers Cognee requires a non-empty For the local examples above, keep
LLM_API_KEY for, even though Ollama itself ignores the value. If LLM_API_KEY is unset or empty, Cognee raises LLMAPIKeyNotSetError before it ever contacts your local server.The fix is to set any placeholder string — the convention is ollama:LLM_API_KEY="ollama" in place. Fastembed does not need an embedding API key, and Ollama embeddings use the same local placeholder. Use the complete .env blocks in the tabs above so neither provider falls back to OpenAI.`Cannot connect to host` / connection refused with Ollama
`Cannot connect to host` / connection refused with Ollama
If Cognee can’t reach Ollama, work through these checks:
- Ollama is running. Start the server with
ollama serve, or open the Ollama desktop app. Verify with: - Endpoints match Ollama’s API surface. The LLM endpoint must use the OpenAI-compatible path and the embedding endpoint uses the native Ollama path:
- The required models are pulled. Cognee does not pull models on demand:
- Running Cognee in Docker?
localhostinside the container does not point at Ollama on the host. Usehost.docker.internalinstead: - Repeated timeouts under load. Ollama processes requests sequentially. If the default
EMBEDDING_BATCH_SIZEof36overwhelms it, lower the batch size:
LLM Providers
Configure OpenAI, Azure, Gemini, Anthropic, Ollama, or custom LLM providers
Embedding Providers
Set up OpenAI, Mistral, Ollama, Fastembed, or custom embedding services
Setup Configuration
Full configuration reference for all backends