Local Setup (No API Key) - Cognee Documentation

Run Cognee entirely on your own machine — no cloud API key required. The key rule is that both the LLM provider and the embedding provider must be configured together to use a local backend; configuring only one will cause the other to fall back to OpenAI. Before you start:

Complete Quickstart to understand basic operations
Install Ollama if using the Ollama options below

After switching to a local provider for the first time, call cognee.prune.prune_system(metadata=True) before running cognify to ensure there are no stale vector collections from the previous (OpenAI) embedding dimensions.

Ollama (LLM + Embeddings)
Ollama LLM + Fastembed

Fully local setup using Ollama for both text generation and embeddings.Prerequisites: Install Ollama and pull the required models:

ollama pull llama3.1:8b
ollama pull nomic-embed-text:latest

.env configuration:

# LLM — Ollama
LLM_PROVIDER="ollama"
LLM_MODEL="llama3.1:8b"
LLM_ENDPOINT="http://localhost:11434/v1"
LLM_API_KEY="ollama"

# Embeddings — Ollama
EMBEDDING_PROVIDER="ollama"
EMBEDDING_MODEL="nomic-embed-text:latest"
EMBEDDING_ENDPOINT="http://localhost:11434/api/embed"
EMBEDDING_DIMENSIONS="768"
HUGGINGFACE_TOKENIZER="nomic-ai/nomic-embed-text-v1.5"

LLM_API_KEY="ollama" is a placeholder required by the client library — Ollama itself does not validate it. HUGGINGFACE_TOKENIZER is the HuggingFace repo ID of the tokenizer used for token counting when sending requests to the Ollama embedding endpoint.

Uses Ollama for text generation and Fastembed for CPU-friendly local embeddings (no Ollama embedding model required).Prerequisites: Install Ollama and pull the LLM model:

ollama pull llama3.1:8b

Install the Fastembed extra (not bundled with the base cognee package):

pip install 'cognee[fastembed]'

See the Fastembed setup notes for supported models and dimensions..env configuration:

# LLM — Ollama
LLM_PROVIDER="ollama"
LLM_MODEL="llama3.1:8b"
LLM_ENDPOINT="http://localhost:11434/v1"
LLM_API_KEY="ollama"

# Embeddings — Fastembed (CPU, no API key)
EMBEDDING_PROVIDER="fastembed"
EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"
EMBEDDING_DIMENSIONS="384"

Troubleshooting

`LLMAPIKeyNotSetError: LLM API key is not set` on a fully local setup

Cognee is free and open source — running it locally with Ollama needs no account, subscription, or paid API key. You are not being asked to pay for anything.The error appears because Ollama is one of the providers Cognee requires a non-empty LLM_API_KEY for, even though Ollama itself ignores the value. If LLM_API_KEY is unset or empty, Cognee raises LLMAPIKeyNotSetError before it ever contacts your local server.The fix is to set any placeholder string — the convention is ollama:

LLM_PROVIDER="ollama"
LLM_MODEL="llama3.1:8b"
LLM_ENDPOINT="http://localhost:11434/v1"
LLM_API_KEY="ollama"

For the local examples above, keep LLM_API_KEY="ollama" in place. Fastembed does not need an embedding API key, and Ollama embeddings use the same local placeholder. Use the complete .env blocks in the tabs above so neither provider falls back to OpenAI.

`Cannot connect to host` / connection refused with Ollama

If Cognee can’t reach Ollama, work through these checks:

Ollama is running. Start the server with ollama serve, or open the Ollama desktop app. Verify with:
```
curl http://localhost:11434/api/tags
```
Endpoints match Ollama’s API surface. The LLM endpoint must use the OpenAI-compatible path and the embedding endpoint uses the native Ollama path:
```
LLM_ENDPOINT="http://localhost:11434/v1"
EMBEDDING_ENDPOINT="http://localhost:11434/api/embed"
```
The required models are pulled. Cognee does not pull models on demand:
```
ollama pull llama3.1:8b
ollama pull nomic-embed-text:latest
```
Running Cognee in Docker? localhost inside the container does not point at Ollama on the host. Use host.docker.internal instead:
```
LLM_ENDPOINT="http://host.docker.internal:11434/v1"
EMBEDDING_ENDPOINT="http://host.docker.internal:11434/api/embed"
```
Repeated timeouts under load. Ollama processes requests sequentially. If the default EMBEDDING_BATCH_SIZE of 36 overwhelms it, lower the batch size:
```
EMBEDDING_BATCH_SIZE="5"
```

For more detail on embedding-side tuning, see Embedding Providers → Ollama.

LLM Providers

Configure OpenAI, Azure, Gemini, Anthropic, Ollama, or custom LLM providers

Embedding Providers

Set up OpenAI, Mistral, Ollama, Fastembed, or custom embedding services

Setup Configuration

Full configuration reference for all backends

​Troubleshooting

LLM Providers

Embedding Providers

Setup Configuration

Troubleshooting