LLM Providers

LLM (Large Language Model) providers handle text generation, reasoning, and structured output tasks in Cognee. You can choose from cloud providers like OpenAI and Anthropic, or run models locally with Ollama.

New to configuration?See the Setup Configuration Overview for the complete workflow:install extras → create .env → choose providers → handle pruning.

Supported Providers

Cognee supports multiple LLM providers:

OpenAI — GPT models via OpenAI API (default)
Azure OpenAI — GPT models via Azure OpenAI Service
Google Gemini — Gemini models via Google AI
Anthropic — Claude models via Anthropic API
Ollama — Local models via Ollama
Custom — OpenAI-compatible endpoints

LLM/Embedding Configuration: If you configure only LLM or only embeddings, the other defaults to OpenAI. Ensure you have a working OpenAI API key, or configure both LLM and embeddings to avoid unexpected defaults.

Configuration

Environment Variables

Set these environment variables in your .env file:

LLM_PROVIDER — The provider to use (openai, gemini, anthropic, ollama, custom)
LLM_MODEL — The specific model to use
LLM_API_KEY — Your API key for the provider
LLM_ENDPOINT — Custom endpoint URL (for Azure, Ollama, or custom providers)
LLM_API_VERSION — API version (for Azure OpenAI)
LLM_MAX_TOKENS — Maximum tokens per request (optional)

Provider Setup Guides

OpenAI (Default)

OpenAI is the default provider and works out of the box with minimal configuration.

LLM_PROVIDER="openai"
LLM_MODEL="gpt-4o-mini"
LLM_API_KEY="sk-..."
# Optional overrides
# LLM_ENDPOINT=https://api.openai.com/v1
# LLM_API_VERSION=
# LLM_MAX_TOKENS=16384

Azure OpenAI

Use Azure OpenAI Service with your own deployment.

LLM_PROVIDER="openai"
LLM_MODEL="azure/gpt-4o-mini"
LLM_ENDPOINT="https://<your-resource>.openai.azure.com/openai/deployments/gpt-4o-mini"
LLM_API_KEY="az-..."
LLM_API_VERSION="2024-12-01-preview"

Google Gemini

Use Google’s Gemini models for text generation.

LLM_PROVIDER="gemini"
LLM_MODEL="gemini/gemini-2.0-flash"
LLM_API_KEY="AIza..."
# Optional
# LLM_ENDPOINT=https://generativelanguage.googleapis.com/
# LLM_API_VERSION=v1beta

Anthropic

Use Anthropic’s Claude models for reasoning tasks.

LLM_PROVIDER="anthropic"
LLM_MODEL="claude-3-5-sonnet-20241022"
LLM_API_KEY="sk-ant-..."

Ollama (Local)

Run models locally with Ollama for privacy and cost control.

LLM_PROVIDER="ollama"
LLM_MODEL="llama3.1:8b"
LLM_ENDPOINT="http://localhost:11434/v1"
LLM_API_KEY="ollama"

Installation: Install Ollama from ollama.ai and pull your desired model:

ollama pull llama3.1:8b

Known Issues

Requires HUGGINGFACE_TOKENIZER: Ollama currently needs this env var set even when used only as LLM. Fix in progress.
NoDataError with mixed providers: Using Ollama as LLM and OpenAI as embedding provider may fail with NoDataError. Workaround: use the same provider for both.

Custom Providers

Use OpenAI-compatible endpoints like OpenRouter or other services.

LLM_PROVIDER="custom"
LLM_MODEL="openrouter/google/gemini-2.0-flash-lite-preview-02-05:free"
LLM_ENDPOINT="https://openrouter.ai/api/v1"
LLM_API_KEY="or-..."
# Optional fallback chain
# FALLBACK_MODEL=
# FALLBACK_ENDPOINT=
# FALLBACK_API_KEY=

Advanced Options

Rate Limiting

Control client-side throttling for LLM calls to manage API usage and costs.Configuration (in .env):

LLM_RATE_LIMIT_ENABLED="true"
LLM_RATE_LIMIT_REQUESTS="60"
LLM_RATE_LIMIT_INTERVAL="60"

How it works:

Client-side limiter: Cognee paces outbound LLM calls before they reach the provider
Moving window: Spreads allowance across the time window for smoother throughput
Per-process scope: In-memory limits don’t share across multiple processes/containers
Auto-applied: Works with all providers (OpenAI, Gemini, Anthropic, Ollama, Custom)

Example: 60 requests per 60 seconds ≈ 1 request/second average rate.

Notes

If EMBEDDING_API_KEY is not set, Cognee falls back to LLM_API_KEY for embeddings
Rate limiting helps manage API usage and costs
Structured output frameworks ensure consistent data extraction from LLM responses

Embedding Providers

Configure embedding providers for semantic search

Overview

Return to setup configuration overview

Relational Databases

Set up SQLite or Postgres for metadata storage

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

Supported Providers

Configuration

Provider Setup Guides

Known Issues

Advanced Options

Notes

Embedding Providers

Overview

Relational Databases

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

​Supported Providers

​Configuration

​Provider Setup Guides

​Known Issues

​Advanced Options

​Notes

Embedding Providers

Overview

Relational Databases

Supported Providers

Configuration

Provider Setup Guides

Known Issues

Advanced Options

Notes