Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt

Use this file to discover all available pages before exploring further.

Deploy Cognee locally or on a server with Docker Compose. The included docker-compose.yml uses profiles so you can start only the services you need.

Prerequisites

  • Docker and Docker Compose v2+
  • Git

Quick Start

git clone https://github.com/topoteretes/cognee.git
cd cognee
cp .env.template .env
Edit .env and set your LLM API key:
LLM_API_KEY="your_api_key"
Then start the Cognee API server (no profile needed):
docker compose up --build cognee
The API will be available at http://localhost:8000. Interactive docs at http://localhost:8000/docs.

Service Profiles

Each optional service is gated behind a profile. Use --profile to activate one or more:
ProfileServicePort(s)Purpose
(none)cognee8000, 5678Core API server
mcpcognee-mcp8000, 5678MCP server for IDE integrations
uifrontend3000Experimental web UI
neo4jneo4j7474, 7687Neo4j graph database
chromadbchromadb3002ChromaDB vector database
postgrespostgres5432PostgreSQL + pgvector
redisredis6379Redis caching

Data Persistence

The compose file mounts your local cognee/ source directory and .env file into the container. Named volumes persist database data between restarts:
ServiceVolume
postgrespostgres_data
chromadb.chromadb_data/ (local dir)
redisredis_data
To ingest files from your host machine, uncomment and update the volume in docker-compose.yml.
# - /path/to/your/data:/data

Environment Variables

The cognee container reads configuration from .env at startup. Key variables:
VariableDefaultDescription
LLM_API_KEY(required)API key for your LLM provider
LLM_MODELopenai/gpt-4o-miniLLM model to use
DB_PROVIDERsqliteRelational DB: sqlite or postgres
GRAPH_DATABASE_PROVIDERkuzuGraph DB: kuzu, neo4j, etc.
VECTOR_DB_PROVIDERlancedbVector DB: lancedb, chromadb, pgvector, etc.
CORS_ALLOWED_ORIGINS*Restrict to specific domains in production
REQUIRE_AUTHENTICATIONfalseEnable JWT auth for the API
COGNEE_SKIP_CONNECTION_TESTfalseSkip LLM/embedding connectivity checks on startup. Accepts true, 1, or yes.
chunk_size1500Max tokens per chunk during cognify (see Chunkers)
chunk_overlap10Overlap between chunks in words (only affects LangchainChunker)
See the full list of options in Setup Configuration.

Common Setups

PostgreSQL with pgvector is a good production choice for the relational database.Add to your .env:
DB_PROVIDER=postgres
DB_HOST=postgres
DB_PORT=5432
DB_USERNAME=cognee
DB_PASSWORD=cognee
DB_NAME=cognee_db
Start both services:
docker compose --profile postgres up --build
For production deployments with a dedicated graph database:Add to your .env:
# Relational DB
DB_PROVIDER=postgres
DB_HOST=postgres
DB_PORT=5432
DB_USERNAME=cognee
DB_PASSWORD=cognee
DB_NAME=cognee_db

# Graph DB
GRAPH_DATABASE_PROVIDER=neo4j
GRAPH_DATABASE_URL=bolt://neo4j:7687
GRAPH_DATABASE_NAME=neo4j
GRAPH_DATABASE_USERNAME=neo4j
GRAPH_DATABASE_PASSWORD=pleaseletmein
Start the stack:
docker compose --profile postgres --profile neo4j up --build
Neo4j browser is available at http://localhost:7474.
Use ChromaDB as the vector store:Add to your .env:
VECTOR_DB_PROVIDER=chromadb
VECTOR_DB_URL=http://chromadb:8000
VECTOR_DB_KEY=your_chroma_token
Start:
docker compose --profile chromadb up --build
Run the MCP server alongside the API:
docker compose --profile mcp up --build cognee-mcp
The MCP server uses SSE transport on port 8000 (separate container). Configure your IDE to point to http://localhost:8000/sse.

Stopping and Cleaning Up

# Stop containers (preserves volumes)
docker compose down

# Stop and remove volumes (deletes all data)
docker compose down --volumes

Additional Information

The default Docker image includes a fixed set of extras from the repository Dockerfile. If you need features behind another optional dependency, add the matching --extra <name> flag to both uv sync lines in the Dockerfile, then rebuild the image.For a table of available extras and common combinations, see Installation. For a table of supported file types and their loaders, see Loaders.Example: adding the docs extra
# First uv sync (--no-install-project):
RUN --mount=type=cache,target=/root/.cache/uv \
    uv sync --extra debug --extra api --extra postgres --extra neo4j \
             --extra llama-index --extra ollama --extra mistral --extra groq \
             --extra anthropic --extra chromadb --extra docs \
             --frozen --no-install-project --no-dev --no-editable

# Second uv sync (installs the project):
RUN --mount=type=cache,target=/root/.cache/uv \
    uv sync --extra debug --extra api --extra postgres --extra neo4j \
             --extra llama-index --extra ollama --extra mistral --extra groq \
             --extra anthropic --extra chromadb --extra docs \
             --frozen --no-dev --no-editable
This same pattern works for other extras such as scraping, redis, tracing, monitoring, or docling.Rebuild after updating the Dockerfile:
docker compose up --build cognee
Even when Cognee is configured to use external databases (Postgres, pgvector, Neo4j, etc.), local writable paths are still required. DATA_ROOT_DIRECTORY (default .data_storage) and SYSTEM_ROOT_DIRECTORY (default .cognee_system) hold ingestion artifacts, file caches, and loader outputs — they are not bypassed by pointing the relational, vector, or graph backends elsewhere.Inside the container these resolve to /app/cognee/.data_storage and /app/cognee/.cognee_system. If that path is read-only or owned by another user, ingestion fails with:
PermissionError: [Errno 13] Permission denied: '/app/cognee/.data_storage/...'
Fix — mount writable volumes for both directories:
services:
  cognee:
    image: cognee/cognee:main
    volumes:
      - cognee_data:/app/cognee/.data_storage
      - cognee_system:/app/cognee/.cognee_system
    environment:
      DB_PROVIDER: postgres
      # ... remaining DB / graph / vector settings

volumes:
  cognee_data:
  cognee_system:
If you relocate the storage paths with DATA_ROOT_DIRECTORY and SYSTEM_ROOT_DIRECTORY, mount the volumes at the same paths:
services:
  cognee:
    image: cognee/cognee:main
    volumes:
      - cognee_data:/var/cognee/data
      - cognee_system:/var/cognee/system
    environment:
      DATA_ROOT_DIRECTORY: /var/cognee/data
      SYSTEM_ROOT_DIRECTORY: /var/cognee/system
      DB_PROVIDER: postgres
      # ... remaining DB / graph / vector settings

volumes:
  cognee_data:
  cognee_system:
Working Postgres + pgvector + Neo4j compose example — includes healthchecks on both postgres and neo4j so Cognee does not start before either database is ready (Cognee otherwise races Neo4j’s Bolt listener and exits with a connection error):
services:
  postgres:
    image: pgvector/pgvector:pg17
    environment:
      POSTGRES_USER: cognee
      POSTGRES_PASSWORD: cognee
      POSTGRES_DB: cognee_db
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U cognee -d cognee_db"]
      interval: 10s
      timeout: 5s
      retries: 5

  neo4j:
    image: neo4j:5.26
    environment:
      NEO4J_AUTH: neo4j/pleaseletmein
    healthcheck:
      test: ["CMD-SHELL", "cypher-shell -u neo4j -p pleaseletmein 'RETURN 1'"]
      interval: 10s
      timeout: 5s
      retries: 10
      start_period: 30s

  cognee:
    image: cognee/cognee:main
    depends_on:
      postgres:
        condition: service_healthy
      neo4j:
        condition: service_healthy
    volumes:
      - cognee_data:/app/cognee/.data_storage
      - cognee_system:/app/cognee/.cognee_system
    environment:
      DB_PROVIDER: postgres
      DB_HOST: postgres
      DB_PORT: 5432
      DB_USERNAME: cognee
      DB_PASSWORD: cognee
      DB_NAME: cognee_db
      VECTOR_DB_PROVIDER: pgvector
      GRAPH_DATABASE_PROVIDER: neo4j
      GRAPH_DATABASE_URL: bolt://neo4j:7687
      GRAPH_DATABASE_USERNAME: neo4j
      GRAPH_DATABASE_PASSWORD: pleaseletmein

volumes:
  cognee_data:
  cognee_system:
See Storage & Logging for the related env vars, or S3 storage if you want to point these directories at S3 instead of local volumes.
When Cognee starts before PostgreSQL finishes initializing, the first API call triggers LLM/embedding connectivity checks (setup_and_check_environment) and may hit the database before it accepts connections, producing [Errno 111] Connection refused or [Errno 99] Cannot assign requested address.Recommended fix — add a healthcheck and depends_on condition to your docker-compose.yml:
services:
  postgres:
    image: pgvector/pgvector:pg17
    environment:
      POSTGRES_USER: cognee
      POSTGRES_PASSWORD: cognee
      POSTGRES_DB: cognee_db
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U cognee -d cognee_db"]
      interval: 10s
      timeout: 5s
      retries: 5

  cognee:
    image: cognee/cognee:main
    depends_on:
      postgres:
        condition: service_healthy
    environment:
      DB_PROVIDER: postgres
      DB_HOST: postgres
      DB_PORT: 5432
      DB_USERNAME: cognee
      DB_PASSWORD: cognee
      DB_NAME: cognee_db
This delays the cognee container until PostgreSQL passes its health check.Alternative fix — bypass the connectivity check:If you cannot modify the compose file (e.g. third-party orchestration), set COGNEE_SKIP_CONNECTION_TEST=true to skip the LLM/embedding startup probe entirely. The check is only performed once (on first run), so the trade-off is that misconfigured endpoints are not caught until the first real request.
COGNEE_SKIP_CONNECTION_TEST=true
To use UnstructuredLoader for .docx, .pptx, .xlsx, .epub, and similar formats, add --extra docs as shown above and rebuild the image.That same docs extra also enables AdvancedPdfLoader. For layout-aware or OCR-based PDF extraction, you additionally need poppler-utils and tesseract-ocr in the runtime stage of your Dockerfile (the second FROM python:3.12-slim-bookworm block):
RUN apt-get update && apt-get install -y \
    libpq5 \
    curl \
    poppler-utils \
    tesseract-ocr \
    && rm -rf /var/lib/apt/lists/*
Once rebuilt, the loaders activate automatically for supported file types with no code changes required.

Need help?

Join our community for Docker deployment support.