> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Installation

> Set up your environment and install Cognee

Set up your environment and install Cognee to start building AI memory.

<Info>
  Python **3.10 – 3.14** is required to run Cognee.
</Info>

## Setup Notes

<AccordionGroup>
  <Accordion title="Environment Configuration">
    * We recommend creating a `.env` file in your project root
    * Cognee supports many configuration options, and a `.env` file keeps them organized
  </Accordion>

  <Accordion title="API Keys & Models">
    You have two main options for configuring LLM and embedding providers:

    **Option 1: OpenAI (Simplest)**

    * Single API key handles both LLM and embeddings
    * Uses gpt-4o-mini for LLM and text-embedding-3-small for embeddings by default
    * Works out of the box with minimal configuration

    **Option 2: Other Providers**

    * Configure both LLM and embedding providers separately
    * Supports Gemini, Anthropic, Ollama, and more
    * Requires setting both `LLM_*` and `EMBEDDING_*` variables

    <Info>
      By default, Cognee uses OpenAI for both LLMs and embeddings. If you change the LLM provider but don't configure embeddings, it will still default to OpenAI.
    </Info>
  </Accordion>

  <Accordion title="Virtual Environment">
    * We recommend using [uv](https://github.com/astral-sh/uv) for virtual environment management
    * Run the following commands to create and activate a virtual environment:

    ```bash theme={null}
    uv venv && source .venv/bin/activate
    ```
  </Accordion>

  <Accordion title="Windows Setup">
    On Windows the setup steps differ slightly from Linux/macOS.

    **Virtual environment activation**

    Use PowerShell or Command Prompt instead of `source`:

    <Tabs>
      <Tab title="PowerShell">
        ```powershell theme={null}
        uv venv
        .venv\Scripts\Activate.ps1
        ```

        If you see an execution-policy error, run this first (current user only):

        ```powershell theme={null}
        Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
        ```
      </Tab>

      <Tab title="Command Prompt (CMD)">
        ```cmd theme={null}
        uv venv
        .venv\Scripts\activate.bat
        ```
      </Tab>
    </Tabs>

    **Creating the `.env` file**

    Copy the template from the project root, then open it in any text editor (Notepad, VS Code, etc.):

    <Tabs>
      <Tab title="PowerShell">
        ```powershell theme={null}
        Copy-Item .env.template .env
        ```
      </Tab>

      <Tab title="Command Prompt (CMD)">
        ```cmd theme={null}
        copy .env.template .env
        ```
      </Tab>
    </Tabs>

    The `.env` file must be saved in the **project root** — the same directory from which you run Python. Cognee calls `load_dotenv()` at import time and searches upward from the working directory.

    **Path values on Windows**

    When setting `DATA_ROOT_DIRECTORY` or `SYSTEM_ROOT_DIRECTORY` in your `.env` file, use **forward slashes** or **double backslashes** — single backslashes are not valid in `.env` values:

    ```ini theme={null}
    # Forward slashes (recommended)
    DATA_ROOT_DIRECTORY="C:/Users/YourName/cognee/.cognee_data"
    SYSTEM_ROOT_DIRECTORY="C:/Users/YourName/cognee/.cognee_system"

    # Or double backslashes
    DATA_ROOT_DIRECTORY="C:\\Users\\YourName\\cognee\\.cognee_data"
    ```

    A `~` home-directory prefix also works and is cross-platform:

    ```ini theme={null}
    DATA_ROOT_DIRECTORY="~/.cognee_data"
    ```

    **Setting env vars without a `.env` file (optional)**

    If you prefer to set variables directly in your shell session instead of using a file:

    <Tabs>
      <Tab title="PowerShell">
        ```powershell theme={null}
        $env:LLM_API_KEY = "your_openai_api_key"
        ```
      </Tab>

      <Tab title="Command Prompt (CMD)">
        ```cmd theme={null}
        set LLM_API_KEY=your_openai_api_key
        ```
      </Tab>
    </Tabs>

    <Warning>
      Variables set this way are session-scoped and lost when the terminal closes. A `.env` file is recommended for persistent configuration.
    </Warning>

    **Line endings**

    Python-dotenv handles both Windows (CRLF) and Unix (LF) line endings automatically, so line endings are not a concern.
  </Accordion>

  <Accordion title="Optional">
    <AccordionGroup>
      <Accordion title="Database">
        * PostgreSQL database is required if you plan to use PostgreSQL as your relational database (requires `postgres` extra)
      </Accordion>
    </AccordionGroup>
  </Accordion>
</AccordionGroup>

## Setup

<Tabs>
  <Tab title="OpenAI (Recommended)">
    <Card>
      **Environment:** Add your OpenAI API key to your `.env` file:

      ```bash theme={null}
      LLM_API_KEY="your_openai_api_key"
      ```

      **Installation:** Install Cognee with the default package:

      ```bash theme={null}
      uv pip install cognee
      ```

      **What this gives you**: Cognee installed with default local databases (SQLite, LanceDB, Kuzu) — no external servers required.

      <Info>
        This single API key handles both LLM and embeddings. We use gpt-4o-mini for the LLM model and text-embedding-3-small for embeddings by default.
      </Info>
    </Card>
  </Tab>

  <Tab title="Other Providers (Gemini, Anthropic, etc.)">
    <Card>
      **Environment:** Configure both LLM and embedding providers in your `.env` file. Here is an example for Gemini:

      ```bash theme={null}
      # LLM
      LLM_PROVIDER="gemini"
      LLM_MODEL="gemini/gemini-flash-latest"
      LLM_API_KEY="your_gemini_api_key"

      # Embeddings
      EMBEDDING_PROVIDER="gemini"
      EMBEDDING_MODEL="gemini/gemini-embedding-001"
      EMBEDDING_API_KEY="your_gemini_api_key"
      ```

      <Info>
        Make sure to configure both LLM and embedding settings. If you only set one, the other will default to OpenAI.
      </Info>

      **Installation:** Install Cognee, then add provider-specific extras only when needed. For Gemini, no extra is required. For other providers, install the matching extra, for example:

      ```bash theme={null}
      uv pip install "cognee[anthropic]"
      ```

      **What this gives you**: Cognee installed with your chosen providers and default local databases.

      For detailed configuration options, see our [LLM](/setup-configuration/llm-providers) and [Embeddings](/setup-configuration/embedding-providers) guides.
    </Card>
  </Tab>
</Tabs>

## Extras and Common Installation Combinations

Cognee's base installation (`pip install cognee`) includes everything needed to run with OpenAI and the default local databases (SQLite, LanceDB, Kuzu). Optional extras unlock additional providers, integrations, and features.

Install one or more extras with:

```bash theme={null}
pip install "cognee[extra1,extra2]"
# or with uv:
uv pip install "cognee[extra1,extra2]"
```

<AccordionGroup>
  <Accordion title="Common installation combinations">
    If you already know the stack you want, these combinations cover the most common setups:

    | Use case                                            | Install                                  |
    | --------------------------------------------------- | ---------------------------------------- |
    | PostgreSQL as the database backend                  | `uv pip install "cognee[postgres]"`      |
    | Neo4j graph store + AWS S3 storage                  | `uv pip install "cognee[neo4j,aws]"`     |
    | Distributed execution on Modal                      | `uv pip install "cognee[distributed]"`   |
    | Code graph analysis                                 | `uv pip install "cognee[codegraph]"`     |
    | Full monitoring (Sentry + Langfuse + OpenTelemetry) | `uv pip install "cognee[monitoring]"`    |
    | Web scraping + extended document formats            | `uv pip install "cognee[scraping,docs]"` |
    | BAML structured output backend                      | `uv pip install "cognee[baml]"`          |
    | Anthropic Claude models                             | `uv pip install "cognee[anthropic]"`     |
  </Accordion>

  <Accordion title="LLM & Embedding Providers">
    These extras install provider SDKs. You still need to set the corresponding environment variables. See [LLM Providers](/setup-configuration/llm-providers) and [Embedding Providers](/setup-configuration/embedding-providers).

    | Extra         | Packages installed                | When to use                                  |
    | ------------- | --------------------------------- | -------------------------------------------- |
    | `anthropic`   | `anthropic>=0.27`                 | Use Claude models (claude-3-5-sonnet, etc.)  |
    | `groq`        | `groq>=0.8.0,<1.0.0`              | Use Groq-hosted inference                    |
    | `mistral`     | `mistral-common`, `mistralai`     | Use Mistral AI models                        |
    | `huggingface` | `transformers>=4.46.3,<5`         | Use HuggingFace models for LLM or embeddings |
    | `ollama`      | `transformers>=4.46.3,<5`         | Use Ollama for local model serving           |
    | `llama-cpp`   | `llama-cpp-python[server]>=0.3.0` | Run GGUF models locally via llama.cpp        |
    | `azure`       | `azure-identity>=1.15.0,<2`       | Azure OpenAI or other Azure-hosted models    |
    | `fastembed`   | `fastembed<=0.6.0`, `onnxruntime` | Fast local embeddings without a GPU          |

    <Info>
      There is no separate `gemini` extra. Gemini is supported through `litellm`, which is already part of the base installation.
    </Info>
  </Accordion>

  <Accordion title="Vector & Graph Stores">
    | Extra             | Packages installed                       | When to use                                                  |
    | ----------------- | ---------------------------------------- | ------------------------------------------------------------ |
    | `postgres`        | `psycopg2`, `pgvector`, `asyncpg`        | Use PostgreSQL as relational DB and pgvector as vector store |
    | `postgres-binary` | `psycopg2-binary`, `pgvector`, `asyncpg` | Same as `postgres` but uses pre-compiled binary wheels       |
    | `neo4j`           | `neo4j>=5.28.0,<6`                       | Use Neo4j as the graph store                                 |
    | `neptune`         | `langchain_aws>=0.2.22`                  | Use Amazon Neptune as the graph store                        |
    | `chromadb`        | `chromadb>=0.6,<0.7`, `pypika`           | Use ChromaDB as the vector store                             |
    | `graphiti`        | `graphiti-core>=0.7.0,<0.8`              | Use Graphiti for temporal knowledge graphs                   |
  </Accordion>

  <Accordion title="Data Ingestion & Processing">
    | Extra         | Packages installed                                                                     | When to use                                                                                             |
    | ------------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
    | `docs`        | `unstructured` (with csv, doc, docx, epub, md, ppt, pptx, xlsx, pdf, and more), `lxml` | Parse Office documents, PDFs via unstructured, and other rich formats beyond the built-in PyPDF support |
    | `docling`     | `docling>=2.54`, `transformers>=4.55`                                                  | Use Docling for advanced document parsing                                                               |
    | `scraping`    | `tavily-python`, `beautifulsoup4`, `playwright`, `lxml`, `protego`, `APScheduler`      | Web scraping, URL ingestion, and scheduled crawling                                                     |
    | `codegraph`   | `fastembed`, `transformers`, `tree-sitter`, `tree-sitter-python`                       | Build code graphs from Python repositories                                                              |
    | `langchain`   | `langsmith`, `langchain_text_splitters`, `langchain-core`                              | Use LangChain text splitters or LangSmith tracing                                                       |
    | `llama-index` | `llama-index-core>=0.14.20,<0.15`                                                      | Use LlamaIndex data loaders and connectors                                                              |
    | `dlt`         | `dlt[sqlalchemy]>=1.9.0,<2`                                                            | Ingest data via DLT pipelines                                                                           |
  </Accordion>

  <Accordion title="Infrastructure & Storage">
    | Extra         | Packages installed      | When to use                                                        |
    | ------------- | ----------------------- | ------------------------------------------------------------------ |
    | `distributed` | `modal>=1.0.5,<2.0.0`   | Run cognee pipelines on Modal for distributed/serverless execution |
    | `redis`       | `redis>=5.0.3,<6.0.0`   | Use Redis for caching instead of the default in-memory/disk cache  |
    | `aws`         | `s3fs[boto3]==2025.3.2` | Use Amazon S3 for file storage                                     |
    | `baml`        | `baml-py==0.206.0`      | Use BAML as a structured output backend                            |
  </Accordion>

  <Accordion title="Observability & Monitoring">
    | Extra        | Packages installed                                                     | When to use                                                                                        |
    | ------------ | ---------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
    | `tracing`    | `opentelemetry-api`, `opentelemetry-sdk`, OTLP exporters (gRPC + HTTP) | Export traces via OpenTelemetry to any compatible backend                                          |
    | `monitoring` | Everything in `tracing` plus `sentry-sdk[fastapi]`, `langfuse`         | Full monitoring stack: Sentry for errors, Langfuse for LLM observability, OpenTelemetry for traces |
    | `posthog`    | `posthog>=3.5.0,<4`                                                    | Send usage analytics to PostHog                                                                    |
  </Accordion>

  <Accordion title="Evaluation">
    | Extra      | Packages installed                                        | When to use                                           |
    | ---------- | --------------------------------------------------------- | ----------------------------------------------------- |
    | `deepeval` | `deepeval>=3.0.1,<4`                                      | Run LLM evaluation benchmarks with DeepEval           |
    | `evals`    | `plotly`, `gdown`, `pandas`, `matplotlib`, `scikit-learn` | Internal evaluation tooling with plotting and metrics |
  </Accordion>

  <Accordion title="Development & Tooling">
    | Extra      | Packages installed                               | When to use                                                         |
    | ---------- | ------------------------------------------------ | ------------------------------------------------------------------- |
    | `notebook` | `notebook>=7.1.0,<8`                             | Run Jupyter notebooks                                               |
    | `dev`      | pytest, mypy, ruff, pre-commit, mkdocs, and more | Full development environment for contributing to cognee             |
    | `debug`    | `debugpy>=1.8.9,<2.0.0`                          | Attach a remote debugger (e.g. VS Code) to a running cognee process |
  </Accordion>

  <Accordion title="Missing dependency errors (ImportError)">
    If you encounter an `ImportError` when using a cognee feature, it usually means a required extra has not been installed.

    | ImportError mentions                     | Install                                         |
    | ---------------------------------------- | ----------------------------------------------- |
    | `neo4j`                                  | `cognee[neo4j]`                                 |
    | `modal`                                  | `cognee[distributed]`                           |
    | `playwright`, `tavily`, `beautifulsoup4` | `cognee[scraping]`                              |
    | `unstructured`                           | `cognee[docs]`                                  |
    | `docling`                                | `cognee[docling]`                               |
    | `fastembed`                              | `cognee[fastembed]` or `cognee[codegraph]`      |
    | `tree_sitter`                            | `cognee[codegraph]`                             |
    | `psycopg2`, `asyncpg`, `pgvector`        | `cognee[postgres]` or `cognee[postgres-binary]` |
    | `redis`                                  | `cognee[redis]`                                 |
    | `s3fs`, `boto3`                          | `cognee[aws]`                                   |
    | `baml`                                   | `cognee[baml]`                                  |
    | `anthropic`                              | `cognee[anthropic]`                             |
    | `groq`                                   | `cognee[groq]`                                  |
    | `mistralai`                              | `cognee[mistral]`                               |
    | `llama_cpp`                              | `cognee[llama-cpp]`                             |
    | `opentelemetry`                          | `cognee[tracing]` or `cognee[monitoring]`       |
    | `sentry_sdk`, `langfuse`                 | `cognee[monitoring]`                            |
    | `graphiti`                               | `cognee[graphiti]`                              |
    | `chromadb`                               | `cognee[chromadb]`                              |
    | `deepeval`                               | `cognee[deepeval]`                              |
    | `dlt`                                    | `cognee[dlt]`                                   |
  </Accordion>
</AccordionGroup>

## Next Steps

<CardGroup cols={2}>
  <Card title="Run Your First Example" href="/getting-started/quickstart" icon="play">
    **Quickstart Tutorial**

    Get started with Cognee by running your first knowledge graph example.
  </Card>

  <Card title="Explore Advanced Features" href="/core-concepts" icon="compass">
    **Core Concepts**

    Dive deeper into Cognee's powerful features and capabilities.
  </Card>
</CardGroup>
