> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Sessions

> Step-by-step guide to using sessions for conversational memory in Cognee

A minimal guide to enabling conversational memory with sessions. When you use the same `session_id` across searches, Cognee remembers previous questions and answers, enabling contextually aware follow-up questions.

## Before You Start

* Complete [Quickstart](../getting-started/quickstart) to understand basic operations
* Ensure you have [LLM Providers](../setup-configuration/llm-providers) configured
* Read [Sessions and Caching](../core-concepts/sessions-and-caching) for conceptual overview
* Configure your cache adapter before using sessions. See [Cache Adapters](../core-concepts/sessions-and-caching#cache-adapters) for Redis and Filesystem setup instructions.

## Code in Action

```python theme={null}
import asyncio
import cognee
from cognee import SearchType

async def main():
    # Prepare knowledge base
    await cognee.add([
        "Alice moved to Paris in 2010. She works as a software engineer.",
        "Bob lives in New York. He is a data scientist.",
        "Alice and Bob met at a conference in 2015."
    ])
    await cognee.cognify()

    # First search - starts a new session (default user is used when none is passed)
    result1 = await cognee.search(
        query_type=SearchType.GRAPH_COMPLETION,
        query_text="Where does Alice live?",
        session_id="conversation_1"
    )
    print("First answer:", result1[0])

    # Follow-up search - uses conversation history
    result2 = await cognee.search(
        query_type=SearchType.GRAPH_COMPLETION,
        query_text="What does she do for work?",
        session_id="conversation_1"  # Same session
    )
    print("Follow-up answer:", result2[0])
    # The LLM knows "she" refers to Alice from previous context

    # Different session - no memory of previous conversation
    result3 = await cognee.search(
        query_type=SearchType.GRAPH_COMPLETION,
        query_text="What does she do for work?",
        session_id="conversation_2"  # New session
    )
    print("New session answer:", result3[0])
    # This won't know who "she" refers to

asyncio.run(main())
```

<Note>
  This example works with either Redis or Filesystem adapter. Configure your chosen adapter in the [Before you start](#before-you-start) section above.
</Note>

## What Just Happened

### Step 1: Prepare Knowledge Base

```python theme={null}
await cognee.add([
    "Alice moved to Paris in 2010. She works as a software engineer.",
    "Bob lives in New York. He is a data scientist.",
    "Alice and Bob met at a conference in 2015."
])
await cognee.cognify()
```

Before you can search with sessions, you need to have data in your knowledge base. Use `cognee.add()` to ingest data and `cognee.cognify()` to build the knowledge graph.

### Step 2: Use Session ID in Searches

```python theme={null}
result = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION,
    query_text="Where does Alice live?",
    session_id="conversation_1"
)
```

The `session_id` parameter creates or continues a conversation thread. All searches with the same `session_id` share conversation history.

### Step 3: Follow-up Questions

```python theme={null}
result = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION,
    query_text="What does she do for work?",
    session_id="conversation_1"  # Same session
)
```

When you use the same `session_id`, Cognee automatically includes previous Q\&A turns in the LLM prompt, enabling contextual follow-up questions.

### Step 4: Multiple Sessions

```python theme={null}
# Session 1
await cognee.search(query_text="Question 1", session_id="session_1")
await cognee.search(query_text="Follow-up", session_id="session_1")

# Session 2 (independent)
await cognee.search(query_text="Question 1", session_id="session_2")
```

Each `session_id` maintains its own conversation history. Sessions are isolated from each other.

## Advanced Usage

<Accordion title="Custom Session IDs">
  Use meaningful session IDs to organize conversations:

  ```python theme={null}
  # User-specific sessions
  await cognee.search(query_text="...", session_id=f"user_{user_id}_chat")

  # Topic-specific sessions
  await cognee.search(query_text="...", session_id="project_planning")
  await cognee.search(query_text="...", session_id="bug_discussion")
  ```

  Session IDs are arbitrary strings—use whatever naming scheme fits your application.
</Accordion>

<Accordion title="Session persistence and clearing">
  Sessions expire according to the configured cache TTL. The default `SESSION_TTL_SECONDS` is 7 days; set it to `0` to keep cached sessions until the cache is cleared. To clear sessions, use `cognee.prune.prune_system(..., cache=True)` or wipe your cache backend (e.g. Redis keys or the filesystem cache directory).

  Sessions can also be bridged into the knowledge graph with `improve()`, which can persist cached Q\&A, persist agent traces, and distill accepted session guidance into `session_learnings`:

  ```python theme={null}
  await cognee.improve(
      dataset="project_memory",
      session_ids=["conversation_1"],
  )
  ```

  If you only want to distill one finished session's gated guidance, call `cognee.session.distill_session()` directly:

  ```python theme={null}
  result = await cognee.session.distill_session(
      "conversation_1",
      dataset="project_memory",
  )
  ```

  The direct result includes `status` and `documents`. A completed run writes accepted lesson documents back into the dataset; statuses such as `no_gated_entries` or `no_accepted_lessons` mean there was nothing durable enough to persist.
</Accordion>

<Accordion title="Default Session ID">
  If you omit `session_id` and caching is enabled, Cognee uses the session ID `default_session` and still stores each search turn there. To scope conversations explicitly, pass a `session_id`.
</Accordion>

<Accordion title="Inspect Session History">
  Use `cognee.session.get_session()` to retrieve stored Q\&A entries for a session. Entries are returned in chronological order (oldest first). Use `entries[-1]` for the most recent entry. If the session does not exist or the cache backend is unavailable, this call returns an empty list instead of raising an error.

  <ParamField path="session_id" type="str" default="default_session">
    Identifier of the session to retrieve. It must match the `session_id` previously passed to `cognee.search()`.
  </ParamField>

  <ParamField path="last_n" type="Optional[int]" default="None">
    Maximum number of most-recent entries to return. When `None`, all stored entries are returned.
  </ParamField>

  <ParamField path="user" type="Optional[User]" default="None">
    User that owns the session. When `None`, Cognee resolves it from the current session context or falls back to the default user.
  </ParamField>

  Returns `List[SessionQAEntry]`, which may be empty.

  ```python theme={null}
  import cognee

  entries = await cognee.session.get_session(session_id="conversation_1", last_n=5)
  for entry in entries:
      print(f"[{entry.time}] Q: {entry.question}")
      print(f"           A: {entry.answer}")
      print(f"  feedback score: {entry.feedback_score}")
  ```
</Accordion>

<Accordion title="Upstream context and the include_context flag">
  Every Q\&A entry stored in the session cache contains a `context` field. Depending on how the completion was generated, this field may be empty or may contain a stored summary of the retrieved context for that turn. You can inspect it programmatically when reading session history.

  **The `include_context` flag:**

  <Note>
    `get_session_manager()` is an internal, lower-level API rather than the usual SDK entry point. Prefer `cognee.session.get_session()` unless you specifically need formatted history control such as `include_context`.
  </Note>

  `SessionManager.get_session()` and `SessionManager.format_entries()` both accept `include_context: bool` (default `True`). When `True`, a `CONTEXT:` line is included for each entry in the formatted history string; when `False`, it is omitted.

  This flag is not exposed on `cognee.search()` or `cognee.session.get_session()`. If you need it, use the lower-level `SessionManager` directly.

  **Additional information:**

  <AccordionGroup>
    <Accordion title="Example: reading context from past entries">
      ```python theme={null}
      import cognee

      entries = await cognee.session.get_session(session_id="my_session", last_n=5)
      for entry in entries:
          print("Q:", entry.question)
          print("Stored context:", entry.context)   # may be empty or a stored summary
          print("A:", entry.answer)
      ```
    </Accordion>

    <Accordion title="Does the LLM automatically see context from previous turns?">
      No. When Cognee builds conversation history for the LLM during `cognee.search()`, it uses `include_context=False` internally — previous questions and answers are included in the prompt, but the stored context from those earlier turns is omitted. Fresh graph context is retrieved for the current query only. This keeps prompts compact and avoids re-sending large context blobs.
    </Accordion>

    <Accordion title="Using SessionManager directly">
      If your use-case requires the LLM to see stored context from a prior turn — for example to trace provenance or build a richer prompt — use the lower-level `SessionManager` to retrieve formatted history with `include_context=True`, then pass that history to your own LLM call.

      ```python theme={null}
      from cognee.infrastructure.session.get_session_manager import get_session_manager
      from cognee.modules.users.methods import get_default_user

      user = await get_default_user()
      sm = get_session_manager()

      # Formatted string with CONTEXT included per entry (default)
      history_with_context = await sm.get_session(
          user_id=str(user.id),
          session_id="my_session",
          formatted=True,
          include_context=True,    # includes CONTEXT: line for each Q&A turn
      )
      ```
    </Accordion>
  </AccordionGroup>
</Accordion>

<Accordion title="Disabling Sessions">
  Omitting `session_id` does not disable session storage: Cognee still writes to `default_session` when caching is on. To avoid storing any session data, set `CACHING=false` or ensure no cache backend is available. The system gracefully handles missing cache backends; searches run without conversational memory.
</Accordion>

<Accordion title="Sessions and search types">
  Sessions only affect these search types: GRAPH\_COMPLETION, RAG\_COMPLETION, TRIPLET\_COMPLETION. Other modes (CHUNKS, SUMMARIES, etc.) do not use or write session history. Cognee includes up to the last 10 session entries when building conversation history. For multi-tenant or background jobs, pass an explicit `user` so the default user is not used.
</Accordion>

<Accordion title="Token usage tracking and cost estimation">
  Cognee can record **approximate LLM usage per session** for session-scoped completion flows. This is most relevant when you use [recall](/core-concepts/main-operations/recall) in a conversational pattern and want to inspect prompt/completion volume over time.

  **What is tracked**

  When an LLM completion is generated inside an active session scope, Cognee accumulates these values on the session record:

  * `tokens_in` — estimated prompt tokens
  * `tokens_out` — estimated completion tokens
  * `cost_usd` — estimated spend using Cognee's built-in pricing table

  These values are approximate. Cognee estimates tokens with a character-based heuristic rather than exact provider-reported usage, so use them for monitoring and rough cost estimation rather than invoice-grade billing.

  Because the estimate is computed locally from prompt and completion text, session token and cost tracking works the same way across every configured [LLM provider](/setup-configuration/llm-providers) — OpenAI, Anthropic, Gemini (Google AI Studio and Vertex AI), Bedrock, Mistral, Ollama, and any `custom` LiteLLM-routed provider. It does not depend on the provider returning usage metadata, and it works in Docker and self-hosted deployments as long as caching is enabled.

  Cost figures come from Cognee's built-in pricing table keyed on `LLM_MODEL`. If your model is not in that table (for example, a Vertex AI custom endpoint or a self-hosted `custom` model), `tokens_in` and `tokens_out` are still recorded, but `cost_usd` may be `0` or based on a fallback rate. Use the token counts for those models and compute cost from your provider's pricing.

  **When counts are recorded**

  * Session tracking depends on caching being available.
  * If you omit `session_id`, Cognee still uses `default_session` rather than disabling session usage tracking.
  * The tracked counts reflect **session-scoped completion calls**, not every LLM-capable step in the broader ingestion pipeline.

  For broader guidance on where LLM calls happen outside session accounting:

  * See [Recall](/core-concepts/main-operations/recall) for retrieval and `only_context=True`
  * See [Cognify](/core-concepts/main-operations/legacy-operations/cognify) for ingestion-time call counts

  **Session visibility in the HTTP API**

  Session-listing and session-detail HTTP endpoints return sessions visible to the caller at read time. That includes sessions owned by the requesting user, sessions owned by child agents whose `parent_user_id` matches the requesting user's `id`, and sessions visible through dataset read permissions.

  Sessions remain stored under the user that actually created them, so the `user_id` on each `session_records` row is unchanged. Aggregate totals such as `stats` and `cost-by-model` therefore include usage from child agents when those sessions are visible to the parent. See [Users](/core-concepts/multi-user-mode/permissions-system/users) for how to create agent users with `parent_user_id`.

  **Reading usage via the HTTP API**

  ```http theme={null}
  GET /api/v1/sessions
  GET /api/v1/sessions/{session_id}
  GET /api/v1/sessions/stats?range=30d
  GET /api/v1/sessions/cost-by-model?range=30d
  ```

  * `GET /api/v1/sessions` lists sessions visible to the caller
  * `GET /api/v1/sessions/{session_id}` returns per-session fields such as `tokens_in`, `tokens_out`, and `cost_usd`
  * `GET /api/v1/sessions/stats?range=30d` returns aggregate totals for `24h`, `7d`, `30d`, or `all`
  * `GET /api/v1/sessions/cost-by-model?range=30d` breaks usage down by model
</Accordion>

<Columns cols={3}>
  <Card title="Sessions and Caching" icon="brain" href="/core-concepts/sessions-and-caching">
    Understand how sessions work conceptually
  </Card>

  <Card title="Search Basics" icon="search" href="/guides/search-basics">
    Learn about search parameters and types
  </Card>

  <Card title="Setup Configuration" icon="settings" href="/setup-configuration/overview">
    Configure cache adapters and providers
  </Card>
</Columns>