Python Agent

Connect your own Python LLM agent to Cognee MCP to give it persistent knowledge graph memory. The mcp Python SDK lets you call all Cognee MCP tools programmatically, without an IDE or chat client.

Prefer the v1.0 memory tools (remember, recall, forget) for new agent integrations. The legacy tools (cognify, search, delete) are still available when you need lower-level control.

Prerequisites

Python 3.10+
uv installed
LLM_API_KEY environment variable set (OpenAI key or equivalent)
mcp package installed in your agent environment:

uv pip install "mcp>=1.12.0"

Connection Options

Choose the transport that matches how you want your Python code to connect to Cognee MCP. Each option below creates the same kind of initialized ClientSession; the tool-calling code is shared in the next section.

stdio
HTTP
SSE

Use stdio when your Python process should launch Cognee MCP as a subprocess and communicate over stdin/stdout.

git clone https://github.com/topoteretes/cognee.git
cd cognee/cognee-mcp
uv sync --dev --all-extras

import os
from contextlib import asynccontextmanager

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client


@asynccontextmanager
async def connect_to_cognee():
    server_params = StdioServerParameters(
        command="uv",
        args=[
            "--directory",
            "/absolute/path/to/cognee/cognee-mcp",
            "run",
            "cognee-mcp",
        ],
        env={**os.environ, "LLM_API_KEY": os.environ["LLM_API_KEY"]},
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            yield session

Replace /absolute/path/to/cognee/cognee-mcp with the absolute path to the cognee-mcp directory in your cloned repository.

Use HTTP when Cognee MCP is already running as a local or remote server with Streamable HTTP enabled.

docker run -e TRANSPORT_MODE=http --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main

from contextlib import asynccontextmanager

from mcp import ClientSession
from mcp.client.streamable_http import streamable_http_client


@asynccontextmanager
async def connect_to_cognee():
    async with streamable_http_client("http://localhost:8000/mcp") as (read, write, _):
        async with ClientSession(read, write) as session:
            await session.initialize()
            yield session

Use SSE when your MCP server or client requires the older Server-Sent Events transport.

docker run -e TRANSPORT_MODE=sse --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main

from contextlib import asynccontextmanager

from mcp import ClientSession
from mcp.client.sse import sse_client


@asynccontextmanager
async def connect_to_cognee():
    async with sse_client("http://localhost:8000/sse") as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            yield session

Send Requests

After you define one of the connect_to_cognee() functions above, the rest of your agent code is transport-agnostic. This example stores a fact with remember, then retrieves it with recall.

import asyncio


async def main():
    async with connect_to_cognee() as session:
        await session.call_tool(
            "remember",
            arguments={
                "data": "Acme Corp signed a $1.2M healthcare contract in Q1 2025.",
                "dataset_name": "sales",
            },
        )

        result = await session.call_tool(
            "recall",
            arguments={
                "query": "healthcare contracts",
                "search_type": "GRAPH_COMPLETION",
            },
        )
        print(result.content[0].text)


asyncio.run(main())

Inject context into your LLM calls

Once you have the retrieved context string, pass it to your LLM as part of the system or user prompt:

import openai

client = openai.AsyncOpenAI()


async def answer_with_memory(session, question: str) -> str:
    result = await session.call_tool(
        "recall",
        arguments={
            "query": question,
            "search_type": "GRAPH_COMPLETION",
        },
    )
    context = result.content[0].text

    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": f"Use this context to answer:\n\n{context}"},
            {"role": "user", "content": question},
        ],
    )
    return response.choices[0].message.content

Key tools for agent context

Tool	Purpose
`remember`	v1.0 API - store data with optional session scoping
`recall`	v1.0 API - smart retrieval with session awareness
`forget`	v1.0 API - delete a dataset or wipe everything
`cognify`	Legacy tool - ingest text, files, or URLs into the knowledge graph
`search`	Legacy tool - retrieve context (`GRAPH_COMPLETION`, `RAG_COMPLETION`, `CHUNKS`)
`cognify_status`	Poll background indexing progress
`prune`	Reset all memory (useful in tests)

See the Tools Reference for all available tools and parameters.

Need Help?

Join Our Community

Get support and connect with other developers using Cognee MCP.

Setup

Integrations

Prerequisites

Connection Options

Send Requests

Inject context into your LLM calls

Key tools for agent context

Need Help?

Join Our Community

​Prerequisites

​Connection Options

​Send Requests

​Inject context into your LLM calls

​Key tools for agent context

​Need Help?

Join Our Community

Prerequisites

Connection Options

Send Requests

Inject context into your LLM calls

Key tools for agent context

Need Help?