Minimal end-to-end example using cognee.agent_memory and LLMGateway
A minimal comparison of the same async function in three modes: without memory, with a fixed retrieval query, and with a query derived from a function argument.Before you start:
Complete Quickstart or have Cognee installed and configured
All three variants use the same dataset and the same LLM helper. Memory is built once with add() and cognify() before any of the three functions are called.LLMGateway.acreate_structured_output() automatically picks up whatever memory the decorator retrieved and prepends it to the text input — no extra code needed inside the function.
import asyncioimport cogneefrom cognee.infrastructure.llm.LLMGateway import LLMGatewayDATASET_NAME = "agent_memory_demo"async def setup_memory() -> None: await cognee.add( ( "Internal product note: the private codename for the first supported " "`cognee.agent_memory` release is Maple Panda." ), dataset_name=DATASET_NAME, ) await cognee.cognify(datasets=[DATASET_NAME])async def ask_llm(question: str) -> str: return await LLMGateway.acreate_structured_output( text_input=question, system_prompt="Answer briefly.", response_model=str, )
The question goes to the LLM with no retrieved context. The model answers from training data alone and will not know the internal codename.
@cognee.agent_memory( memory_query_fixed="What animal does the internal codename refer to?", dataset_name=DATASET_NAME,)async def answer_with_memory() -> str: return await ask_llm("What animal does the internal codename refer to?")async def main() -> None: await setup_memory() answer = await answer_with_memory() print(answer)if __name__ == "__main__": asyncio.run(main())
Before the function runs, Cognee retrieves memory using the fixed query. The same retrieval runs on every call regardless of input. If retrieval succeeds, the printed answer should refer to Panda.Use this when the function has one stable job and should always retrieve the same kind of context.
The retrieval query is taken from the question argument. Each call retrieves memory that matches what was actually asked, so retrieval changes with input.Use this when the function handles different questions and retrieval should follow each incoming request.
The decorator does not create memory — it only retrieves it. The example works because the fact was stored first with add() and processed with cognify().