Skip to main content

What is the recall operation

The .recall operation is the main retrieval entry point in Cognee v1.0. It searches memory using the best available source for the request.
  • Auto-routing by default: when you do not specify a search type, recall() classifies the query and picks the best retrieval strategy automatically.
  • Session-aware: with session_id, it can search session cache entries first and fall through to the permanent graph if needed.
  • Graph-backed: for permanent memory, recall() runs graph retrieval and returns contextual results.
  • Source tagging: results include _source metadata so you can tell whether they came from "session" or "graph".

Where recall fits

  • Use recall() as the default way to ask questions over memory in v1.0.
  • Use it after Remember has created either permanent or session memory.
  • Use explicit query_type only when you want to force a specific retrieval mode.
  • Use datasets to scope results to a specific knowledge base.

What happens under the hood

  1. Check session scope
    • If you pass session_id without datasets and without query_type, Cognee first searches the session cache directly.
    • Session search is keyword-based over stored question, context, and answer fields.
  2. Choose the retrieval strategy
    • If query_type is provided, Cognee uses it directly.
    • Otherwise, if auto_route=True, a rule-based router picks the best strategy based on your query. See Auto-routing behavior below for the kinds of patterns it recognizes.
    • If auto_route=False, Cognee falls back to GRAPH_COMPLETION.
  3. Run graph retrieval when needed
    • If session search finds nothing, or if graph retrieval is requested, recall() queries the permanent knowledge graph.
    • The retrieval strategy selected in step 2 determines how that graph query is executed.

After recall finishes

  • Session-only recall: you get matching session entries when cache hits exist, each tagged with _source="session".
  • Graph-backed recall: you get results from the permanent graph, tagged with _source="graph" when returned as dicts.
  • Hybrid behavior: with session_id plus graph-scoping inputs like datasets or query_type, recall uses the permanent graph path rather than session-only lookup.

Examples and details

The built-in router uses query patterns to choose an underlying search mode:
  • Summary-style prompts like “overview”, “summary”, or “key takeaways” bias toward summary retrieval.
  • Relationship questions like “how are X and Y connected?” bias toward graph context-extension retrieval.
  • Time-oriented questions like “when”, “before”, “after”, or year ranges bias toward temporal retrieval.
  • Code-focused queries like “coding rules” or async def bias toward coding-rules retrieval.
  • Exact quoted phrases bias toward lexical chunk search.
If you want direct control, pass query_type explicitly and bypass the router.
Most users should think in terms of asking memory a question, not selecting a retriever manually.Under the hood, retrieval strategies — graph completion, summary, temporal, lexical, coding-rules — do the actual work. recall() is intentionally one layer above them:
  • you call recall()
  • Cognee chooses or accepts a query_type
  • the matching retriever runs underneath
  • the result comes back through the recall API
If you want to understand or control the lower-level retrieval behavior directly, see the lower-level search reference.
  • Session-only recall returns matching session cache entries.
  • Graph-backed recall returns the output of the underlying retrieval mode selected by the router or query_type.
  • Depending on the retrieval path, that can be a plain answer string, retrieved context, or structured result dictionaries.
  • Returned dict results are tagged with _source so callers can distinguish session from graph results.
  • only_context=True skips the final LLM answer-generation step and returns retrieved context instead.
  • verbose=True exposes extra retrieval details from the lower-level graph search path.
Set only_context=True when you want the retrieved context without asking Cognee to produce a final answer.When only_context=True, Cognee stops after retrieval formatting:
  • the LLM answer is not generated
  • the final completion step is skipped
  • the returned value is the retrieved context rather than a synthesized answer
results = await cognee.recall(
    query_text="Tell me about NLP",
    only_context=True,
)
This is useful when you want to:
  • inspect what was retrieved before answer generation
  • feed the retrieved context into your own prompt or downstream logic
  • debug retrieval quality separately from answer quality
In other words, only_context=True keeps recall focused on retrieval output instead of answer generation.
answers = await cognee.recall(
    query_text="Give me an overview of this dataset.",
    datasets=["product_docs"],
)
  • datasets limits graph retrieval to named datasets.
  • With both datasets and session_id, recall becomes hybrid: session context is available, but graph retrieval is scoped to the selected dataset.
OptionWhat it does
query_textThe natural-language query to answer.
query_typeForces a specific underlying search type instead of using auto-routing.
datasetsRestricts graph retrieval to specific dataset names.
top_kLimits the number of returned results. Defaults to 10.
auto_routeEnables the rule-based query router. Defaults to True.
session_idEnables session-aware retrieval and session-cache lookup.
only_contextReturns retrieved context only. The final LLM answer is not generated.
system_prompt / system_prompt_pathCustomizes the generation prompt.
recall() runs Search under the hood for graph-backed retrieval.Use legacy Search directly when you need to select a specific retriever, inspect retrieval internals, or use advanced parameters not exposed by recall(). See also Search Basics for the full set of retrieval options.

Remember

Create permanent or session memory

Improve

Enrich the graph for better future recall