Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt

Use this file to discover all available pages before exploring further.

What is the recall operation

The .recall operation is the main retrieval entry point in Cognee v1.0. It searches memory using the best available source for the request.
  • Auto-routing by default: when you do not specify a search type, recall() classifies the query and picks the best retrieval strategy automatically.
  • Session-aware: with session_id, it can search session cache entries first and fall through to the permanent graph if needed.
  • Graph-backed: for permanent memory, recall() runs graph retrieval and returns contextual results.
  • Source tagging: when results are returned as dicts, they include _source metadata so you can tell whether they came from "session" or "graph".

Where recall fits

  • Use recall() as the default way to ask questions over memory in v1.0.
  • Use it after Remember has created either permanent or session memory.
  • Use explicit query_type only when you want to force a specific retrieval mode.
  • Use datasets to scope results to a specific knowledge base.

What happens under the hood

  1. Check session scope
    • If you pass session_id without datasets and without query_type, Cognee first searches the session cache directly.
    • Session search is keyword-based over stored question, context, and answer fields.
  2. Choose the retrieval strategy
    • If query_type is provided, Cognee uses it directly.
    • Otherwise, if auto_route=True, a rule-based router picks the best strategy based on your query. See Auto-routing behavior below for the kinds of patterns it recognizes.
    • If auto_route=False, Cognee falls back to GRAPH_COMPLETION.
  3. Run graph retrieval when needed
    • If session search finds nothing, or if graph retrieval is requested, recall() queries the permanent knowledge graph.
    • The retrieval strategy selected in step 2 determines how that graph query is executed.

After recall finishes

  • Session-only recall: you get matching session entries when cache hits exist, each tagged with _source="session".
  • Graph-backed recall: you get results from the permanent graph, tagged with _source="graph" when returned as dicts.
  • Hybrid behavior: with session_id plus graph-scoping inputs like datasets or query_type, recall uses the permanent graph path rather than session-only lookup.

Examples and details

recall() can run while another dataset is still being ingested or indexed in the background.
  • There is no global lock that blocks retrieval while remember() is processing.
  • recall() only sees data that has already made it through indexing.
  • If a dataset is mid-run, results may be incomplete until that run reaches completion.
If you need to confirm a dataset is fully ready before querying it, check its status with cognee.datasets.get_status() or the indexing-status guidance on Remember.
The built-in router uses query patterns to choose an underlying search mode:
  • Summary-style prompts like “overview”, “summary”, or “key takeaways” bias toward summary retrieval.
  • Relationship questions like “how are X and Y connected?” bias toward graph context-extension retrieval.
  • Time-oriented questions like “when”, “before”, “after”, or year ranges bias toward temporal retrieval.
  • Code-focused queries like “coding rules” or async def bias toward coding-rules retrieval.
  • Exact quoted phrases bias toward lexical chunk search.
If you want direct control, pass query_type explicitly and bypass the router.
Most users should think in terms of asking memory a question, not selecting a retriever manually.Under the hood, retrieval strategies — graph completion, summary, temporal, lexical, coding-rules — do the actual work. recall() is intentionally one layer above them:
  • you call recall()
  • Cognee chooses or accepts a query_type
  • the matching retriever runs underneath
  • the result comes back through the recall API
If you want to understand or control the lower-level retrieval behavior directly, see the lower-level search reference.
  • Session-only recall returns matching session cache entries.
  • Graph-backed recall returns the output of the underlying retrieval mode selected by the router or query_type.
  • Depending on the retrieval path, that can be a plain answer string, retrieved context, or structured result dictionaries.
  • Returned dict results are tagged with _source so callers can distinguish session from graph results.
  • only_context=True skips the final LLM answer-generation step and returns retrieved context instead.
  • verbose=True exposes extra retrieval details from the lower-level graph search path.
Set only_context=True when you want the retrieved context without asking Cognee to produce a final answer.When only_context=True, Cognee stops after retrieval formatting:
  • the LLM answer is not generated
  • the final completion step is skipped
  • the returned value is the retrieved context rather than a synthesized answer
results = await cognee.recall(
    query_text="Tell me about NLP",
    only_context=True,
)
This is useful when you want to:
  • inspect what was retrieved before answer generation
  • feed the retrieved context into your own prompt or downstream logic
  • debug retrieval quality separately from answer quality
In other words, only_context=True keeps recall focused on retrieval output instead of answer generation.
answers = await cognee.recall(
    query_text="Give me an overview of this dataset.",
    datasets=["product_docs"],
)
  • datasets limits graph retrieval to named datasets.
  • dataset_ids scopes retrieval by dataset UUID instead of name. When supplied, it takes precedence over datasets and the name-to-UUID resolution step is skipped.
  • With session_id plus either datasets or dataset_ids, recall becomes hybrid: session context is available, but graph retrieval is scoped to the selected dataset or dataset UUIDs.
OptionWhat it does
query_textThe natural-language query to answer.
query_typeForces a specific underlying search type instead of using auto-routing.
datasetsRestricts graph retrieval to specific dataset names.
dataset_idsRestricts graph retrieval by dataset UUIDs. Takes precedence over datasets when both are set.
top_kLimits the number of returned results. Defaults to 10.
auto_routeEnables the rule-based query router. Defaults to True.
session_idEnables session-aware retrieval and session-cache lookup.
only_contextReturns retrieved context only. The final LLM answer is not generated.
system_prompt / system_prompt_pathCustomizes the generation prompt.
recall() runs Search under the hood for graph-backed retrieval.Use legacy Search directly when you need to select a specific retriever, inspect retrieval internals, or use advanced parameters not exposed by recall(). See also Search Basics for the full set of retrieval options.
recall()search() (legacy)
Recommended forMost v1.0 use casesAdvanced retriever control
Search type selectionAuto-routes by default; pass query_type to overrideAlways explicit; defaults to GRAPH_COMPLETION
Session behaviorSearches session-cache QA entries by keyword; falls through to graph on misssession_id writes/reads conversation history only — no cache lookup
Source taggingResults carry _source ("session" or "graph")No source tagging
Parameter surfaceCompact; power-user options via **kwargsFull surface — neighborhood_depth, node_type, triplet_distance_penalty with explicit defaults
Use recall() when you want to ask a question and get an answer. It handles routing, session lookup, and source attribution automatically.Use search() directly when you need a specific retriever, lower-level parameters such as neighborhood_depth or node_type, or when building custom pipelines.

Remember

Create permanent or session memory

Improve

Enrich the graph for better future recall