Recall

What is the recall operation

The .recall operation is the main retrieval entry point in Cognee v1.0. It searches memory using the best available source for the request.

Auto-routing by default: when you do not specify a search type, recall() classifies the query and picks the best retrieval strategy automatically.
Session-aware: with session_id, it can search session cache entries first and fall through to the permanent graph if needed.
Graph-backed by default: for permanent memory, recall() runs graph retrieval — not plain embedding similarity. GRAPH_COMPLETION is the fallback when auto-routing does not choose a more specific strategy.
Source tagging: each recall result includes a source field so you can tell whether it came from "session", "graph", "trace", or "graph_context".

Where recall fits

Use recall() as the default way to ask questions over memory in v1.0.
Use it after Remember has created either permanent or session memory.
Use explicit query_type only when you want to force a specific retrieval mode.
Use datasets to scope results to a specific knowledge base.

What happens under the hood

Check session scope
- If you pass session_id without datasets and without query_type, Cognee first searches the session cache directly.
- Session search is keyword-based over stored question, context, and answer fields.
Choose the retrieval strategy
- If query_type is provided, Cognee uses it directly.
- Otherwise, if auto_route=True, a rule-based router picks the best strategy based on your query. See Auto-routing behavior below for the kinds of patterns it recognizes.
- If auto_route=False, Cognee falls back to GRAPH_COMPLETION.
Run graph retrieval when needed
- If session search finds nothing, or if graph retrieval is requested, recall() queries the permanent knowledge graph.
- The retrieval strategy selected in step 2 determines how that graph query is executed.

After recall finishes

Session-only recall: you get matching session entries when cache hits exist, each tagged with source="session".
Graph-backed recall: you get normalized graph result objects tagged with source="graph".
Hybrid behavior: with session_id plus graph-scoping inputs like datasets or query_type, recall uses the permanent graph path rather than session-only lookup.

Examples and details

Prerequisites before calling recall

recall() only reads from memory — it never initializes anything itself. Populate memory first with remember() (or the legacy add() + cognify() sequence). The first ingestion run creates the relational, vector, and graph databases and the default user.

import cognee

await cognee.remember("Einstein was born in Ulm.")  # creates databases + ingests
results = await cognee.recall("Where was Einstein born?")

If you call recall() before any data exists, it raises RecallPreconditionError (HTTP 422), triggered by the underlying DatabaseNotCreatedError (“The database has not been created yet. Please call await setup() first.”) or UserNotFoundError. The fix is to run remember() (or add() + cognify()) first.recall() can also fail when the configured LLM provider (or LiteLLM proxy) reports that its token budget is exhausted. In that case it raises LLMPaymentRequiredError, which the API surfaces as HTTP 402 (Payment Required) with body {"error": "Token budget exhausted", "detail": "..."}. This error is terminal — Cognee does not retry budget-exhaustion failures — so treat a 402 as final for the request and prompt the user to top up their token budget rather than reissuing the call.recall() also accepts only its documented parameters — there is no catch-all **kwargs. Passing an unsupported keyword such as node_type raises a TypeError. Use node_name to scope retrieval to specific nodes or node sets; node_type belongs to the legacy search() API. See the recall() API reference for the full parameter list.

Recall while indexing is still running

recall() can run while another dataset is still being ingested or indexed in the background.

There is no global lock that blocks retrieval while remember() is processing.
recall() only sees data that has already made it through indexing.
If a dataset is mid-run, results may be incomplete until that run reaches completion.

If you need to confirm a dataset is fully ready before querying it, check its status with cognee.datasets.get_status() or the indexing-status guidance on Remember.

Auto-routing behavior

The built-in router uses query patterns to choose an underlying search mode:

Summary-style prompts like “overview”, “summary”, or “key takeaways” bias toward summary retrieval.
Relationship questions like “how are X and Y connected?” bias toward graph context-extension retrieval.
Time-oriented questions like “when”, “before”, “after”, or year ranges bias toward temporal retrieval.
Code-focused queries like “coding rules” or async def bias toward coding-rules retrieval.
Exact quoted phrases bias toward lexical chunk search.

If you want direct control, pass query_type explicitly and bypass the router.

How recall relates to retrievers

Most users should think in terms of asking memory a question, not selecting a retriever manually.Under the hood, retrieval strategies — graph completion, summary, temporal, lexical, coding-rules — do the actual work. recall() is intentionally one layer above them:

you call recall()
Cognee chooses or accepts a query_type
the matching retriever runs underneath
the result comes back through the recall API

If you want to understand or control the lower-level retrieval behavior directly, see the lower-level search reference.

What recall returns

Session-only recall returns matching session cache entries.
Graph-backed recall returns the output of the underlying retrieval mode selected by the router or query_type.
Depending on the retrieval path, each graph result object’s text can contain a plain answer, retrieved context, chunk text, or rendered structured output.
Each recall result is tagged with source so callers can distinguish session, graph, trace, and graph-context results.
only_context=True skips the final LLM answer-generation step and returns retrieved context instead.
verbose=True exposes extra retrieval details from the lower-level graph search path.

Session Results
Graph Results

When recall() returns session cache hits, the result is a list of ResponseQAEntry objects tagged with source="session".These entries follow the session QA shape documented in Sessions and Caching, with fields such as:

time
qa_id
question
context
answer
feedback_text
feedback_score
source

When recall() falls through to graph retrieval, it returns normalized ResponseGraphEntry objects tagged with source="graph". Each entry includes text, kind, search_type, optional dataset_id and dataset_name, metadata, and raw.For chunk and summary results (CHUNKS, CHUNKS_LEXICAL, SUMMARIES), metadata carries stable source identifiers — data_id (the ingested Data item’s id), chunk_id (the chunk’s own node id), chunk_index, and document_name — so you can map a result back to what you ingested and inspect the exact cited chunk. Only keys present in the underlying payload are included; completion-style results carry an empty metadata dict. For GRAPH_COMPLETION (and other completion) answers, the same data_id/chunk_id are instead surfaced inline in the Evidence: bullets (- chunk N of document NAME (data_id: …, chunk_id: …): "snippet") when include_references=True. This is additive — no DB migration is required.The text field is derived from the underlying retrieval mode selected by the router or query_type:

Search type(s)	What `text` represents	Notes
`GRAPH_COMPLETION`, `RAG_COMPLETION`, `HYBRID_COMPLETION`, `TRIPLET_COMPLETION`, `GRAPH_COMPLETION_DECOMPOSITION`, `GRAPH_SUMMARY_COMPLETION`, `GRAPH_COMPLETION_COT`, `GRAPH_COMPLETION_CONTEXT_EXTENSION`	Natural-language answer	With `only_context=True`, `text` contains the formatted context string instead.
`TEMPORAL`	Time-aware answer text	`only_context=True` returns retrieved temporal context instead of an answer.
`AGENTIC_COMPLETION`	Agentic answer text	Requires the resolved graph scope to contain exactly one dataset.
`CHUNKS`	Chunk text	`raw` preserves the normalized chunk payload.
`CHUNKS_LEXICAL`	Ranked chunk text	`raw` may include scores and chunk metadata.
`SUMMARIES`	Summary text	`raw` preserves the normalized summary payload.
`CYPHER`, `NATURAL_LANGUAGE`, `CODING_RULES`	Rendered structured row/object output	`raw` preserves the normalized structured payload. `CYPHER` and `NATURAL_LANGUAGE` are disabled when `ALLOW_CYPHER_QUERY=false`.
`FEELING_LUCKY`	Varies	Uses whichever search type the router selects.

Read recall results with attribute access (result.text, result.raw, result.source), not result.get(...). The text_result, context_result, and objects_result keys belong to the legacy search(verbose=True) API, which returns plain dicts. See recall() — Return value for the full per-source field list.

Using only_context

Set only_context=True when you want the retrieved context without asking Cognee to produce a final answer.When only_context=True, Cognee stops after retrieval formatting:

the LLM answer is not generated
the final completion step is skipped
the returned value is the retrieved context rather than a synthesized answer

results = await cognee.recall(
    query_text="Tell me about NLP",
    only_context=True,
)

This is useful when you want to:

inspect what was retrieved before answer generation
feed the retrieved context into your own prompt or downstream logic
debug retrieval quality separately from answer quality

In other words, only_context=True keeps recall focused on retrieval output instead of answer generation.

Session cache reads and writes during recall

session_id makes recall session-aware, but the exact behavior depends on the recall mode.Session reads:

With session_id and no explicit query_type, recall() searches session-cache QA entries by keyword and returns matches tagged with source="session".
If session_id is used with datasets or dataset_ids, recall can combine session lookup with graph retrieval.
If you pass an explicit query_type, the default path is graph-backed retrieval. Session history can still be used during the LLM completion step, but session-cache QA lookup is not part of the default source set unless you pass scope.

Graph-backed completion with session history:When graph-backed recall runs with session_id and the LLM completion step is enabled, Cognee loads prior session conversation history and prepends it to the completion prompt. If improve() has saved a graph_context snapshot for that session, that snapshot is also prepended as background knowledge.Session writes:When the LLM completion step runs through session-enabled graph retrieval, Cognee appends a new QA entry to the session cache with the question, stored context, and generated answer. The next compatible recall on the same session_id can see that entry.Using only_context=True:Pass only_context=True to skip the final LLM completion. Because the QA write happens during completion, only_context=True also avoids adding a new QA entry to the session cache.

# Session-aware recall with answer generation can write a new QA entry
answer = await cognee.recall(
    query_text="What did we decide about pricing?",
    session_id="chat-42",
)

# Returns retrieved context only; skips final LLM completion and QA write
context = await cognee.recall(
    query_text="What did we decide about pricing?",
    session_id="chat-42",
    only_context=True,
)

Use only_context=True when you want retrieval output without LLM summarization and without growing the session cache.

Dataset scoping

answers = await cognee.recall(
    query_text="Give me an overview of this dataset.",
    datasets=["product_docs"],
)

Scoping is exclusive. When you pass datasets (or dataset_ids), retrieval runs against only those datasets — Cognee does not pull from any other dataset, even ones the current user can read.
When you omit both, recall searches across all datasets the current user has read access to. Supply a dataset to narrow that down to a single knowledge base.
datasets limits graph retrieval to named datasets. With backend access control enabled, names are resolved only against datasets owned by the current user.
dataset_ids scopes retrieval by dataset UUID instead of name. Use this for shared datasets that the current user can access but did not create. When supplied, it takes precedence over datasets and the name-to-UUID resolution step is skipped.
With session_id plus either datasets or dataset_ids, recall becomes hybrid: session context is available, but graph retrieval is scoped to the selected dataset or dataset UUIDs.

If you set query_type=SearchType.AGENTIC_COMPLETION, the resolved graph scope must contain exactly one dataset. Passing multiple dataset names or UUIDs, or leaving scope broad enough to match multiple readable datasets, can raise 422 InvalidAgenticDatasetScope.

If Alice created shared_dataset and Bob only has permission to use it, Bob should query it with dataset_ids=[shared_id], not datasets=["shared_dataset"]. Name-based lookup can fail for non-owners even when they have read access.

Parameters

Basic Parameters
Advanced Parameters

Option	What it does
`query_text`	The natural-language query to answer.
`query_type`	Forces a specific underlying search type instead of using auto-routing.
`datasets`	Restricts graph retrieval to specific dataset names. Names are resolved only against datasets owned by the current user.
`dataset_ids`	Restricts graph retrieval by dataset UUIDs. Use this for shared datasets the current user did not create. Takes precedence over `datasets` when both are set.
`top_k`	Limits the number of returned results. Defaults to `15`.
`auto_route`	Enables the rule-based query router. Defaults to `True`.
`session_id`	Enables session-aware retrieval and session-cache lookup.
`only_context`	Returns retrieved context only. The final LLM answer is not generated.
`system_prompt` / `system_prompt_path`	Customizes the generation prompt.

Option	What it does
`node_name` / `node_name_filter_operator`	Restricts graph retrieval to matching node names or node sets.
`wide_search_top_k`	Expands the initial candidate set used by graph-completion retrieval before ranking.
`triplet_distance_penalty`	Adjusts scoring for triplet-based retrieval paths.
`feedback_influence`	Applies stored feedback weights during ranking where supported.
`verbose`	Returns extra retrieval details from the lower-level graph search path.
`retriever_specific_config`	Passes advanced configuration directly to the selected retriever.
`user`	Runs recall under a specific user context, affecting dataset access and session-cache lookup.

Under the hood — legacy operations

recall() runs Search under the hood for graph-backed retrieval.Use legacy Search directly when you need to select a specific retriever, inspect retrieval internals, or use advanced parameters not exposed by recall(). See also Search Basics for the full set of retrieval options.

recall() vs search(): when to use each

	`recall()`	`search()` (legacy)
Recommended for	Most v1.0 use cases	Advanced retriever control
Search type selection	Auto-routes by default; pass `query_type` to override	Always explicit; defaults to `GRAPH_COMPLETION`
Session behavior	Searches session-cache QA entries by keyword; falls through to graph on miss	`session_id` writes/reads conversation history only — no cache lookup
Source tagging	Results carry `source` (`"session"`, `"graph"`, `"trace"`, or `"graph_context"`)	No source tagging
Parameter surface	Compact; explicit keyword options documented in the API reference	Full surface — `neighborhood_depth`, `node_type`, `triplet_distance_penalty` with explicit defaults

Use recall() when you want to ask a question and get an answer. It handles routing, session lookup, and source attribution automatically.Use search() directly when you need a specific retriever, lower-level parameters such as neighborhood_depth or node_type, or when building custom pipelines.

Remember

Create permanent or session memory

Improve

Enrich the graph for better future recall

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

Rust SDK

TypeScript SDK

OSS

What is the recall operation

Where recall fits

What happens under the hood

After recall finishes

Examples and details

Remember

Improve

​What is the recall operation

​Where recall fits

​What happens under the hood

​After recall finishes

​Examples and details

Remember

Improve

What is the recall operation

Where recall fits

What happens under the hood

After recall finishes

Examples and details