> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Search

> Query your AI memory with vectors, graphs, and LLMs.

<Note>
  `search()` is a legacy operation. In Cognee v1.0, most users should use [recall()](/core-concepts/main-operations/recall) instead. Legacy `search()` remains useful when you need explicit control over retrievers and lower-level retrieval parameters.
</Note>

## What is search

`search` lets you ask questions over everything you've ingested and cognified.\
Under the hood, Cognee blends **vector similarity**, **graph structure**, and **LLM reasoning** to return answers with context and provenance.

## The big picture

* **Dataset-aware**: searches run against one or more datasets you can read *(requires `ENABLE_BACKEND_ACCESS_CONTROL=true`)*
* **Multiple modes**: from simple chunk lookup to graph-aware Q\&A
* **Hybrid retrieval**: vectors find relevant pieces; graphs provide structure; LLMs compose answers
* **Conversational memory**: for GRAPH\_COMPLETION, RAG\_COMPLETION, and TRIPLET\_COMPLETION, use `session_id` to maintain conversation history across searches *(requires caching enabled)*. When caching is on, omitting `session_id` uses `default_session` and still stores history. Other search types do not use session history.
* **Safe by default**: permissions are checked before any retrieval
* **Observability**: telemetry is emitted for query start/completion

<Warning>
  **Dataset scoping** requires specific configuration. See [permissions system](/core-concepts/multi-user-mode/permissions-system/datasets#dataset-isolation-how-access-is-enforced) for details on access control requirements and supported database setups.
</Warning>

## Where search fits

Use `search` after you've run `.add` and `.cognify`.
At that point, your dataset has chunks, summaries, embeddings, and a knowledge graph—so queries can leverage both **similarity** and **structure**.

<Warning>
  If you see `DatabaseNotCreatedError: please call await setup() first`, Cognee has not been initialized yet. In practice, that usually means you are trying to search or use an operation that resolves the default user before initialization has happened. Run `cognee.add(...)` or `cognee.remember(...)` first, then retry your operation.
</Warning>

<Note>
  Completion-type search modes (`GRAPH_COMPLETION`, `RAG_COMPLETION`, `TRIPLET_COMPLETION`, and the other `*_COMPLETION` types) call the LLM to compose an answer, so they can fail when the configured LLM provider (or LiteLLM proxy) reports that its token budget is exhausted. In that case `search()` raises `LLMPaymentRequiredError`, which the API surfaces as **HTTP 402 (Payment Required)** with body `{"error": "Token budget exhausted", "detail": "..."}`. This error is **terminal** — Cognee does not retry budget-exhaustion failures — so treat a `402` as final for the request and prompt the user to top up their token budget rather than reissuing the call.
</Note>

## How it works (conceptually)

1. **Scope & permissions**\
   Resolve target datasets (by name or id) and enforce read access.

   For most search types, this scope can contain multiple datasets. `AGENTIC_COMPLETION` is stricter and requires the resolved scope to contain exactly one dataset.

2. **Mode dispatch**\
   Pick a search mode (default: **graph-aware completion**) and route to its retriever.

3. **Retrieve → (optional) generate**\
   Collect context via vectors and/or graph traversal; some modes then ask an LLM to compose a final answer.

4. **Return results**\
   Depending on mode: a natural-language answer string, chunk/summary dicts with metadata, graph records, or Cypher results.

For a practical guide to using search with examples and detailed parameter explanations, see [Search Basics](/guides/search-basics).

## Retrievers

Each search type is handled by a **retriever**. The pipeline is: `get_retrieved_objects` → `get_context_from_objects` → `get_completion_from_context` (skipped when `only_context=True`).

| Search type                           | Retriever                                |
| ------------------------------------- | ---------------------------------------- |
| GRAPH\_COMPLETION                     | GraphCompletionRetriever                 |
| RAG\_COMPLETION                       | CompletionRetriever                      |
| HYBRID\_COMPLETION                    | HybridRetriever                          |
| CHUNKS                                | ChunksRetriever                          |
| SUMMARIES                             | SummariesRetriever                       |
| GRAPH\_SUMMARY\_COMPLETION            | GraphSummaryCompletionRetriever          |
| GRAPH\_COMPLETION\_COT                | GraphCompletionCotRetriever              |
| GRAPH\_COMPLETION\_CONTEXT\_EXTENSION | GraphCompletionContextExtensionRetriever |
| TRIPLET\_COMPLETION                   | TripletRetriever                         |
| CHUNKS\_LEXICAL                       | BM25ChunksRetriever                      |
| CODING\_RULES                         | CodingRulesRetriever                     |
| TEMPORAL                              | TemporalRetriever                        |
| CYPHER                                | CypherSearchRetriever                    |
| NATURAL\_LANGUAGE                     | NaturalLanguageRetriever                 |

You can register a custom retriever for a search type via `use_retriever(SearchType, RetrieverClass)`; the class must implement the same three-step interface (`BaseRetriever`). See the API reference for `BaseRetriever` and `register_retriever`.

### Multi-query (batch)

**GraphCompletionRetriever**, **GraphCompletionCotRetriever**, and **GraphCompletionContextExtensionRetriever** support **batch mode**: pass `query_batch` (a non-empty list of strings) instead of `query`. You get one result per query; session cache is not used in batch mode. The public `cognee.search()` API accepts only a single `query_text`; batch is available when you use the retrievers directly (e.g. in custom pipelines).

<Accordion title="GRAPH_COMPLETION (default)" defaultOpen={true}>
  Graph-aware question answering.

  * **What it does**: Finds relevant graph triplets using vector hints across indexed fields, resolves them into readable context, and asks an LLM to answer your question grounded in that context.
  * **Why it’s useful**: Combines fuzzy matching (vectors) with precise structure (graph) so answers reflect relationships, not just nearby text.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted context string instead of an answer. With `verbose=True`, returns a payload containing `text_result`, `context_result`, and `objects_result`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
</Accordion>

<Accordion title="RAG_COMPLETION">
  Retrieve-then-generate over text chunks.

  * **What it does**: Pulls top-k chunks via vector search, stitches a context window, then asks an LLM to answer.
  * **When to use**: You want fast, text-only RAG without graph structure.
  * **Scope**: Honors `node_name` and `node_name_filter_operator`, so chunk retrieval can be limited to specific node sets.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted context string instead of an answer. With `verbose=True`, returns a payload containing `text_result`, `context_result`, and `objects_result`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
</Accordion>

<Accordion title="HYBRID_COMPLETION">
  Blended chunk + entity retrieval with LLM completion.

  * **What it does**: Builds a single context from three channels — BM25 lexical chunks, semantic (vector) chunks, and entity/graph context (matched entities plus their connected edges) — then asks an LLM to answer grounded in that combined context. The lexical and vector chunks are merged and de-duplicated up to `chunks_top_k`. Each entity's edge bullets are ordered by how relevant the edge text is to the query (entity-type edges are pinned first, then query-ranked edges, then remaining edges in graph order). The combined context also ends with a `## Related facts` section: up to `facts_top_k` edge-derived facts ranked by similarity to the query, excluding facts already shown as entity edge bullets. Facts whose text comes from chunk→entity "contains" edges are rendered as `Name: description` glossary-style lines.
  * **When to use**: You want both keyword precision (BM25) and semantic recall (vectors) alongside entity relationships in one answer, without choosing between lexical and vector chunk search.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted context string instead of an answer. With `verbose=True`, returns a payload containing `text_result`, `context_result`, and `objects_result`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
  * **Cost**: Issues both a BM25 lexical chunk lookup and a semantic vector-store query per search, so expect slightly higher query load than a single-channel mode.
  * **Limitation**: Does not support `query_batch` (single `query` only).

  **Configuration (`retriever_specific_config`)**

  | Key                            | Default        | What it does                                                                                                                                                             |
  | ------------------------------ | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
  | `chunks_top_k`                 | `top_k` (`15`) | Max merged chunks (BM25 + vector) included in context.                                                                                                                   |
  | `entities_top_k`               | `top_k` (`15`) | Number of entities retrieved for entity/graph context.                                                                                                                   |
  | `max_edges_per_entity`         | `10`           | Max connected edges listed per entity.                                                                                                                                   |
  | `facts_top_k`                  | `top_k` (`15`) | Max edge-derived facts included in the `## Related facts` section. Set to `0` to disable the section. Facts are also skipped when the search is scoped with `node_name`. |
  | `include_global_context_index` | `false`        | When `true`, prepends a global-context section (root text + top global-context summaries) to the answer context.                                                         |
  | `global_context_index_top_k`   | `3`            | Number of global-context summaries to include when `include_global_context_index=true`.                                                                                  |

  Pass these through `retriever_specific_config`.
</Accordion>

<Accordion title="CHUNKS">
  Direct chunk retrieval.

  * **What it does**: Returns the most similar text chunks to your query via vector search.
  * **When to use**: You want raw passages/snippets to display or post-process.
  * **Output**: `list[dict]`. Each chunk item includes `id`, `text`, `chunk_index`, `chunk_size`, and `cut_type`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`. See [Citation and Source Tracking](/guides/search-basics#citation-and-source-tracking) for details and examples.
</Accordion>

<Accordion title="SUMMARIES">
  Search over precomputed summaries.

  * **What it does**: Vector search on `TextSummary` content for concise, high-signal hits.
  * **When to use**: You prefer short summaries instead of full chunks.
  * **Output**: `list[dict]`. Each summary item includes `id` and `text`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`. See [Citation and Source Tracking](/guides/search-basics#citation-and-source-tracking) for details and examples.
</Accordion>

<Accordion title="GRAPH_SUMMARY_COMPLETION">
  Graph-aware summary answering.

  * **What it does**: Builds graph context like GRAPH\_COMPLETION, then condenses it before answering.
  * **When to use**: You want a tighter, summary-first response.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted context string instead of an answer. With `verbose=True`, returns a payload containing `text_result`, `context_result`, and `objects_result`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
</Accordion>

<Accordion title="GRAPH_COMPLETION_COT">
  Chain-of-thought over the graph.

  * **What it does**: Starts from `GRAPH_COMPLETION` (retrieve triplets, draft an answer), then runs `max_iter` reasoning rounds. Each round validates the current answer, generates a follow-up question, fetches new triplets for it, merges them into the context, and regenerates the answer.
  * **When to use**: Complex multi-hop questions where the answer depends on traversing several relationships (for example `A → B → C → D`). Each round can pull in one additional hop's worth of triplets, so deeper chains need more iterations.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted context string instead of an answer. With `verbose=True`, returns a payload containing `text_result`, `context_result`, and `objects_result`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.

  **Controlling depth with `max_iter`**

  `max_iter` (default `4`) is the number of follow-up rounds after the initial retrieval. Higher values let the retriever traverse further from the seed entities but multiply LLM calls — each round adds a validation call, a follow-up-generation call, a new triplet fetch, and a completion call. Latency and token cost scale roughly linearly with `max_iter`.

  Pass it through `retriever_specific_config`.
</Accordion>

<Accordion title="GRAPH_COMPLETION_CONTEXT_EXTENSION">
  Iterative context expansion.

  * **What it does**: Starts like `GRAPH_COMPLETION`: vector similarity seeds relevant triplets, then graph retrieval resolves them into context. It then runs up to `context_extension_rounds` extension rounds (default `4`), using each generated answer as a follow-up query to fetch and merge additional triplets until no new triplets are found.
  * **When to use**: Open-ended or exploratory queries that need broader subgraph coverage than a single `GRAPH_COMPLETION` pass.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted context string instead of an answer. With `verbose=True`, returns a payload containing `text_result`, `context_result`, and `objects_result`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
</Accordion>

<Accordion title="NATURAL_LANGUAGE">
  Natural language to Cypher to execution.

  * **What it does**: Infers a Cypher query from your question using the graph schema, runs it, returns the results.
  * **When to use**: You want structured graph answers without writing Cypher.
  * **Output**: `list[dict[str, Any]]` containing executed graph-query result rows. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
  * **Availability**: Disabled when `ALLOW_CYPHER_QUERY=false`.
</Accordion>

<Accordion title="CYPHER">
  Run Cypher directly.

  * **What it does**: Executes your Cypher query against the graph database.
  * **When to use**: You know the schema and want full control.
  * **Output**: `list[dict[str, Any]]` containing raw Cypher result rows. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
  * **Availability**: Disabled when `ALLOW_CYPHER_QUERY=false`.
</Accordion>

<Accordion title="CODING_RULES">
  Code-focused retrieval (coding rules / codebase search).

  * **What it does**: Retrieves rules or code context from the `coding_agent_rules` nodeset and returns structured code information.
  * **When to use**: Codebases or coding guidelines indexed by Cognee (e.g. via memify).
  * **Output**: `list[dict[str, Any]]` containing structured code/rule retrieval objects. Exact fields depend on the retrieved rule/context objects. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
  * **Prereq**: The `coding_agent_rules` nodeset must be populated (e.g. via [memify](/guides/memify-quickstart)).
</Accordion>

<Accordion title="TRIPLET_COMPLETION">
  Triple-based retrieval with LLM completion (no full graph traversal).

  * **What it does**: Retrieves graph triplets by vector similarity, resolves them to text, and asks an LLM to answer.
  * **When to use**: You want triplet-level context without full graph expansion.
  * **Scope**: Honors `node_name` and `node_name_filter_operator`, so triplet retrieval can be limited to specific node sets.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted context string instead of an answer. With `verbose=True`, returns a payload containing `text_result`, `context_result`, and `objects_result`. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
  * **Prereq**: Triplet embeddings must exist—set `TRIPLET_EMBEDDING=true` before running [cognify](/core-concepts/main-operations/legacy-operations/cognify) or run the [`create_triplet_embeddings`](/guides/memify-triplet-embeddings) memify pipeline (retriever uses the `Triplet_text` collection).
</Accordion>

<Accordion title="CHUNKS_LEXICAL">
  Lexical (keyword-style) chunk search.

  * **What it does**: Returns chunks that match your query using token-based BM25 lexical ranking, not semantic embeddings.
  * **When to use**: Exact-term or keyword-style lookups; stopword-aware search.
  * **Output**: `list[dict]` containing ranked chunk-like results. Depending on backend and normalization, items may include scores in addition to chunk text and metadata. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`.
</Accordion>

<Accordion title="TEMPORAL">
  Time-aware retrieval.

  * **What it does**: Retrieves and ranks content by temporal relevance (dates, events) and answers with time context.
  * **When to use**: Queries about "before/after X", "in 2020", or event timelines.
  * **Output**: `str` by default. With `only_context=True`, returns the formatted temporal context instead of an answer. With `ENABLE_BACKEND_ACCESS_CONTROL=true`, results are wrapped with `dataset_name` and `dataset_id`. See [Time-awareness](/guides/time-awareness) for setup.
</Accordion>

<Accordion title="FEELING_LUCKY">
  Automatic mode selection.

  * **What it does**: Uses an LLM to choose the most suitable search type for your query, then runs it. Falls back to `RAG_COMPLETION` if mode selection fails.
  * **When to use**: Exploratory or one-off queries when you are unsure which mode fits best. For production or latency-sensitive workloads, prefer a specific search type.
  * **Output**: Varies. Returns the same shape as whichever search type the router selects.
</Accordion>

<Accordion title="INSIGHTS (MCP only)">
  Graph-edge result format specific to the Cognee MCP server.

  * **What it does**: Runs search in the Cognee MCP server and formats the raw triplet results as readable relationship lines instead of asking an LLM to compose an answer.
  * **When to use**: You want to inspect raw graph edges in MCP rather than receive a natural-language response.
  * **Output**: A newline-separated list of relationship statements. `INSIGHTS` is MCP-only and is not available through the Python `SearchType` enum.
</Accordion>

<Info>
  **Feedback** is handled via [Sessions](/core-concepts/sessions-and-caching) and the [Feedback System](/guides/feedback-system)—use `cognee.session.add_feedback` and `cognee.session.delete_feedback`. See the [Sessions Guide](/guides/sessions) and [Feedback System](/guides/feedback-system) for full details.
</Info>

## Further details

<AccordionGroup>
  <Accordion title="Searching across both graph and vector stores">
    You do **not** need to run two separate searches and merge the results — the graph-completion family already queries both stores in a single call.

    For `GRAPH_COMPLETION`, `GRAPH_COMPLETION_DECOMPOSITION`, `GRAPH_SUMMARY_COMPLETION`, `GRAPH_COMPLETION_COT`, and `GRAPH_COMPLETION_CONTEXT_EXTENSION`, retrieval works in two combined steps:

    1. **Vector store** — seed nodes and edges are found by embedding similarity across indexed fields (`brute_force_triplet_search` over the vector index collections).
    2. **Graph store** — those seeds are resolved against the knowledge graph so the surrounding triplets (nodes + relationships) come back as structured context.

    The merged graph + vector context is then formatted and (unless `only_context=True`) passed to the LLM. So a single `GRAPH_COMPLETION` call is the built-in way to "look at both the graph and the vector store" at once.

    `TRIPLET_COMPLETION` is related but different: it searches precomputed triplet text in the vector store (`Triplet_text`) and uses those triplet payloads as context, without expanding the seeds through the graph store.

    ```python theme={null}
    import cognee
    from cognee import SearchType

    # One call uses vector similarity to seed and graph structure to expand
    results = await cognee.search(
        query_text="How are X and Y connected?",
        query_type=SearchType.GRAPH_COMPLETION,
    )
    ```

    If you instead want raw vector-only matches (`CHUNKS`, `SUMMARIES`) and raw graph rows (`CYPHER`, `NATURAL_LANGUAGE`) and prefer to merge them yourself, run each search type separately and combine the returned lists. For multi-part questions, `GRAPH_COMPLETION_DECOMPOSITION` with `retriever_specific_config={"decomposition_mode": "combined_triplets_context"}` splits the query into subqueries, fetches triplets for each, and de-duplicates them into one merged context before answering.
  </Accordion>

  <Accordion title="Retrieval pipeline and raw results">
    Every retriever follows the same three-step internal pipeline:

    * **`get_retrieved_objects`** — fetches raw graph triplets (edges + nodes) or vector chunks from the store
    * **`get_context_from_objects`** — formats those objects into a context string (e.g. `"Nodes: … Connections: …"` for graph modes)
    * **`get_completion_from_context`** — sends context to the LLM to produce a natural-language answer *(skipped when `only_context=True`)*

    **Completion modes** (`GRAPH_COMPLETION`, `RAG_COMPLETION`, `TRIPLET_COMPLETION`, etc.) run all three steps and return a **string answer**. That is why `GRAPH_COMPLETION` returns a plain string by default: the raw graph objects are consumed internally to assemble the LLM prompt.
    **Retrieval-only modes** (`CHUNKS`, `SUMMARIES`) skip the LLM step and return **structured dicts** directly.

    **Inspecting the raw retrieved objects**: add `verbose=True` to receive `text_result` (LLM answer), `context_result` (formatted context string), and `objects_result` (raw graph edges/nodes or chunk objects) together. Use `only_context=True` to skip LLM generation and return the formatted context string directly. For structured chunk dicts with no LLM call at all, use `SearchType.CHUNKS`. See [Search Basics — raw source objects](/guides/search-basics#citation-and-source-tracking) for code examples.
  </Accordion>
</AccordionGroup>

<Columns cols={2}>
  <Card title="Add" icon="plus" href="/core-concepts/main-operations/legacy-operations/add">
    First bring data into Cognee
  </Card>

  <Card title="Cognify" icon="brain-cog" href="/core-concepts/main-operations/legacy-operations/cognify">
    Build the knowledge graph that search queries
  </Card>

  <Card title="Architecture" icon="building" href="/core-concepts/architecture">
    Understand how vector and graph stores work together
  </Card>

  <Card title="Sessions and Caching" icon="message-square" href="/core-concepts/sessions-and-caching">
    Enable conversational memory with sessions
  </Card>
</Columns>