Skip to main content
A minimal guide to using cognee.search() to query your processed datasets. Covers the basic call, core parameters, and result shapes. Before you start:
  • Complete Quickstart to understand basic operations
  • Ensure you have LLM Providers configured for LLM-backed search types
  • Run cognee.cognify(...) to build the graph before searching
  • Keep at least one dataset with read permission for the user running the search

Code in Action

import asyncio
import cognee

async def main():
    answers = await cognee.search(
        query_text="What are the main themes in my data?"
    )
    for answer in answers:
        print(answer)

if __name__ == "__main__":
    asyncio.run(main())
SearchType.GRAPH_COMPLETION is the default. It returns an LLM-generated answer backed by context retrieved from your knowledge graph.

What Just Happened

cognee.search() retrieved relevant context from the knowledge graph and passed it to the LLM to generate an answer. Results are returned as a list.

Parameters Reference

All examples below assume you are inside an async function. Import helpers when needed:
from cognee import SearchType
from cognee.modules.engine.models.node_set import NodeSet
  • query_text (str, required): The question or phrase to search for.
    answers = await cognee.search(query_text="Who owns the rollout plan?")
    
  • query_type (SearchType, optional, default: SearchType.GRAPH_COMPLETION): Sets the search mode. See Search Types for the full list and Retrievers for how each type maps to a retriever.
    await cognee.search(
        query_text="List coding guidelines",
        query_type=SearchType.CODING_RULES,
    )
    
  • top_k (int, optional, default: 10): Maximum number of results to return.
    await cognee.search(query_text="Summaries please", top_k=3)
    
  • only_context (bool, optional, default: False): Skip the LLM completion step and return the retrieved context directly. This avoids the final LLM call and is useful when you want to inspect or reuse the context yourself.
    import asyncio
    import cognee
    
    async def main():
        results = await cognee.search(
            query_text="What did we promise the client?",
            only_context=True,
        )
    
        if isinstance(results, str):
            # Single-dataset searches may unwrap to a plain string.
            print(results)
        elif results and isinstance(results[0], dict):
            # With access control enabled, each item is grouped by dataset.
            for dataset_result in results:
                print("Dataset:", dataset_result["dataset_name"])
                print("Context:", dataset_result["search_result"])
        else:
            # Otherwise, results is typically a list of context strings.
            for context_text in results:
                print(context_text)
    
    asyncio.run(main())
    
    only_context=True works with any search type. For LLM-completion types (GRAPH_COMPLETION, RAG_COMPLETION, etc.) it returns the text that would have been sent to the LLM. For retrieval-only types (CHUNKS, SUMMARIES) the behavior is effectively unchanged because no final LLM call is made.
  • system_prompt_path (str, optional, default: "answer_simple_question.txt"): Path to a prompt file packaged with your project.
    await cognee.search(
        query_text="Explain the roadmap in bullet points",
        system_prompt_path="prompts/bullets.txt",
    )
    
  • system_prompt (Optional[str]): Inline prompt string. Overrides system_prompt_path when set.
    await cognee.search(
        query_text="Give me a confident answer",
        system_prompt="Answer succinctly and state confidence at the end.",
    )
    
  • only_context (bool, optional, default: False): Skip LLM generation and return the raw retrieved context directly as a Union[str, List[str]]. Useful for inspecting what the retriever found, building custom prompts, or downstream processing without an LLM call.
    context = await cognee.search(
        query_text="What did we promise the client?",
        only_context=True,
    )
    # Returns a list of context strings (one per dataset searched), e.g.:
    # ["Alice promised the client a delivery by Q3. Bob confirmed the SLA in writing."]
    print(context)
    
    When ENABLE_BACKEND_ACCESS_CONTROL=false (default) and a single dataset is searched, Cognee unwraps the list one level for backwards compatibility, so you typically get a single string or a flat list of strings:
    # Single dataset — context is a string or list of strings
    context = await cognee.search(
        query_text="What are the deployment steps?",
        only_context=True,
    )
    # e.g. "Step 1: Build the image. Step 2: Push to registry. Step 3: Apply manifests."
    
    Add verbose=True to get the full breakdown (text_result, context_result, objects_result) alongside the context, instead of the unwrapped value:
    results = await cognee.search(
        query_text="What did we promise the client?",
        only_context=True,
        verbose=True,
    )
    # Returns a list of dicts, one per dataset:
    # [
    #   {
    #     "text_result": None,          # LLM completion — None because only_context=True skipped it
    #     "context_result": "Alice promised the client a delivery by Q3.",
    #     "objects_result": [...]        # Raw retriever objects (graph nodes/edges), if any
    #   }
    # ]
    
    When ENABLE_BACKEND_ACCESS_CONTROL=true, the result always includes dataset metadata regardless of verbose:
    # Access control enabled — each item includes dataset info
    results = await cognee.search(
        query_text="What did we promise the client?",
        only_context=True,
    )
    # [
    #   {
    #     "dataset_id": UUID("..."),
    #     "dataset_name": "client_contracts",
    #     "dataset_tenant_id": UUID("..."),
    #     "search_result": "Alice promised the client a delivery by Q3."
    #   }
    # ]
    
  • wide_search_top_k (int, optional, default: 100): Caps initial candidate retrieval for graph-completion retrievers before ranking. Increase for broader recall on large graphs.
  • triplet_distance_penalty (float, optional, default: 3.5): Penalty applied in graph retrieval ranking. Controls how triplet distance influences final result ordering.
  • retriever_specific_config (dict, optional): Per-retriever options. Examples: response_model for typed LLM output; max_iter for GRAPH_COMPLETION_COT; context_extension_rounds for GRAPH_COMPLETION_CONTEXT_EXTENSION. See the API reference for full keys.
  • verbose (bool, optional, default: False): When true, results include text_result, context_result, and objects_result fields alongside the answer.
These options scope the search to specific node sets. Set both node_type and node_name to filter — use the same names you passed to cognee.add(..., node_set=[...]). See NodeSets for background.
  • node_type (Optional[Type], optional, default: NodeSet): The graph model to search. Leave as NodeSet unless you have a custom node model.
  • node_name (Optional[List[str]]): Names of the node sets to include.
    from cognee.modules.engine.models.node_set import NodeSet
    
    await cognee.search(
        query_text="What discounts did TechSupply offer?",
        node_type=NodeSet,
        node_name=["vendor_conversations"],
    )
    
  • node_name_filter_operator (str, optional, default: "OR"): Controls how multiple node-set names are combined. "OR" returns results connected to any of the listed node sets; "AND" returns results connected to all of them. Case-insensitive.
    # OR (default) — results touching any of the listed node sets
    await cognee.search(
        query_text="Summarize procurement rules",
        node_type=NodeSet,
        node_name=["procurement_policies", "purchase_history"],
        node_name_filter_operator="OR",
    )
    
    # AND — results that belong to every listed node set
    await cognee.search(
        query_text="What topics span both domains?",
        node_type=NodeSet,
        node_name=["procurement_policies", "purchase_history"],
        node_name_filter_operator="AND",
    )
    
Node-set filtering applies to graph-completion search types (GRAPH_COMPLETION, GRAPH_COMPLETION_COT, GRAPH_COMPLETION_CONTEXT_EXTENSION, GRAPH_SUMMARY_COMPLETION, TEMPORAL). It has no effect on CHUNKS, SUMMARIES, RAG_COMPLETION, CYPHER, or NATURAL_LANGUAGE.
  • session_id (Optional[str]): Links this search to a conversation session. When the same session_id is reused, previous Q&A turns are included in the LLM prompt. Only effective for GRAPH_COMPLETION, RAG_COMPLETION, and TRIPLET_COMPLETION; other search types do not read or write session history. If omitted while caching is enabled, Cognee writes to default_session.
    await cognee.search(
        query_text="Where does Alice live?",
        session_id="conversation_1"
    )
    # Later — the LLM has access to the previous answer
    await cognee.search(
        query_text="What does she do for work?",
        session_id="conversation_1"
    )
    
    See Sessions Guide for complete examples. To record feedback on answers, see the Feedback System.
  • datasets (Optional[Union[list[str], str]]): Limit search to specific dataset names.
    await cognee.search(
        query_text="Key risks",
        datasets=["risk_register", "exec_summary"],
    )
    
  • dataset_ids (Optional[Union[list[UUID], UUID]]): Same as datasets, using UUIDs instead of names.
    from uuid import UUID
    await cognee.search(
        query_text="Customer feedback",
        dataset_ids=[UUID("aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee")],
    )
    
  • user (Optional[User]): The user to run the search as. Required for multi-tenant flows or background jobs.
    from cognee.modules.users.methods import get_user
    user = await get_user(user_id)
    await cognee.search(query_text="Team OKRs", user=user)
    
    When ENABLE_BACKEND_ACCESS_CONTROL=true:
    • Result shape: Searches run only on datasets the user can access. Results are returned as a list of per-dataset objects (dataset_name, dataset_id, search_result). Use verbose=True to include text_result, context_result, and objects_result in each item.
    • Parallel execution: Multiple datasets are searched concurrently using asyncio.gather() — total time is roughly that of the slowest single-dataset search.
    • If no user is given, get_default_user() is used (created if missing); an error is raised only if this user lacks dataset permissions.
    • If datasets is not set, all datasets readable by the user are searched. An error is raised if none are accessible or if a requested dataset is forbidden.
    PermissionDeniedError will be raised unless you search with the same user that added the data or grant access to the default user.
    When ENABLE_BACKEND_ACCESS_CONTROL=false:
    • Dataset filters (datasets, dataset_ids) are ignored — all data is searched.
    • Results are returned as a plain list (e.g. ["answer1", "answer2"]). If only one dataset is searched and the retriever returns a list, Cognee may unwrap one level for backwards compatibility.

Citation and Source Tracking

Provenance is available at two levels:
  1. Dataset level — when ENABLE_BACKEND_ACCESS_CONTROL=true, results are wrapped with dataset_name and dataset_id.
  2. Chunk/summary levelCHUNKS and SUMMARIES results include an id you can use to look up the item in the graph.
Search results do not include source file paths. raw_data_location and document name live on the Document node, not on retrieved chunk/summary payloads. To trace a result back to a file, use dataset provenance or query the graph by id.
SearchType.CHUNKS returns a list of dicts with these fields:
FieldTypeDescription
idstrChunk UUID
textstrRaw chunk text
chunk_indexintPosition of this chunk within the source document
chunk_sizeintToken count of this chunk
cut_typestrHow the boundary was chosen (sentence_end, paragraph_end, etc.)
import asyncio
import cognee
from cognee import SearchType

async def main():
    await cognee.add("path/to/policy.pdf", dataset_name="docs")
    await cognee.cognify(datasets=["docs"])

    results = await cognee.search(
        query_text="What is the refund policy?",
        query_type=SearchType.CHUNKS,
    )

    for chunk in results:
        print("Text:        ", chunk["text"])
        print("Chunk index: ", chunk["chunk_index"])
        print("Chunk ID:    ", chunk["id"])  # UUID for graph lookups
        print()

asyncio.run(main())
SearchType.SUMMARIES returns a list of dicts with these fields:
FieldTypeDescription
idstrSummary UUID
textstrSummary text
results = await cognee.search(
    query_text="What is the refund policy?",
    query_type=SearchType.SUMMARIES,
)

for summary in results:
    print("Summary:", summary["text"])
    print("ID:     ", summary["id"])
When ENABLE_BACKEND_ACCESS_CONTROL=true, every result is wrapped with dataset information:
import asyncio
import cognee
from cognee import SearchType

async def main():
    results = await cognee.search(
        query_text="What is the refund policy?",
        query_type=SearchType.CHUNKS,
        datasets=["docs"],
    )

    for dataset_result in results:
        print("Dataset:   ", dataset_result["dataset_name"])
        print("Dataset ID:", dataset_result["dataset_id"])
        for chunk in dataset_result["search_result"]:
            print("  Text:", chunk["text"])
            print("  Chunk ID:", chunk["id"])

asyncio.run(main())
When ENABLE_BACKEND_ACCESS_CONTROL=false, results are a plain list with no dataset_name or dataset_id wrapper.
For modes that return a generated answer (GRAPH_COMPLETION, RAG_COMPLETION, etc.), use verbose=True to receive the raw retrieved objects alongside the answer:
results = await cognee.search(
    query_text="Summarize the launch timeline",
    verbose=True,
)

for result in results:
    print("Answer:         ", result.get("text_result"))
    print("Context passed: ", result.get("context_result"))
    print("Source objects: ", result.get("objects_result"))

Additional Examples

A complete runnable script:
import asyncio
import cognee

async def main():
    # Start clean (optional)
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    await cognee.add(
        [
            "Alice moved to Paris in 2010. She works as a software engineer.",
            "Bob lives in New York. He is a data scientist.",
            "Alice and Bob met at a conference in 2015.",
        ]
    )
    await cognee.cognify()

    answers = await cognee.search(query_text="What are the main themes in my data?")
    for answer in answers:
        print(answer)

if __name__ == "__main__":
    asyncio.run(main())
Additional examples are available on our GitHub.

Custom Prompts

Learn about custom prompts for tailored answers

Permission Snippets

Multi-tenant deployment patterns

API Reference

Explore all search types and parameters

Sessions

Enable conversational memory with sessions

Agent Memory Decorator

Attach retrieval to an agent function boundary