> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# NodeSets

> Tagging and grouping data in Cognee

## What are NodeSets?

A **NodeSet** lets you group parts of your AI memory at the dataset level. You create them as a simple list of tags when adding data to Cognee:

```python theme={null}
await cognee.remember(..., node_set=["projectA", "finance"])
```

These tags travel with your data into the knowledge graph, where they become first-class nodes connected with belongs\_to\_set edges — and you can later filter retrieval to only those subsets.

## How they flow through Cognee

* **[Remember](../main-operations/remember)**:
  * NodeSets are attached as simple tags to datasets or documents
  * This happens when you first ingest data
  * The underlying graph-building step carries them into Documents and Chunks
  * They are materialized as real `NodeSet` nodes in the graph and connected with `belongs_to_set` edges
* **[Recall](../main-operations/recall)**:
  * NodeSets help define meaningful retrieval subsets
  * Use `recall()` with `node_name` and `node_name_filter_operator` to scope retrieval to specific node-set subsets
* **[Improve](../main-operations/improve)**:
  * The default improvement path runs memify-style enrichment
  * Related enrichment flows can create node sets such as `coding_agent_rules` or `user_sessions_from_cache`

## Why they matter

* Provide a lightweight way to organize and tag your data
* Enable graph-based filtering, traversal, and reporting
* Ideal for creating project-, domain-, or user-defined subsets of your knowledge graph

## Example: tagging data with NodeSets

```python theme={null}
import asyncio
import cognee

async def main():
    # reset Cognee’s memory and metadata for a clean run
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    # remember a document linked only to the "AI_Memory" node set
    await cognee.remember(
        "Cognee builds AI memory from raw documents.",
        node_set=["AI_Memory"]
    )

    # remember a document linked to both "AI_Memory" and "Graph_RAG" node sets
    await cognee.remember(
        "Cognee combines vector search with graph reasoning.",
        node_set=["AI_Memory", "Graph_RAG"]
    )

if __name__ == "__main__":
    asyncio.run(main())
```

## What just happened?

* You reset Cognee’s memory so you’re working with a clean graph.
* You remembered two documents, each tagged with one or more `NodeSet` labels.
  * The first document is only linked to `AI_Memory`.
  * The second document is linked to both `AI_Memory` and `Graph_RAG`.
* During `remember()`, Cognee:
  * Created `NodeSet` nodes (`AI_Memory`, `Graph_RAG`) in the graph.
  * Attached each document to the corresponding NodeSets.
  * Extracted entities and relationships from the documents, then linked those entities back to the same NodeSets.

This means the tags you add flow down into the extracted entities:

* **“Cognee”** appears in both documents → connects to **both NodeSets**.
* **“AI memory”** appears only in the first → connects only to **AI\_Memory**.
* **“Vector search”** appears only in the second → connects to **both** since that document belongs to **AI\_Memory** and **Graph\_RAG**.

Your NodeSets now unlock powerful search and navigation capabilities:

<AccordionGroup>
  <Accordion title="Filtering searches by NodeSet" defaultOpen>
    When filtering with multiple NodeSet names, you can control matching behavior by choosing whether results must be connected to **all** selected names or to **any** selected name;
    by default, Cognee uses the **any selected name** behavior (OR-style matching).
    This behaviour is controlled by passing the wanted value (`AND` or `OR`) via the `node_name_filter_operator` parameter in `recall()`.

    ```python theme={null}
    from cognee import SearchType
    from cognee.modules.engine.models.node_set import NodeSet

    # OR (default) — results from any of the listed node sets
    results = await cognee.recall(
        "What are the key topics?",
        query_type=SearchType.GRAPH_COMPLETION,
        node_type=NodeSet,
        node_name=["AI_Memory", "Graph_RAG"],
        node_name_filter_operator="OR",
    )

    # AND — results must belong to all listed node sets simultaneously
    results = await cognee.recall(
        "What concepts appear in both?",
        query_type=SearchType.GRAPH_COMPLETION,
        node_type=NodeSet,
        node_name=["AI_Memory", "Graph_RAG"],
        node_name_filter_operator="AND",
    )
    ```

    <Note>
      Node-set filtering works with graph-completion search types (`GRAPH_COMPLETION`, `GRAPH_COMPLETION_COT`, `GRAPH_COMPLETION_CONTEXT_EXTENSION`, `GRAPH_SUMMARY_COMPLETION`, `TEMPORAL`). It is not applied for `CHUNKS`, `SUMMARIES`, `RAG_COMPLETION`, `CYPHER`, or `NATURAL_LANGUAGE`.
    </Note>
  </Accordion>

  <Accordion title="Scoping queries to specific NodeSets">
    Use the same NodeSet names you assigned during `remember()` to limit retrieval to a focused subset of your graph. This is useful when one dataset contains multiple topics, teams, or workflows but you only want answers grounded in one slice of that memory.

    In practice, NodeSets let you keep a shared dataset while still asking targeted questions like "show only finance concepts" or "search just the Graph\_RAG material" without splitting everything into separate datasets.

    ```python theme={null}
    from cognee import SearchType
    from cognee.modules.engine.models.node_set import NodeSet

    # Recall only within the finance-related subset
    results = await cognee.recall(
        "What risks are mentioned in the quarterly report?",
        query_type=SearchType.GRAPH_COMPLETION,
        node_type=NodeSet,
        node_name=["finance"],
    )

    # Recall only within the Graph_RAG subset
    results = await cognee.recall(
        "How does Cognee combine graph reasoning with retrieval?",
        query_type=SearchType.GRAPH_COMPLETION,
        node_type=NodeSet,
        node_name=["Graph_RAG"],
    )
    ```
  </Accordion>

  <Accordion title="Navigating data by project or domain">
    Because NodeSets become first-class graph nodes, they can act as anchors for exploration as well as filtering. A project-level NodeSet like `project_alpha` or a domain-level NodeSet like `compliance` gives you a stable entry point into the related documents, chunks, entities, and relationships.

    This makes NodeSets a lightweight way to organize one knowledge graph around the mental model your team already uses: project, customer, topic, workflow, or domain.

    ```python theme={null}
    # Tag documents by project and domain during ingestion
    await cognee.remember(
        "Project Alpha must satisfy EU compliance requirements.",
        node_set=["project_alpha", "compliance"],
    )

    await cognee.remember(
        "Project Alpha rollout depends on infrastructure readiness.",
        node_set=["project_alpha", "operations"],
    )
    ```

    After the `remember()` workflow finishes, `project_alpha`, `compliance`, and `operations` become graph anchors you can use to explore related information by project or by domain.
  </Accordion>
</AccordionGroup>

<Columns cols={3}>
  <Card title="Remember" icon="plus" href="/core-concepts/main-operations/remember">
    Where NodeSets are first attached
  </Card>

  <Card title="Improve" icon="brain-cog" href="/core-concepts/main-operations/improve">
    How enrichment flows add more NodeSet-based structure
  </Card>

  <Card title="Search" icon="search" href="/core-concepts/main-operations/legacy-operations/search">
    Use NodeSets as anchors in queries
  </Card>
</Columns>
