> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Custom Graph Model

> Step-by-step guide to creating custom graph models and using remember with them

A minimal guide to creating custom graph models and loading them into Cognee with `remember()`.

**Before you start:**

* Complete [Quickstart](getting-started/quickstart) to understand basic operations
* Ensure you have [LLM Providers](setup-configuration/llm-providers) configured
* Have some structured data you want to model

## What Custom Graph Models Do

* Define a schema (Pydantic models inheriting from `DataPoint`) that constrains how Cognee extracts entities and relationships from text
* Guide `remember()` to produce graph nodes and edges that match the defined domain model
* Improve consistency of labels, properties, and relationship structure across ingested documents
* Control search quality by choosing which fields are indexed via metadata

## Code in Action

### Step 1: Define Your Entity Classes

```python theme={null}
class Activity(DataPoint):
    name: str
    metadata: dict = {"index_fields": ["name"], "identity_fields": ["name"]}

class Person(DataPoint):
    name: str
    likes: List[Activity] | None = None
    metadata: dict = {"index_fields": ["name"], "identity_fields": ["name"]}
```

Create Pydantic models that inherit from `DataPoint` to represent the node types in your custom graph. The nested `likes: List[Activity]` field tells Cognee to extract `Activity` nodes and the `likes` edges connecting each `Person` to them.

The `metadata` dict controls how each node is indexed and identified:

* **`index_fields`** — the fields Cognee embeds and indexes for retrieval (here, `name`).
* **`identity_fields`** — the fields used to derive a deterministic node id (namespaced by class name). Without them a node gets a random id, so repeated mentions never merge; with `identity_fields=["name"]`, the same `Alice` or a shared activity like `board games` collapses into a single node instead of creating duplicates.

### Step 2: Define Your Top-Level Graph Container

```python theme={null}
class PeopleGraph(DataPoint):
    people: List[Person]
```

Wrap your top-level entities in a container model. This is the model you pass as `graph_model`, and it tells the LLM to return a list of `Person` entities (each with their nested activities).

<Accordion title="How graph_model constrains LLM extraction">
  When you pass `graph_model=...`, that model **is** the structured-output schema the LLM must fill in. Internally, Cognee hands your model to the LLM as the `response_model` for structured extraction, so the LLM can only return entities and relationships that fit the fields you declared — it is not free to invent an arbitrary shape.

  * **Default:** without `graph_model`, Cognee uses its general-purpose `KnowledgeGraph` schema (free-form nodes and edges).
  * **Custom model:** when you pass a `DataPoint` subclass, only the fields you declare on each subclass become part of the extraction schema. Inherited `DataPoint` infrastructure fields (like `id`) are stripped out, so they do not expand the LLM's response schema. Adding a field (e.g. `age: int` on `Person`) tells the LLM to extract that value; nested `DataPoint` fields (like `likes: List[Activity]`) tell it to extract those related entities and the edges between them.
  * **`custom_prompt` vs `graph_model`:** they play different roles. `graph_model` defines the *shape* (which fields and relationships are allowed), while `custom_prompt` replaces the system prompt that tells the LLM *what to look for*. Use them together for predictable, domain-specific extraction.
</Accordion>

### Step 3: Remember Your Data with the Custom Model

```python theme={null}
CUSTOM_PROMPT = (
    "Extract all people mentioned in the text. "
    "For each person, extract ALL activities they like, including shared activities."
)

text = (
    "Alice likes biking and swimming. Bob likes playing basketball. "
    "Alice and Bob are friends. "
    "Charlie likes skiing. "
    "Alice and Bob like playing board games together."
)

await remember(
    text,
    graph_model=PeopleGraph,
    custom_prompt=CUSTOM_PROMPT,
    self_improvement=False,
)
```

This ingests the text and builds the graph in one call. The custom `graph_model` acts as the extraction schema, and `CUSTOM_PROMPT` tells the LLM what to look for — here, every person and all the activities they like, including activities shared between people.

### Step 4: Visualize Your Data

```python theme={null}
graph_path = os.path.join(os.path.dirname(__file__), ".artifacts", "hobbies_graph.html")
await visualize_graph(graph_path)
```

This renders the generated graph to `hobbies_graph.html` so you can verify nodes, relationships, and overall schema behavior.

## Use in Custom Tasks and Pipelines

This pattern is useful when you need predictable, domain-specific extraction inside custom workflows.

* Reuse the same graph schema across tasks to keep outputs consistent
* Run `remember(graph_model=...)` in workflows where downstream logic expects a specific graph shape
* Combine with custom prompts or custom tasks to refine extraction
* Validate pipeline results with `visualize_graph` before promoting changes to production

## Additional examples

Additional examples about custom data models are available on our [github](https://github.com/topoteretes/cognee/tree/main/examples/guides).

## Full Example

<Accordion title="Latest guide">
  ```python theme={null}
  import asyncio
  import os
  from typing import List

  from cognee import forget, remember, visualize_graph
  from cognee.low_level import DataPoint

  CUSTOM_PROMPT = (
      "Extract all people mentioned in the text. "
      "For each person, extract ALL activities they like, including shared activities."
  )


  class Activity(DataPoint):
      name: str
      metadata: dict = {"index_fields": ["name"], "identity_fields": ["name"]}


  class Person(DataPoint):
      name: str
      likes: List[Activity] | None = None
      metadata: dict = {"index_fields": ["name"], "identity_fields": ["name"]}


  class PeopleGraph(DataPoint):
      people: List[Person]


  async def main():
      await forget(everything=True)

      text = (
          "Alice likes biking and swimming. Bob likes playing basketball. "
          "Alice and Bob are friends. "
          "Charlie likes skiing. "
          "Alice and Bob like playing board games together."
      )

      await remember(
          text,
          graph_model=PeopleGraph,
          custom_prompt=CUSTOM_PROMPT,
          self_improvement=False,
      )

      graph_path = os.path.join(os.path.dirname(__file__), ".artifacts", "hobbies_graph.html")
      await visualize_graph(graph_path)


  if __name__ == "__main__":
      asyncio.run(main())
  ```
</Accordion>

<Accordion title="Legacy guide">
  ```python theme={null}
  import os  
  import asyncio  
  from typing import Any, List  
  from pydantic import SkipValidation  
    
  from cognee import add, cognify, prune, visualize_graph  
  from cognee.low_level import DataPoint  
    
  CUSTOM_PROMPT = (  
      "Extract a simple graph containing Programming Language and Fields that it is used in."  
  ) 
    
  # Define a custom graph model for programming languages.  
  class FieldType(DataPoint):  
      name: str = "Field"  
    
  class Field(DataPoint):  
      name: str  
      is_type: FieldType  
      metadata: dict = {"index_fields": ["name"]}  
    
  class ProgrammingLanguageType(DataPoint):  
      name: str = "Programming Language"  
     
  class ProgrammingLanguage(DataPoint):  
      name: str  
      used_in: List[Field] = None  
      is_type: ProgrammingLanguageType  
      metadata: dict = {"index_fields": ["name"]}  
     
  async def visualize_data():  
      graph_file_path = os.path.join(  
          os.path.dirname(__file__), ".artifacts", "custom_graph_model_entity_schema_definition.html"  
      )  
      await visualize_graph(graph_file_path)  
    
  async def main():  
      # Prune data and system metadata before running, only if we want "fresh" state.  
      await prune.prune_data()  
      await prune.prune_system(metadata=True)  
    
      text = "The Python programming language is widely used in data analysis, web development, and machine learning."  
    
      await add(text)  
      await cognify(graph_model=ProgrammingLanguage, custom_prompt=CUSTOM_PROMPT)  
    
      await visualize_data()  
     
  if __name__ == "__main__":  
      asyncio.run(main())
  ```
</Accordion>

<Note>
  This example shows the complete workflow with metadata for indexing. In practice, you can create complex nested models with multiple relationships and sophisticated data structures.
</Note>

<Columns cols={3}>
  <Card title="Low-Level LLM" icon="cpu" href="/guides/low-level-llm">
    Learn about direct LLM interaction
  </Card>

  <Card title="Core Concepts" icon="brain" href="/core-concepts/overview">
    Understand knowledge graph fundamentals
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/introduction">
    Explore API endpoints
  </Card>
</Columns>