Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt

Use this file to discover all available pages before exploring further.

A minimal guide to creating custom graph models and loading them into Cognee with remember(). Before you start:
  • Complete Quickstart to understand basic operations
  • Ensure you have LLM Providers configured
  • Have some structured data you want to model

What Custom Graph Models Do

  • Define a schema (Pydantic models inheriting from DataPoint) that constrains how Cognee extracts entities and relationships from text
  • Guide remember() to produce graph nodes and edges that match the defined domain model
  • Improve consistency of labels, properties, and relationship structure across ingested documents
  • Control search quality by choosing which fields are indexed via metadata

Code in Action

Step 1: Define Your Entity Type

class FieldType(DataPoint):
    name: str = "Field"

class ProgrammingLanguageType(DataPoint):
    name: str = "Programming Language"
Create Pydantic models that inherit from DataPoint to represent the node types in your custom graph.

Step 2: Define Your Entity Classes

class Field(DataPoint):
    name: str
    is_type: FieldType
    metadata: dict = {"index_fields": ["name"]}

class ProgrammingLanguage(DataPoint):
    name: str
    used_in: List[Field] = None
    is_type: ProgrammingLanguageType
    metadata: dict = {"index_fields": ["name"]}
These models define the graph structure that remember() should extract. Metadata is recommended because it tells Cognee which fields to embed and index for retrieval.

Step 3: Remember Your Data with the Custom Model

text = "The Python programming language is widely used in data analysis, web development, and machine learning."

await remember(
    text,
    graph_model=ProgrammingLanguage,
    custom_prompt=CUSTOM_PROMPT,
    self_improvement=False,
)
This ingests the text and builds the graph in one call. The custom graph_model acts as the extraction schema, and CUSTOM_PROMPT tells the LLM what relationships to look for.

Step 4: Visualize Your Data

await visualize_data()
This renders the generated graph to an HTML file so you can verify nodes, relationships, and overall schema behavior.

Use in Custom Tasks and Pipelines

This pattern is useful when you need predictable, domain-specific extraction inside custom workflows.
  • Reuse the same graph schema across tasks to keep outputs consistent
  • Run remember(graph_model=...) in workflows where downstream logic expects a specific graph shape
  • Combine with custom prompts or custom tasks to refine extraction
  • Validate pipeline results with visualize_graph before promoting changes to production

Additional examples

Additional examples about custom data models are available on our github.

Full Example

import os
import asyncio
from typing import Any, List
from pydantic import SkipValidation

from cognee import prune, remember, visualize_graph
from cognee.low_level import DataPoint

CUSTOM_PROMPT = (
    "Extract a simple graph containing Programming Language and Fields that it is used in."
)


# Define a custom graph model for programming languages.
class FieldType(DataPoint):
    name: str = "Field"


class Field(DataPoint):
    name: str
    is_type: FieldType
    metadata: dict = {"index_fields": ["name"]}


class ProgrammingLanguageType(DataPoint):
    name: str = "Programming Language"


class ProgrammingLanguage(DataPoint):
    name: str
    used_in: List[Field] = None
    is_type: ProgrammingLanguageType
    metadata: dict = {"index_fields": ["name"]}


async def visualize_data():
    graph_file_path = os.path.join(
        os.path.dirname(__file__), ".artifacts", "custom_graph_model_entity_schema_definition.html"
    )
    await visualize_graph(graph_file_path)


async def main():
    # Prune data and system metadata before running, only if we want "fresh" state.
    await prune.prune_data()
    await prune.prune_system(metadata=True)

    text = "The Python programming language is widely used in data analysis, web development, and machine learning."

    await remember(
        text,
        graph_model=ProgrammingLanguage,
        custom_prompt=CUSTOM_PROMPT,
        self_improvement=False,
    )

    await visualize_data()


if __name__ == "__main__":
    asyncio.run(main())
import os  
import asyncio  
from typing import Any, List  
from pydantic import SkipValidation  
  
from cognee import add, cognify, prune, visualize_graph  
from cognee.low_level import DataPoint  
  
CUSTOM_PROMPT = (  
    "Extract a simple graph containing Programming Language and Fields that it is used in."  
) 
  
# Define a custom graph model for programming languages.  
class FieldType(DataPoint):  
    name: str = "Field"  
  
class Field(DataPoint):  
    name: str  
    is_type: FieldType  
    metadata: dict = {"index_fields": ["name"]}  
  
class ProgrammingLanguageType(DataPoint):  
    name: str = "Programming Language"  
   
class ProgrammingLanguage(DataPoint):  
    name: str  
    used_in: List[Field] = None  
    is_type: ProgrammingLanguageType  
    metadata: dict = {"index_fields": ["name"]}  
   
async def visualize_data():  
    graph_file_path = os.path.join(  
        os.path.dirname(__file__), ".artifacts", "custom_graph_model_entity_schema_definition.html"  
    )  
    await visualize_graph(graph_file_path)  
  
async def main():  
    # Prune data and system metadata before running, only if we want "fresh" state.  
    await prune.prune_data()  
    await prune.prune_system(metadata=True)  
  
    text = "The Python programming language is widely used in data analysis, web development, and machine learning."  
  
    await add(text)  
    await cognify(graph_model=ProgrammingLanguage, custom_prompt=CUSTOM_PROMPT)  
  
    await visualize_data()  
   
if __name__ == "__main__":  
    asyncio.run(main())
This example shows the complete workflow with metadata for indexing. In practice, you can create complex nested models with multiple relationships and sophisticated data structures.

Low-Level LLM

Learn about direct LLM interaction

Core Concepts

Understand knowledge graph fundamentals

API Reference

Explore API endpoints