Custom Data Models

A minimal guide to creating custom data models and inserting them directly into the knowledge graph using add_data_points. Before you start:

Complete Quickstart to understand basic operations
Ensure you have LLM Providers configured
Have some structured data you want to model

What Custom Data Models Do

Define your own Pydantic models that inherit from DataPoint
Insert structured data directly into the knowledge graph without cognify
Create relationships between data points programmatically
Control exactly what gets indexed and how

Code in Action

import asyncio
from typing import Any
from pydantic import SkipValidation

import cognee
from cognee.infrastructure.engine import DataPoint
from cognee.infrastructure.engine.models.Edge import Edge
from cognee.tasks.storage import add_data_points

class Person(DataPoint):
    name: str
    # Keep it simple for forward refs / mixed values
    knows: SkipValidation[Any] = None  # single Person or list[Person]
    # Recommended: specify which fields to index for search
    metadata: dict = {"index_fields": ["name"]}

async def main():
    # Start clean (optional in your app)
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    alice = Person(name="Alice")
    bob = Person(name="Bob")
    charlie = Person(name="Charlie")

    # Create relationships - field name becomes edge label
    alice.knows = bob
    # You can also do lists: alice.knows = [bob, charlie]
    
    # Optional: add weights and custom relationship types
    bob.knows = (Edge(weight=0.9, relationship_type="friend_of"), charlie)

    await add_data_points([alice, bob, charlie])

asyncio.run(main())

This example shows the complete workflow with metadata for indexing and optional edge weights. In practice, you can create complex nested models with multiple relationships and sophisticated data structures.

What Just Happened

Step 1: Define Your Data Model

class Person(DataPoint):
    name: str
    knows: SkipValidation[Any] = None
    # Recommended: specify which fields to index for search
    metadata: dict = {"index_fields": ["name"]}

Create a Pydantic model that inherits from DataPoint. Use SkipValidation[Any] for fields that will hold other DataPoints to avoid forward reference issues. Metadata is recommended - it tells Cognee which fields to embed and store in the vector database for search.

Step 2: Create Data Instances

alice = Person(name="Alice")
bob = Person(name="Bob")
charlie = Person(name="Charlie")

Instantiate your models with the data you want to store. Each instance becomes a node in the knowledge graph.

Step 3: Create Relationships

alice.knows = bob
# Optional: add weights and custom relationship types
bob.knows = (Edge(weight=0.9, relationship_type="friend_of"), charlie)

Assign DataPoint instances to fields to create edges. The field name becomes the relationship label by default. Weights are optional - you can use Edge to add weights, custom relationship types, or other metadata to your relationships.

Step 4: Insert into Graph

await add_data_points([alice, bob, charlie])

This converts your DataPoint instances into nodes and edges in the knowledge graph, automatically handling the graph structure and indexing. The name field gets embedded and stored in the vector database for search.

Use in Custom Tasks and Pipelines

This approach is particularly useful when creating custom tasks and pipelines where you need to:

Insert structured data programmatically
Define specific relationships between known entities
Control exactly what gets indexed and how
Integrate with external data sources or APIs

You can combine this with cognify to extract knowledge from unstructured text, then add your own structured data on top.

Low-Level LLM

Learn about direct LLM interaction

Core Concepts

Understand knowledge graph fundamentals

API Reference

Explore API endpoints

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

What Custom Data Models Do

Code in Action

What Just Happened

Step 1: Define Your Data Model

Step 2: Create Data Instances

Step 3: Create Relationships

Step 4: Insert into Graph

Use in Custom Tasks and Pipelines

Low-Level LLM

Core Concepts

API Reference

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

​What Custom Data Models Do

​Code in Action

​What Just Happened

​Step 1: Define Your Data Model

​Step 2: Create Data Instances

​Step 3: Create Relationships

​Step 4: Insert into Graph

​Use in Custom Tasks and Pipelines

Low-Level LLM

Core Concepts

API Reference

What Custom Data Models Do

Code in Action

What Just Happened

Step 1: Define Your Data Model

Step 2: Create Data Instances

Step 3: Create Relationships

Step 4: Insert into Graph

Use in Custom Tasks and Pipelines