Why Use This Integration
- Schema-Aware Graphs: Foreign key relationships are preserved as first-class edges in the knowledge graph
- Deterministic Graph Construction: Structured data bypasses LLM entity extraction — no hallucination risk
- Mixed Ingestion: Combine structured (dlt) and unstructured (text, PDF) data in the same dataset
- Multiple Input Modes: Pass explicit dlt resources, CSV file paths, or database connection strings
- Write Dispositions: Control how data is synced — merge (upsert), append, or replace
Installation
Quick Start
1. Ingest a dlt Resource
Define a dlt resource and pass it tocognee.add():
pets inside each user) and creates separate tables with foreign key relationships.
2. Build and Query the Graph
Oncecognify completes, use cognee.search to query the graph. See Search for all available search types.
Other Input Modes
CSV Auto-Detection
Pass a.csv file path and cognee creates a dlt source automatically:
Database Connection String
Ingest tables directly from an existing database:Mixed Structured + Unstructured
Combine dlt resources with unstructured text in a single dataset:Structured data creates deterministic graph nodes from the schema, while unstructured text goes through LLM-based entity extraction. Both are combined in the same knowledge graph.
Write Dispositions
Control how data is synced on repeated runs using thewrite_disposition parameter:
merge(default): Upsert by primary key — updates existing rows, inserts new ones. Best for data that changes over time.append: Always insert without deduplication. Use for time-series data and event logs.replace: Drop and recreate tables on each run. Use for full snapshot refreshes.
How It Works
- Source Detection: cognee identifies dlt resources, CSV files, and connection strings in the input
- Pipeline Execution: A dlt pipeline loads data into a per-dataset staging database
- Schema Extraction: Table schemas, primary keys, and foreign keys are extracted
- Graph Construction: Each row becomes a document node; foreign keys become edges between nodes
- LLM Bypass: Structured rows skip chunking, entity extraction, and summarization — the graph is built entirely from schema metadata
The
primary_key parameter controls upsert behavior. If not specified, cognee auto-detects from an id column or falls back to the first column. Set DLT_MAX_ROWS_PER_TABLE (default: 50) to control the maximum rows ingested per table.Use Cases
CRM and Relational Data
CRM and Relational Data
Load customer, order, and product tables from a database. Foreign keys between tables (e.g.,
order.customer_id → customer.id) become graph edges, enabling cross-table queries like “Which customers ordered product X?”CSV Analytics Pipeline
CSV Analytics Pipeline
Point cognee at CSV exports from analytics tools. Each row becomes a searchable node in the graph, and you can combine them with unstructured reports in the same dataset.
Event Log Ingestion
Event Log Ingestion
Use
write_disposition="append" to stream event batches into cognee without deduplication. Query across the full event history with natural language.Database Mirroring
Database Mirroring
Use
write_disposition="merge" to keep cognee’s graph in sync with a live database. Rows that are removed upstream are automatically cleaned up.Add Operation
Learn more about data ingestion in cognee
dlt Documentation
Official dlt documentation and guides