Cognify

What is the cognify operation

The .cognify operation takes the data you ingested with Add and turns plain text into structured knowledge: chunks, embeddings, summaries, nodes, and edges that live in Cognee’s vector and graph stores. It prepares your data for downstream operations like Search.

Transforms ingested data: builds chunks, embeddings, and summaries; always comes after Add
Graph creation: extracts entities and relationships to form a knowledge graph
Vector indexing: makes everything searchable via embeddings
Dataset-scoped: runs per dataset, respecting ownership and permissions
Incremental loading: you can run .cognify multiple times as your dataset grows, and Cognee will skip what’s already processed

What happens under the hood

The .cognify pipeline is made of six ordered Tasks. Each task takes the output of the previous one and moves your data closer to becoming a searchable knowledge graph.

Classify documents — wrap each ingested file as a Document object with metadata and optional node sets
Check permissions — enforce that you have the right to modify the target dataset
Extract chunks — split documents into smaller pieces (paragraphs, sections)
Extract graph — use LLMs to identify entities and relationships, inserting them into the graph DB
Summarize text — generate summaries for each chunk, stored as TextSummary DataPoints
Add data points — embed nodes and summaries, write them into the vector store, and update graph edges

The result is a fully searchable, structured knowledge graph connected to your data.

After cognify finishes

When .cognify completes for a dataset:

DocumentChunks exist in memory as the granular breakdown of your files
Summaries are stored and indexed in the vector database for semantic search
Knowledge graph nodes and edges are committed to the graph database
Dataset metadata is updated with token counts and pipeline status
Your dataset is now query-ready: you can run Search or graph queries immediately

Examples and details

Pipeline tasks (detailed)

Classify documents
- Turns raw Data rows into Document objects
- Chooses the right document type (PDF, text, image, audio, etc.)
- Attaches metadata and optional node sets
Check permissions
- Verifies that the user has write access to the dataset
Extract chunks
- Splits documents into DocumentChunks using a chunker
- Updates token counts in the relational DB
Extract graph
- Calls the LLM to extract entities and relationships
- Deduplicates nodes and edges, commits to the graph DB
Summarize text
- Generates concise summaries per chunk
- Stores them as TextSummary DataPoints for vector search
Add data points
- Converts summaries and other DataPoints into graph + vector nodes
- Embeds them in the vector store, persists in the graph DB

Datasets and permissions

Cognify always runs on a dataset
You must have write access to the dataset
Permissions are enforced at pipeline start
Each dataset maintains its own cognify status and token counts

Incremental loading

By default, .cognify processes all data in a dataset
With incremental_loading=True, only new or updated files are processed
Saves time and compute for large, evolving datasets

Final outcome

Vector database contains embeddings for summaries and nodes
Graph database contains entities and relationships
Relational database tracks token counts and pipeline run status
Your dataset is now ready for Search (semantic or graph-based)

Add

First bring data into Cognee

Search

Query embeddings or graph structures built by Cognify

Building Blocks

Learn about DataPoints, Tasks, and Pipelines

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

What is the cognify operation

What happens under the hood

After cognify finishes

Examples and details

Add

Search

Building Blocks

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

​What is the cognify operation

​What happens under the hood

​After cognify finishes

​Examples and details

Add

Search

Building Blocks

What is the cognify operation

What happens under the hood

After cognify finishes

Examples and details