Improve

What is the improve operation

The .improve operation enriches an existing Cognee graph after data has already been ingested.

Graph enrichment: by default, improve() runs Cognee’s built-in enrichment pass on an existing dataset, adding derived retrieval structures that make later recall work better.
Session bridging: with session_ids, it moves useful session memory into the permanent graph.
Session distillation: with session_ids, it can turn gated session guidance into curated, entity-anchored lesson documents tagged under session_learnings.
Feedback-aware: it can raise or lower the importance of graph elements based on feedback attached to session answers that used those elements during retrieval.
Global context indexing: with build_global_context_index=True, it builds dataset-level bucket and root summaries that can later be prepended during graph completion retrieval.
Truth-subspace build: with build_truth_subspace=True, it can build truth-subspace anchors from distilled session_learnings.
Sync back to sessions: after enrichment, it can write new graph relationships back into session cache for faster future session recall.

Where improve fits

Use improve() after Remember when you want to enrich an existing graph further.
Use it at the end of a chat or agent session to bridge short-term session memory into permanent memory.
Use it when you want custom extraction or enrichment tasks.
Use it instead of re-ingesting everything when the graph already exists and you want additive enrichment.

What happens under the hood

Without session IDs

Run graph enrichment
- improve() runs an enrichment pass on the target dataset.
- By default, this extracts and indexes triplet datapoints when triplet embeddings are enabled.
- For coding-rule retrieval, pass explicit coding-rule extraction and enrichment tasks.
Optionally build the global context index
- When build_global_context_index=True, Cognee builds semantic summary buckets over existing TextSummary nodes.
- It also creates a root summary for the dataset.
- This index can later be included in graph completion search with include_global_context_index=True.

With session IDs

When session_ids is provided, Cognee can run these stages:

Apply feedback weights
- Session feedback updates feedback_weight on graph nodes and edges that were used during retrieval.
- In practice, this means highly rated answers can make their source graph elements more influential later, while poorly rated answers can reduce their influence.
Persist session Q&A
- Question-and-answer content from the session is cognified into the permanent graph.
- Persisted session content is tagged under the user_sessions_from_cache node set.
Persist agent traces
- Structured trace steps from agent/tool activity are cognified into the graph, so tool outcomes can become long-term memory instead of staying only in cache.
Extract session context
- Pending trace windows can be summarized into session-context lessons before distillation.
- These lessons join the same gated guidance pool used by conversational session context.
Distill sessions
- Cognee loads session Q&A plus active session-context guidance.
- Guidance is eligible only when it has not been rated harmful and its confidence passes the distillation gate.
- A curator proposes durable lessons, a writer/rejecter checks them against existing lessons and graph entities, and accepted lessons are added and cognified into the target dataset.
- Distilled documents are tagged under session_learnings and a session-specific node set.
Optionally build the truth subspace
- When build_truth_subspace=True, Cognee builds truth-subspace anchors from the distilled session_learnings.
- This stage is opt-in and only applies when session_ids is provided.
Run enrichment
- The dataset goes through the normal enrichment pass.
Optionally build the global context index
- When build_global_context_index=True, Cognee builds the same retrieval-ready summary layer after enrichment.
Sync graph back to sessions
- Newly enriched graph relationships are copied back into the session cache as human-readable context.

After improve finishes

Without session IDs: the target dataset has gone through the enrichment pass and is ready for better downstream retrieval.
With session IDs: session feedback, Q&A, trace activity, and accepted distilled lessons can be persisted into the permanent graph, enrichment runs, and new graph context may be synced back into those sessions.
With build_global_context_index=True: the dataset also has bucket and root summaries available for graph completion retrieval.
With build_truth_subspace=True: distilled session_learnings can also become truth-subspace anchors for opt-in hybrid reranking.

Examples and details

What graph enrichment means

In the context of improve(), graph enrichment means adding new derived retrieval structures or knowledge on top of an already-built graph instead of re-ingesting the original source data from scratch.By default, that usually means:

extracting and indexing triplet datapoints when triplet embeddings are enabled
improving how the graph can be searched later
optionally adding extra derived structures through custom tasks

So improve() is not the stage that first creates the graph. Instead, it makes an existing graph more useful for future retrieval.

What feedback weights mean

Feedback weights are stored importance signals attached to graph elements that were used during retrieval.

When a session answer has feedback and Cognee knows which graph nodes or edges helped produce that answer, improve() can update those elements’ feedback_weight.
Positive feedback can make those elements more influential in future ranking.
Negative feedback can make them less influential.
If no retrieval trace exists for the session, or if no feedback was captured, this stage may have little or nothing to update.

This is one of the ways Cognee lets memory quality improve over time without retraining the model itself.

What session distillation means

Session distillation is the part of improve(session_ids=[...]) that turns short-term session guidance into long-term graph memory.During a session, Cognee can accumulate guidance such as goals, rules, preferences, and lessons learned. Distillation does not blindly persist every guidance entry. It first filters out entries that have harmful feedback or low confidence, then uses a curator/writer pass to keep only durable lessons that are supported by the session and not already known.Accepted lessons are rendered as standalone markdown documents and written back through add() + cognify() into the target dataset. They are tagged with the session_learnings node set so later systems, including truth-subspace reranking, can find them.If you want to run only this stage for one finished session, use:

result = await cognee.session.distill_session(
    "support_chat_7",
    dataset="product_docs",
)

print(result.status)
print(result.documents)

result.status is one of completed, no_gated_entries, no_proposed_lessons, or no_accepted_lessons.

How custom improvement tasks work

improve() supports power-user overrides for custom extraction and enrichment tasks.

extraction_tasks lets you define what intermediate subgraph or source material should be prepared.
enrichment_tasks lets you define what new derived structures should be added to the graph.
This is how you move beyond the default enrichment pass and create domain-specific memory behavior.

Build the global context index

Use the global context index when later answers need document-wide or dataset-wide orientation in addition to local graph facts.

await cognee.improve(
    dataset="project_memory",
    build_global_context_index=True,
)

This builds GlobalContextSummary buckets over TextSummary nodes and one root summary for the dataset. During graph completion search, enable it with retriever_specific_config={"include_global_context_index": True}.See Global Context Index for the full model and retrieval example.

What improve produces

Enriched graph structures on the target dataset
Triplet-embedding style retrieval artifacts (when triplet embeddings are enabled)
Optional global context bucket and root summaries
Optional persistence of session Q&A into the permanent graph
Optional persistence of agent trace steps into the permanent graph
Optional distilled session-learning documents under session_learnings
Optional feedback-based weighting updates on graph elements used during retrieval
Optional truth-subspace anchors when build_truth_subspace=True
Optional sync of newly enriched graph context back into session cache

What improve needs before it can help

A target dataset must already exist.
That dataset should already contain graph memory, usually created by Remember.
Session bridging only applies when you pass session_ids.
Feedback-based weighting only helps when those sessions contain feedback and retrieval traces.
Session distillation only produces documents when the session contains gated guidance that survives curation.

So if you run improve() on a dataset with no existing graph content, or pass sessions that have no useful cached interactions, the operation may run successfully but add little new value.

Bridge session memory into the graph

await cognee.improve(
    dataset="rules_demo",
    session_ids=["chat_1", "chat_2"],
)

This applies feedback weights from those sessions.
It persists the session Q&A into the permanent graph.
It persists agent trace steps when present.
It distills accepted session guidance into session_learnings.
It enriches the graph and syncs new context back into the sessions.

Parameters

Basic Parameters
Advanced Parameters

Option	What it does
`dataset`	The dataset name or UUID to improve. Defaults to `main_dataset`.
`session_ids`	Bridges those sessions into the permanent graph and syncs graph context back.
`run_in_background`	Runs the improvement pipeline asynchronously.
`build_global_context_index`	Builds semantic bucket summaries and a root dataset summary after enrichment. Skipped in background mode.
`build_truth_subspace`	Builds truth-subspace anchors from distilled `session_learnings`. Only runs when `session_ids` is provided.
`node_name`	Restricts improvement to specific named entities or node sets.
`feedback_alpha`	Controls how strongly session feedback changes graph weights.

Option	What it does
`extraction_tasks` / `enrichment_tasks`	Overrides the default enrichment task set.
`data`	Supplies explicit data to advanced improvement pipelines when supported.
`node_type`	Changes which node type the enrichment pass targets.
`user`	Runs improve under a specific user context, affecting dataset access and session ownership.
`vector_db_config` / `graph_db_config`	Overrides database backend configuration for the vector or graph stores.

Under the hood — legacy operations

improve() runs Memify under the hood for its enrichment pass.Use legacy Memify directly when you need fine-grained control over extraction and enrichment tasks — for example, to supply custom task lists or target specific pipeline stages. Note that remember(..., self_improvement=True) already calls improve() for you after permanent ingestion.

Remember

Ingest and build memory in one call

Recall

Query the improved graph and session memory

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

Rust SDK

TypeScript SDK

OSS

What is the improve operation

Where improve fits

What happens under the hood

Without session IDs

With session IDs

After improve finishes

Examples and details

Remember

Recall

​What is the improve operation

​Where improve fits

​What happens under the hood

​Without session IDs

​With session IDs

​After improve finishes

​Examples and details

Remember

Recall

What is the improve operation

Where improve fits

What happens under the hood

Without session IDs

With session IDs

After improve finishes

Examples and details