Global Context Index

The global context index is an optional summary layer that helps Cognee answer questions that depend on the broader shape of a dataset, not only the closest graph facts. Normal graph retrieval is local: Cognee searches for graph edges, chunks, summaries, and entities that match the query. That works well when the answer is near a few specific facts. The global context index adds a higher-level map: semantic buckets of TextSummary nodes and a root summary of the dataset.

Why use it

Use the global context index when answers often depend on document-wide or dataset-wide context:

long documents where important details are spread across chapters or sections
evolving conversations where the final state depends on earlier updates
project memory where the answer needs the overall plan, risks, and current status
policy or research corpora where local facts need broader framing

It is most useful when you want retrieval to include both:

local evidence from the graph
global orientation from compact dataset summaries

How it works

During normal ingestion and enrichment, Cognee creates DocumentChunk and TextSummary datapoints. The global context index adds the GlobalContextSummary layers shown inside the dashed lines:

--------------------------------------------------
root GlobalContextSummary
  -> optional higher-level GlobalContextSummary bucket
    -> GlobalContextSummary bucket
--------------------------------------------------
      -> TextSummary
        -> DocumentChunk

The build process is bottom-up, starting from TextSummary nodes. The retrieval hierarchy is top-down, starting from the root summary.

The index groups TextSummary nodes, not raw DocumentChunk nodes directly.

What retrieval adds

When enabled for graph completion search, Cognee prepends a global context prelude before the usual graph context:

World summary:
...

Relevant areas:
...

<normal graph context follows>

The World summary comes from the root GlobalContextSummary. The Relevant areas are the top matching non-root GlobalContextSummary bucket texts for the query. This gives the model a compact map before it reads the local graph facts.

Build the index

The index is opt-in. Build it after memory has been created:

await cognee.improve(
    dataset="product_docs",
    build_global_context_index=True,
)

improve() first runs the normal enrichment pass, then builds the global context index.

build_global_context_index=True is skipped when run_in_background=True, because ordered background pipeline chaining is not currently supported for this step.

Use it during search

Enable it through retriever_specific_config on graph completion search:

from cognee import SearchType

results = await cognee.recall(
    query_text="What is the current state of the rollout plan?",
    query_type=SearchType.GRAPH_COMPLETION,
    datasets=["product_docs"],
    retriever_specific_config={
        "include_global_context_index": True,
        "global_context_index_top_k": 3,
    },
)

To inspect exactly what will be sent as context, use only_context=True:

context = await cognee.recall(
    query_text="What changed after the second meeting?",
    query_type=SearchType.GRAPH_COMPLETION,
    datasets=["product_docs"],
    only_context=True,
    retriever_specific_config={
        "include_global_context_index": True,
        "global_context_index_top_k": 3,
    },
)

Configuration

Option	Default	Where it is used	What it does
`build_global_context_index`	`False`	`cognee.improve()`	Builds the bucket and root summaries after enrichment.
`include_global_context_index`	`False`	`retriever_specific_config`	Prepends global context during `GRAPH_COMPLETION` retrieval.
`global_context_index_top_k`	`3`	`retriever_specific_config`	Number of non-root bucket summaries to include as relevant areas.

Benefits and tradeoffs

The main benefit is better long-range coherence. The model can see a compact summary of the dataset before it reasons over the retrieved graph context. This can reduce failures where local retrieval finds a relevant fragment but misses the broader story. The tradeoff is that the index is lossy. A bucket summary is an orientation aid, not a replacement for source chunks or graph facts. The most reliable answers still come from combining global context with precise retrieved evidence. Building the index also adds work: Cognee clusters summaries and calls the LLM to summarize buckets and the root.

When not to use it

You may not need the global context index when:

your dataset is small enough that normal retrieval already has enough context
queries are mostly simple fact lookups
you need the fastest possible enrichment pass
you want retrieval context to contain only direct local graph evidence

For small datasets, start without it. Add it when you see questions that need broader orientation or multi-part memory.

Getting Started

Core Concepts

Setup Configuration

Guides

Examples

CLI

Rust SDK

TypeScript SDK

OSS

Why use it

How it works

What retrieval adds

Build the index

Use it during search

Configuration

Benefits and tradeoffs

When not to use it

​Why use it

​How it works

​What retrieval adds

​Build the index

​Use it during search

​Configuration

​Benefits and tradeoffs

​When not to use it

Why use it

How it works

What retrieval adds

Build the index

Use it during search

Configuration

Benefits and tradeoffs

When not to use it