What is the memify operation
The.memify operation runs enrichment pipelines on an existing knowledge graph. It requires a graph built by Add and Cognify β it does not ingest raw data or build the graph from scratch.
Every memify pipeline is composed of two stages:
- Extraction β selects or prepares data from the existing graph. For example, pulling document chunks, loading graph triplets, or reading cached sessions.
- Enrichment β processes the extracted data and writes new or updated nodes and edges back to the graph. Depending on the pipeline, that can mean indexing triplet datapoints, deriving coding rules, or consolidating entity descriptions.
await cognee.memify() with no arguments, it runs the default pipeline. You can also call one of the other built-in pipelines directly, or supply your own custom tasks.
Parameters (cognee.memify)
Parameters (cognee.memify)
extraction_tasks(List[Task], default: config-dependent) β tasks that select or prepare the data to process. By default, memify uses triplet-datapoint extraction when triplet embeddings are enabled; otherwise the default extraction stage can be empty.enrichment_tasks(List[Task], default:[Task(index_data_points, task_config={"batch_size": 100})]) β tasks that create or update nodes and edges from the extracted data. When omitted, memify indexes the default extracted datapoints.data(Any, default:None) β input data forwarded to the first extraction task. WhenNone, memify loads the graph (or a filtered subgraph) as input.dataset(strorUUID, default:"main_dataset") β the dataset to process. The user must have write access.node_type(Type, default:NodeSet) β filter the graph to nodes of this type. Only used whendataisNone.node_name(List[str], default:None) β filter the graph to nodes with these names. Only used whendataisNone.run_in_background(bool, default:False) β ifTrue, memify starts processing and returns immediately. Use the returnedpipeline_run_idto monitor progress.
Built-in pipelines
Cognee ships a default memify pipeline plus several convenience pipelines. The default pipeline runs when you callcognee.memify() with no arguments. Other helpers wrap cognee.memify() with their own task sets.
Default enrichment (triplet datapoints)
Default enrichment (triplet datapoints)
Runs when you call
await cognee.memify() with no task arguments.- Extraction (
get_triplet_datapoints) β when triplet embeddings are enabled, reads graph triplets (source -> relationship -> target) and converts each to an indexable datapoint - Enrichment (
index_data_points) β indexes those datapoints in the vector DB
The default extraction stage is config-dependent. When triplet embeddings are disabled, memify does not run the triplet extraction task automatically.
Triplet embeddings
Triplet embeddings
Calls
cognee.memify() with triplet-specific tasks via await create_triplet_embeddings(user, dataset).- Extraction (
get_triplet_datapoints) β reads graph triplets (source β relationship β target) and converts each to an embeddable text - Enrichment (
index_data_points) β indexes those texts in the vector DB under theTriplet_textcollection
Triplet_text vector collection. Enables SearchType.TRIPLET_COMPLETION queries.Guide: Triplet Embeddings GuideCoding rules (custom workflow)
Coding rules (custom workflow)
Coding-rule extraction still exists, but it is no longer the default memify pipeline.
- Extraction (
extract_subgraph_chunks) β pulls document chunk texts from the existing graph - Enrichment (
add_rule_associations) β sends chunks to the LLM, which derives coding-rule associations
Rule nodes connected to source chunks via rule_associated_from edges, grouped under the coding_agent_rules node set. Enables SearchType.CODING_RULES queries.Guide: Memify QuickstartSession persistence
Session persistence
Calls
cognee.memify() with session-specific tasks via await persist_sessions_in_knowledge_graph_pipeline(user, session_ids). Requires caching to be enabled.- Extraction (
extract_user_sessions) β reads Q&A data from the session cache for the specified session IDs - Enrichment (
cognify_session) β processes session data throughcognee.add+cognee.cognify
user_sessions_from_cache node set.Guide: Session Persistence GuideEntity consolidation
Entity consolidation
Calls
cognee.memify() with entity-consolidation tasks via await consolidate_entity_descriptions_pipeline(). Useful when entity descriptions are fragmented or repetitive across chunks after cognify.- Extraction (
get_entities_with_neighborhood) β loadsEntitynodes along with their edges and neighbors - Enrichment (
generate_consolidated_entitiesβadd_data_points) β sends each entity and its neighborhood to the LLM, which returns a refined description
Entity descriptions written back in place β no new nodes are created.Guide: Entity Consolidation GuideCognify
Build the knowledge graph that memify enriches
Memify Quickstart
Run the default memify pipeline step by step
Search
Query the enriched graph with specialized search types