Skip to main content

What is the memify operation

The .memify operation runs enrichment pipelines on an existing knowledge graph. It requires a graph built by Add and Cognify — it does not ingest raw data or build the graph from scratch. Every memify pipeline is composed of two stages:
  • Extraction — selects or prepares data from the existing graph. For example, pulling document chunks, loading graph triplets, or reading cached sessions.
  • Enrichment — processes the extracted data (typically via LLM) and writes new or updated nodes and edges back to the graph. For example, deriving coding rules, indexing triplet embeddings, or consolidating entity descriptions.
Memify chains extraction tasks and enrichment tasks into a single pipeline and runs them in sequence. When you call await cognee.memify() with no arguments, it runs the default pipeline. You can also call one of the other built-in pipelines directly, or supply your own custom tasks.
  • extraction_tasks (List[Task], default: [Task(extract_subgraph_chunks)]) — tasks that select or prepare the data to process. When omitted, memify pulls document chunks from the existing graph.
  • enrichment_tasks (List[Task], default: [Task(add_rule_associations, rules_nodeset_name="coding_agent_rules")]) — tasks that create or update nodes and edges from the extracted data. When omitted, memify derives coding-rule associations.
  • data (Any, default: None) — input data forwarded to the first extraction task. When None, memify loads the graph (or a filtered subgraph) as input.
  • dataset (str or UUID, default: "main_dataset") — the dataset to process. The user must have write access.
  • node_type (Type, default: NodeSet) — filter the graph to nodes of this type. Only used when data is None.
  • node_name (List[str], default: None) — filter the graph to nodes with these names. Only used when data is None.
  • run_in_background (bool, default: False) — if True, memify starts processing and returns immediately. Use the returned pipeline_run_id to monitor progress.

Built-in pipelines

Cognee ships four built-in pipelines. Each one calls cognee.memify() with a pre-configured pair of extraction and enrichment tasks. The default pipeline runs when you call cognee.memify() with no arguments. The other three are convenience functions that call cognee.memify() internally with their own tasks.
Runs when you call await cognee.memify() with no task arguments.
  • Extraction (extract_subgraph_chunks) — pulls document chunk texts from the existing graph
  • Enrichment (add_rule_associations) — sends chunks to the LLM, which derives coding-rule associations
Produces: Rule nodes connected to source chunks via rule_associated_from edges, grouped under the coding_agent_rules node set. Enables SearchType.CODING_RULES queries.Guide: Memify Quickstart
Calls cognee.memify() with triplet-specific tasks via await create_triplet_embeddings(user, dataset).
  • Extraction (get_triplet_datapoints) — reads graph triplets (source → relationship → target) and converts each to an embeddable text
  • Enrichment (index_data_points) — indexes those texts in the vector DB under the Triplet_text collection
Produces: a searchable Triplet_text vector collection. Enables SearchType.TRIPLET_COMPLETION queries.Guide: Triplet Embeddings Guide
Calls cognee.memify() with session-specific tasks via await persist_sessions_in_knowledge_graph_pipeline(user, session_ids). Requires caching to be enabled.
  • Extraction (extract_user_sessions) — reads Q&A data from the session cache for the specified session IDs
  • Enrichment (cognify_session) — processes session data through cognee.add + cognee.cognify
Produces: new graph nodes from the session content, grouped under the user_sessions_from_cache node set.Guide: Session Persistence Guide
Calls cognee.memify() with entity-consolidation tasks via await consolidate_entity_descriptions_pipeline(). Useful when entity descriptions are fragmented or repetitive across chunks after cognify.
  • Extraction (get_entities_with_neighborhood) — loads Entity nodes along with their edges and neighbors
  • Enrichment (generate_consolidated_entitiesadd_data_points) — sends each entity and its neighborhood to the LLM, which returns a refined description
Produces: updated Entity descriptions written back in place — no new nodes are created.Guide: Entity Consolidation Guide