Changelog - Cognee Documentation

Cognee releases with highlights and links to the full release notes on GitHub.

Unreleased

Changes queued for the next release. This section is updated as unreleased work is merged and is folded into a versioned release section when the release is published.

Fixes the Mistral transcription adapter sending an entire Windows path as the API file_name. MistralAdapter.create_transcript derived the file name with input.split("/")[-1]; on Windows the audio path uses backslash separators (e.g. C:\audio\clip.mp3) and contains no forward slashes, so the split left the value unchanged and the whole path — rather than the basename clip.mp3 — was sent to the Mistral transcription API. The basename is now derived with str(input).replace("\\", "/").split("/")[-1], normalizing both \ and / separators (the same handling used by _normalize_filename in cognee/tasks/ingestion/utils.py), so Windows and POSIX paths both send just the file name. The change is backward-compatible: POSIX paths resolve to the same basename as before, and no public API signature, configuration option, or environment variable changed (fixes #3587, PR #3588).
Fixes the stored file size not refreshing when a file is re-ingested. On the re-ingest update path, ingest_data assigned the new size to data_point.file_size, but the Data model defines the column as data_size (the name already used correctly on the new-record branch). SQLAlchemy silently ignored the nonexistent attribute, so the persisted data_size stayed at its original value after a file was re-added with new or changed content. The assignment now targets data_point.data_size, so re-ingestion records the current size. No public API signature, configuration option, environment variable, or database migration changed (fixes #3160, PR #3578).
Fixes regex entity extraction config files failing to load on platforms whose default locale encoding is not UTF-8 (commonly Windows). RegexEntityConfig._load_config previously opened the config JSON with open(path, "r"), which relies on the platform’s locale encoding; on a non-UTF-8 system a config containing non-ASCII characters (for example Unicode entity names, descriptions, or regex patterns) could raise a decode error and fail to load. The file is now opened with an explicit encoding="utf-8", so configs load consistently across platforms. The change is backward-compatible: existing UTF-8/ASCII configs load exactly as before, no config edits or migration are required, and no public API signature, configuration option, or environment variable changed (fixes #3316, PR #3337).
Fixes a TypeError when instantiating the Amazon Neptune Analytics graph adapter (NeptuneGraphDB). GraphDBInterface declares is_empty() as an abstract method, but the Neptune adapter never implemented it, so the class was abstract and any attempt to construct it (including test collection in cognee/tests/test_neptune_analytics_graph.py) raised TypeError: Can't instantiate abstract class NeptuneGraphDB with abstract method is_empty. The adapter now implements async is_empty() -> bool, which runs a small openCypher node-existence query (MATCH (n) RETURN true LIMIT 1) and returns True when the graph contains no nodes and False otherwise; it relies on Neptune’s openCypher support. This is a non-breaking bug fix that only adds the required method — no public SDK function, configuration option, or environment variable changed (closes #3407, PR #3457).
Fixes embedding retries wasting the full back-off window on deterministic “context window too small” failures. When an over-length embedding input is split down to a single string that still exceeds the model’s context window but can no longer be divided, the engines now raise a new terminal EmbeddingContextWindowTooSmallError (a subclass of EmbeddingException, default message Text is too short to split further but exceeds context window.) and add it to their retry_if_not_exception_type set, so the failure returns immediately instead of consuming the ~128-second retry/back-off window. This applies to LiteLLMEmbeddingEngine, FastembedEmbeddingEngine, and OpenAICompatibleEmbeddingEngine; generic EmbeddingException failures remain retryable. The OpenAICompatibleEmbeddingEngine also now imports the shared EmbeddingException/EmbeddingContextWindowTooSmallError from cognee.infrastructure.databases.exceptions instead of defining a local EmbeddingException. No public API signature, configuration option, or environment variable changed; code that already catches EmbeddingException continues to catch the new subclass (fixes #3319, PR #3424).
Fixes .txt prompt templates being HTML-escaped on the wire. render_prompt configured its Jinja2 environment with autoescape=select_autoescape(["html", "xml", "txt"]), and because every prompt template shipped with Cognee is a .txt file, every interpolated variable in every rendered LLM prompt was HTML-escaped for all providers — apostrophes became ', triplet arrows in retrieval context became -->, ampersands became &, and angle brackets became </>, including the user’s own question in completion prompts. Autoescape now covers only markup templates (["html", "xml"]), so .txt prompts render their content verbatim while .html/.xml templates remain escaped. The user-visible effect is restored prompt fidelity, reduced token waste, and fewer subtle parsing/extraction issues; the change is internal to prompt rendering and non-breaking — no public API signature, configuration option, or environment variable changed (SDK-203, PR #4115).
Fixes writes failing on Postgres-backed graph and vector stores when node/edge fields or vector payloads contain NUL bytes (\u0000). Postgres text columns and JSONB reject the \u0000 escape, and while the vector store’s json column accepts it on insert, the payload::jsonb casts used by search/merge queries later reject it — so ingesting content with an embedded NUL byte could error. A shared sanitize_relational_payload helper now strips NUL bytes from strings and recurses through nested containers (dicts, lists, tuples), decoding bytes/bytearray values as UTF-8 with replacement so invalid byte sequences do not break persistence. It is applied in the Postgres graph adapter to the id, name, type, and properties of nodes and to the source_id, target_id, relationship_name, and properties of edges (ids sanitized identically on both sides so references stay consistent), and in PGVectorAdapter to each data point’s serialized payload. This is an internal serialization fix for the Postgres graph and PGVector adapters; no public API signature, configuration option, environment variable, or database migration changed (PR #4153).
Fixes ingestion crashing on Windows when a string starts with / or is drive-relative. In save_data_item_to_storage, the absolute-path branch treated any string beginning with / (or, on Windows, one whose second character is :) as a local file path and called Path(...).as_uri() on it. On Windows, os.path.normpath("/etc/hosts") yields a drive-relative path and a drive-relative input like C:notes.txt stays drive-relative, so as_uri() raised ValueError (relative paths cannot be expressed as file: URIs) — any POSIX-style path string or plain text note starting with / crashed add(). The branch is now additionally guarded by Path(os.path.normpath(data_item)).is_absolute(), so on the current platform only genuinely absolute paths convert to a file: URI; non-absolute /-prefixed or drive-relative strings fall through to the existing relative-path/text handling and are ingested as text (saved to Cognee’s data storage as a text file). POSIX behavior is unchanged (/... paths still convert to file: URIs) and genuine Windows absolute paths (C:\...) still convert as before. No public API signature, configuration option, or environment variable changed, and accept_local_file_path continues to govern acceptance of true absolute paths (fixes #3887, PR #3892).
Fixes S3 ingestion failing on Windows with PermissionError (WinError 32). In data_item_to_text_file, the S3 branch downloaded the object into a tempfile.NamedTemporaryFile created with the default delete=True and then passed temp_file.name to the loader, which reopens the file by name while Cognee’s handle is still open. On Windows that reopen raises PermissionError [WinError 32], so every S3 ingestion failed. The temp file is now created with delete=False, its handle is flushed and closed before the loader reopens it, and it is removed with os.unlink in a finally block (guarded against OSError) so no temp file is leaked — mirroring the delete=False pattern already used by the SQLAlchemy and ladybug S3 temp-file paths. POSIX behavior (Linux/macOS) is unchanged and temporary files are still cleaned up after use. No public API signature, configuration option, or environment variable changed (fixes #3339, PR #3340).
Fixes DLT orphan cleanup leaving forgotten rows in the per-dataset graph and vector stores under ENABLE_BACKEND_ACCESS_CONTROL. When re-ingesting a DLT source (or a document source such as Notion/Slack/Google Drive) after rows were removed upstream, Cognee reconciles the corpus by deleting rows no longer present. Under access control the graph and vector engines are dataset-scoped, but the cleanup ran outside the dataset DB context — most visibly on the background-ingest path, where orphan_cleanup runs before any pipeline establishes that context — so delete_data_nodes_and_edges resolved the default engines and the graph + vector purge silently targeted the wrong database, leaving the forgotten row’s chunks and entities in place and still retrievable (only the relational record was removed). The per-orphan deletion now runs inside set_database_global_context_variables(dataset.id, dataset.owner_id), so the graph, vector, and relational stores are all purged for the correct dataset. Cleanup remains best-effort: partial failures are logged and retried on the next ingest rather than failing the add. No public API signature, configuration option, or environment variable changed (SDK-189, PR #4090).
Fixes two failures in the SQL session-cache backend (CACHE_BACKEND=postgres / sqlite) when ids are UUID-like or when many writers target the same session concurrently. First, cache key columns (user_id, session_id, qa_id, entry_id, log_key, and the KV key) now use a StringKey TypeDecorator that coerces stringable ids such as uuid.UUID to str in the bind processor: the asyncpg dialect renders explicit bind casts, so an id bound as a uuid.UUID made Postgres parse text = uuid and raise 42883 (“operator does not exist”), while SQLite rejected the non-str bind outright — passing a uuid.UUID id (rather than a string) to the adapter could therefore fail to read or write. The decorator normalizes every read and write to the same string regardless of the caller’s type; because DDL is delegated to the underlying Text impl, the emitted column stays plain TEXT and existing tables need no migration. Second, on Postgres each same-session write transaction now takes a transaction-scoped pg_advisory_xact_lock keyed by (table, user_id, session_id) before writing, so concurrent writers to one session queue instead of deadlocking on the sliding-TTL UPDATE (SQLSTATE 40P01); the lock auto-releases at COMMIT/ROLLBACK, delete_session acquires the per-table locks in a fixed order so a multi-table writer can’t cycle with single-table writers, and the whole mechanism is a no-op on SQLite (which serializes writers with its own single-writer lock). The tradeoff is that concurrent writes to the same session may serialize slightly; writes across different sessions are unaffected. No public API signature, configuration option, environment variable, or database migration changed (PR #4182).
Fixes the cognee-mcp client failing or hanging when listing datasets or checking status against Cognee Cloud, and makes API-mode cognify submit a background run instead of blocking. In API mode the client’s list_datasets requested /api/v1/datasets (no trailing slash) and relied on the server’s 307 redirect to the canonical /api/v1/datasets/; the client does not follow redirects, so the call failed with an HTTP error on the redirect response, and the redirect Location could additionally downgrade to http:// against the HTTPS-only edge (see the server-side fix, CLO-320). list_datasets now calls the canonical trailing-slash route /api/v1/datasets/ directly, so no redirect is involved. Separately, the client applied its single client-wide timeout=300.0 to every request, so a hung or black-holed read-only GET froze the caller for a full five minutes; a per-request READ_TIMEOUT_SECONDS = 30.0 is now applied to the dataset list and status GETs, while the 300s client-wide default remains for other requests. READ_TIMEOUT_SECONDS is a hardcoded module constant in cognee-mcp/src/cognee_client.py, not an environment variable or documented setting; the tradeoff is that a valid but very slow (>30s) GET will now time out. Finally, the client’s cognify POST now sends run_in_background: true, so the request submits the pipeline run on the server and returns immediately instead of holding the HTTP request open for the whole run; the MCP cognify tool already returned immediately and directs callers to poll dataset status, which now reflects the server-side background run. No public MCP tool signature, configuration option, or environment variable changed (CLO-322, PR #4184).

v1.4.0

View on GitHub Release that bumps the package version from 1.3.0 to 1.4.0 and refreshes uv.lock. The release cut itself introduces no functional code, public API, configuration, or environment-variable changes; the entries below are the accumulated work promoted from the development branch in this release (SDK-197). No migration or action is required for existing integrations.

Highlights

Fixes RAG_COMPLETION and TRIPLET_COMPLETION searches ignoring the node_name filter. The public search() API already accepted node_name (and node_name_filter_operator) to restrict results to specific node sets, but for these two search types the argument was silently dropped: CompletionRetriever and TripletRetriever never received it, so their vector lookups (DocumentChunk_text / Triplet_text) searched the whole collection and returned chunks/triplets from outside the requested node set(s). Both retrievers now accept node_name and node_name_filter_operator, the search-type factory (get_search_type_retriever_instance) forwards them, and they are passed through to the vector search so results are scoped to the given node set(s) using the chosen AND/OR operator. The change is backward-compatible: node_name defaults to None (no filtering), so calls that never set it behave exactly as before, and no public API signature, configuration option, or environment variable changed (COG-5868, PR #4053).
Changes visualize_graph() and GET /api/v1/visualize to render a bounded, relevant subgraph by default instead of the whole graph. The renderer now selects a small set of seed nodes, expands their k-hop neighborhood, and caps the result at max_nodes, keeping renders fast and readable on large graphs. Seeds are resolved by priority — explicit seed_node_ids > a recall() or search result’s graph provenance (recall_result, via used_graph_element_ids) > a query string’s nearest (distance-ranked) vector hits > the graph’s highest-degree nodes as a fallback — so a bare visualize_graph() call still shows a representative view, and “show me the subgraph behind this answer” and query-seeded views are deterministic and capped. New keyword-only parameters were added to visualize_graph(): full, query, seed_node_ids, recall_result, neighborhood_depth (default 2), neighborhood_seed_top_k (default 10), and max_nodes (default 500); when a neighborhood exceeds max_nodes, nodes are kept by hop distance from the seeds and edges survive only when both endpoints do (no dangling edges). To restore the previous whole-graph render, pass full=True (or ?full=true on the endpoint). GET /api/v1/visualize gains matching query params full, query, seed_node_ids, neighborhood_depth, neighborhood_seed_top_k, and max_nodes (recall_result is Python-only). The change is backward-compatible: the new parameters are keyword-only, so existing positional callers keep working; the underlying renderer and shared graph primitives are reused. See Graph Visualization → Bounded subgraph by default (SDK-140, PR #3985).
Fixes two cognee-mcp bugs that surfaced on the first remember of a clean direct-mode (stdio) MCP session. First, the session-backed remember() flow now completes the session-to-graph bridge cleanly instead of tripping over dataset setup on the first write. Second, MCP startup migration output no longer pollutes the stdio JSON‑RPC channel, so clients do not misread migration chatter as protocol data. The remember() / cognee.remember() signatures are unchanged and no configuration option or environment variable changed; the only user-visible tradeoff is a brief one-time delay on the first remember while the session is initialized and bridged (SDK-192, PR #4091).
Fixes improve() runs failing to persist agent-trace feedback on multi-tenant (ENABLE_BACKEND_ACCESS_CONTROL=true) deployments. The agent-trace-feedback persistence path now forwards the authenticated user into cognee.add() and cognee.cognify(): cognify_agent_trace_feedback accepts a user parameter and passes it to both calls, and persist_agent_trace_feedbacks_in_knowledge_graph_pipeline supplies the pipeline’s user to that enrichment task. Previously these add/cognify calls ran as the default user, which has no write ACL on multi-tenant deployments, so trace persistence raised a 403 PermissionDeniedError and the improve() run showed errored memify-pipeline stages while feedbacks were silently skipped. The user parameter these internal pipelines and tasks already accepted is unchanged, and no public improve()/memify() API signature, configuration option, or environment variable was modified (COG-5893, PR #4097).

v1.3.0

View on GitHub Release that bumps the package version from 1.2.2 to 1.3.0 and regenerates the dependency lockfiles (poetry.lock, uv.lock, cognee-mcp/uv.lock). The release cut itself introduces no functional code changes; the entries below are the accumulated work promoted from the development branch in this release. Deployers upgrading should re-lock and reinstall to pick up the refreshed dependency graph.

Highlights

Fixes document classification for text-like and unknown file extensions. classify_documents previously looked up the document class with EXTENSION_TO_DOCUMENT_CLASS[data_item.extension], which raised KeyError during ingestion for extensions the map didn’t cover — including common text formats (md, json, xml, yaml) and uppercase variants such as .PDF or .CSV. The extension is now normalized to lowercase before lookup, md/json/xml/yaml are mapped to TextDocument, and any unrecognized extension falls back to TextDocument instead of crashing the pipeline. Files that previously failed to ingest are now classified and processed. The change is backward-compatible: already-mapped extensions classify exactly as before, and no public API signature, configuration option, or environment variable changed (fixes #3657, PR #3662).
Fixes the CLI cognify command’s --ontology-file flag, which previously had no effect. The command passed ontology_file_path= to cognee.cognify(), but cognify() accepts only a config object and silently swallowed the unsupported argument through **kwargs, so the ontology was never loaded. The command now translates --ontology-file into the canonical ontology config structure (an rdflib resolver with fuzzy matching, built with the same factory cognify() uses for its env-based fallback), validates up front that every referenced path exists and otherwise raises a clear Ontology file not found: <paths> error, and accepts multiple ontology files as a comma-separated list. Separately, a failed CLI command now always prints its error message: cognee cognify failures raise a CliCommandException whose raiseable_exception field is unset, and the entry point previously printed the message only when that field was set, so the command exited with code 1 but no explanation. Exit codes are unchanged (still 1 on failure), and no new flag, public API signature, or environment variable was introduced (PR #3997).
Improves concurrency and throughput of LanceDB subprocess mode (VECTOR_DB_SUBPROCESS_ENABLED=true) by replacing the session-wide RPC lock with id-based routing. Each async RPC now carries a per-request id and a main-process reader thread routes responses to per-call futures, so concurrent call_async operations run in parallel instead of serializing behind a single lock. A new SUBPROCESS_WORKER_MAX_INFLIGHT environment variable (default 16) bounds how many async operations a worker runs at once; it must be > 0 or worker initialization raises ValueError rather than silently degrading. Failure semantics also change: a per-call timeout or cancellation now resolves only that call and no longer tears down the entire subprocess session — the session ends only on genuine crash/shutdown/respawn events, which propagate a SubprocessTransportError to any still-pending calls. Synchronous calls (such as the Kuzu graph backend) continue to run serially and are unchanged. In internal Locust benchmarks, /api/v1/add average latency dropped from ~1371ms to ~246ms and p95 from ~9300ms to ~520ms, with overall throughput up ~21% (PR #2826).
Fixes propagation of the authenticated user into the memify session- and feedback-persistence pipelines, correcting multi-tenant attribution. The persist_sessions_in_knowledge_graph_pipeline and persist_agent_trace_feedbacks_in_knowledge_graph_pipeline functions now set the session user context (set_session_user_context_variable(user)) before running memify, so persisted sessions and agent-trace feedbacks are recorded against the intended authenticated user instead of a default/missing user. The user parameter these pipelines already accept is unchanged — no public API signature, parameter, or environment variable was modified, and no data migration is required. Logs and stored knowledge-graph entries may now show different (correct) user associations; custom memify pipeline hooks that relied on the previous missing-user behavior should be verified (PR #3950).
Adds optional per-stage LLM model routing so the extraction, summarization, and query stages can each run on a different model or provider. Each stage reads an optional LLM_<STAGE>_* environment group — LLM_EXTRACTION_*, LLM_SUMMARIZATION_*, and LLM_QUERY_*, each accepting MODEL, PROVIDER, ENDPOINT, API_KEY, and API_VERSION — whose set fields override the base LLM_* values for that stage while any unset field falls back to LLM_*. Because extraction runs once per chunk and dominates token spend, a common setup routes a cheaper or local model for extraction while keeping a stronger model for summarization and query-time reasoning (for example LLM_EXTRACTION_MODEL="ollama_chat/llama3.1", LLM_EXTRACTION_PROVIDER="ollama", LLM_EXTRACTION_ENDPOINT="http://localhost:11434"). Under the hood LLMConfig.stage_config(stage) returns a copy of the base config with any stage overrides applied, and a pipeline_stage(stage) context manager sets the existing llm_config ContextVar to that merged config for the duration of the stage; the client cache key is already derived from the context config, so each stage transparently gets its own cached client. This is fully backward-compatible and requires no action for single-model setups: the stage fields default to empty, so with no LLM_<STAGE>_* variables set the effective config is identical to today, and no extraction, summarization, retrieval, or SDK call signature changed. See the “Per-Stage Model Routing” section of LLM Providers for the full field reference and examples (PR #3961).
Fixes the missing exception type in error logs. When an exception is logged, setup_logging()’s structlog exception_handler processor records the exception class name in the exception_type field. That field guarded its assignment with hasattr(exc_type, __name__), which used the module’s own __name__ (the string "cognee.shared.logging_utils") rather than the literal "__name__"; since an exception class never has an attribute by that name the check was always false, so exception_type was never added to the log event. The guard now checks hasattr(exc_type, "__name__"), so logged exceptions record their type (e.g. ValueError) alongside the existing exception_message, identifying what failed rather than only that something failed. No public API, configuration option, or environment variable changed (PR #3998).
Fixes the exception_type field being silently omitted from logged exceptions. The custom exception_handler structlog processor in cognee.shared.logging_utils.setup_logging checked hasattr(exc_type, __name__), where the unquoted __name__ resolved to the module’s name rather than the literal attribute name — so the check almost never passed and event_dict["exception_type"] was never set. The argument is now the string "__name__", so log records for exceptions correctly capture the exception class name (exception_type = exc_type.__name__). This only affects the metadata attached to logged exceptions; no public API signature, configuration option, or environment variable changed (fixes #3709, PR #3849).
Restores a synchronous get_vector_engine() as a deprecated backward-compatibility shim and establishes get_vector_engine_async() as the canonical async accessor for the vector engine. Both are exported from cognee.infrastructure.databases.vector. get_vector_engine() is safe to call synchronously from any context — it does no async work when constructing the engine handle — but it now emits a DeprecationWarning, and the returned adapter’s methods (embed_text, search, get_connection, …) remain coroutines that must be awaited inside a running event loop. Released users who called get_vector_engine() without await are unaffected. Dev users who adopted the unreleased async form await get_vector_engine() should switch to await get_vector_engine_async(), which keeps a uniform “await the engine getter” contract alongside await get_graph_engine() (PR #3967).
Speeds up graph extraction for inputs with many chunks by removing a quadratic scan in extract_graph_from_data. DLT row chunks (whose graph is built deterministically from schema metadata rather than by the LLM) are excluded from the extraction path; previously each chunk was matched against the DLT set with a repeated list-membership check that triggered a full Pydantic __eq__ comparison per pair, so the filter cost scaled with len(data_chunks) × len(dlt_chunks). The function now partitions data_chunks into DLT and non-DLT lists in a single pass and returns integrated + dlt_chunks, making the extraction hot path linear in the number of chunks (a micro-benchmark reports roughly a 9,000× speedup at 4,000 chunks). Outputs are identical and the change is internal to extract_graph_from_data — no public API signature, configuration option, or environment variable changed (fixes #4015, PR #4017).
Preserves external ontology IRIs end-to-end and adds an RDF/SPARQL read surface plus RDF ingestion. DataPoint gains an optional ontology_uri field (defaults to None) that carries the external IRI a node is grounded in, threaded through expand_with_nodes_and_edges so persisted nodes keep their identifier instead of collapsing it to a local label. A new read surface (cognee.modules.graph.rdf) exposes the memory graph as RDF: graph_data_to_rdf, export_memory_graph_to_rdf, serialize_memory_graph, and query_memory_graph_sparql. Ungrounded nodes receive minted IRIs under https://cognee.ai/graph/... so the RDF is well-formed, is_a maps to rdf:type (individual→class) or rdfs:subClassOf (class→class), and other relations become predicate IRIs. RDF ingestion (cognee.modules.ontology.rdf_xml.rdf_ingest — ingest_rdf, load_rdf_graph, build_datapoints_from_rdf) parses TBox/ABox into EntityType/Entity datapoints that preserve verbatim IRIs, with identity derived from the IRI so re-ingesting the same RDF is idempotent. The change is backward-compatible: ontology_uri defaults to None, no DB migration is required, and the RDF surface rides on the existing rdflib ontology dependency (ensure rdflib and any parser backends are available in your runtime to use RDF export/ingest). See Ontologies → RDF read/write surface (PR #3928).

v1.2.2

View on GitHub Patch release that bumps the package version from 1.2.1 to 1.2.2 and refreshes uv.lock. This release introduces truth-subspace retrieval improvements, opt-in feedback weighting, and reliability fixes for S3-backed LanceDB setups.

Highlights

Adds the truth subspace builder, which compiles accepted session learnings into centroids and slots that can be used to align and rerank retrieval results.
Adds opt-in truth-subspace reranking and learned feedback weighting for graph search. The default influence remains 0.0; enable it with DEFAULT_FEEDBACK_INFLUENCE or per-call feedback_influence values.
Adds build_truth_subspace to the Improve API so truth-subspace indexes can be rebuilt as part of the enrichment flow.
Tracks the active dataset through request-local context so retrieval and background tasks can keep dataset-scoped truth state aligned.
Fixes LanceDB dataset provisioning for S3-backed system roots by avoiding direct local directory creation for S3 paths.
Adds demos and tests for truth-subspace building, reranking, feedback influence, and graph truth-state persistence.

Other fixes

Removes the Sentry and Langfuse third-party observability integrations while keeping the OpenTelemetry tracing layer intact. The Observer.LANGFUSE enum value, the Langfuse branch in get_observe(), and the LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY / LANGFUSE_HOST configuration fields and environment variables are gone, and Sentry initialization is dropped from the API. The @observe decorator now maps only to OpenTelemetry — it emits an OTEL span when tracing is enabled (COGNEE_TRACING_ENABLED=true) and is a no-op otherwise — so existing @observe usage keeps working unchanged with no call-site edits. Packaging (breaking): the monitoring extra is removed in favor of the existing tracing extra, which installs only the OpenTelemetry API/SDK and OTLP exporters. Migrate with pip install cognee[tracing] in place of pip install cognee[monitoring]. Users who relied on Sentry or Langfuse should switch to an OTLP-compatible backend; configure it via OTEL_EXPORTER_OTLP_ENDPOINT and related OTEL_* variables (see OpenTelemetry Tracing). Lockfiles (uv.lock / poetry.lock) that still reference the removed sentry-sdk / langfuse packages should be regenerated.
Fixes a crash when running a pipeline in the background (run_in_background=True) with no explicit datasets. The background runner (run_pipeline_as_background_process) now reads the effective user from the run’s params first and only falls back to the default user when none was supplied, then resolves the run across all datasets that user has write access to. Previously user was bound only on the fallback path, so the usual case — a user passed in params — left user unassigned and raised UnboundLocalError: cannot access local variable 'user' before the run started. No API or CLI changes are required.
Fixes a lock-starvation bug in single-session improve(). When improve() is called with one session_id, it holds a per-session lock so concurrent auto-improve, idle-watcher, and SessionEnd runs serialize instead of duplicating work. Previously the lock was released only after a successful run (and on early stage 1–2 failures), so an exception in a later stage — default enrichment (memify), the global context index, or the graph-to-session sync — left the lock held permanently, and every subsequent improve() for that session silently returned {} until the process restarted. All stages are now wrapped in a single try/finally, so the session lock is always released on exit regardless of which stage fails. No public signature, parameter, return type, configuration option, or environment variable changed. The fix prevents the issue from recurring; sessions already stuck from before the upgrade still need a process restart to clear the held lock (closes #3313, PR #3317).
Fixes a CLI startup crash on a fresh, uninitialized database (for example a new Postgres) when no --user-id is passed. When resolve_cli_user() resolves the default user, it now catches DatabaseNotCreatedError, runs the database migrations to create the schema, and retries — so the command proceeds with the default user instead of failing on first run. The recovery is automatic when resolving the default user, including omitted --user-id and non-strict fallback-to-default paths; normal calls against an already-initialized database are unaffected and incur no extra overhead. No new flag, configuration option, or environment variable is introduced (fixes #3267, PR #3308).
Allows one automatic retry on structured-output (instructor) calls in the generic LLM API adapter (LLM_PROVIDER="custom" and other generic OpenAI-compatible providers). The adapter’s acreate_structured_output now passes max_retries=2 to instructor on both the primary and the content-policy fallback request, where it previously allowed no retry. When the model returns output that fails instructor’s schema parsing or validation, instructor reissues the request once before surfacing an InstructorRetryException. The user-visible effect is fewer transient structured-output failures, at the cost of a slight latency increase on the rare request that is retried. No public API signature, configuration option, or environment variable changed, and the existing retry-warning logs are unchanged (PR #3413).

v1.2.1

View on GitHub Patch release that bumps the package version from 1.2.0 to 1.2.1 and refreshes uv.lock. This release follows v1.2.0 with targeted reliability fixes for dataset-scoped ingestion, background task lifetime, and dataset helper authorization.

Highlights

Fixes remember(..., dataset_id=...) so it now forwards dataset_id to add(). Previously dataset_id was used only to build the cognify() target while add() silently ingested raw data into the default main_dataset, so cognify() ran on the intended (but empty) dataset and produced no new graph. Ingestion and graph building now target the same dataset. No API or migration changes are required; callers who passed dataset_id and saw missing results should upgrade.
Anchors fire-and-forget background tasks so Python’s garbage collector can no longer abort them mid-run. Background syncs (cognee.api.v1.sync.sync.sync) and background pipeline runs (run_pipeline_as_background_process) now hold a strong reference to their in-flight asyncio.Task in a module-level set (_BACKGROUND_SYNC_TASKS / _BACKGROUND_PIPELINE_TASKS) until the task finishes, with a done-callback that discards the reference on completion. Previously the event loop kept only a weak reference, so the GC could collect a still-running task and silently abort a background sync or pipeline run. This fixes those intermittent silent aborts; the only side effect is a small, transient increase in retained memory while tasks run (released as each task completes). No public API signature, request/response schema, configuration option, or environment variable changed.
Fixes cognee.datasets.has_data() raising AttributeError. The method now forwards the full User object to its internal authorization helper instead of user.id, so calls succeed and return the expected bool. No signature, parameter, or behavioral change beyond the method no longer crashing.

v1.2.0

View on GitHub Release that promotes accumulated dev work after the v1.2.0 development builds, bumping the package version to 1.2.0 and refreshing the lockfile (uv.lock). Highlights include ChromaDB search enhancements, BM25 lexical chunk search, search-answer reference evidence, session-context guidance enabled by default, and a range of Postgres/Neptune adapter, visualization, logging, and memory-stability fixes.

Highlights

Adds ChromaDB vector search support for include_payload=False, so callers can omit metadata payloads from returned ScoredResult values when they only need ids and scores.
Adds ChromaDB node_name filtering for search() and batch_search(), including OR and AND semantics through node_name_filter_operator.
Prevents Entity and EntityType node id collisions by namespacing generated ids by node category.
Excludes internal EntityType taxonomy nodes and their is_a edges from the schema inventory output (get_schema_inventory and the visualize schema inventory endpoint). Consumers no longer receive a separate EntityType type group or is_a relationship aggregates; entity instances are still grouped under their resolved semantic type.
Improves ontology parsing for file-like inputs with filename/content-type detection, RDFLib fallback formats, and clearer initialization errors when parsing fails.
Reaps subprocess database workers deterministically at interpreter exit. The cognee_db_workers harness now registers an atexit handler that force-terminates any still-live LanceDB/Kuzu worker processes on shutdown, instead of relying on garbage-collector and __del__ ordering that is not guaranteed to run at interpreter exit (notably for Windows spawn daemon workers). This helps avoid leftover worker processes and shutdown hangs when running with graph_database_subprocess_enabled=true or vector_db_subprocess_enabled=true.
Offloads the Ollama adapter’s blocking client calls off the asyncio event loop. The LLM_PROVIDER="ollama" adapter wraps a synchronous OpenAI-compatible client, so its chat-completion, audio-transcription, and image/vision calls previously ran inline and blocked the running event loop for the full duration of each Ollama request, serializing concurrent async callers (for example the per-chunk extraction that cognify() fans out). These calls are now dispatched through asyncio.to_thread, so they execute in worker threads and no longer stall the loop. Public async signatures are unchanged and no configuration or migration changes are required; because requests now run in worker threads, any objects shared with the Ollama client should be thread-safe.
Serializes concurrent decodes on the shared llama.cpp local in-process model. The LLM_PROVIDER="llama_cpp" local (in-process) adapter now guards calls into its single llama_cpp.Llama instance with a lock, so the per-chunk extraction that cognify() fans out via asyncio.gather/asyncio.to_thread no longer decodes on the same non-thread-safe instance concurrently. This helps avoid native GGML_ASSERT crashes from corrupted KV-cache/logits state during local llama.cpp runs; in-process requests are now processed one at a time (use server mode for parallel decoding).
Simplifies the structured-output schema sent to the LLM during graph extraction when a custom graph_model (a DataPoint subclass) is used. extract_content_graph now converts the model to a plain BaseModel that keeps only the fields you declare on each subclass — DataPoint infrastructure fields (such as id, created_at, version, type, belongs_to_set) and the metadata field are dropped from the schema the LLM is asked to fill — and then rehydrates the LLM result back into your original DataPoint model via model_validate. The LLM extracts only your domain fields, while declared metadata defaults (for example {"index_fields": ["name"]}) are preserved on the rehydrated objects, so indexing behavior is unchanged. This is a no-action change for callers of the high-level extraction, cognify, and remember APIs.
Fixes Neptune (GRAPH_DATABASE_PROVIDER="neptune") edge writes for relationship types that contain spaces, hyphens, or openCypher reserved words. The adapter now backtick-quotes (and escapes embedded backticks in) the relationship type when interpolating it into the generated openCypher MERGE statements for both single-edge and batched (UNWIND) edge upserts, preventing query syntax errors and unsafe interpolation. Also fixes the batched-edge fallback path so that when a batch insert fails, the per-edge retry iterates the edges for that relationship instead of the relationship grouping map. No configuration or migration changes are required, but generated/logged openCypher will now show backtick-quoted relationship-type names.
Reuses a single aiohttp.ClientSession across anonymous telemetry requests instead of opening a new session per call. This avoids a repeated DNS + TCP + TLS handshake to the telemetry endpoint on every event, helping lower latency and connection churn for telemetry. The shared session is created lazily inside the running event loop and rebuilt transparently when the loop changes (for example across tests or asyncio.run boundaries) or after it is closed; telemetry stays best-effort and never raises. No new configuration is required, and telemetry can still be turned off with TELEMETRY_DISABLED=true.
Tolerates a missing dataset_database table on PostgreSQL during startup migrations and pruning. The run_startup_migrations() vector step and the graph/vector prune routines now also catch the asyncpg ProgrammingError / UndefinedTableError, in addition to the SQLite OperationalError already handled. Running against a fresh PostgreSQL/pgvector database (for example the pgvector example) now skips the step with a warning instead of crashing with an undefined-table error.
Reduces peak memory use of the Postgres graph adapter (GRAPH_DATABASE_PROVIDER="postgres") for graph node and edge relational upserts. add_nodes/add_edges now stream each batch to Postgres in fixed-size chunks (1000 rows per INSERT ... ON CONFLICT statement) instead of compiling one large multi-thousand-row statement, and JSONB property columns are serialized once at execute time via an engine-level json_serializer (the UUID/datetime-aware JSONEncoder) rather than a per-row json.loads(json.dumps(...)) round-trip. This helps avoid the transient allocation churn and memory spikes seen on large single-batch writes (the commit reports roughly a 20x reduction). The number of rows written, the upsert/conflict semantics, and the data stored are unchanged; no configuration or migration changes are required.
Switches CHUNKS_LEXICAL search to BM25 ranking. Lexical chunk searches now rank exact-term matches with BM25 instead of the previous Jaccard-style scorer, and the retriever filters default stop words unless explicitly configured otherwise. API signatures stay the same, but result ordering can change for SearchType.CHUNKS_LEXICAL.
Adds lightweight references (Evidence) to completion-style search answers via a new include_references flag (default true) on search(), recall(), and the POST /api/v1/search and POST /api/v1/recall request bodies. When enabled, a deterministic Evidence: block is appended to the answer text, assembled in-process (no extra LLM call) from the retrieved chunk payloads, falling back to entity → chunk → document graph traversal when chunk metadata is missing. The response schema and return types are unchanged — Evidence is added to the answer text only. Because this changes default answer text, snapshot and evaluation baselines will diff; set include_references=False to restore the exact prior output. Older indexes lacking the new document_id/document_name chunk fields use the graph fallback where available or omit the Evidence block silently.
Disables local-variable rendering in logged exception tracebacks. setup_logging() now configures the console renderer with RichTracebackFormatter(show_locals=False), so when an exception is logged the traceback no longer expands each frame’s local variables. In the retrieval/search path those locals can hold graph objects carrying embedding vectors and deep node/edge references, and rendering them recursively spiked memory to multiple GB and OOM-killed the process (notably in CI) whenever an exception was logged mid-search. Tracebacks themselves are still logged; only the per-frame locals dump is omitted. No configuration changes are required.
Restores Cognee’s safe uncaught-exception hook. setup_logging() now installs sys.excepthook so non-KeyboardInterrupt exceptions are logged through structlog before Python’s default traceback is printed, and falls back to plain traceback output if rich rendering itself fails. No configuration changes are required.
Propagates relational DATABASE_CONNECT_ARGS SSL settings to the Postgres maintenance, PGVector, and graph Postgres engines, so connections to managed Postgres that enforce SSL (for example Neon, RDS/Aurora, Azure Database for PostgreSQL) succeed. Previously only the main relational engine received these args, so CREATE/DROP DATABASE maintenance, per-dataset PGVector engines, and the GRAPH_DATABASE_PROVIDER="postgres" graph engine could fail with missing-SSL errors. The maintenance engine also maps the libpq sslmode key to the asyncpg ssl key, and rewrites a Neon -pooler. host to its direct endpoint because CREATE/DROP DATABASE cannot run through Neon’s PgBouncer pooler. No configuration schema change is required — supply asyncpg SSL options via the existing DATABASE_CONNECT_ARGS and they are now honored across Cognee’s Postgres engines; the env unset stays a no-op for in-cluster Postgres. Deployments on managed Postgres with enforced SSL should upgrade.
Fixes two interaction glitches in the graph visualization (visualize_graph) story view. Clicking a node no longer displaces it: the click-vs-drag threshold is raised from 3px to 6px so trackpad jitter on a plain click is no longer treated as a drag that reheated the force simulation and sent the clicked node (and, in the Force and Flow layouts, the whole layout) flying off the canvas. In the Story layout, dragging now drives the node position directly instead of reheating the pinned grid, and a released node snaps back cleanly to its lane. Separately, the pipeline stage-header pills (shown in the Story and Flow layouts) are now drawn in a final pass after edges, nodes, and labels, so a dense graph panned toward the top of the viewport can no longer paint over them. Generated visualization HTML changes only; no API, configuration, or migration changes are required.
Retries the Kuzu/Ladybug JSON extension load on the live connection when it is missing at runtime. In the subprocess graph worker (graph_database_subprocess_enabled=true), if LOAD EXTENSION fails with a “not been installed” error, the worker now runs INSTALL on the active connection and retries the load once; if that INSTALL fails it raises with the real underlying cause instead of the generic load error. This recovers from cases where the best-effort warm-up install on the throwaway database did not complete (for example a transient network error while downloading the extension on a fresh machine). The warm-up install path also now logs its failure cause to stderr ([ladybug worker] warm-up INSTALL JSON failed: ...) instead of swallowing it silently, so these conditions are diagnosable from worker/CI logs. The retry may perform an extension install on the live connection and add a small startup delay; no configuration or migration changes are required.
Relaxes the bundled Ladybug graph-store dependency from the ladybug==0.16.0 pin to ladybug>=0.16.0,<0.18, so installs can pick up the 0.17.x line. The database migration worker’s storage-version table now maps the 0.17 on-disk format (catalog code 41) to 0.17.1, so an existing 0.16.x/0.17.x graph database is recognized as current and is not flagged for legacy migration (migration still targets only pre-0.15.0 databases). No manual database schema changes are required; deployers upgrading should re-lock dependencies (refresh uv.lock) and redeploy database workers to pick up the new range.
Adds a session-context guidance layer and turns it on by default. The cache AUTO_FEEDBACK setting now defaults to true (previously false), so when CACHING is enabled, session-capable completion searches run one additional structured-output LLM call per answered turn under the resolved session (default_session when session_id is omitted) to analyze the current turn against the previous one. The analysis can rewrite the turn into an effective query used for retrieval, accumulate durable per-session guidance grouped into goals, rules, preferences, and lessons_learned that can be injected into later answers, and gate a follow-up turn — returning a short acknowledgement (the analysis reply, or "Got it.") instead of running retrieval and completion. The step fails open to answering the original query when analysis errors or no session is available. Because guidance and the effective query can change retrieval inputs, session answers and turn gating may differ from history-only sessions, and per-turn latency and token usage increase. Set AUTO_FEEDBACK=false to disable and restore plain conversation-history replay. See Sessions and Caching.
Bypasses the Instructor structured-output pipeline when acreate_structured_output is called with response_model=str on the default OpenAI, generic, and Ollama LLM adapters. Plain-text requests are now sent directly to the provider and the model’s raw string content is returned, instead of being wrapped in Instructor’s JSON/tool-call schema. This avoids repeated parse failures and retry storms on local llama.cpp-compatible servers that don’t honor those schemas, and can lower latency for string completions. Rate limiting still applies to these direct calls. Passing a Pydantic model is unchanged — it still returns a validated model instance — so this is a no-action change for callers.
Fixes the condition that gates name-to-UUID resolution of the datasets argument in search(). The check was wrapped in a single-element list ([all(...)]), which is always truthy, so the name-resolution path ran for any non-None datasets value. It now runs only when every entry in datasets is a string. Passing dataset names (the documented usage) is unaffected; the only behavior change is that non-string entries supplied through datasets (for example already-resolved UUIDs) are no longer forced through name-based authorization lookup — pass UUIDs via dataset_ids as before. No API signature, default, or migration change is required.
Corrects two .env.template knob names that the config loader was ignoring. The template previously listed LLM_MAX_TOKENS and EMBEDDING_MAX_TOKENS, but Cognee’s settings classes read these values from LLM_MAX_COMPLETION_TOKENS (default 16384) and EMBEDDING_MAX_COMPLETION_TOKENS (default 8191). Anyone who copied the old template and set the chunk-sizing limits under the previous names had them silently ignored, so chunk sizing fell back to the defaults. If you relied on those entries, rename them to LLM_MAX_COMPLETION_TOKENS / EMBEDDING_MAX_COMPLETION_TOKENS in your .env. No code or schema changes are required. The same .env.template update also documents additional already-supported settings (LLM/embedding tuning and rate limiting, chunking, session cache, graph/vector connection and subprocess tuning, Langfuse monitoring, llama.cpp, and the auth-token secrets) as commented examples with their defaults.
Fixes an UnboundLocalError in file metadata extraction (get_file_metadata) when the underlying file-like object cannot seek. content_hash is now initialized before the seek/hash attempt, so when file.seek(0) raises io.UnsupportedOperation (the error is still logged), metadata is returned with an empty content_hash instead of crashing. This makes ingestion more robust for non-seekable file-like inputs; no configuration or API changes are required.
Preserves structured search completions in the result payload. SearchResultPayload.completion now models a single dict, a Pydantic BaseModel, and a list of models in addition to the previous str / list-of-string / list-of-dict shapes, and a custom serializer dumps model instances to their dict representation. This fixes searches that pass a non-string response_model (typed LLM output via retriever_specific_config): the structured object is kept as-is instead of being dropped or coerced into an empty model. The default string-answer path is unchanged; no configuration or migration changes are required.
Caps LLM retries and recovers from over-length embedding input. The structured-output and transcription adapters (anthropic, azure_openai, gemini, generic_llm_api, llama_cpp, mistral, ollama, openai) now stop on a fixed number of attempts (stop_after_attempt) instead of the previous time-based stop_after_delay(128) window, and the instructor retry counts were lowered (for example structured-output generation tops out at 3–4 attempts, while Bedrock’s and the other adapters’ inner instructor max_retries drop to 1–2). This makes transient failures fail faster and at lower cost, with a small reduction in resilience to intermittent errors — watch your LLM error/latency/cost metrics after upgrading. Separately, LiteLLMEmbeddingEngine now recovers from over-length embedding input: a context-window error or a 400 BadRequestError matching maximum input length triggers recursive split-and-pool (splitting the batch, or splitting a single string into overlapping halves and averaging the resulting vectors) instead of failing, while other 400 errors still fail fast. No API or configuration changes are required. See Embedding Providers → Timeout and Retry Behavior.
Bounds the input-data preview persisted in the pipeline_runs.run_info column so a single run cannot grow the table without limit. On pipeline run start, error, and completion, the audit-only run_info data is summarized: a list of Data records is still reduced to their IDs and empty input is still recorded as "None", but any other payload is now stringified and truncated to a 512-character preview ending with ... [truncated, <N> chars total] instead of being stored verbatim. run_info is never read back during processing; persist large raw inputs (for example text passed to add()/cognify()) elsewhere if you need the full payload. No configuration or migration changes are required.
Fixes forget(everything=True) under multi-tenant per-dataset database isolation (ENABLE_BACKEND_ACCESS_CONTROL=true). The everything branch no longer runs inside a single-dataset database context; the per-dataset context is now established per dataset inside the underlying delete-all flow. Previously, entering a single-dataset context with no dataset reference could try to create a dataset_database row for a non-existent dataset and fail the operation. Single-dataset, single-item, and memory_only modes are unchanged, and the public forget() signature, return shapes, and error messages are unchanged.
Forwards FALLBACK_ENDPOINT to the OpenAI adapter’s content-policy fallback request (LLM_PROVIDER="openai"). Previously this api_base override was not applied, so the fallback completion always went to the default OpenAI endpoint even when FALLBACK_ENDPOINT was set; now the fallback request is routed to the configured base URL. Deployments that set FALLBACK_ENDPOINT to an OpenAI-compatible proxy or alternate endpoint will see their fallback traffic go there. FALLBACK_ENDPOINT remains optional for openai — when unset, the fallback still uses the default OpenAI endpoint.
Caps the instructor dependency at <1.15.3 (previously <2.0.0) and lowers the litellm minimum to >=1.83.7 (previously >=1.84.0). This pins structured-output extraction to a known-good instructor range and widens the compatible litellm window; lockfiles (poetry.lock, uv.lock) are refreshed to match. No API or behavioral changes — callers using the high-level cognify/search APIs are unaffected. Developers and deployers should re-lock and reinstall dependencies to pick up the new constraints.
Detects Markdown, JSON, XML, and YAML files by extension during file-type guessing. guess_file_type now returns deterministic types for .md/.markdown (text/markdown), .json (application/json), .xml (application/xml), and .yaml/.yml (application/yaml) instead of relying on content-based detection, which has no magic-number signature for these formats and fell back to text/plain/txt. The recorded file metadata (mime_type and extension) for these files now reflects their actual format. Loader selection is unchanged — TextLoader already handled these extensions — so no action or migration is required.
Adds a GET /api/v1/proposals/{proposal_id} endpoint for reviewing a stored skill-improvement proposal before applying it. The endpoint takes a required dataset_id query parameter and returns the proposal’s status (proposed/applied), confidence, rationale, model_name, and before/after procedures (old_procedure/proposed_procedure); it is read-only and never mutates the graph (applying still goes through POST /api/v1/remember/entry with skill_improvement). It returns 403 when the caller is not authorized for the dataset and 404 when the proposal is not found.
Adds no-code, inline skill ingestion. POST /api/v1/remember (with content_type=skills) now accepts skills_text (a SKILL.md markdown body as a string) and skill_name (the skill name/slug, defaults to skill) form fields, so a skill can be ingested without uploading a file — when skills_text is set and no files are uploaded, it is written to a temporary SKILL.md and ingested through the existing skills pipeline. A new POST /api/v1/skills endpoint exposes the same inline ingestion via a JSON body (skills_text, optional skill_name, and one of dataset_name/dataset_id).

Notes

Includes a behavior-preserving cleanup of the LiteLLM embedding engine (LiteLLMEmbeddingEngine): no public __init__ signature, env-var (MOCK_EMBEDDING, EMBEDDING_ENDPOINT), or default changes, and embedding behavior is unchanged.
Deployers upgrading should re-lock dependencies (refresh uv.lock) and reinstall, then rebuild/redeploy to pick up the updated dependency set.
The ontology parser update improves file-like parsing behavior; upload endpoint format restrictions should be documented separately if they change.

v1.1.3

View on GitHub Patch release focused on API-mode robustness and dependency safety. It enables remote pipeline status checks for MCP/API deployments, improves vector retrieval behavior for empty input, and tightens the instructor dependency range.

Highlights

Enables cognify_status in API mode. The MCP can resolve dataset IDs remotely and read pipeline status from GET /api/v1/datasets/status, so self-hosted API deployments can check background pipeline status without local database access.
Adds API-mode support to CogneeClient.get_pipeline_status, which now queries the server’s /api/v1/datasets/status endpoint instead of raising NotImplementedError.
Makes LanceDB retrieval return an empty list when called with an empty id list, preventing avoidable errors for callers that sometimes have no vector ids to fetch.
Pins instructor below 1.15.3 and refreshes lock metadata. Deployers with exact dependency pins should re-lock or reinstall against the updated constraints.
Refreshes the README with clearer Cognee positioning, branding, and a research paper link.

v1.1.2

View on GitHub Patch release with a refreshed public frontend, improved Cloud UI workflows, and a Postgres graph adapter compatibility fix for asyncpg/PostgreSQL 16.

Highlights

Syncs the public frontend with the SaaS application, bringing updated dashboard, search, dataset, connection, onboarding, knowledge graph, and graph model editor experiences.
Adds conversation-based search history and refreshed multi-dataset search flows in the frontend.
Improves connection and onboarding flows with a connection modal, step-by-step agent setup guidance, new quickstart assets, and updated loading visuals.
Adds memory customization UI support for datasets, including graph models, custom prompts, and ontology-related configuration.
Fixes Postgres graph neighborhood expansion under asyncpg/PostgreSQL 16 by casting recursive CTE seed parameters to text[].

Notable Changes

Bumps the package version from 1.1.1 to 1.1.2 and refreshes lockfiles.
Aligns frontend API routes and local development behavior with the OSS backend.
Updates API key, tenant, configuration, dataset, ingestion, ontology, search-history, session, analytics, and user frontend modules.
Adds frontend assets for quickstarts, agent integrations, loading states, and graph previews.
Adds regression coverage for Postgres graph neighborhood seed array typing and retries a flaky usage-logger e2e path in CI.

Fixes and Improvements

Postgres neighborhood query parameter typing: The Postgres graph adapter’s get_neighborhood query now casts the seed parameter to text[] (unnest(CAST(:seeds AS text[]))) in its recursive CTE seed row. Deployments using GRAPH_DATABASE_PROVIDER=postgres with asyncpg/PostgreSQL 16 should no longer hit parameter type inference errors when expanding neighbors from seed node ids.
Cloud UI refresh: Dashboard, dataset, dataset detail, connections, search, onboarding, knowledge graph, and graph model editor screens were refreshed and aligned with current Cloud workflows.
Search and dataset workflows: Search now supports conversation history and multi-dataset recall flows, while dataset pages add improved status polling, graph access, and memory customization entry points.
Connect Agent flow: The frontend adds clearer connection setup prompts, modal-based setup guidance, and integration visual assets.
Frontend resilience: Error handling, loading states, analytics logging, tenant context, user configuration, and local fetch behavior were updated across the public frontend.

v1.1.1

View on GitHub Patch release that promotes accumulated dev work after v1.1.1.dev0, with agent-management APIs, graph visualization updates, custom graph-model support in remember, and backend stability fixes.

Highlights

Adds agent management and connection endpoints for listing, creating, inspecting, registering, unregistering, and deleting agents and their active connections.
Reworks graph visualization with a pipeline-aware Story layout, Schema view, improved labels, legends, and modular visualization components.
Adds graph_model support to the remember REST endpoint, letting API callers pass a JSON-serialized graph schema into ingestion.
Expands graph and retrieval behavior with local Neo4j dataset handling, global context graph bucketing, improved edge text, and node_name filtering for chunk retrieval.
Improves LLM, PGVector, remember/session, prune, forget, and graph-projection error handling.

Notable Changes

Bumps the package version from 1.1.0 to 1.1.1 and refreshes the release lockfiles.
Splits agent lifecycle and connection handling into dedicated modules and API routes, including persisted agent connection state and agent-session names.
Adds SDK/API support for retrieving specific agent configuration and for inspecting current agent connections.
Adds local Neo4j dataset database handling and updates graph database selection to recognize that handler.
Reworks global context index internals with graph bucketing, scoring, build, update, load, summarize, and persistence flows.
Improves edge indexing and rendering by preserving natural edge descriptions, generating fallback edge text from metadata, and rendering relationship labels inside edge markup.
Updates CI and test coverage across database adapters, agents, visualization, global context indexing, retrieval filters, and LLM configuration.

Fixes and Improvements

Remember custom graph models: The remember REST endpoint now accepts an optional graph_model form field, parses the JSON schema into a graph model, and forwards it into the ingestion flow.
Agent lifecycle and connections: Agent endpoints now separate agent resources from agent connections, support agent-session names, persist connection metadata, mark unregistering agents inactive, and expose connection detail.
Graph visualization: Story view spacing, column pinning, schema rendering, edge-label rendering, and fallback labeling were improved so generated graph views are easier to inspect.
Graph ingestion and retrieval: Edges with unprojectable endpoints are skipped instead of failing graph projection, KnowledgeGraph subclasses follow the knowledge-graph integration path, chunk retrieval receives node_name filters, and forget can handle dataset values that are string UUIDs.
PGVector metadata consistency: create_collection now reflects SQLAlchemy metadata only after the table-creation transaction commits, avoiding stale metadata entries when table creation rolls back.
LLM adapters: Generic LLM API transcription and Ollama image transcription now raise clear ValueError messages for empty responses, Mistral guards against None messages before reading content, and OpenAI instructor mode is honored.
Session remember routing: remember(session_id=...) now routes through the JSON /entry endpoint in API mode, and using custom_prompt with session_id raises a clear ValueError.
Operational stability: Prune errors and dataset lookup issues are handled more defensively, brittle batch-query test settings were adjusted, and optional LLM configuration can be passed through CI.

v1.1.0.dev1

View on GitHub Developer preview release on the way to v1.1.0dev1. This release includes API, retrieval, permissions, storage-runtime, and backend consistency changes.

Highlights

Adds database subprocess workers for LanceDB and Kuzu so native database work can run outside the main Cognee process. The wheel now includes the cognee_db_workers package.
Exposes more ingestion controls through the public API and remote client paths, including chunk sizing and background execution options for remember() and cognify().
Adds dataset_ids support to recall(), making shared-dataset retrieval more reliable when dataset names are not owned by the calling user.
Expands permission management with DELETE endpoints for dataset permissions, roles, and user-role membership.
Improves session visibility so parent users can see sessions created by child-agent users where appropriate.

Notable Changes

Adds graph_database_subprocess_enabled and vector_db_subprocess_enabled configuration, plus Kuzu tuning variables for threads, buffer pool size, and max DB size.
Keeps belongs_to_set metadata consistent across dataset deletion and shared-node/vector upserts in LanceDB, PGVector, and Neo4j paths.
Adds include_payload behavior to Neptune Analytics vector search.
Improves Postgres hybrid batching by respecting embedding-engine batch size.
Improves infer-schema text sampling and prompting.
Rewrites the examples README into a fuller index and adds performance-testing support with Locust.
Deprecates .env.example as the canonical template in favor of .env.template.
Bumps the package version from 1.0.9 to 1.1.0.dev1 and refreshes lockfiles.

v1.0.3

View on GitHub Patch release with bug fixes and stability improvements on top of v1.0.2.

Highlights

Promotes accumulated dev work to main for the v1.0.3 release
Adds session lifecycle APIs, unified memory/session handling, and dashboard support
Introduces dataset queueing for async context management and ingestion flows
Ships new relational migrations, including session lifecycle tables and parent_user_id
Expands recall/remember and cloud routing behavior, plus frontend onboarding and Connect Agent updates

Notable Changes

Added session endpoints, metrics, and supporting persistence work
Added dataset queue infrastructure and follow-up fixes for background processing
Added database migrations for new tables and user/dataset ownership handling
Updated recall, remember, improve, and search-related API behavior
Added frontend work for Connect Agent, dashboard/activity views, API keys, and onboarding
Included guide updates, workflow/tooling changes, and dependency updates such as litellm and onnxruntime

Bug Fixes

PostgreSQL null-byte compatibility: Embedded null bytes (\x00) in node or edge string fields no longer cause errors when using PostgreSQL as the relational backend. Null bytes are now automatically stripped from all string values (including nested attributes) before writes to the relational store. This sanitization is transparent — affected strings are silently cleaned rather than rejected.
Fixed duplicate graph nodes caused by DataPoint.id being regenerated during graph construction. The original id is now preserved when converting DataPoint instances into graph nodes, ensuring node identity is stable across graph extraction passes.

v1.0.2

View on GitHub Patch release with bug fixes and stability improvements on top of v1.0.1.

Bug Fixes

LanceDB schema migration: “contained null values” errors (raised when old rows lack a field required by a newer DataPoint schema) are now treated as recoverable schema drift. The affected table is automatically rebuilt from the current schema instead of raising a hard failure.
cognee-mcp Docker image build: Added missing build-essential and libpq-dev system packages to the builder stage so that cognee[postgres] can compile psycopg2 from source on Linux.

Dependency Updates

Bumped llama-index-core requirement from >=0.13.0,<0.14 to >=0.14.20,<0.15 for the llama-index extra.
Pinned nltk>=3.9.3,<4 explicitly in the docs extra to satisfy unstructured’s dependency until unstructured v0.21.0.

v1.0.1

View on GitHub Patch release with bug fixes on top of v1.0.0.

v1.0.0

View on GitHub

Highlights

New high-level API: remember, recall, improve, and forget cover the full memory lifecycle in four operations
Session-aware memory via session_id — short-term context that can be promoted into the permanent graph
Unified recall replaces the previous search call with automatic retrieval strategy selection
Legacy operations (add, cognify, search, memify) remain available as lower-level building blocks

New Features

cognee.remember(data, session_id=...) — ingest and graph in one call; supports permanent or session memory
cognee.recall(query, session_id=...) — query across both the permanent graph and session cache
cognee.improve(...) — enrich an existing graph with feedback-based weighting and session promotion
cognee.forget(dataset=..., session_id=...) — delete data, datasets, or full session memory

v0.5.4.dev1

Released: March 5, 2026
View on GitHub

Highlights

Developer preview release focused on quality, performance, and developer ergonomics
Faster ingestion and sync
Improved search relevance and new filtering options
Stability fixes for memory creation, deletion, and CLI workflows
Internal refactoring and dependency upgrades

New Features

Bulk import CLI for faster batched ingestion
Search filters for tags and date ranges
Optional per-collection ingestion throttling

Improvements

Lower latency for ingestion and sync
Better search ranking
More robust deletion and duplicate handling
Clearer CLI messages and debug logs

Bug Fixes

Fixed duplicate memories under concurrent ingestion
Fixed partial state after deletion
Fixed CLI export formatting issues
Fixed intermittent retrieval failures under load

v0.5.3

Released: February 27, 2026
View on GitHub

Highlights

New graph visualization improvements
Expanded permissions and user management work
SessionManager and cache/session persistence work
Search and graph retrieval improvements
Multiple stability and CI/CD fixes

Notable Changes

Added role-based permission checks and permission endpoints
Added graph visualization updates, including note set coloring
Added return type hints to API functions
Added chunk associations for the memify pipeline
Added vector filtering based on node sets
Fixed delete flow bugs, health check issues, MCP issues, and several config/integration issues

v0.5.3.dev1

Released: February 20, 2026
View on GitHub

Highlights

Added vector filtering based on node sets
Added principal Cognee configuration
Fixed health check issues
Fixed FalkorDB adapter port bug
Fixed Ollama image ingestion argument issue

Notes

Includes a small set of targeted fixes and feature work on top of v0.5.3.dev0
Introduced one new contributor in this release

​Unreleased

​v1.4.0

​Highlights

​v1.3.0

​Highlights

​v1.2.2

​Highlights

​Other fixes

​v1.2.1

​Highlights

​v1.2.0

​Highlights

​Notes

​v1.1.3

​Highlights

​v1.1.2

​Highlights

​Notable Changes

​Fixes and Improvements

​v1.1.1

​Highlights

​Notable Changes

​Fixes and Improvements

​v1.1.0.dev1

​Highlights

​Notable Changes

​v1.0.3

​Highlights

​Notable Changes

​Bug Fixes

​v1.0.2

​Bug Fixes

​Dependency Updates

​v1.0.1

​v1.0.0

​Highlights

​New Features

​v0.5.4.dev1

​Highlights

​New Features

​Improvements

​Bug Fixes

​v0.5.3

​Highlights

​Notable Changes

​v0.5.3.dev1

​Highlights

​Notes

Unreleased

v1.4.0

Highlights

v1.3.0

Highlights

v1.2.2

Highlights

Other fixes

v1.2.1

Highlights

v1.2.0

Highlights

Notes

v1.1.3

Highlights

v1.1.2

Highlights

Notable Changes

Fixes and Improvements

v1.1.1

Highlights

Notable Changes

Fixes and Improvements

v1.1.0.dev1

Highlights

Notable Changes

v1.0.3

Highlights

Notable Changes

Bug Fixes

v1.0.2

Bug Fixes

Dependency Updates

v1.0.1

v1.0.0

Highlights

New Features

v0.5.4.dev1

Highlights

New Features

Improvements

Bug Fixes

v0.5.3

Highlights

Notable Changes

v0.5.3.dev1

Highlights

Notes