Skip to Content

Cognee SDK Reference


description: Overview of the Cognee core classes, functions, and modules included in the Cognee SDK.

This reference provides an overview of the core classes, functions, and modules included in the Cognee SDK. It serves as a starting point for developers to understand the building blocks of Cognee, how to integrate with its pipelines and tasks, and how to manage data flow using Datapoints.

Note: This is a conceptual reference. For the most accurate and up-to-date details, refer to the official Cognee documentation and source code repository.

Table of Contents


Core Concepts

Tasks

Tasks are atomic units of work. A task:

  • Accepts input data (as typed pydantic models).
  • Transforms the data.
  • Returns output data (as typed pydantic models).

Tasks can perform operations such as:

  • Chunking text
  • Generating embeddings
  • Classifying entities
  • Inferring relationships between nodes

Pipelines

Pipelines are sequences of tasks orchestrated to perform a complex workflow. By connecting multiple tasks, you form a pipeline that ingests raw data, processes it, and outputs structured, meaningful artifacts (e.g., a knowledge graph).

Key points:

  • Pipelines ensure that the output of one task is the valid input of the next.
  • Pipelines allow you to version, maintain, and scale complex data flows.

Datapoints

Datapoints are pydantic models used as standardized input/output schemas for tasks. They:

  • Define the shape of data passing between tasks.
  • Provide validation and consistent typing.
  • Make pipelines more robust and maintainable by catching schema errors early.

Modules

cognee

The core module that provides high-level entry points to Cognee functionalities:

  • cognee.add(text: str): Add text documents to the metastore.
  • cognee.cognify(): Run the default pipeline to generate a knowledge graph from ingested data.
  • cognee.search(...): Query the knowledge graph or embeddings.
  • cognee.prune: Utilities to reset or prune data and system states.

cognee.api.v1

Contains HTTP API endpoints and related logic for:

  • Searching and retrieving insights.
  • Managing knowledge graphs, embeddings, and document states through REST APIs.
  • Integrating Cognee into your applications with a web-based interface.

Key submodules and functions:

  • cognee.api.v1.search.SearchType: Enums for specifying different search modes (e.g., INSIGHTS, COMPLETION).

cognee.prune

Provides functions to reset or prune data:

  • cognee.prune.prune_data(): Clears stored data (documents, embeddings, graph nodes).
  • cognee.prune.prune_system(metadata=True): Clears system metadata, allowing a “clean slate” before running a new pipeline.

cognee.add

Functionality for adding raw data into Cognee’s metastore:

  • cognee.add(text: str): Ingest textual data for later processing.
  • Future expansions may include adding different data types (e.g., code, HTML, or binary documents after conversion).

Classes & Functions

Task

Definition:

from cognee import Task class MyCustomTask(Task): def run(self, input_datapoint: InputModel) -> OutputModel: # Custom logic here return OutputModel(...)