ReferenceSdk Reference

Cognee SDK Reference

This reference provides an overview of the core classes, functions, and modules included in the Cognee SDK. It serves as a starting point for developers to understand the building blocks of Cognee, how to integrate with its pipelines and tasks, and how to manage data flow using Datapoints.

Note: This is a conceptual reference. For the most accurate and up-to-date details, refer to the official Cognee documentation and source code repository.

Table of Contents


Core Concepts

Tasks

Tasks are atomic units of work. A task:

  • Accepts input data (as typed pydantic models).
  • Transforms the data.
  • Returns output data (as typed pydantic models).

Tasks can perform operations such as:

  • Chunking text
  • Generating embeddings
  • Classifying entities
  • Inferring relationships between nodes

Pipelines

Pipelines are sequences of tasks orchestrated to perform a complex workflow. By connecting multiple tasks, you form a pipeline that ingests raw data, processes it, and outputs structured, meaningful artifacts (e.g., a knowledge graph).

Key points:

  • Pipelines ensure that the output of one task is the valid input of the next.
  • Pipelines allow you to version, maintain, and scale complex data flows.

Datapoints

Datapoints are pydantic models used as standardized input/output schemas for tasks. They:

  • Define the shape of data passing between tasks.
  • Provide validation and consistent typing.
  • Make pipelines more robust and maintainable by catching schema errors early.

Modules

cognee

The core module that provides high-level entry points to Cognee functionalities:

  • cognee.add(text: str): Add text documents to the metastore.
  • cognee.cognify(): Run the default pipeline to generate a knowledge graph from ingested data.
  • cognee.search(...): Query the knowledge graph or embeddings.
  • cognee.prune: Utilities to reset or prune data and system states.

cognee.api.v1

Contains HTTP API endpoints and related logic for:

  • Searching and retrieving insights.
  • Managing knowledge graphs, embeddings, and document states through REST APIs.
  • Integrating Cognee into your applications with a web-based interface.

Key submodules and functions:

  • cognee.api.v1.search.SearchType: Enums for specifying different search modes (e.g., INSIGHTS, CONTEXTUAL).

cognee.prune

Provides functions to reset or prune data:

  • cognee.prune.prune_data(): Clears stored data (documents, embeddings, graph nodes).
  • cognee.prune.prune_system(metadata=True): Clears system metadata, allowing a “clean slate” before running a new pipeline.

cognee.add

Functionality for adding raw data into Cognee’s metastore:

  • cognee.add(text: str): Ingest textual data for later processing.
  • Future expansions may include adding different data types (e.g., code, HTML, or binary documents after conversion).

Classes & Functions

Task

Definition:

from cognee import Task
 
class MyCustomTask(Task):
    def run(self, input_datapoint: InputModel) -> OutputModel:
        # Custom logic here
        return OutputModel(...)