Cognee SDK Reference
This reference provides an overview of the core classes, functions, and modules included in the Cognee SDK. It serves as a starting point for developers to understand the building blocks of Cognee, how to integrate with its pipelines and tasks, and how to manage data flow using Datapoints.
Note: This is a conceptual reference. For the most accurate and up-to-date details, refer to the official Cognee documentation and source code repository.
Table of Contents
Core Concepts
Tasks
Tasks are atomic units of work. A task:
- Accepts input data (as typed
pydantic
models). - Transforms the data.
- Returns output data (as typed
pydantic
models).
Tasks can perform operations such as:
- Chunking text
- Generating embeddings
- Classifying entities
- Inferring relationships between nodes
Pipelines
Pipelines are sequences of tasks orchestrated to perform a complex workflow. By connecting multiple tasks, you form a pipeline that ingests raw data, processes it, and outputs structured, meaningful artifacts (e.g., a knowledge graph).
Key points:
- Pipelines ensure that the output of one task is the valid input of the next.
- Pipelines allow you to version, maintain, and scale complex data flows.
Datapoints
Datapoints are pydantic
models used as standardized input/output schemas for tasks. They:
- Define the shape of data passing between tasks.
- Provide validation and consistent typing.
- Make pipelines more robust and maintainable by catching schema errors early.
Modules
cognee
The core module that provides high-level entry points to Cognee functionalities:
cognee.add(text: str)
: Add text documents to the metastore.cognee.cognify()
: Run the default pipeline to generate a knowledge graph from ingested data.cognee.search(...)
: Query the knowledge graph or embeddings.cognee.prune
: Utilities to reset or prune data and system states.
cognee.api.v1
Contains HTTP API endpoints and related logic for:
- Searching and retrieving insights.
- Managing knowledge graphs, embeddings, and document states through REST APIs.
- Integrating Cognee into your applications with a web-based interface.
Key submodules and functions:
cognee.api.v1.search.SearchType
: Enums for specifying different search modes (e.g., INSIGHTS, CONTEXTUAL).
cognee.prune
Provides functions to reset or prune data:
cognee.prune.prune_data()
: Clears stored data (documents, embeddings, graph nodes).cognee.prune.prune_system(metadata=True)
: Clears system metadata, allowing a “clean slate” before running a new pipeline.
cognee.add
Functionality for adding raw data into Cognee’s metastore:
cognee.add(text: str)
: Ingest textual data for later processing.- Future expansions may include adding different data types (e.g., code, HTML, or binary documents after conversion).
Classes & Functions
Task
Definition:
from cognee import Task
class MyCustomTask(Task):
def run(self, input_datapoint: InputModel) -> OutputModel:
# Custom logic here
return OutputModel(...)