Tasks: Smallest Executable Units
Tasks are Cognee’s smallest executable units — they wrap any Python callable and give it a uniform interface for batching, error handling, and logging. While they can work with anything, Tasks are most powerful when creating or enriching DataPoints.What are Tasks
Tasks are Cognee’s smallest executable units.- They wrap any Python callable (function, coroutine, generator, async generator).
- Give a uniform interface for batching, error handling, and logging.
- Can work with anything, but are most powerful when creating or enriching DataPoints.
Why Tasks Exist
- Normalize different kinds of Python functions so they behave consistently.
- Enable stream-based processing: outputs flow directly into the next step.
- Provide batching controls for efficiency, especially with LLM or I/O-heavy operations.
- Form the building blocks of higher-level Pipelines.
Core Concepts
- Execution: run functions in a consistent way, regardless of sync/async/gen.
- Batching: configurable with
task_config
. - Composition: Tasks can be chained — one Task’s output is the next Task’s input.
- Flexibility: Tasks don’t need to handle DataPoints, but Cognee’s defaults encourage it.
Dependencies & Ordering
Tasks often assume a certain input type and produce an expected output type. Example flow (educational, not exhaustive):- Raw data → Documents
- Documents → Chunks
- Chunks → Entities and relationships
- Entities/Chunks → Summaries
- Any DataPoint → Storage
Built-in Tasks
- Ingestion:
resolve_data_directories
,ingest_data
- Classification:
classify_documents
- Access control:
check_permissions_on_dataset
- Chunking:
extract_chunks_from_documents
- Graph extraction:
extract_graph_from_data
- Summarization:
summarize_text
,summarize_code
- Persistence:
add_data_points
Examples and details
Task API & Constructor
Task API & Constructor
executable
: Any Python callable (function, coroutine, generator, async generator)task_config
: Configuration for batching, error handling, and loggingdefault_params
: Parameters that are always passed to the executable
Supported Task Types
Supported Task Types
Cognee automatically detects and handles different Python function types:
- Functions: Standard synchronous functions
- Coroutines: Async functions using
async def
- Generators: Functions that yield multiple values
- Async Generators: Async functions that yield multiple values
Writing a Custom Task
Writing a Custom Task
- Predictable inputs and outputs
- Easy to chain together
- Clear data flow between steps
Execution Flow
Execution Flow
Tasks execute in sequence within Pipelines, with each Task’s output becoming the next Task’s input. This creates a data transformation pipeline that builds up to the final knowledge graph.