Tasks: Smallest Executable Units

Tasks are Cognee’s smallest executable units — they wrap any Python callable and give it a uniform interface for batching, error handling, and logging. While they can work with anything, Tasks are most powerful when creating or enriching DataPoints.

What are Tasks

Tasks are Cognee’s smallest executable units.
  • They wrap any Python callable (function, coroutine, generator, async generator).
  • Give a uniform interface for batching, error handling, and logging.
  • Can work with anything, but are most powerful when creating or enriching DataPoints.

Why Tasks Exist

  • Normalize different kinds of Python functions so they behave consistently.
  • Enable stream-based processing: outputs flow directly into the next step.
  • Provide batching controls for efficiency, especially with LLM or I/O-heavy operations.
  • Form the building blocks of higher-level Pipelines.

Core Concepts

  • Execution: run functions in a consistent way, regardless of sync/async/gen.
  • Batching: configurable with task_config.
  • Composition: Tasks can be chained — one Task’s output is the next Task’s input.
  • Flexibility: Tasks don’t need to handle DataPoints, but Cognee’s defaults encourage it.

Dependencies & Ordering

Tasks often assume a certain input type and produce an expected output type. Example flow (educational, not exhaustive):
  • Raw data → Documents
  • Documents → Chunks
  • Chunks → Entities and relationships
  • Entities/Chunks → Summaries
  • Any DataPoint → Storage

Built-in Tasks

  • Ingestion: resolve_data_directories, ingest_data
  • Classification: classify_documents
  • Access control: check_permissions_on_dataset
  • Chunking: extract_chunks_from_documents
  • Graph extraction: extract_graph_from_data
  • Summarization: summarize_text, summarize_code
  • Persistence: add_data_points

Examples and details