Cogwit layers orchestration and managed services on top of the open-source Cognee storage model. This document explains how the main components fit together.
Behind the scenes, every pipeline step runs as a Modal job that talks to managed LanceDB, Kuzu, and PostgreSQL clusters.

System Overview

Cogwit’s architecture centers around three main layers that work together to provide a managed knowledge processing platform: Modal provides the compute foundation for all Cogwit operations:
  • API Services: Hosts the FastAPI service that handles all REST endpoints and authentication (see Cogwit SDK)
  • Notebook Sandbox: Provides isolated environments for running user code with 24-hour timeout support (see Cogwit Notebooks)
  • Container Orchestration: Every API request runs inside a Modal container with secrets managed internally by Cogwit
  • Code Execution: Notebook code runs in short-lived sandboxes that forward the user’s Cogwit API key to the managed API
This infrastructure ensures reliable, scalable execution while keeping all compute resources managed by Cogwit.

Storage Services (Managed by Cogwit)

All data persistence is handled through Cogwit’s managed storage infrastructure:
  • S3 – Central storage for all raw uploads, LanceDB tables, and Kuzu graph files in Cogwit’s managed S3 infrastructure
  • LanceDB – Vector database that stores embeddings generated during the cognify process
  • Kuzu – Graph database that maintains knowledge graph relationships and entities
  • PostgreSQL – Relational database for users, datasets, permissions, quotas, and billing records
Each dataset maintains separate storage namespaces for isolation, and all workers share the same state through Cogwit’s managed S3 infrastructure.

Key Architectural Principles

  • Dataset Isolation: All processing happens at the dataset level, with separate storage namespaces (see permissions & security for details)
  • Managed Infrastructure: Users don’t configure Modal, S3, or database credentials—everything is managed by Cogwit
  • Compatibility: Storage schemas remain compatible with self-hosted Cognee for easy migration

Continue exploring