Datasets: The Core Unit of Data
A dataset is a logical container for related documents and their processed knowledge graphs. All data in Cognee belongs to a dataset. When you add documents to Cognee usingcognee.add(), they are processed and stored within a specific dataset.
Dataset-scoped permissions — All permissions in Cognee are defined at the dataset level, never for individual documents.
Ownership and Permissions
When a principal creates a dataset, they become its owner. A principal is any entity that can have permissions, like a user, tenant, or role. Ownership cannot be changed. The owner has full control and can grant permissions to others. There are four types of permissions you can grant on a dataset:- Read — View documents and query the knowledge graph.
 - Write — Add, modify, or remove documents and data.
 - Delete — Remove the entire dataset.
 - Share — Grant permissions to other principals.
 
Dataset Isolation: How Access Is Enforced
Cognee can enforce strict data isolation between datasets, but it’s important to understand when this happens.- Isolation is optional: Dataset boundaries are only enforced when the 
ENABLE_BACKEND_ACCESS_CONTROLsetting istrue. - Without isolation: If this setting is 
false, dataset parameters are ignored during searches, and queries will run across all data in the system, regardless of permissions. - Database support: True isolation is currently supported when using Kùzu for the graph store, LanceDB for the vector store, and SQLite or Postgres for the relational database. Other database backends (like Neo4j or Qdrant) do not support dataset isolation.
 
Using Datasets in Operations
Datasets integrate with Cognee’s main operations:add: Direct new content into a specific dataset by name or ID. If no dataset is specified, a defaultmain_datasetis used.cognify: Choose which dataset(s) to transform into AI memory stored in the graph and vector stores.search: Scope queries to run only against datasets you have read access to.memify: Apply optional semantic enrichment on a per-dataset basis.
Technical Details
Operation Permission Requirements
Operation Permission Requirements
Different operations require different permissions:
add/cognifyoperations → requirewritepermissionsearchoperations → requirereadpermissiondeleteoperations → requiredeletepermission- Permission management → requires 
sharepermission 
Dataset Creation Methods
Dataset Creation Methods
Cognee provides two helper methods for creating datasets:
create_dataset(): This is a lower-level function that only inserts the dataset record. It expects the caller to manage the Access Control List (ACL) entries separately.create_authorized_dataset(): This is the recommended method for most user-facing flows. It wrapscreate_dataset()and then immediately grants the creator fullread/write/delete/sharepermissions. This ensures the dataset is usable as soon as it’s created, especially whenENABLE_BACKEND_ACCESS_CONTROLis active.
Dataset Model Fields
Dataset Model Fields
The core dataset metadata is stored in a relational (SQL) database. The 
datasets table includes:id: Unique identifier (UUID primary key)name: Human-readable nameowner_id: ID of the principal who created the datasetcreated_at: Timestamp when createdupdated_at: Timestamp when last modified
Limitations
- Dataset ownership cannot be transferred.
 - When access control is enabled, the graph and vector stores are enforced as Kùzu and LanceDB.
 - Cross-dataset searches are not supported directly. Queries are always scoped to a single dataset. To search multiple datasets, you must run separate queries for each one you have access to.