Skip to Content
Core ConceptsData Management (CRUD)

What is CRUD?

CRUD stands for:

  1. Create: Add new records or data entries.
  2. Read: Retrieve existing data (search, filter, or simply fetch details).
  3. Update: Modify existing data, such as changing a record’s content or updating settings.
  4. Delete: Remove data entries that are no longer needed or are incorrect.

These operations form the backbone of data management. In cognee, you’ll use CRUD concepts to add data to your datasets, retrieve them for inspection or processing, update system settings or dataset details, and remove data or entire datasets as needed.

Cognee provides a flexible way to manage data and knowledge graphs through a set of CRUD operations, aligning with standard RESTful API conventions. Now, we will walk you through how CRUD works within cognee and offer practical examples on how to integrate these operations into your workflow.

RESTful Alignment

In a RESTful context, each HTTP method typically corresponds to one of these CRUD actions. Cognee’s API follows these conventions, making it straightforward for developers to integrate cognee into their applications or data pipelines.

CRUD ActionsRESTful ContextHow Cognee Handles
CreatePOSTUpload new datasets or documents for analysis
ReadGETRetrieve the current state of datasets, graphs, and system settings
UpdatePUT/PATCHRefine configurations and reprocess data to improve AI-driven results
DeleteDELETEClean Up when a dataset is no longer needed or to remove specific data entries

Cognee’s Endpoints

Below is a quick reference for how these CRUD operations map to cognee’s API endpoints. You can find detailed information in cognee’s API Reference.

1. Create

Add Data
POST /add
Upload new data (e.g., files, documents) to a specified dataset. The uploaded files will be stored and available for further processing (e.g., chunk extraction, graph population).

2. Read

  • Get Datasets
    GET /datasets
    Retrieve a list of available datasets.

  • Get Dataset Data
    GET /datasets/{dataset_id}/data
    Retrieves the data entries (documents, files, etc.) associated with a specific dataset.

  • Get Dataset Graph
    GET /datasets/{dataset_id}/graph
    Obtain the graph visualization URL for a particular dataset.

  • Get Dataset Status
    GET /datasets/status
    Check the status of one or more datasets.

  • Get Raw Data
    GET /datasets/{dataset_id}/data/{data_id}/raw
    Downloads the original file for a specific data entry.

  • Get Settings
    GET /settings
    Retrieves current system configurations (LLM settings, vector DB configurations, etc.).

3. Update

  • Save or Update Settings
    POST /settings
    Updates system configurations. This does not modify datasets directly but affects how data is processed and stored.

4. Delete

  • Delete a Dataset
    DELETE /datasets/{dataset_id}
    Removes as specific dataset by its ID, including all associated data from cognee’s storage.

Deleting a Single Document from a Dataset

Currently, cognee does not offer a single “delete document” endpoint for partially removing files from a dataset’s graph. However, you can remove a file from the dataset’s metastore using a script provided in the codebase, ensuring it will not be processed in subsequent runs:

Once removed from the metastore:

  1. The file will no longer appear in the dataset listing.
  2. If you re-run cognify on that dataset, the removed file will not be included in the new graph build.

Note: A feature is cooking to remove a single document from the knowledge graph itself. Until that is released, manually removing the file from the metastore is the best workaround if you do not want to delete the entire dataset.


Example CRUD Workflow in cognee

A typical sequence using cognee’s RESTful API might look like this:

1. Create - upload your documents or data to a new dataset

POST /add

2. Read - verify the dataset was created and inspect its contents

GET /datasets GET /datasets/{dataset_id}/data

3. Cognify - once the dataset is verified, trigger cognitive processing (e.g., generate embeddings, extract knowledge graphs, etc.)

POST /cognify { "datasets": ["..."] }

4. Read (Graph Insights) - check the resulting graph and insights

GET /datasets/{dataset_id}/graph GET /datasets/status

5. Update - adjust system settings if necessary

POST /settings { "llm_provider": "...", "vector_db": "..." }

6. Search - now that the dataset is processed, run queries to discover insights

POST /search { "dataset_id": "{dataset_id}", "query": "...?" }

7. Delete - remove the dataset entirely if no longer needed

DELETE /datasets/{dataset_id}

See an example with code.

Cognee SDK Overview: CRUD in Practice

The cognee SDK offers a structured approach to data management through its core components, which align with CRUD operations. Here’s a summary from the CRUD perspective:

Create -> Data Ingestion: cognee.add(text: str) adds new text data to the metastore for future graph and embedding generation.

Read -> Data Retrieval: cognee.search(...) searches the knowledge graph or embeddings, enabling quick data exploration.

Update -> Data Processing: cognee.cognify() processes ingested data, including generating embeddings and building knowledge graphs.

Delete -> Data Pruning: cognee.prune.prune_data() removes stored documents, embeddings, or graph elements when they are no longer relevant.

These functions let you manage the full data lifecycle within cognee, making it easier to create, read, update, and delete data elements programmatically.


Join the Conversation!

For further details, please visit:

If you have additional questions or suggestions, feel free to reach out on our Discord channel or open an issue in our GitHub repo.