Project Modules Documentation
This document provides an overview of the modules in the project, their purposes, and the key functionalities they handle.
1. Settings Module
Handles configuration and settings management for the system.
save_llm_config.py
: Saves configuration for language model settings.save_vector_db_config.py
: Saves settings for vector database configurations.get_settings.py
: Retrieves general settings.get_current_settings.py
: Retrieves the currently active settings.
2. Ingestion Module
Manages data ingestion, classification, and identification processes.
save_data_to_file.py
: Saves ingested data to files.classify.py
: Classifies datasets during ingestion.discover_directory_datasets.py
: Discovers datasets in specified directories.get_matched_datasets.py
: Retrieves datasets matching specific criteria.identify.py
: Identifies data characteristics.- Submodule:
data_types
:TextData.py
: Handles text data ingestion.BinaryData.py
: Manages binary data ingestion.IngestionData.py
: Defines generic ingestion data structures.
- Submodule:
exceptions
:exceptions.py
: Defines custom exceptions for ingestion-related issues.
3. Graph Module
Focuses on graph-related operations, including graph creation, manipulation, and utility functions.
- Submodule:
utils
:- Utility scripts for node and edge handling, such as:
convert_node_to_data_point.py
deduplicate_nodes_and_edges.py
- Utility scripts for node and edge handling, such as:
- Submodule:
cognee_graph
:- Core graph handling, including:
CogneeGraph.py
: Main graph class.CogneeAbstractGraph.py
: Abstract base for graph implementations.
- Core graph handling, including:
- Submodule:
models
:EdgeType.py
: Defines edge types within the graph.
- Submodule:
exceptions
:exceptions.py
: Custom exceptions for graph operations.
4. Pipelines Module
The pipelines module is designed to facilitate the execution and management of workflows, consisting of interconnected tasks. It provides tools for defining tasks, organizing them into pipelines, executing them sequentially or in parallel, and monitoring their execution status.
models/
: Defines pipeline and task models.operations/
: Contains operations for running tasks, parallelization, logging pipeline statuses, and retrieving pipeline states.
5. Chunking Module
Handles text chunking for processing and storage.
TextChunker.py
: Main chunking logic.models/DocumentChunk.py
: Defines the structure of document chunks.
6. Cognify Module
Handles configuration and initialization of the system.
config.py
: System configuration settings.
7. Search Module
Manages search functionality, including query and result logging.
models/
: Defines search-related models likeQuery
andResult
.operations/
: Includes scripts for handling queries and results.
8. Retrieval Module
Handles retrieval operations.
description_to_codepart_search.py
: Maps descriptions to code parts.brute_force_triplet_search.py
: Implements a brute-force approach to triplet searches.
9. Users Module
Manages user-related functionality, including authentication, permissions, and user data.
- Submodule:
methods
:- Handles user-related operations such as creation, deletion, and authentication.
- Submodule:
models
:- Defines user-related models, including
User
,Group
, andPermission
.
- Defines user-related models, including
- Submodule:
permissions
:- Manages permissions on documents and resources.
- Submodule:
exceptions
:- Custom exceptions for user-related operations.
- Submodule:
authentication
:- Handles user authentication mechanisms.
10. Data Module
Handles data operations, processing, and management.
- Submodule:
methods
:- Includes dataset and data management scripts.
- Submodule:
processing
:- Processes document types like
ImageDocument
,AudioDocument
, andTextDocument
.
- Processes document types like
- Submodule:
operations
:- Operations like translation, language detection, and metadata handling.
- Submodule:
extraction
:- Extracts topics, summaries, and categories.
- Includes knowledge graph extraction utilities.
11. Engine Module
Provides utilities and models for system operations.
- Submodule:
utils
:- Node and edge generation utilities.
- Submodule:
models
:- Defines entities and their types.
For further details on each module, refer to the inline documentation and comments within the respective files.