Modal Deployment

Deploy Cognee on Modal for serverless, auto-scaling knowledge graph processing with minimal infrastructure management.

Modal is a cloud platform that lets you run code remotely with automatic scaling, perfect for variable Cognee workloads.

Serverless Scaling

Automatically scales based on workload without server management

Cost Efficient

Pay only for compute time used, ideal for batch processing

Fast Deployment

Deploy within seconds with minimal configuration

GPU Support

Access to powerful GPUs for LLM processing when needed

Prerequisites

Modal Account

Create a free account at modal.com

Install Modal CLI

pip install modal
modal token new

Environment Variables

Set up your environment variables:

# Required
export OPENAI_API_KEY="your-openai-api-key"

# Optional - for external databases
export POSTGRES_URL="postgresql://user:pass@host:5432/db"
export NEO4J_URL="bolt://user:pass@host:7687"
export QDRANT_URL="http://host:6333"

Quick Deployment

Clone Repository

git clone https://github.com/topoteretes/cognee.git
cd cognee

Install Dependencies

# Install with uv (recommended)
uv sync --dev --all-extras --reinstall

# Activate virtual environment
source .venv/bin/activate

Deploy to Modal

# Run the Modal deployment script
modal run -d modal_deployment.py

The -d flag runs the deployment in detached mode. Monitor progress in your Modal dashboard.

Monitor Deployment

Visit your Modal dashboard to monitor the deployment status and view logs.

Configuration Options

Basic Setup
Production Setup
Hybrid Setup

Default ConfigurationUses embedded databases for quick testing:

# modal_deployment.py configuration
GRAPH_DATABASE = "networkx"
VECTOR_DATABASE = "lancedb"
RELATIONAL_DATABASE = "sqlite"

Deployment Architecture

Compute Resources

Modal automatically provisions compute resources based on your workload:

CPU: 2-16 cores per container
Memory: 4-64 GB RAM per container
GPU: Optional NVIDIA GPUs for LLM processing
Storage: Ephemeral storage per container

Auto-scaling

Modal scales your deployment automatically:

Cold Start: ~2-5 seconds to spin up new containers
Concurrent Processing: Multiple containers for parallel workloads
Auto-shutdown: Containers shut down when idle to save costs

Data Persistence

Configure persistent storage for your data:

Volumes: Modal volumes for persistent file storage
External DBs: Connect to managed database services
S3 Integration: Direct S3 access for large datasets

Monitoring & Debugging

Modal Dashboard

Real-time MonitoringView logs, metrics, and container status in the Modal web interface.

Log Streaming

Live LogsStream logs directly to your terminal:

modal logs cognee-app

Video Tutorial

Cost Optimization

Batch Processing: Group multiple documents together to maximize container utilization and reduce cold start costs.

Database Costs: Consider using Modal’s built-in storage for development and external managed services for production.

Troubleshooting

Common Issues

Container Timeout

Increase timeout limits in modal_deployment.py
Break large datasets into smaller batches

Memory Errors

Increase container memory allocation
Use streaming processing for large files

Environment Variables

Missing API Keys

Ensure all required environment variables are set
Use Modal secrets for sensitive data

Database Connections

Verify database URLs and credentials
Check network connectivity from Modal containers

Next Steps

Scale Up

Production DeploymentConfigure external databases and optimize for production workloads.

Monitor Usage

Track CostsMonitor compute usage and optimize batch sizes for cost efficiency.

Need Help?

Join our community for Modal deployment support and best practices.

Self-Hosting

​Modal Deployment

​Why Modal?

Serverless Scaling

Cost Efficient

Fast Deployment

GPU Support

​Prerequisites

​Quick Deployment

​Configuration Options

​Deployment Architecture

​Monitoring & Debugging

Modal Dashboard

Log Streaming

​Video Tutorial

​Cost Optimization

​Troubleshooting

​Next Steps

Scale Up

Monitor Usage

Need Help?

Why Modal?

Prerequisites

Quick Deployment

Configuration Options

Deployment Architecture

Monitoring & Debugging

Video Tutorial

Cost Optimization

Troubleshooting

Next Steps