Skip to Content
How-to GuidesConfiguration

Configuration

🚀 Configure Vector and Graph Stores

You can configure the vector and graph stores using the environment variables in your .env file or programmatically. We use Pydantic Settings

We have a global configuration object (cognee.config) and individual configurations on pipeline and data store levels

Check available configuration options:

from cognee.infrastructure.databases.vector import get_vectordb_config from cognee.infrastructure.databases.graph.config import get_graph_config from cognee.infrastructure.databases.relational import get_relational_config from cognee.infrastructure.llm.config import get_llm_config print(get_vectordb_config().to_dict()) print(get_graph_config().to_dict()) print(get_relational_config().to_dict()) print(get_llm_config().to_dict())

Setting the environment variables in your .env file, and Pydantic will pick them up:

GRAPH_DATABASE_PROVIDER = 'lancedb'

Otherwise, you can set the configuration yourself:

cognee.config.set_llm_provider('ollama')

Make sure to keep your API keys secure and never commit them to version control.

Settings for database engines

Relational Engines

The default SQLite does not need much (it can be omitted as well):

DB_PROVIDER=sqlite

Postgres instance requires a bit more. Add the following to your .env file:

DB_PROVIDER=postgres DB_HOST=127.0.0.1 DB_PORT=5432 DB_USERNAME=cognee DB_PASSWORD=cognee DB_NAME=cognee_db

Vector Engines

LanceDB

LanceDB is the default vector database, it is a file-based vector store. To use it, add the following to your .env file (it can be omitted as well):

VECTOR_DB_PROVIDER="lancedb"

PGVector

If you are already using Postgres, PGVector is a natural choice.

VECTOR_DB_PROVIDER="pgvector"

If you’re using an AWS hosted Postgres you need to run CREATE EXTENSION vector; before first use. See official AWS docs

Qdrant

VECTOR_DB_PROVIDER="qdrant" VECTOR_DB_URL=https://url-to-your-qdrant-cloud-instance.cloud.qdrant.io:6333 VECTOR_DB_KEY=your-qdrant-api-key

Weaviate

VECTOR_DB_PROVIDER="weaviate" VECTOR_DB_URL=https://url-to-your-weaviate-cloud-instance.weaviate.cloud VECTOR_DB_KEY=your-weaviate-api-key

Embedding Engines

Embedding engine is used to embed data before it is stored in a vector database. Cognee supports different embedding providers by implementing adapters for them.

To configure it, add the following to your .env file:

Example with Mistral:

EMBEDDING_PROVIDER=mistral EMBEDDING_MODEL=mistral/mistral-embed EMBEDDING_API_KEY=mistral-api-key EMBEDDING_MAX_TOKENS=8000 EMBEDDING_DIMENSIONS=1024

Example with Azure OpenAI Embeddings:

EMBEDDING_PROVIDER=openai EMBEDDING_MODEL=azure/text-embedding-3-large EMBEDDING_ENDPOINT=https://azure-project.openai.azure.com/openai/deployments/text-embedding-3-large EMBEDDING_API_KEY=azure-openai-api-key EMBEDDING_API_VERSION=2024-12-01-preview

Example with Fastembed:

Fastembed is used for the codegraph pipeline, because OpenAI models usually reach rate limit when ingesting the codebase.

EMBEDDING_PROVIDER=fastembed EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 EMBEDDING_DIMENSIONS=384 EMBEDDING_MAX_TOKENS=256

Graph Engines

NetworkX

NetworkX is the default graph database we use. It is a file-based graph storage. For NetworkX graph database, add the following to your .env file:

GRAPH_DATABASE_PROVIDER="networkx"

It will persist the data in .cognee_system/databases/cognee_graph.pkl JSON file.

Kuzu

For Kuzu graph database, add the following to your .env file:

GRAPH_DATABASE_PROVIDER="kuzu"

Its persistance is slightly different. It will write files to the .cognee_system/databases/cognee_graph.pkl directory.

Neo4j

Neo4J needs a bit more setup, a default for a locally hosted Neo4j instance would look like this:

GRAPH_DATABASE_PROVIDER="neo4j" GRAPH_DATABASE_URL=bolt://localhost:7687 GRAPH_DATABASE_USERNAME=neo4j GRAPH_DATABASE_PASSWORD=pleaseletmein

If you installed your Neo4J manually, don’t forget to install apoc and graph-data-science plugins.


neo4j-local

For the cloud instance, add the following to your .env file:

GRAPH_DATABASE_PROVIDER=neo4j GRAPH_DATABASE_USERNAME=neo4j GRAPH_DATABASE_URL=neo4j+ssc://12345678.databases.neo4j.io GRAPH_DATABASE_PASSWORD=neo4j-api-key

FalkorDB

VECTOR_DB_PROVIDER=falkordb VECTOR_DB_URL=localhost VECTOR_DB_PORT=6379 GRAPH_DATABASE_PROVIDER=falkordb GRAPH_DATABASE_URL=localhost GRAPH_DATABASE_PORT=6379