Skip to main content
Canonical reference for configuring cognee-rust. The complete, field-level source of truth is the Settings struct and the ConfigManager runtime API — build the rustdoc with cargo doc -p cognee-lib --no-deps --open to browse every field and setter with its type. This page groups those fields by subsystem and gives the env-var name and default for each.

How configuration resolves

Three layers, lowest precedence first:
  1. DefaultsSettings::default() in crates/lib/src/config.rs.
  2. Persisted config file (CLI only) — JSON at ~/.config/cognee-rust/config.json ($XDG_CONFIG_HOME/cognee-rust/config.json), managed by cognee-cli config. See crates/cli/src/config_store.rs.
  3. Environment variables — bound by Settings::overlay_from_env(). A .env file in the working directory (or any ancestor) is loaded automatically via dotenv.
So: defaults < config.json < env. At runtime, code can also mutate settings through ConfigManager’s set_* methods (below) or the binding config APIs. Parsing notes: booleans accept true|1|yes / false|0|no (cognee_utils::parse_env_bool); empty env values are treated as unset; numeric vars that fail to parse are ignored.

LLM

Read by the LLM adapter. The deep reference is tools/cli for retries and the cognee-llm rustdoc for the adapter.
Env var (aliases)Settings fieldDefault
LLM_PROVIDERllm_provideropenai
LLM_MODEL / OPENAI_MODELllm_modelopenai/gpt-5-mini
LLM_API_KEY / OPENAI_TOKENllm_api_key(empty)
LLM_ENDPOINT / OPENAI_URLllm_endpoint(empty)
LLM_API_VERSIONllm_api_version(empty)
LLM_TEMPERATUREllm_temperature0.0
LLM_STREAMINGllm_streamingfalse
LLM_MAX_COMPLETION_TOKENS / LLM_MAX_TOKENSllm_max_completion_tokens16384
LLM_MAX_RETRIESllm_max_retries2
LLM_MAX_PARALLEL_REQUESTSllm_max_parallel_requests20
MOCK_LLMllm_mockfalse
MOCK_LLM_CASSETTEllm_cassette(empty)
COGNEE_RECORD_LLMllm_record_path(empty)
A fallback LLM (llm_fallback_provider/_model/_endpoint/_api_key) is configurable programmatically (no env binding). MOCK_LLM + cassettes power the offline benchmark — see performance/mock-benchmark.md.

Embedding

Read by EmbeddingConfig::from_env() (crates/embedding/src/config.rs).
Env var (aliases)Settings fieldDefault
EMBEDDING_PROVIDERembedding_provideropenai (onnx on Android)
EMBEDDING_MODELembedding_model_nametext-embedding-3-small (BGE-Small-v1.5 on Android)
EMBEDDING_DIMENSIONSembedding_dimensions1536 (384 on Android)
EMBEDDING_ENDPOINTembedding_endpoint(empty)
EMBEDDING_API_KEY (falls back to LLM_API_KEY)embedding_api_key(empty)
EMBEDDING_API_VERSIONembedding_api_version(empty)
EMBEDDING_MODEL_PATH / COGNEE_E2E_EMBED_MODEL_PATHembedding_model_path./target/models/BGE-Small-v1.5-model_quantized.onnx
EMBEDDING_TOKENIZER_PATH / COGNEE_E2E_TOKENIZER_PATHembedding_tokenizer_path./target/models/bge-small-tokenizer.json
EMBEDDING_MAX_SEQUENCE_LENGTHembedding_max_sequence_length512
EMBEDDING_BATCH_SIZEembedding_batch_size32
MOCK_EMBEDDING(provider override)false (also accepts deterministic)
Provider values: onnx, fastembed, openai, openai_compatible, ollama, mock.

Vector database

Env varSettings fieldDefault
VECTOR_DB_PROVIDERvector_db_providerlancedb (embedded, persistent) on non-Android; falls back to brute-force (in-memory) on Android
VECTOR_DB_URLvector_db_url(empty — defaults to {system_root_directory}/databases/cognee.lancedb; set to :memory: to force the in-memory brute-force store)
VECTOR_DB_HOST / VECTOR_DB_PORTvector_db_host / vector_db_port(empty) / 1234
VECTOR_DB_NAME / VECTOR_DB_KEYvector_db_name / vector_db_key(empty)
VECTOR_DB_USERNAME / VECTOR_DB_PASSWORD(empty)
Supported providers:
  • lancedb — embedded Apache-Arrow / Lance vector store, on disk. Default on every target except Android. The on-disk layout matches the Python SDK’s default LanceDB store, so a Rust deployment can be opened from Python and vice versa.
  • brute-force — pure-Rust in-memory linear scan. Default on Android (where LanceDB’s native stack does not cross-compile). Selected on any target by setting vector_db_url = ":memory:".
  • pgvector — Postgres + the pgvector extension; requires the pgvector Cargo feature on the binary build.
Qdrant lives in closed cognee-cloud-rs as the cognee-vector-qdrant crate and is not part of OSS. See tools/backends. Setting vector_db_provider to qdrant is rejected at component initialization in OSS (it returns a config error rather than falling back).

Graph database

Env varSettings fieldDefault
GRAPH_DATABASE_PROVIDERgraph_database_providerladybug
GRAPH_FILE_PATHgraph_file_path(empty; defaults under the system root)
GRAPH_DATABASE_URLgraph_database_url(empty)
GRAPH_DATABASE_HOST / GRAPH_DATABASE_PORTgraph_database_host / graph_database_port(empty) / 123
GRAPH_DATABASE_NAME / GRAPH_DATABASE_KEY(empty)
GRAPH_DATABASE_USERNAME / GRAPH_DATABASE_PASSWORD(empty)
Supported providers: ladybug/kuzu (embedded), postgres (feature pggraph). When Postgres graph credentials are unset they fall back to the relational DB_* config (see roadmap/cognify-compatibility-plan.md).

Relational database

Env varSettings fieldDefault
DATABASE_URLrelational_db_urlsqlite:./cognee.db?mode=rwc
DB_PROVIDERdb_providersqlite
DB_HOST / DB_PORTdb_host / db_portlocalhost / 5432
DB_NAMEdb_namecognee_db
DB_USERNAME / DB_PASSWORD(empty)

Chunking & tokenizer

Read by crates/chunking/src/config.rs. Most chunking knobs (chunk_strategy default PARAGRAPH, chunk_size 1500, chunk_overlap 10, chunk_engine) are Settings/CognifyConfig fields without env bindings. The token counter is env-selected:
Env varPurposeDefault
COGNEE_TOKEN_COUNTERtiktoken / word / huggingface(hf)auto from embedding provider
HUGGINGFACE_TOKENIZERmodel id when counter = huggingface(empty)

Ontology

Env varSettings fieldDefault
ONTOLOGY_FILE_PATHontology_file_path(empty)
ONTOLOGY_RESOLVERontology_resolverrdflib
ONTOLOGY_MATCHING_STRATEGYontology_matching_strategyfuzzy

System paths, users & datasets

Env varSettings fieldDefault
COGNEE_SYSTEM_ROOT_DIRECTORYsystem_root_directory./.cognee_system
COGNEE_DATA_ROOT_DIRECTORYdata_root_directory./.data_storage
CACHE_ROOT_DIRECTORYcache_root_directory./.cognee_cache
COGNEE_DEFAULT_USER_IDdefault_user_idnil UUID
COGNEE_DEFAULT_DATASET_NAMEdefault_dataset_namemain_dataset
DEFAULT_USER_EMAIL / DEFAULT_USER_PASSWORDdefault_user@example.com / (empty)
ENABLE_BACKEND_ACCESS_CONTROLenable_access_controlfalse
Setting system_root_directory cascades to the default graph_file_path and vector_db_url unless those are set explicitly.

Session / cache & rate limiting

Env varSettings fieldDefault
CACHE_BACKENDcache_backendfs
CACHE_HOST / CACHE_PORTlocalhost / 6379
SESSION_TTL_SECONDSsession_ttl_seconds604800 (7d)
CACHINGenable_cachingtrue
LLM_RATE_LIMIT_ENABLED / _REQUESTS / _INTERVALfalse / 60 / 60
EMBEDDING_RATE_LIMIT_ENABLED / _REQUESTS / _INTERVALfalse / 60 / 60

Logging

Canonical table. Binding READMEs and .env.example link here. cognee writes structured logs to stdout and (when writable) to a rotating file. File logging is owned by cognee-logging, initialised by the CLI and HTTP server via cognee_logging::init_logging.
Env varDefaultPurpose
COGNEE_LOG_FILEtrueMaster file-logging toggle (false/0/no disables).
COGNEE_LOGS_DIR~/.cognee/logsLog directory (falls back to /tmp/cognee_logs if unwritable).
COGNEE_LOG_FORMATplainplain (Python-compatible) or json. Applies to stdout + file.
COGNEE_LOG_ROTATIONdailydaily / hourly / minutely / never.
COGNEE_LOG_BACKUP_COUNT5Rotated files retained by age.
COGNEE_LOG_MAX_FILES10Hard cap on retained log files.
LOG_FILE_NAME(timestamped)Override the log file name.
RUST_LOG / LOG_LEVELinfoLevel filter (RUST_LOG preferred).
Multi-process warning — when several cognee processes share one log file via LOG_FILE_NAME, rotation is not coordinated; concurrent rotation can corrupt the log. For sharded workers, give each shard its own COGNEE_LOGS_DIR (or unset LOG_FILE_NAME per shard).

Observability & telemetry

cognee emits OpenTelemetry traces (behind the telemetry feature) and opt-out product analytics. The deep references are observability/opentelemetry.md and observability/send_telemetry.md; the env surface:
Env varDefaultPurpose
COGNEE_TRACING_ENABLEDfalseActivate OTLP trace export.
OTEL_EXPORTER_OTLP_ENDPOINT(empty)OTLP collector endpoint (non-empty also activates tracing).
OTEL_SERVICE_NAMEcogneeService name attribute.
OTEL_EXPORTER_OTLP_HEADERS / _PROTOCOL(empty) / grpcExporter headers / protocol.
OTEL_SPAN_PROCESSORbatchbatch or simple.
OTEL_TRACES_SAMPLER / _ARG(empty)Sampler selection.
TELEMETRY_DISABLED, ENV=test|dev(unset)Opt out of product analytics.

HTTP server

The server binary reads its own env surface (crates/http-server/src/config.rs) — host/port, auth, body limits, pipeline registry, notebooks, health probes. See tools/http-server and http-server/architecture.md §config.

Cloud

Cloud/Auth0 configuration (COGNEE_CLOUD_URL, COGNEE_AUTH0_*) and the serve()/disconnect() flow live in the closed cognee-cloud-rs product (the cognee-cloud crate) and are not part of OSS.

Runtime configuration API

ConfigManager (Arc<RwLock<Settings>>) exposes typed setters used by the bindings and CLI. Families: set_llm_*, set_embedding_*, set_vector_db_*, set_graph_*, set_chunk_*, set_relational_db_*, set_*_root_directory, set_ontology_*, set_classification_model / set_summarization_model / set_summarization_schema, plus four bulk setters (set_llm_config, set_embedding_config, set_vector_db_config, set_graph_db_config) and a generic set(key, value). Introspection: read(), version(), get_settings() (secrets masked). Full signatures are in the ConfigManager rustdoc. The binding ergonomics (granular JS setters vs generic set in Python/C) are documented in tools/bindings §config.

CLI config subcommand

cognee-cli config get|set|unset <key> reads/writes the persisted JSON file. The settable keys are the snake_case Settings field names — see known_keys(). Example:
cognee-cli config set llm_max_retries 4
cognee-cli config get llm_model
cognee-cli config unset embedding_endpoint