Settings struct and the
ConfigManager runtime API — build the rustdoc with
cargo doc -p cognee-lib --no-deps --open to browse every field and setter with
its type. This page groups those fields by subsystem and gives the env-var name
and default for each.
How configuration resolves
Three layers, lowest precedence first:- Defaults —
Settings::default()incrates/lib/src/config.rs. - Persisted config file (CLI only) — JSON at
~/.config/cognee-rust/config.json($XDG_CONFIG_HOME/cognee-rust/config.json), managed bycognee-cli config. Seecrates/cli/src/config_store.rs. - Environment variables — bound by
Settings::overlay_from_env(). A.envfile in the working directory (or any ancestor) is loaded automatically viadotenv.
defaults < config.json < env. At runtime, code can also mutate settings
through ConfigManager’s set_* methods (below) or the binding config APIs.
Parsing notes: booleans accept true|1|yes / false|0|no (cognee_utils::parse_env_bool);
empty env values are treated as unset; numeric vars that fail to parse are ignored.
LLM
Read by the LLM adapter. The deep reference is tools/cli for retries and thecognee-llm rustdoc for the adapter.
| Env var (aliases) | Settings field | Default |
|---|---|---|
LLM_PROVIDER | llm_provider | openai |
LLM_MODEL / OPENAI_MODEL | llm_model | openai/gpt-5-mini |
LLM_API_KEY / OPENAI_TOKEN | llm_api_key | (empty) |
LLM_ENDPOINT / OPENAI_URL | llm_endpoint | (empty) |
LLM_API_VERSION | llm_api_version | (empty) |
LLM_TEMPERATURE | llm_temperature | 0.0 |
LLM_STREAMING | llm_streaming | false |
LLM_MAX_COMPLETION_TOKENS / LLM_MAX_TOKENS | llm_max_completion_tokens | 16384 |
LLM_MAX_RETRIES | llm_max_retries | 2 |
LLM_MAX_PARALLEL_REQUESTS | llm_max_parallel_requests | 20 |
MOCK_LLM | llm_mock | false |
MOCK_LLM_CASSETTE | llm_cassette | (empty) |
COGNEE_RECORD_LLM | llm_record_path | (empty) |
llm_fallback_provider/_model/_endpoint/_api_key) is configurable
programmatically (no env binding). MOCK_LLM + cassettes power the offline
benchmark — see performance/mock-benchmark.md.
Embedding
Read byEmbeddingConfig::from_env() (crates/embedding/src/config.rs).
| Env var (aliases) | Settings field | Default |
|---|---|---|
EMBEDDING_PROVIDER | embedding_provider | openai (onnx on Android) |
EMBEDDING_MODEL | embedding_model_name | text-embedding-3-small (BGE-Small-v1.5 on Android) |
EMBEDDING_DIMENSIONS | embedding_dimensions | 1536 (384 on Android) |
EMBEDDING_ENDPOINT | embedding_endpoint | (empty) |
EMBEDDING_API_KEY (falls back to LLM_API_KEY) | embedding_api_key | (empty) |
EMBEDDING_API_VERSION | embedding_api_version | (empty) |
EMBEDDING_MODEL_PATH / COGNEE_E2E_EMBED_MODEL_PATH | embedding_model_path | ./target/models/BGE-Small-v1.5-model_quantized.onnx |
EMBEDDING_TOKENIZER_PATH / COGNEE_E2E_TOKENIZER_PATH | embedding_tokenizer_path | ./target/models/bge-small-tokenizer.json |
EMBEDDING_MAX_SEQUENCE_LENGTH | embedding_max_sequence_length | 512 |
EMBEDDING_BATCH_SIZE | embedding_batch_size | 32 |
MOCK_EMBEDDING | (provider override) | false (also accepts deterministic) |
onnx, fastembed, openai, openai_compatible, ollama, mock.
Vector database
| Env var | Settings field | Default |
|---|---|---|
VECTOR_DB_PROVIDER | vector_db_provider | lancedb (embedded, persistent) on non-Android; falls back to brute-force (in-memory) on Android |
VECTOR_DB_URL | vector_db_url | (empty — defaults to {system_root_directory}/databases/cognee.lancedb; set to :memory: to force the in-memory brute-force store) |
VECTOR_DB_HOST / VECTOR_DB_PORT | vector_db_host / vector_db_port | (empty) / 1234 |
VECTOR_DB_NAME / VECTOR_DB_KEY | vector_db_name / vector_db_key | (empty) |
VECTOR_DB_USERNAME / VECTOR_DB_PASSWORD | … | (empty) |
lancedb— embedded Apache-Arrow / Lance vector store, on disk. Default on every target except Android. The on-disk layout matches the Python SDK’s default LanceDB store, so a Rust deployment can be opened from Python and vice versa.brute-force— pure-Rust in-memory linear scan. Default on Android (where LanceDB’s native stack does not cross-compile). Selected on any target by settingvector_db_url = ":memory:".pgvector— Postgres + thepgvectorextension; requires thepgvectorCargo feature on the binary build.
cognee-cloud-rs as the cognee-vector-qdrant crate
and is not part of OSS. See tools/backends.
Setting vector_db_provider to qdrant is rejected at component
initialization in OSS (it returns a config error rather than falling back).
Graph database
| Env var | Settings field | Default |
|---|---|---|
GRAPH_DATABASE_PROVIDER | graph_database_provider | ladybug |
GRAPH_FILE_PATH | graph_file_path | (empty; defaults under the system root) |
GRAPH_DATABASE_URL | graph_database_url | (empty) |
GRAPH_DATABASE_HOST / GRAPH_DATABASE_PORT | graph_database_host / graph_database_port | (empty) / 123 |
GRAPH_DATABASE_NAME / GRAPH_DATABASE_KEY | … | (empty) |
GRAPH_DATABASE_USERNAME / GRAPH_DATABASE_PASSWORD | … | (empty) |
ladybug/kuzu (embedded), postgres (feature pggraph).
When Postgres graph credentials are unset they fall back to the relational DB_*
config (see roadmap/cognify-compatibility-plan.md).
Relational database
| Env var | Settings field | Default |
|---|---|---|
DATABASE_URL | relational_db_url | sqlite:./cognee.db?mode=rwc |
DB_PROVIDER | db_provider | sqlite |
DB_HOST / DB_PORT | db_host / db_port | localhost / 5432 |
DB_NAME | db_name | cognee_db |
DB_USERNAME / DB_PASSWORD | … | (empty) |
Chunking & tokenizer
Read bycrates/chunking/src/config.rs. Most
chunking knobs (chunk_strategy default PARAGRAPH, chunk_size 1500,
chunk_overlap 10, chunk_engine) are Settings/CognifyConfig fields without
env bindings. The token counter is env-selected:
| Env var | Purpose | Default |
|---|---|---|
COGNEE_TOKEN_COUNTER | tiktoken / word / huggingface(hf) | auto from embedding provider |
HUGGINGFACE_TOKENIZER | model id when counter = huggingface | (empty) |
Ontology
| Env var | Settings field | Default |
|---|---|---|
ONTOLOGY_FILE_PATH | ontology_file_path | (empty) |
ONTOLOGY_RESOLVER | ontology_resolver | rdflib |
ONTOLOGY_MATCHING_STRATEGY | ontology_matching_strategy | fuzzy |
System paths, users & datasets
| Env var | Settings field | Default |
|---|---|---|
COGNEE_SYSTEM_ROOT_DIRECTORY | system_root_directory | ./.cognee_system |
COGNEE_DATA_ROOT_DIRECTORY | data_root_directory | ./.data_storage |
CACHE_ROOT_DIRECTORY | cache_root_directory | ./.cognee_cache |
COGNEE_DEFAULT_USER_ID | default_user_id | nil UUID |
COGNEE_DEFAULT_DATASET_NAME | default_dataset_name | main_dataset |
DEFAULT_USER_EMAIL / DEFAULT_USER_PASSWORD | … | default_user@example.com / (empty) |
ENABLE_BACKEND_ACCESS_CONTROL | enable_access_control | false |
system_root_directory cascades to the default graph_file_path and
vector_db_url unless those are set explicitly.
Session / cache & rate limiting
| Env var | Settings field | Default |
|---|---|---|
CACHE_BACKEND | cache_backend | fs |
CACHE_HOST / CACHE_PORT | … | localhost / 6379 |
SESSION_TTL_SECONDS | session_ttl_seconds | 604800 (7d) |
CACHING | enable_caching | true |
LLM_RATE_LIMIT_ENABLED / _REQUESTS / _INTERVAL | … | false / 60 / 60 |
EMBEDDING_RATE_LIMIT_ENABLED / _REQUESTS / _INTERVAL | … | false / 60 / 60 |
Logging
Canonical table. Binding READMEs and.env.examplelink here. cognee writes structured logs to stdout and (when writable) to a rotating file. File logging is owned bycognee-logging, initialised by the CLI and HTTP server viacognee_logging::init_logging.
| Env var | Default | Purpose |
|---|---|---|
COGNEE_LOG_FILE | true | Master file-logging toggle (false/0/no disables). |
COGNEE_LOGS_DIR | ~/.cognee/logs | Log directory (falls back to /tmp/cognee_logs if unwritable). |
COGNEE_LOG_FORMAT | plain | plain (Python-compatible) or json. Applies to stdout + file. |
COGNEE_LOG_ROTATION | daily | daily / hourly / minutely / never. |
COGNEE_LOG_BACKUP_COUNT | 5 | Rotated files retained by age. |
COGNEE_LOG_MAX_FILES | 10 | Hard cap on retained log files. |
LOG_FILE_NAME | (timestamped) | Override the log file name. |
RUST_LOG / LOG_LEVEL | info | Level filter (RUST_LOG preferred). |
Multi-process warning — when several cognee processes share one log file viaLOG_FILE_NAME, rotation is not coordinated; concurrent rotation can corrupt the log. For sharded workers, give each shard its ownCOGNEE_LOGS_DIR(or unsetLOG_FILE_NAMEper shard).
Observability & telemetry
cognee emits OpenTelemetry traces (behind thetelemetry feature) and opt-out
product analytics. The deep references are
observability/opentelemetry.md and
observability/send_telemetry.md; the env surface:
| Env var | Default | Purpose |
|---|---|---|
COGNEE_TRACING_ENABLED | false | Activate OTLP trace export. |
OTEL_EXPORTER_OTLP_ENDPOINT | (empty) | OTLP collector endpoint (non-empty also activates tracing). |
OTEL_SERVICE_NAME | cognee | Service name attribute. |
OTEL_EXPORTER_OTLP_HEADERS / _PROTOCOL | (empty) / grpc | Exporter headers / protocol. |
OTEL_SPAN_PROCESSOR | batch | batch or simple. |
OTEL_TRACES_SAMPLER / _ARG | (empty) | Sampler selection. |
TELEMETRY_DISABLED, ENV=test|dev | (unset) | Opt out of product analytics. |
HTTP server
The server binary reads its own env surface (crates/http-server/src/config.rs) —
host/port, auth, body limits, pipeline registry, notebooks, health probes. See
tools/http-server and
http-server/architecture.md §config.
Cloud
Cloud/Auth0 configuration (COGNEE_CLOUD_URL, COGNEE_AUTH0_*) and the
serve()/disconnect() flow live in the closed cognee-cloud-rs product
(the cognee-cloud crate) and are not part of OSS.
Runtime configuration API
ConfigManager (Arc<RwLock<Settings>>) exposes typed setters used by the
bindings and CLI. Families: set_llm_*, set_embedding_*, set_vector_db_*,
set_graph_*, set_chunk_*, set_relational_db_*, set_*_root_directory,
set_ontology_*, set_classification_model / set_summarization_model /
set_summarization_schema, plus four bulk setters (set_llm_config,
set_embedding_config, set_vector_db_config, set_graph_db_config) and a
generic set(key, value). Introspection: read(), version(), get_settings()
(secrets masked). Full signatures are in the
ConfigManager rustdoc. The binding ergonomics
(granular JS setters vs generic set in Python/C) are documented in
tools/bindings §config.
CLI config subcommand
cognee-cli config get|set|unset <key> reads/writes the persisted JSON file.
The settable keys are the snake_case Settings field names — see
known_keys(). Example: