Skip to main content
Structured output backends ensure reliable data extraction from LLM responses. Cognee supports two frameworks that convert LLM text into structured Pydantic models for knowledge graph extraction and other tasks.
New to configuration?See the Setup Configuration Overview for the complete workflow:install extras → create .env → choose providers → handle pruning.

Supported Frameworks

Cognee supports two structured output approaches:
  • LiteLLM + Instructor — Provider-agnostic client with Pydantic coercion (default)
  • BAML — DSL-based framework with type registry and guardrails
Both frameworks produce the same Pydantic-validated outputs, so your application code remains unchanged regardless of which backend you choose.

How It Works

Cognee uses a unified interface that abstracts the underlying framework:
from cognee.infrastructure.llm.LLMGateway import LLMGateway
await LLMGateway.acreate_structured_output(text, system_prompt, response_model)
The STRUCTURED_OUTPUT_FRAMEWORK environment variable determines which backend processes your requests, but the API remains identical.

Configuration

The default framework — no extra install needed. Uses LiteLLM and the instructor library to coerce LLM responses into Pydantic models.
STRUCTURED_OUTPUT_FRAMEWORK=instructor
Optionally, control how the model is prompted for structured output:
# Override instructor mode (e.g. json_mode, tool_call, markdown_json_mode)
# Leave unset to use the provider's default — see "Instructor Modes" below.
LLM_INSTRUCTOR_MODE=json_schema_mode

Instructor Modes

When STRUCTURED_OUTPUT_FRAMEWORK=instructor, the instructor mode controls how Cognee asks the model for structured output — for example via the model’s native JSON-schema response, a plain JSON object, or a tool/function call. The value of LLM_INSTRUCTOR_MODE is passed directly to the instructor library’s Mode, so it must be one of instructor’s supported mode strings. LLM_INSTRUCTOR_MODE is empty by default. When it is unset, Cognee either applies a provider-specific mode or defers to the underlying Instructor/LiteLLM default, so in most cases you don’t need to set it at all:
LLM_PROVIDERBehavior when LLM_INSTRUCTOR_MODE is unset
openai, azure with gpt-5 modelsjson_schema_mode
openai, azure with other modelsuse Instructor/LiteLLM default
AWS Bedrockjson_schema_mode
ollama, gemini, custom (OpenAI-compatible), llama.cppjson_mode
anthropicanthropic_tools
mistralmistral_tools
Common values you can set explicitly include json_schema_mode, json_mode, tool_call, and markdown_json_mode.
Which mode for OpenAI models (e.g. gpt-5-mini)? Leave LLM_INSTRUCTOR_MODE unset, or set json_schema_mode — Cognee applies json_schema_mode to gpt-5 models, and it is the recommended mode for OpenAI models that support native JSON-schema responses. Only override it when you point Cognee at a custom or local OpenAI-compatible endpoint that rejects JSON-schema responses; in that case try json_mode first, then markdown_json_mode or tool_call.

Important Notes

  • Unified Interface: Your application code uses the same acreate_structured_output() call regardless of framework
  • Provider Flexibility: Both frameworks support the same LLM providers
  • Output Consistency: Both produce identical Pydantic-validated results
  • Performance: Framework choice doesn’t significantly impact performance

Troubleshooting

This error appears during recall() / search() with completion search types such as GRAPH_COMPLETION, GRAPH_SUMMARY_COMPLETION, GRAPH_COMPLETION_COT, and RAG_COMPLETION.Cause. These search types ask the LLM for a plain-text answer (the retriever uses response_model=str). When the configured instructor mode doesn’t match what your model/provider actually supports, the model wraps its answer in a JSON object instead of returning plain text. The instructor backend then can’t coerce that dict into the expected string field, so Pydantic raises Input should be a valid string ... input_type=dict. This is common with OpenAI-compatible, custom, and local (Ollama / LM Studio) endpoints.Fixes:
  • Align the instructor mode with your provider. OpenAI/Azure gpt-4o/gpt-5 models work with the default json_schema_mode. Endpoints that don’t support JSON-schema responses usually need a different mode:
    # Try one of these for OpenAI-compatible / local endpoints
    LLM_INSTRUCTOR_MODE=json_mode
    # LLM_INSTRUCTOR_MODE=markdown_json_mode
    # LLM_INSTRUCTOR_MODE=tool_call
    
  • Switch to BAML if a small/local model keeps wrapping answers in JSON. BAML bypasses instructor’s coercion and is more forgiving of loose model output:
    STRUCTURED_OUTPUT_FRAMEWORK=baml
    BAML_LLM_PROVIDER=openai
    BAML_LLM_MODEL=gpt-4o-mini
    BAML_LLM_API_KEY=sk-...
    # For local/OpenAI-compatible endpoints:
    # BAML_LLM_ENDPOINT=http://localhost:11434/v1
    # BAML_LLM_API_KEY=ollama
    
  • Skip the LLM completion step to confirm retrieval works independently of model output formatting. Pass only_context=True to return the retrieved context directly — see Search Basics. If retrieval succeeds with only_context=True, the problem is the structured-output configuration above, not your graph.

LLM Providers

Configure LLM providers for text generation

Overview

Return to setup configuration overview

Custom Prompts

Learn about custom prompt configuration