> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Structured Output Backends

> Configure structured output frameworks for reliable data extraction in Cognee

Structured output backends ensure reliable data extraction from LLM responses. Cognee supports two frameworks that convert LLM text into structured Pydantic models for knowledge graph extraction and other tasks.

<Info>
  **New to configuration?**

  See the [Setup Configuration Overview](./overview) for the complete workflow:

  install extras → create `.env` → choose providers → handle pruning.
</Info>

## Supported Frameworks

Cognee supports two structured output approaches:

* **LiteLLM + Instructor** — Provider-agnostic client with Pydantic coercion (default)
* **BAML** — DSL-based framework with type registry and guardrails

Both frameworks produce the same Pydantic-validated outputs, so your application code remains unchanged regardless of which backend you choose.

## How It Works

Cognee uses a unified interface that abstracts the underlying framework:

```python theme={null}
from cognee.infrastructure.llm.LLMGateway import LLMGateway
await LLMGateway.acreate_structured_output(text, system_prompt, response_model)
```

The `STRUCTURED_OUTPUT_FRAMEWORK` environment variable determines which backend processes your requests, but the API remains identical.

## Configuration

<Tabs>
  <Tab title="LiteLLM + Instructor (Default)">
    The default framework — no extra install needed. Uses LiteLLM and the `instructor` library to coerce LLM responses into Pydantic models.

    ```dotenv theme={null}
    STRUCTURED_OUTPUT_FRAMEWORK=instructor
    ```

    Optionally, control how the model is prompted for structured output:

    ```dotenv theme={null}
    # Override instructor mode (e.g. json_mode, tool_call, md_json)
    LLM_INSTRUCTOR_MODE=json_schema_mode
    ```
  </Tab>

  <Tab title="BAML">
    BAML is an alternative structured output framework that uses a DSL-based type registry to extract data. It is particularly useful when small local models (such as Ollama models like `llama3.1:8b` or `qwen3.5:0.8b`) struggle to produce valid structured output with instructor, causing repeated `InstructorRetryException` errors.

    **Installation**: BAML requires a separate install:

    ```bash theme={null}
    pip install "cognee[baml]"
    ```

    **Configuration**: BAML uses its own LLM settings, independent of the main `LLM_*` variables:

    ```dotenv theme={null}
    STRUCTURED_OUTPUT_FRAMEWORK=baml

    # BAML-specific LLM settings (required)
    BAML_LLM_PROVIDER=openai
    BAML_LLM_MODEL=gpt-4o-mini
    BAML_LLM_API_KEY=sk-...

    # Optional BAML overrides
    # BAML_LLM_ENDPOINT=https://api.openai.com/v1
    # BAML_LLM_API_VERSION=
    # BAML_LLM_TEMPERATURE=0.0
    ```

    `BAML_LLM_PROVIDER` and `BAML_LLM_MODEL` accept the same provider names and model identifiers as the main LLM configuration. You can point BAML at a different model than your main LLM — for example, use a small Ollama model for general text generation while routing structured extraction through a cloud model.

    <Warning>
      If `STRUCTURED_OUTPUT_FRAMEWORK=baml` is set but the `cognee[baml]` extra is not installed, Cognee will raise an `ImportError` on startup. Run `pip install "cognee[baml]"` to resolve it.
    </Warning>

    <Accordion title="Using BAML for Small Local Models">
      Small Ollama models (e.g. `llama3.1:8b`, `qwen3.5:0.8b`) often fail to produce valid JSON-structured output when using the default instructor backend, resulting in repeated `InstructorRetryException` errors during `cognify` for types like `KnowledgeGraph` or `SummarizedContent`.

      Switching to BAML bypasses instructor entirely and uses BAML's own extraction pipeline, which is more forgiving with smaller models:

      ```dotenv theme={null}
      # Main LLM — small local model via Ollama
      LLM_PROVIDER=ollama
      LLM_MODEL=llama3.1:8b
      LLM_ENDPOINT=http://localhost:11434/v1
      LLM_API_KEY=ollama

      # Use BAML for structured extraction (can point to a different, more capable model)
      STRUCTURED_OUTPUT_FRAMEWORK=baml
      BAML_LLM_PROVIDER=openai
      BAML_LLM_MODEL=gpt-4o-mini
      BAML_LLM_API_KEY=sk-...
      ```

      See the [Local Setup guide](/guides/local-setup) for a complete Ollama configuration including embeddings.
    </Accordion>
  </Tab>
</Tabs>

## Important Notes

* **Unified Interface**: Your application code uses the same `acreate_structured_output()` call regardless of framework
* **Provider Flexibility**: Both frameworks support the same LLM providers
* **Output Consistency**: Both produce identical Pydantic-validated results
* **Performance**: Framework choice doesn't significantly impact performance

<Columns cols={3}>
  <Card title="LLM Providers" icon="brain" href="/setup-configuration/llm-providers">
    Configure LLM providers for text generation
  </Card>

  <Card title="Overview" icon="settings" href="/setup-configuration/overview">
    Return to setup configuration overview
  </Card>

  <Card title="Custom Prompts" icon="text-wrap" href="/guides/custom-prompts">
    Learn about custom prompt configuration
  </Card>
</Columns>
