Evaluation with DeepEval

Why DeepEval?

DeepEval is an open-source evaluation framework that provides ready-made metrics (both traditional and LLM-as-a-judge) for LLM pipelines. Compared with hand-rolled evaluation scripts, DeepEval lets you:

Track Contextual Relevancy, Contextual Precision/Recall, Coverage and more.
Swap between automatic string-based metrics (EM/F1) and LLM-based scoring with a single flag.
Re-use the same metrics across different projects and datasets.

DeepEval stores no data – it simply runs metrics locally or via your preferred LLM. That makes it a perfect drop-in evaluator for Cognee’s pipelines.

DeepEval inside Cognee

Cognee ships with a dedicated DeepEvalAdapter. When enabled, every answer produced by your pipeline is scored with the metrics you choose.

evaluating_answers: bool = True
evaluating_contexts: bool = True
evaluation_engine: str = "DeepEval"  # Options: 'DeepEval', 'DirectLLM'
evaluation_metrics: list[str] = [
    "correctness",   # LLM-based correctness
    "EM",            # Exact-Match
    "f1",            # Token-level precision / recall
]
deepeval_model: str = "gpt-4o-mini"  # Any OpenAI-compatible LLM

Behind the scenes the adapter:

Transforms Cognee’s Answer objects into DeepEval’s LLMTestCase format.
Runs the selected metrics.
Stores the raw scores alongside rationales so they appear in Cognee’s HTML dashboard.

Quick Start

Install Cognee (DeepEval is declared in pyproject.toml so you automatically get the dependency).
Set your LLM API key so DeepEval can run LLM-based metrics:

import os
os.environ["LLM_API_KEY"] = "<YOUR_OPENAI_API_KEY>"

You can also export the variable in your shell (export LLM_API_KEY=...). 3. (Optional) Configure the model DeepEval should call:

export DEEPEVAL_MODEL=gpt-4o

Run a standard Cognee pipeline (add → cognify → search). The evaluation executor will automatically invoke DeepEval.

Useful Links

DeepEval integration guide – deepeval.com » Cognee
DeepEval docs – deepeval.com/docs

Join the conversation on Discord and let us know how DeepEval works for you!

Overview

Cloud LLM Providers

Observability

Evaluation

Agent Frameworks

Why DeepEval?

DeepEval inside Cognee

Quick Start

Useful Links

Overview

Cloud LLM Providers

Observability

Evaluation

Agent Frameworks

​Why DeepEval?

​DeepEval inside Cognee

​Quick Start

​Useful Links

Why DeepEval?

DeepEval inside Cognee

Quick Start

Useful Links