Skip to content

Feedback

autogen.beta.eval._types.Feedback dataclass #

Feedback(key, score=None, value=None, comment=None, detail=None)

A single piece of feedback produced by a scorer.

Exactly one of score or value is typically populated — score for numeric / boolean grades, value for categorical labels. Both being None is valid and represents a "no signal" feedback (e.g. a scorer that crashed mid-evaluation).

PARAMETER DESCRIPTION
key

Stable identifier for this feedback, usually the scorer's function name. Pass rates and stats on :class:~autogen.beta.eval.RunResult are looked up by this key.

TYPE: str

score

Numeric or boolean grade.

TYPE: ScoreValue DEFAULT: None

value

Categorical label, used for slicing aggregates.

TYPE: ValueLabel DEFAULT: None

comment

Free-form human-readable explanation. Surfaces in run JSON; useful for LLM-as-judge rationales or scorer error traces.

TYPE: str | None DEFAULT: None

detail

Optional structured evidence behind this feedback — a JSON-safe mapping serialized into the run JSON for programmatic access (e.g. a failure attribution's decisive step / responsible agent, or a judge's per-order swap verdicts). Supplementary only: aggregation never reads detailscore / value remain the graded signal. Typed at the source (producers serialize a typed model into it), so it is evidence, not a grab-bag.

TYPE: dict[str, Any] | None DEFAULT: None

key instance-attribute #

key

score class-attribute instance-attribute #

score = None

value class-attribute instance-attribute #

value = None

comment class-attribute instance-attribute #

comment = None

detail class-attribute instance-attribute #

detail = field(default=None, compare=False)