Task
autogen.beta.eval.dataset.task.Task dataclass #
Task(task_id, inputs, reference_outputs=None, tags=(), metadata=dict())
A single task in an evaluation suite.
Tasks are typically loaded from JSONL via :meth:Suite.from_jsonl or built inline via :meth:Suite.from_list. The runner passes inputs["input"] to agent.ask(...); every other field is plumbed through to scorers unchanged.
| PARAMETER | DESCRIPTION |
|---|---|
task_id | Stable identifier for this task. Auto-generated as TYPE: |
inputs | The task's input payload. Must contain at least an |
reference_outputs | Expected outputs, consumed by reference-based scorers (e.g. |
tags | Free-form labels, useful for filtering or slicing ( |
metadata | Anything else the dataset carries — surfaces in the run JSON so scorers and reports can consume it. |