run_pairwise
autogen.beta.eval.runtime.runner.run_pairwise async #
run_pairwise(suite, *, variant_a, variant_b, comparators, store_dir, model_config=None, variant_a_name='A', variant_b_name='B', concurrency=4, run_id=None, label=None, stream=None)
Produce traces for two variants over a suite, then compare them.
Convenience over :func:~autogen.beta.eval.evaluate_pairwise: runs each variant across the suite (capturing a :class:Trace per task, keyed by task_id), then pairwise-compares the two sets. Mirrors how :func:run_agent is produce-then-:func:~autogen.beta.eval.evaluate_traces for one variant. For decoupled grading of pre-existing traces, call evaluate_pairwise directly.
label is a shared identifier recorded on the result (like :func:run_agent); pass stream to observe PairwiseStarted / PairwiseCompared / PairwiseCompleted lifecycle events as the comparison runs.