Agents
Agents are the central primitive in AG2. They maintain state, interact with models, execute tools, and handle user interactions through a clean, conversation-focused API.
Core Communication Primitives#
The API is built around a few simple methods:
Agent.ask(...)initiates a new turn and blocks until it finishes, returning anAgentReply.AgentReply.ask(...)continues an existing conversation, preserving its context and history.Agent.run(...)/AgentReply.run(...)are the observable counterparts: they open a turn you can watch live and drive on demand, returning anAgentRunhandle. See Watching a Turn Live withrunbelow.
The final result of any turn is safely stored in reply.response; use reply.body for the text.
Basic Communication Example#
Here's how easily you can start and continue a conversation:
Empowering Agents with Tools#
Agents can seamlessly use Python functions as tools. When you provide a list of @tool-decorated functions to an agent, it automatically manages the entire execution lifecycle (model requests to execution and returning results).
Adding Human-in-the-Loop (HITL)#
Sometimes an agent needs human guidance. You can configure an agent to handle HumanInputRequest events. This is especially effective inside tools where you can get confirmation before taking a sensitive action.
Observing Agent Actions#
Need to know exactly what the agent is doing? Pass a MemoryStream when calling ask(). You can attach event subscribers to log actions, save history to a database, or update a user interface in real time.
Watching a Turn Live with run#
ask() blocks until the whole turn is done. When you want to watch a turn unfold — stream tokens to a UI as they arrive, or steer the agent mid-turn — use run() instead. It returns an AgentRun async context manager.
The turn does not advance on its own. Call run.start() to drive it in the background, then read run.stream.join() — an async iterator of live events — at the same time. await run.result() returns the turn's final AgentReply.
Iterating over events while the turn runs#
How it behaves:
run.start()drives the turn in a scope-owned background task; it returns immediately (it is not awaited). Without it,run.stream.join()would block forever — nothing advances the turn.run.stream.join()yields every event the turn emits;- Token streaming needs
streaming=Trueon the model config (as above); the model then emitsModelMessageChunkevents as it generates. await run.result()returns the sameAgentReplythe started turn produced (idempotent, re-raising the same failure on retry). Leaving the block without awaiting it cancels a still-running turn.- To drive inline instead, skip
start()andawait run.result()directly; cancelling that await — e.g.await asyncio.wait_for(run.result(), timeout=5)— cancels the turn.
Do not timeout individual join() items
Consume run.stream.join() with a plain async for, or use join(max_events=N) when you know how many events you need. Avoid wrapping individual pulls such as await asyncio.wait_for(events.__anext__(), timeout=...): cancelling one __anext__() closes the current iterator, so later events will not be yielded by that iterator. If you need a time limit, put one asyncio.timeout(...) around the whole loop instead of around each item.
AgentReply.run(...) is the same thing for a continuation: it watches a follow-up turn on an existing conversation, just as AgentReply.ask(...) continues one with a blocking call.
ask is run you don't have to drive
await agent.ask(...) is exactly run() + await result() rolled into one call. Reach for ask when you just want the answer, and run when you need to observe or steer the turn.
Feeding the turn while it runs#
A running turn keeps an inbox. Call run.enqueue(...) to push a follow-up message into it — from a concurrent task or while iterating events — and the turn consumes it at its next model call, without starting a new turn. Here we wait for the first tool result, then steer the turn:
enqueue is non-blocking — it only appends to the inbox. When the message is consumed depends on timing:
- before
result()→ merged into the turn's first model call; - while the turn is running → consumed by that same turn's next or final model call;
- after the turn finishes → waits for the next turn on this stream.
Building with an AI coding assistant?
See Coding with AI Assistants to set up Claude Code, Cursor, Copilot, or another assistant with AG2 skills and project rules so it writes against the current ag2 API.