Compaction
Compaction reduces a stream's event history to respect runtime constraints — event count or token budget. It is the constraint-respecting counterpart to Aggregation.
Compaction removes. Aggregation creates. They are separate concerns.
When to use it#
Long-running conversations accumulate events faster than the model's context window can absorb. Use compaction to cap the size of history that flows into the next LLM call.
| Symptom | Use |
|---|---|
| History getting close to provider token limit | TailWindowCompact or SummarizeCompact |
| Need to keep recent events and forget old ones cheaply | TailWindowCompact |
| Want a short summary of old events to preserve context | SummarizeCompact |
CompactStrategy protocol#
Every strategy implements the same shape:
Returns a new event list that replaces the current history. Strategies must preserve the causal ordering of retained events — no reshuffling.
CompactTrigger#
A dataclass describing when compaction should fire. Any configured threshold that is exceeded triggers compaction.
Leaving a field at 0 disables that threshold. CompactTrigger() alone does nothing — you must opt into at least one condition.
CompactTrigger is a plain data object — it records when you want compaction to fire, but does not fire it. Strategies are invoked explicitly via await strategy.compact(...).
Built-in strategies#
Both built-ins are importable from autogen.beta.compact.
TailWindowCompact#
Keeps the last N events, drops the rest. Zero LLM cost. Suitable when old context has diminishing value and recency is what matters.
Passing a KnowledgeStore is optional. If provided, dropped events are persisted to /log/ as a numbered segment — see the KnowledgeStore docs — so they can be replayed later via EventLogWriter.load(). If omitted, dropped events are discarded.
SummarizeCompact#
Summarizes the dropped portion via one LLM call, inserts a CompactionSummary event at the head of retained history. Use when you want to keep some sense of the old conversation instead of just forgetting it.
The summarization model is independent from the agent's main model — pick a smaller / cheaper one. Token usage is recorded on the strategy instance as strategy.last_usage.
CompactionSummary#
The synthetic event inserted by SummarizeCompact at the head of history.
CompactionSummary is on the allowlist of ConversationPolicy, so it passes through the assembly chain and reaches the LLM as context — without requiring special handling elsewhere.
Driving compaction#
Pattern for invoking a strategy against a CompactTrigger:
For the token-based threshold, estimate with sum(len(str(e)) for e in events) / trigger.chars_per_token.
Writing a custom strategy#
Any object with an async compact(events, ctx, store) method satisfies the protocol. A couple of ideas:
- Drop tool noise. Keep
ModelRequest/ModelResponse, dropToolCallEvent/ToolResultEventolder than some boundary. - Priority retention. Score events (e.g. keep every
ModelResponsebut decimateToolCallEvents). - Segmented summarization. Run
SummarizeCompactin chunks to produce multipleCompactionSummaryevents over time rather than one big one.