Model Configuration#

The AG2 framework provides an explicit, predictable, and type-safe way to configure Large Language Models (LLMs) for your agents. The configuration API is designed to provide a consistent developer experience across different model providers while maintaining strong typing support.

Supported Providers#

AG2 supports multiple LLM providers through dedicated configuration classes. Each provider requires its respective optional dependencies to be installed.

Provider	Configuration Class	Installation Command
OpenAI Responses	`OpenAIResponsesConfig`	`pip install "ag2[openai]"`
OpenAI	`OpenAIConfig`	`pip install "ag2[openai]"`
Anthropic	`AnthropicConfig`	`pip install "ag2[anthropic]"`
Gemini	`GeminiConfig`	`pip install "ag2[gemini]"`
Gemini on Vertex AI	`VertexAIConfig`	`pip install "ag2[gemini]"`
Amazon Bedrock	`BedrockConfig`	`pip install "ag2[bedrock]"`
Ollama	`OllamaConfig`	`pip install "ag2[ollama]"`
DashScope	`DashScopeConfig`	`pip install "ag2[dashscope]"`
xAI	`XAIConfig`	`pip install "ag2[xai]"`

(Note: OpenAIConfig is also available for OpenAI-compatible endpoints).

How to Configure a Model#

Basic Configuration#

To configure a model, import the specific provider's configuration class and initialize it with your desired parameters. The most common parameters are model, api_key, and base_url.

OpenAI ResponsesOpenAIAnthropicGeminiAmazon BedrockOllamaDashScopexAI

from autogen.beta.config import OpenAIResponsesConfig

# Configure an OpenAI Responses API model
config = OpenAIResponsesConfig(
    model="gpt-4.1-nano",
    api_key="sk-...",
    streaming=True
)

from autogen.beta.config import OpenAIConfig

# Configure an OpenAI model
config = OpenAIConfig(
    model="gpt-4o-mini",
    api_key="sk-...",
    temperature=0.2,
    streaming=True
)

from autogen.beta.config import AnthropicConfig

# Configure an Anthropic model
config = AnthropicConfig(
    model="claude-haiku-4-5-20251001",
    api_key="sk-ant-...",
    streaming=True
)

from autogen.beta.config import GeminiConfig

# Configure a Gemini model
config = GeminiConfig(
    model="gemini-3-flash-preview",
    api_key="...",
    streaming=True
)

from autogen.beta.config import BedrockConfig

# Configure an Amazon Bedrock model (Converse API)
config = BedrockConfig(
    model="anthropic.claude-sonnet-4-5-20250929-v1:0",
    region_name="us-east-1",
    streaming=True
)

Credentials follow the standard AWS resolution chain: explicit aws_access_key_id / aws_secret_access_key, a named profile_name, environment variables, shared config files, or instance roles. model accepts a Bedrock model id or an inference-profile ARN. See Amazon Bedrock authentication for the API-key (bearer token) alternative.

from autogen.beta.config import OllamaConfig

# Configure an Ollama model
config = OllamaConfig(
    model="qwen3.5:latest",
    streaming=True
)

from autogen.beta.config import DashScopeConfig

# Configure a DashScope model
config = DashScopeConfig(
    model="qwen-plus",
    api_key="...",
    streaming=True
)

from autogen.beta.config import XAIConfig

# Configure an xAI Grok model
config = XAIConfig(
    model="grok-4",
    api_key="xai-...",
    streaming=True
)

Tip

AG2 Beta is designed to be async and streaming-first, so for the best user experience it is recommended to enable streaming on the model provider configurations for models that support it. As shown above, Streaming has been set to True in each config.

Using Environment Variables#

For security and convenience, you don't need to hardcode your API keys. If api_key is not explicitly provided, the configuration will automatically attempt to load it from your environment variables.

The system looks for provider-specific keys (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, XAI_API_KEY).

from autogen.beta.config import OpenAIConfig

# Automatically falls back to OPENAI_API_KEY from the environment
config = OpenAIConfig(model="gpt-5")

Amazon Bedrock Authentication#

BedrockConfig authenticates in either of two ways, both resolved by the underlying AWS SDK:

1. AWS credentials (SigV4) — explicit keys, a profile_name, or the standard environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN). Recommended for production; credentials refresh automatically through botocore.

2. Bedrock API keys (bearer token) — set the Amazon Bedrock API key as an environment variable and boto3 uses bearer-token auth for Bedrock calls automatically (no other credentials needed):

export AWS_BEARER_TOKEN_BEDROCK=<your-bedrock-api-key>
export AWS_DEFAULT_REGION=us-east-1

from autogen.beta.config import BedrockConfig

# Auth from AWS_BEARER_TOKEN_BEDROCK, region from AWS_DEFAULT_REGION
config = BedrockConfig(model="anthropic.claude-sonnet-4-5-20250929-v1:0")

A region is always required — pass region_name= or set AWS_DEFAULT_REGION. Notes on API keys:

Short-term keys expire with the console session that minted them (max 12 hours) and are region-bound — generate the key in the same region you call.
Long-term keys are backed by an auto-created IAM user; AWS recommends them for exploration only.
API keys work only for Bedrock / Bedrock Runtime actions. If both a bearer token and AWS credentials are present, the bearer token wins for Bedrock calls.

Google Vertex AI (Gemini)#

For Gemini on Vertex AI (Google Cloud), use the dedicated VertexAIConfig class. GeminiConfig covers the public Developer API (api_key); VertexAIConfig covers the Vertex path (GCP project, location, and Google-issued credentials).

Authentication accepts any of the following:

Service account key fileApplication Default CredentialsPre-built Credentials object

from autogen.beta.config import VertexAIConfig

config = VertexAIConfig(
    model="gemini-3-flash-preview",
    project="my-gcp-project",
    location="us-central1",
    credentials="/path/to/service-account-key.json",
    # Path to a service-account JSON key file downloaded from
    # GCP Console -> IAM & Admin -> Service Accounts -> Keys.
)

The service account needs the Vertex AI User (roles/aiplatform.user) IAM role on the project.

from autogen.beta.config import VertexAIConfig

# Run `gcloud auth application-default login` first, or ensure
# GOOGLE_APPLICATION_CREDENTIALS points to a key file. With nothing
# passed to `credentials`, google-genai resolves ADC automatically.
config = VertexAIConfig(
    model="gemini-3-flash-preview",
    project="my-gcp-project",
    location="us-central1",
)

import google.auth
from autogen.beta.config import VertexAIConfig

creds, _ = google.auth.default(
    scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

config = VertexAIConfig(
    model="gemini-3-flash-preview",
    project="my-gcp-project",
    location="us-central1",
    credentials=creds,
)

Use this path for impersonated credentials, workload identity, or any other google.auth.credentials.Credentials source.

Environment variables#

Instead of passing parameters explicitly, the underlying google-genai SDK resolves any field left unset from the following environment variables:

Environment variable	Used by	Equivalent parameter	Notes
`GOOGLE_API_KEY`	`GeminiConfig`	`api_key`	Takes precedence over `GEMINI_API_KEY` if both are set.
`GEMINI_API_KEY`	`GeminiConfig`	`api_key`	Developer API key.
`GOOGLE_CLOUD_PROJECT`	`VertexAIConfig`	`project`	GCP project ID.
`GOOGLE_CLOUD_LOCATION`	`VertexAIConfig`	`location`	GCP region (or `global`).
`GOOGLE_APPLICATION_CREDENTIALS`	`VertexAIConfig`	`credentials`	Path to a service-account JSON key file, read via ADC.

With the three Vertex variables set in the environment, configuration collapses to just the model name:

export GOOGLE_CLOUD_PROJECT=my-gcp-project
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json

from autogen.beta.config import VertexAIConfig

# All Vertex auth parameters resolved from the environment.
config = VertexAIConfig(model="gemini-3-flash-preview")

Controlling Gemini Thinking#

Gemini 3 Pro models default to dynamic / unbounded thinking, which can cause individual calls to spend a large internal token budget before responding. Both GeminiConfig and VertexAIConfig accept thinking controls that map directly to Google's Thinking API.

Use thinking_level for Gemini 3 models, or thinking_budget for Gemini 2.5 models:

from autogen.beta.config import GeminiConfig, VertexAIConfig

# Gemini 3 — bound thinking with a level
gemini3 = GeminiConfig(
    model="gemini-3-flash-preview",
    thinking_level="low",  # "low" | "medium" | "high"
)

# Gemini 2.5 — bound thinking with an explicit token budget
gemini25 = VertexAIConfig(
    model="gemini-2.5-pro",
    project="my-gcp-project",
    location="us-central1",
    thinking_budget=512,  # 0 disables thinking entirely
)

For full control (e.g. enabling include_thoughts), pass a google.genai.types.ThinkingConfig directly via thinking_config. When set, it takes precedence over the shorthand fields.

The number of thinking tokens consumed is reported on ModelResponse.usage.thinking_tokens and emitted as the gen_ai.usage.thinking_tokens OpenTelemetry attribute by TelemetryMiddleware.

Self-Hosted and OpenAI-Compatible Models (vLLM, LM Studio, etc.)#

If you are using a self-hosted model or an API that is compatible with the OpenAI format (such as vLLM, LM Studio, FastChat, or Together AI), you can use the OpenAIConfig class and specify a custom base_url.

from autogen.beta.config import OpenAIConfig

# Configure a vLLM or other OpenAI-compatible endpoint
config = OpenAIConfig(
    model="qwen-3",
    base_url="http://localhost:8000/v1",
    # Some endpoints don't require an API key, but the client expects a non-empty string
    api_key="NotRequired",
)

Tip

If you are running a self-hosted server via HTTPS without a valid SSL certificate (e.g., a local self-signed certificate), you can disable SSL checks by passing a custom httpx.AsyncClient with verify=False to the configuration:

import httpx
from autogen.beta.config import OpenAIConfig

config = OpenAIConfig(
    model="qwen-3",
    base_url="https://localhost:8000/v1",
    api_key="NotRequired",
    http_client=httpx.AsyncClient(verify=False)
)

Extra Body Parameters#

Some OpenAI API-compatible providers require additional, provider-specific parameters in the request body. Use the extra_body parameter on OpenAIConfig to pass these through directly to the API call.

This is useful for enabling features like extended thinking on self-hosted or third-party models:

from autogen.beta.config import OpenAIConfig

# NVIDIA NIM
nemotron = OpenAIConfig(
    model="nvidia/nemotron-3-super-120b-a12b",
    base_url="https://integrate.api.nvidia.com/v1",
    extra_body={"chat_template_kwargs": {"thinking": True}},
)

Reusing and Overriding Configurations#

Model configurations are immutable. If you need to reuse a configuration for multiple agents with slight variations (e.g., changing the model version or adjusting the temperature), use the .copy() method. This creates a new updated instance without mutating the original configuration.

from autogen.agent import Agent
from autogen.beta.config import OpenAIConfig

base_config = OpenAIConfig(model="gpt-5")

agent1 = Agent(
    "Assistant",
    # Create a new configuration with updated temperature
    config=base_config.copy(temperature=0.2),
)

agent2 = Agent(
    "AnotherAssistant",
    # Create a new configuration with updated model and temperature
    config=base_config.copy(model="gpt-5-mini", temperature=0.8),
)

Delaying Model Configuration#

In many use cases, you may want to separate the logic of defining your agent (tools, system messages, instructions) from configuring the specific model it uses. This allows you to construct an agent once and dynamically provide the model configuration later during execution.

You can accomplish this by passing the configuration to the .ask() method when interacting with the agent. This is especially useful for applications like web servers where the user might bring their own API key or choose a different model on the fly.

from autogen.agent import Agent
from autogen.beta.config import OpenAIConfig

# Define an agent without an initial model config,
# or with a default one you plan to override later
agent = Agent(
    "Assistant",
    prompt="You are a helpful assistant.",
    # other tools and settings...
)

# Ask the agent, passing the explicit model configuration
response = await agent.ask(
    "Hello!",
    config=OpenAIConfig(
        model="gpt-5",
        api_key="sk-user-specific-key"
    )
)

Warning

Providing a configuration or client directly to the ask() method completely overrides the original model configuration assigned to the agent for that specific turn.

from autogen.agent import Agent
from autogen.beta.config import OpenAIConfig

agent = Agent(
    "Assistant",
    config=OpenAIConfig(model="gpt-5"),
)

response = await agent.ask(
    "Hello!",
    # overrides the original model configuration
    config=OpenAIConfig(model="gpt-5-mini")
)