Skip to content

Model Configuration#

The AG2 framework provides an explicit, predictable, and type-safe way to configure Large Language Models (LLMs) for your agents. The configuration API is designed to provide a consistent developer experience across different model providers while maintaining strong typing support.

Supported Providers#

AG2 supports multiple LLM providers through dedicated configuration classes. Each provider requires its respective optional dependencies to be installed.

Provider Configuration Class Installation Command
OpenAI Responses OpenAIResponsesConfig pip install "ag2[openai]"
OpenAI OpenAIConfig pip install "ag2[openai]"
Anthropic AnthropicConfig pip install "ag2[anthropic]"
Gemini GeminiConfig pip install "ag2[gemini]"
Gemini on Vertex AI VertexAIConfig pip install "ag2[gemini]"
Ollama OllamaConfig pip install "ag2[ollama]"
DashScope DashScopeConfig pip install "ag2[dashscope]"

(Note: OpenAIConfig is also available for OpenAI-compatible endpoints).


How to Configure a Model#

Basic Configuration#

To configure a model, import the specific provider's configuration class and initialize it with your desired parameters. The most common parameters are model, api_key, and base_url.

1
2
3
4
5
6
7
8
from autogen.beta.config import OpenAIResponsesConfig

# Configure an OpenAI Responses API model
config = OpenAIResponsesConfig(
    model="gpt-4.1-nano",
    api_key="sk-...",
    streaming=True
)
1
2
3
4
5
6
7
8
9
from autogen.beta.config import OpenAIConfig

# Configure an OpenAI model
config = OpenAIConfig(
    model="gpt-4o-mini",
    api_key="sk-...",
    temperature=0.2,
    streaming=True
)
1
2
3
4
5
6
7
8
from autogen.beta.config import AnthropicConfig

# Configure an Anthropic model
config = AnthropicConfig(
    model="claude-haiku-4-5-20251001",
    api_key="sk-ant-...",
    streaming=True
)
1
2
3
4
5
6
7
8
from autogen.beta.config import GeminiConfig

# Configure a Gemini model
config = GeminiConfig(
    model="gemini-3-flash-preview",
    api_key="...",
    streaming=True
)
1
2
3
4
5
6
7
from autogen.beta.config import OllamaConfig

# Configure an Ollama model
config = OllamaConfig(
    model="qwen3.5:latest",
    streaming=True
)
1
2
3
4
5
6
7
8
from autogen.beta.config import DashScopeConfig

# Configure a DashScope model
config = DashScopeConfig(
    model="qwen-plus",
    api_key="...",
    streaming=True
)

Tip

AG2 Beta is designed to be async and streaming-first, so for the best user experience it is recommended to enable streaming on the model provider configurations for models that support it. As shown above, Streaming has been set to True in each config.

Using Environment Variables#

For security and convenience, you don't need to hardcode your API keys. If api_key is not explicitly provided, the configuration will automatically attempt to load it from your environment variables.

The system looks for provider-specific keys (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY).

1
2
3
4
from autogen.beta.config import OpenAIConfig

# Automatically falls back to OPENAI_API_KEY from the environment
config = OpenAIConfig(model="gpt-5")

Google Vertex AI (Gemini)#

For Gemini on Vertex AI (Google Cloud), use the dedicated VertexAIConfig class. GeminiConfig covers the public Developer API (api_key); VertexAIConfig covers the Vertex path (GCP project, location, and Google-issued credentials).

Authentication accepts any of the following:

from autogen.beta.config import VertexAIConfig

config = VertexAIConfig(
    model="gemini-3-flash-preview",
    project="my-gcp-project",
    location="us-central1",
    credentials="/path/to/service-account-key.json",
    # Path to a service-account JSON key file downloaded from
    # GCP Console -> IAM & Admin -> Service Accounts -> Keys.
)

The service account needs the Vertex AI User (roles/aiplatform.user) IAM role on the project.

from autogen.beta.config import VertexAIConfig

# Run `gcloud auth application-default login` first, or ensure
# GOOGLE_APPLICATION_CREDENTIALS points to a key file. With nothing
# passed to `credentials`, google-genai resolves ADC automatically.
config = VertexAIConfig(
    model="gemini-3-flash-preview",
    project="my-gcp-project",
    location="us-central1",
)
import google.auth
from autogen.beta.config import VertexAIConfig

creds, _ = google.auth.default(
    scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

config = VertexAIConfig(
    model="gemini-3-flash-preview",
    project="my-gcp-project",
    location="us-central1",
    credentials=creds,
)

Use this path for impersonated credentials, workload identity, or any other google.auth.credentials.Credentials source.

Environment variables#

Instead of passing parameters explicitly, the underlying google-genai SDK resolves any field left unset from the following environment variables:

Environment variable Used by Equivalent parameter Notes
GOOGLE_API_KEY GeminiConfig api_key Takes precedence over GEMINI_API_KEY if both are set.
GEMINI_API_KEY GeminiConfig api_key Developer API key.
GOOGLE_CLOUD_PROJECT VertexAIConfig project GCP project ID.
GOOGLE_CLOUD_LOCATION VertexAIConfig location GCP region (or global).
GOOGLE_APPLICATION_CREDENTIALS VertexAIConfig credentials Path to a service-account JSON key file, read via ADC.

With the three Vertex variables set in the environment, configuration collapses to just the model name:

export GOOGLE_CLOUD_PROJECT=my-gcp-project
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json
1
2
3
4
from autogen.beta.config import VertexAIConfig

# All Vertex auth parameters resolved from the environment.
config = VertexAIConfig(model="gemini-3-flash-preview")

Self-Hosted and OpenAI-Compatible Models (vLLM, LM Studio, etc.)#

If you are using a self-hosted model or an API that is compatible with the OpenAI format (such as vLLM, LM Studio, FastChat, or Together AI), you can use the OpenAIConfig class and specify a custom base_url.

1
2
3
4
5
6
7
8
9
from autogen.beta.config import OpenAIConfig

# Configure a vLLM or other OpenAI-compatible endpoint
config = OpenAIConfig(
    model="qwen-3",
    base_url="http://localhost:8000/v1",
    # Some endpoints don't require an API key, but the client expects a non-empty string
    api_key="NotRequired",
)

Tip

If you are running a self-hosted server via HTTPS without a valid SSL certificate (e.g., a local self-signed certificate), you can disable SSL checks by passing a custom httpx.AsyncClient with verify=False to the configuration:

1
2
3
4
5
6
7
8
9
import httpx
from autogen.beta.config import OpenAIConfig

config = OpenAIConfig(
    model="qwen-3",
    base_url="https://localhost:8000/v1",
    api_key="NotRequired",
    http_client=httpx.AsyncClient(verify=False)
)

Extra Body Parameters#

Some OpenAI API-compatible providers require additional, provider-specific parameters in the request body. Use the extra_body parameter on OpenAIConfig to pass these through directly to the API call.

This is useful for enabling features like extended thinking on self-hosted or third-party models:

1
2
3
4
5
6
7
8
from autogen.beta.config import OpenAIConfig

# NVIDIA NIM
nemotron = OpenAIConfig(
    model="nvidia/nemotron-3-super-120b-a12b",
    base_url="https://integrate.api.nvidia.com/v1",
    extra_body={"chat_template_kwargs": {"thinking": True}},
)

Reusing and Overriding Configurations#

Model configurations are immutable. If you need to reuse a configuration for multiple agents with slight variations (e.g., changing the model version or adjusting the temperature), use the .copy() method. This creates a new updated instance without mutating the original configuration.

from autogen.agent import Agent
from autogen.beta.config import OpenAIConfig

base_config = OpenAIConfig(model="gpt-5")

agent1 = Agent(
    "Assistant",
    # Create a new configuration with updated temperature
    config=base_config.copy(temperature=0.2),
)

agent2 = Agent(
    "AnotherAssistant",
    # Create a new configuration with updated model and temperature
    config=base_config.copy(model="gpt-5-mini", temperature=0.8),
)

Delaying Model Configuration#

In many use cases, you may want to separate the logic of defining your agent (tools, system messages, instructions) from configuring the specific model it uses. This allows you to construct an agent once and dynamically provide the model configuration later during execution.

You can accomplish this by passing the configuration to the .ask() method when interacting with the agent. This is especially useful for applications like web servers where the user might bring their own API key or choose a different model on the fly.

from autogen.agent import Agent
from autogen.beta.config import OpenAIConfig

# Define an agent without an initial model config,
# or with a default one you plan to override later
agent = Agent(
    "Assistant",
    prompt="You are a helpful assistant.",
    # other tools and settings...
)

# Ask the agent, passing the explicit model configuration
response = await agent.ask(
    "Hello!",
    config=OpenAIConfig(
        model="gpt-5",
        api_key="sk-user-specific-key"
    )
)

Warning

Providing a configuration or client directly to the ask() method completely overrides the original model configuration assigned to the agent for that specific turn.

from autogen.agent import Agent
from autogen.beta.config import OpenAIConfig

agent = Agent(
    "Assistant",
    config=OpenAIConfig(model="gpt-5"),
)

response = await agent.ask(
    "Hello!",
    # overrides the original model configuration
    config=OpenAIConfig(model="gpt-5-mini")
)