Skip to content

AnthropicV2Client

autogen.llm_clients.anthropic_v2.AnthropicV2Client #

AnthropicV2Client(api_key=None, base_url=None, timeout=None, response_format=None, **kwargs)

Bases: ModelClient

Anthropic Messages API client implementing ModelClientV2 protocol.

This client works with Anthropic's Messages API (client.messages.create) which returns structured output with thinking blocks, tool calls, and more.

Key Features: - Preserves thinking blocks as ReasoningContent (extended thinking feature) - Handles tool calls and results - Supports native structured outputs (beta API) and JSON Mode fallback - Provides backward compatibility via create_v1_compatible() - Supports multiple authentication methods (API key, AWS Bedrock, GCP Vertex)

Example

client = AnthropicCompletionsClient(api_key="...")

Get rich response with thinking#

response = client.create({ "model": "claude-3-5-sonnet-20241022", "messages": [{"role": "user", "content": "Explain quantum computing"}] })

Access thinking blocks#

for reasoning in response.reasoning: print(f"Thinking: {reasoning.reasoning}")

Get text response#

print(f"Answer: {response.text}")

Initialize Anthropic Messages API client.

PARAMETER DESCRIPTION
api_key

Anthropic API key (or set ANTHROPIC_API_KEY env var)

TYPE: str | None DEFAULT: None

base_url

Optional base URL for the API

TYPE: str | None DEFAULT: None

timeout

Optional timeout in seconds

TYPE: int | None DEFAULT: None

response_format

Optional response format for structured outputs

TYPE: type[BaseModel] | dict | None DEFAULT: None

**kwargs

Additional arguments passed to Anthropic client

TYPE: Any DEFAULT: {}

Source code in autogen/llm_clients/anthropic_v2.py
def __init__(
    self,
    api_key: str | None = None,
    base_url: str | None = None,
    timeout: int | None = None,
    response_format: type[BaseModel] | dict | None = None,
    **kwargs: Any,
):
    """
    Initialize Anthropic Messages API client.

    Args:
        api_key: Anthropic API key (or set ANTHROPIC_API_KEY env var)
        base_url: Optional base URL for the API
        timeout: Optional timeout in seconds
        response_format: Optional response format for structured outputs
        **kwargs: Additional arguments passed to Anthropic client
    """
    if anthropic_import_exception is not None:
        raise anthropic_import_exception

    # Store credentials
    self._api_key = api_key or os.getenv("ANTHROPIC_API_KEY")

    # Validate credentials
    if self._api_key is None:
        raise ValueError(
            "API key is required to use the Anthropic API. Set api_key parameter or ANTHROPIC_API_KEY environment variable."
        )

    # Initialize Anthropic client
    client_kwargs = {"api_key": self._api_key}
    if base_url:
        client_kwargs["base_url"] = base_url
    if timeout:
        client_kwargs["timeout"] = timeout
    self._client = Anthropic(**client_kwargs, **kwargs)  # type: ignore[misc]

    # Store response format for structured outputs
    self._response_format: type[BaseModel] | dict | None = response_format

RESPONSE_USAGE_KEYS class-attribute instance-attribute #

RESPONSE_USAGE_KEYS = ['prompt_tokens', 'completion_tokens', 'total_tokens', 'cost', 'model']

ModelClientResponseProtocol #

Bases: Protocol

choices instance-attribute #

choices

model instance-attribute #

model

Choice #

Bases: Protocol

message instance-attribute #
message
Message #

Bases: Protocol

content instance-attribute #
content

create #

create(params)

Create a completion and return UnifiedResponse with all features preserved.

This method implements ModelClient.create() but returns UnifiedResponse instead of ModelClientResponseProtocol. The rich UnifiedResponse structure is compatible via duck typing - it has .model attribute and works with message_retrieval().

Automatically selects the best structured output method: - Native structured outputs for Claude Sonnet 4.5+ (guaranteed schema compliance) - JSON Mode for older models (prompt-based with tags) - Standard completion for requests without response_format

PARAMETER DESCRIPTION
params

Request parameters including: - model: Model name (e.g., "claude-3-5-sonnet-20241022") - messages: List of message dicts - temperature: Optional temperature - max_tokens: Optional max completion tokens - tools: Optional tool definitions - response_format: Optional Pydantic BaseModel or JSON schema dict - **other Anthropic parameters

TYPE: dict[str, Any]

RETURNS DESCRIPTION
UnifiedResponse

UnifiedResponse with thinking blocks, tool calls, and all content preserved

Source code in autogen/llm_clients/anthropic_v2.py
def create(self, params: dict[str, Any]) -> UnifiedResponse:  # type: ignore[override]
    """
    Create a completion and return UnifiedResponse with all features preserved.

    This method implements ModelClient.create() but returns UnifiedResponse instead
    of ModelClientResponseProtocol. The rich UnifiedResponse structure is compatible
    via duck typing - it has .model attribute and works with message_retrieval().

    Automatically selects the best structured output method:
    - Native structured outputs for Claude Sonnet 4.5+ (guaranteed schema compliance)
    - JSON Mode for older models (prompt-based with <json_response> tags)
    - Standard completion for requests without response_format

    Args:
        params: Request parameters including:
            - model: Model name (e.g., "claude-3-5-sonnet-20241022")
            - messages: List of message dicts
            - temperature: Optional temperature
            - max_tokens: Optional max completion tokens
            - tools: Optional tool definitions
            - response_format: Optional Pydantic BaseModel or JSON schema dict
            - **other Anthropic parameters

    Returns:
        UnifiedResponse with thinking blocks, tool calls, and all content preserved
    """
    model = params.get("model")
    response_format = params.get("response_format") or self._response_format

    # Route to appropriate implementation based on model and response_format
    if response_format:
        self._response_format = response_format
        params["response_format"] = response_format

        # Try native structured outputs if model supports it
        if supports_native_structured_outputs(model) and has_messages_parse_api():
            try:
                return self._create_with_native_structured_output(params)
            except (BadRequestError, AttributeError, ValueError) as e:  # type: ignore[misc]
                # Fallback to JSON Mode if native API not supported or schema invalid
                self._log_structured_output_fallback(e, model, response_format, params)
                return self._create_with_json_mode(params)
        else:
            # Use JSON Mode for older models or when beta API unavailable
            return self._create_with_json_mode(params)
    else:
        # Standard completion without structured outputs
        return self._create_standard(params)

load_config #

load_config(params)

Load the configuration for the Anthropic API client.

Source code in autogen/llm_clients/anthropic_v2.py
def load_config(self, params: dict[str, Any]) -> dict[str, Any]:
    """Load the configuration for the Anthropic API client."""
    anthropic_params = {}

    anthropic_params["model"] = params.get("model")
    assert anthropic_params["model"], "Please provide a `model` in the config_list to use the Anthropic API."

    anthropic_params["temperature"] = validate_parameter(
        params, "temperature", (float, int), False, 1.0, (0.0, 1.0), None
    )
    anthropic_params["max_tokens"] = validate_parameter(params, "max_tokens", int, False, 4096, (1, None), None)
    anthropic_params["timeout"] = validate_parameter(params, "timeout", int, True, None, (1, None), None)
    anthropic_params["top_k"] = validate_parameter(params, "top_k", int, True, None, (1, None), None)
    anthropic_params["top_p"] = validate_parameter(params, "top_p", (float, int), True, None, (0.0, 1.0), None)
    anthropic_params["stop_sequences"] = validate_parameter(params, "stop_sequences", list, True, None, None, None)
    anthropic_params["stream"] = validate_parameter(params, "stream", bool, False, False, None, None)
    if "thinking" in params:
        anthropic_params["thinking"] = params["thinking"]

    if anthropic_params["stream"]:
        warnings.warn(
            "Streaming is not currently supported, streaming will be disabled.",
            UserWarning,
        )
        anthropic_params["stream"] = False

    # Note the Anthropic API supports "tool" for tool_choice but you must specify the tool name so we will ignore that here
    # Dictionary, see options here: https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview#controlling-claudes-output
    # type = auto, any, tool, none | name = the name of the tool if type=tool
    anthropic_params["tool_choice"] = validate_parameter(params, "tool_choice", dict, True, None, None, None)

    return anthropic_params

create_v1_compatible #

create_v1_compatible(params)

Create completion in backward-compatible ChatCompletion format.

This method provides compatibility with existing AG2 code that expects ChatCompletion format. Note that thinking blocks will be preserved in the content string with [Thinking] tags, matching V1 behavior.

PARAMETER DESCRIPTION
params

Same parameters as create()

TYPE: dict[str, Any]

RETURNS DESCRIPTION
ChatCompletion

ChatCompletion object compatible with OpenAI format

Warning

This method may lose some information when converting to the legacy format. Prefer create() for new code.

Source code in autogen/llm_clients/anthropic_v2.py
def create_v1_compatible(self, params: dict[str, Any]) -> ChatCompletion:
    """
    Create completion in backward-compatible ChatCompletion format.

    This method provides compatibility with existing AG2 code that expects
    ChatCompletion format. Note that thinking blocks will be preserved in
    the content string with [Thinking] tags, matching V1 behavior.

    Args:
        params: Same parameters as create()

    Returns:
        ChatCompletion object compatible with OpenAI format

    Warning:
        This method may lose some information when converting to the legacy format.
        Prefer create() for new code.
    """
    # Get rich response
    unified_response = self.create(params)

    # Build message text with proper thinking block formatting (matching V1 behavior)
    message_text = ""
    for msg in unified_response.messages:
        # Extract reasoning blocks (thinking content)
        reasoning_blocks = msg.get_reasoning()
        # Extract text content blocks
        text_blocks = [b for b in msg.content if isinstance(b, TextContent)]

        # Combine thinking content (multiple blocks joined with \n\n)
        thinking_content = "\n\n".join([r.reasoning for r in reasoning_blocks])
        # Combine text content (multiple blocks joined with \n\n)
        text_content = "\n\n".join([t.text for t in text_blocks])

        # Format like V1: [Thinking]\n{thinking}\n\n{text}
        if thinking_content and text_content:
            message_text = f"[Thinking]\n{thinking_content}\n\n{text_content}"
        elif thinking_content:
            message_text = f"[Thinking]\n{thinking_content}"
        elif text_content:
            message_text = text_content
        break  # Anthropic responses have single message

    # Extract tool calls if present
    tool_calls = None
    for msg in unified_response.messages:
        tool_call_blocks = msg.get_tool_calls()
        if tool_call_blocks:
            tool_calls = [
                ChatCompletionMessageToolCall(
                    id=tc.id,
                    function={"name": tc.name, "arguments": tc.arguments},
                    type="function",
                )
                for tc in tool_call_blocks
            ]
            break

    # Build ChatCompletion
    message = ChatCompletionMessage(
        role="assistant",
        content=message_text,
        function_call=None,
        tool_calls=tool_calls,
    )

    choices = [Choice(finish_reason=unified_response.finish_reason or "stop", index=0, message=message)]

    return ChatCompletion(
        id=unified_response.id,
        model=unified_response.model,
        created=int(time.time()),
        object="chat.completion",
        choices=choices,
        usage=CompletionUsage(
            prompt_tokens=unified_response.usage.get("prompt_tokens", 0),
            completion_tokens=unified_response.usage.get("completion_tokens", 0),
            total_tokens=unified_response.usage.get("total_tokens", 0),
        ),
        cost=unified_response.cost or 0.0,
    )

message_retrieval #

message_retrieval(response)

Retrieve messages from response in OpenAI-compatible format.

Returns list of strings for text-only messages, or list of dicts when tool calls or complex content is present.

PARAMETER DESCRIPTION
response

UnifiedResponse from create()

TYPE: UnifiedResponse

RETURNS DESCRIPTION
list[str] | list[ChatCompletionMessage]

List of strings (for text-only) OR list of message dicts (for tool calls/complex content)

Source code in autogen/llm_clients/anthropic_v2.py
def message_retrieval(self, response: UnifiedResponse) -> list[str] | list[ChatCompletionMessage]:  # type: ignore[override]
    """
    Retrieve messages from response in OpenAI-compatible format.

    Returns list of strings for text-only messages, or list of dicts when
    tool calls or complex content is present.

    Args:
        response: UnifiedResponse from create()

    Returns:
        List of strings (for text-only) OR list of message dicts (for tool calls/complex content)
    """
    result: list[str] | list[ChatCompletionMessage] = []

    for msg in response.messages:
        # Check for tool calls
        tool_calls = msg.get_tool_calls()

        # Check for complex/multimodal content that needs dict format
        has_complex_content = any(
            isinstance(block, (ImageContent, AudioContent, VideoContent)) for block in msg.content
        )

        if tool_calls or has_complex_content:
            # Return OpenAI-compatible dict format
            message_dict = ChatCompletionMessage(
                role=msg.role.value if hasattr(msg.role, "value") else msg.role,
                content=msg.get_text() or None,
            )

            # Add tool calls in OpenAI format
            if tool_calls:
                message_dict.tool_calls = [
                    ChatCompletionMessageToolCall(
                        id=tc.id,
                        type="function",
                        function={"name": tc.name, "arguments": tc.arguments},
                    )
                    for tc in tool_calls
                ]

            result.append(message_dict)
        else:
            # Simple text content - apply FormatterProtocol if available
            content = msg.get_text()

            # If response_format implements FormatterProtocol (has format() method), use it
            if isinstance(self._response_format, FormatterProtocol):
                try:
                    # Try to parse and format
                    parsed = self._response_format.model_validate_json(content)  # type: ignore[union-attr]
                    content = parsed.format()  # type: ignore[union-attr]
                except Exception:
                    # If parsing fails, return as-is
                    pass

            result.append(content)

    return result

cost #

cost(response)

Calculate cost from response usage.

Implements ModelClient.cost() but accepts UnifiedResponse via duck typing.

PARAMETER DESCRIPTION
response

UnifiedResponse with usage information

TYPE: UnifiedResponse

RETURNS DESCRIPTION
float

Cost in USD for the API call

Source code in autogen/llm_clients/anthropic_v2.py
def cost(self, response: UnifiedResponse) -> float:  # type: ignore[override]
    """
    Calculate cost from response usage.

    Implements ModelClient.cost() but accepts UnifiedResponse via duck typing.

    Args:
        response: UnifiedResponse with usage information

    Returns:
        Cost in USD for the API call
    """
    if not response.usage:
        return 0.0

    model = response.model
    prompt_tokens = response.usage.get("prompt_tokens", 0)
    completion_tokens = response.usage.get("completion_tokens", 0)

    return _calculate_cost(prompt_tokens, completion_tokens, model)

get_usage staticmethod #

get_usage(response)

Extract usage statistics from response.

Implements ModelClient.get_usage() but accepts UnifiedResponse via duck typing.

PARAMETER DESCRIPTION
response

UnifiedResponse from create()

TYPE: UnifiedResponse

RETURNS DESCRIPTION
dict[str, Any]

Dict with keys from RESPONSE_USAGE_KEYS

Source code in autogen/llm_clients/anthropic_v2.py
@staticmethod
def get_usage(response: UnifiedResponse) -> dict[str, Any]:  # type: ignore[override]
    """
    Extract usage statistics from response.

    Implements ModelClient.get_usage() but accepts UnifiedResponse via duck typing.

    Args:
        response: UnifiedResponse from create()

    Returns:
        Dict with keys from RESPONSE_USAGE_KEYS
    """
    return {
        "prompt_tokens": response.usage.get("prompt_tokens", 0),
        "completion_tokens": response.usage.get("completion_tokens", 0),
        "total_tokens": response.usage.get("total_tokens", 0),
        "cost": response.cost or 0.0,
        "model": response.model,
    }

openai_func_to_anthropic staticmethod #

openai_func_to_anthropic(openai_func)

Convert OpenAI function format to Anthropic format.

PARAMETER DESCRIPTION
openai_func

OpenAI function definition

TYPE: dict

RETURNS DESCRIPTION
dict

Anthropic function definition

Source code in autogen/llm_clients/anthropic_v2.py
@staticmethod
def openai_func_to_anthropic(openai_func: dict) -> dict:
    """Convert OpenAI function format to Anthropic format.

    Args:
        openai_func: OpenAI function definition

    Returns:
        Anthropic function definition
    """
    res = openai_func.copy()
    res["input_schema"] = res.pop("parameters")

    # Preserve strict field if present (for Anthropic structured outputs)
    # strict=True enables guaranteed schema validation for tool inputs
    if "strict" in openai_func:
        res["strict"] = openai_func["strict"]
        # Transform schema to add required additionalProperties: false for all objects
        # Anthropic requires this for strict tools
        res["input_schema"] = transform_schema_for_anthropic(res["input_schema"])

    return res

convert_tools_to_functions staticmethod #

convert_tools_to_functions(tools)

Convert tool definitions into Anthropic-compatible functions, updating nested $ref paths in property schemas.

PARAMETER DESCRIPTION
tools

List of tool definitions

TYPE: list

RETURNS DESCRIPTION
list

List of functions with updated $ref paths

Source code in autogen/llm_clients/anthropic_v2.py
@staticmethod
def convert_tools_to_functions(tools: list) -> list:
    """Convert tool definitions into Anthropic-compatible functions,
    updating nested $ref paths in property schemas.

    Args:
        tools: List of tool definitions

    Returns:
        List of functions with updated $ref paths
    """

    def update_refs(obj: Any, defs_keys: set[str], prop_name: str) -> None:
        """Recursively update $ref values that start with "#/$defs/"."""
        if isinstance(obj, dict):
            for key, value in obj.items():
                if key == "$ref" and isinstance(value, str) and value.startswith("#/$defs/"):
                    ref_key = value[len("#/$defs/") :]
                    if ref_key in defs_keys:
                        obj[key] = f"#/properties/{prop_name}/$defs/{ref_key}"
                else:
                    update_refs(value, defs_keys, prop_name)
        elif isinstance(obj, list):
            for item in obj:
                update_refs(item, defs_keys, prop_name)

    functions = []
    for tool in tools:
        if tool.get("type") == "function" and "function" in tool:
            function = tool["function"]
            parameters = function.get("parameters", {})
            properties = parameters.get("properties", {})
            for prop_name, prop_schema in properties.items():
                if "$defs" in prop_schema:
                    defs_keys = set(prop_schema["$defs"].keys())
                    update_refs(prop_schema, defs_keys, prop_name)
            functions.append(function)
    return functions