Skip to content

Additional Model Request Fields with Amazon Bedrock in AG2#

Open In Colab Open on GitHub

This notebook demonstrates how to use additional_model_request_fields with Amazon Bedrock in AG2. This feature allows you to pass model-specific parameters directly to the Bedrock API, including advanced features like Claude’s thinking configuration.

What are Additional Model Request Fields?#

additional_model_request_fields is a powerful feature that enables you to: - Pass model-specific parameters: Access Bedrock features not directly exposed in AG2’s standard configuration - Enable advanced features: Use cutting-edge capabilities like Claude’s thinking mode - Customize model behavior: Fine-tune model responses with provider-specific options - Future-proof your code: Easily adopt new Bedrock features as they become available

How It Works#

When you provide additional_model_request_fields in your LLM configuration, AG2: 1. Extracts these fields from your config 2. Passes them directly to Bedrock’s additional_model_request_fields parameter 3. Allows the model to use these advanced features

This is particularly useful for features like: - Thinking mode (Claude models): Extended reasoning with configurable token budgets - Model-specific parameters: Any parameter supported by your chosen Bedrock model - Experimental features: New capabilities before they’re fully integrated into AG2

Requirements#

  • Python >= 3.10
  • AG2 installed with bedrock extra: pip install ag2[bedrock]
  • AWS credentials configured (via environment variables, IAM role, or AWS credentials file)
  • A Bedrock model that supports the features you want to use

Model Compatibility#

Thinking Configuration is supported by: - anthropic.claude-3-7-sonnet-20250219-v1:0 (and newer Claude models) - eu.anthropic.claude-3-7-sonnet-20250219-v1:0 (EU region)

Check the Bedrock model documentation for the latest list of models and their supported features.

Installation#

Install required packages if not already installed:

%pip install ag2[bedrock] python-dotenv --upgrade

Setup: Import Libraries and Configure AWS Credentials#

import os

from dotenv import load_dotenv

from autogen import ConversableAgent, LLMConfig

load_dotenv()

print("Libraries imported successfully!")

Part 1: Understanding Thinking Configuration#

Claude’s thinking configuration enables extended reasoning capabilities. When enabled, the model can: - Perform deeper reasoning before generating a response - Use a configurable token budget for internal “thinking” - Show more thorough problem-solving processes

Thinking Configuration Parameters#

  • type: Set to "enabled" to activate thinking mode
  • budget_tokens: Maximum number of tokens the model can use for thinking (must be less than max_tokens)

Important: When using thinking mode, ensure your max_tokens is greater than budget_tokens to allow space for both thinking and the actual response.

Part 2: Basic Example - Enabling Thinking Mode#

Let’s create a basic example using thinking configuration, similar to the test.py reference:

# Configure LLM with Bedrock and thinking mode enabled
llm_config = LLMConfig(
    config_list={
        "api_type": "bedrock",
        "model": "eu.anthropic.claude-3-7-sonnet-20250219-v1:0",
        "api_key": os.getenv("BEDROCK_API_KEY"),
        "aws_region": os.getenv("AWS_REGION"),
        "aws_access_key": os.getenv("AWS_ACCESS_KEY"),
        "aws_secret_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
        "aws_profile_name": os.getenv("AWS_PROFILE"),
        # Enable thinking mode via additional_model_request_fields
        "additional_model_request_fields": {
            "thinking": {
                "type": "enabled",
                "budget_tokens": 1024,  # Allocate 1024 tokens for thinking
            }
        },
        "temperature": 1,
        "max_tokens": 4096,  # Must be greater than budget_tokens
    },
)

print("Bedrock LLM configuration created with thinking mode enabled!")
# Create an agent with thinking mode
conv_agent = ConversableAgent(
    name="conv_agent",
    llm_config=llm_config,
    system_message="You are a helpful assistant that thinks deeply about problems before responding.",
    max_consecutive_auto_reply=1,
    human_input_mode="NEVER",
)

print(f"Agent '{conv_agent.name}' created successfully!")

Part 3: Example 1 - Simple Question with Thinking#

Let’s test the agent with a question that benefits from extended reasoning:

print("=== Example 1: Simple question with thinking mode ===")

result = conv_agent.run(
    message="What is the capital of France? Also add a small research on it.",
    max_turns=5,
).process()

print("\nResponse received!")

Part 4: Example 2 - Complex Reasoning Problem#

Thinking mode is particularly useful for complex problems that require deep reasoning:

print("=== Example 2: Complex reasoning problem ===")

complex_result = conv_agent.run(
    message="""Analyze the following scenario: A company wants to reduce its carbon footprint by 50% over 5 years.
    They currently use 100% fossil fuel energy. They're considering:
    1. Switching to renewable energy (solar/wind)
    2. Implementing energy efficiency measures
    3. Carbon offset programs

    What combination of strategies would be most effective? Consider cost, feasibility, and long-term impact.""",
    max_turns=5,
).process()

print("\nComplex reasoning response received!")

Part 5: Adjusting Thinking Budget#

You can adjust the budget_tokens based on your needs: - incorrect budget (\<1024 tokens): tokens below 1024 should throw ValidationException - Medium budget (1024-2048 tokens): Balanced reasoning for most problems - Higher budget (2048-4096 tokens): For very complex problems requiring deep analysis

Note: Higher budgets increase token usage and cost, but may improve response quality for complex tasks.

# Example with different thinking budgets
thinking_configs = {
    "incorrect": {"type": "enabled", "budget_tokens": 512},
    "medium": {"type": "enabled", "budget_tokens": 1024},
    "high": {"type": "enabled", "budget_tokens": 2048},
}

# Create config with medium thinking budget
llm_config_incorrect = LLMConfig(
    config_list={
        "api_type": "bedrock",
        "model": "eu.anthropic.claude-3-7-sonnet-20250219-v1:0",
        "api_key": os.getenv("BEDROCK_API_KEY"),
        "aws_region": os.getenv("AWS_REGION"),
        "aws_access_key": os.getenv("AWS_ACCESS_KEY"),
        "aws_secret_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
        "aws_profile_name": os.getenv("AWS_PROFILE"),
        "additional_model_request_fields": {"thinking": thinking_configs["incorrect"]},
    },
)

print("Configuration with medium thinking budget created!")
conv_agent = ConversableAgent(
    name="conv_agent",
    llm_config=llm_config_incorrect,
    system_message="You are a helpful assistant that thinks deeply about problems before responding.",
    max_consecutive_auto_reply=1,
    human_input_mode="NEVER",
)

conv_agent.run(
    message="what is the capital of France? also research on nearby area.",
    max_turns=5,
).process()

Summary#

In this notebook, we’ve learned:

  1. ✅ What additional_model_request_fields is and how it works
  2. ✅ How to enable Claude’s thinking configuration
  3. ✅ How to configure thinking budget tokens
  4. ✅ How the feature works under the hood

Key Takeaways#

  • additional_model_request_fields allows you to pass model-specific parameters to Bedrock
  • Thinking mode enables extended reasoning with configurable token budgets
  • Always ensure max_tokens > budget_tokens when using thinking mode
  • Thinking mode is particularly useful for complex reasoning tasks
  • Monitor token usage and costs when using extended thinking

Next Steps#

  • Experiment with different budget_tokens values for your use cases
  • Try combining thinking mode with other AG2 features
  • Explore other additional_model_request_fields supported by your Bedrock model
  • Check AWS Bedrock documentation for new features and capabilities

References#