Skip to content

Grok & OpenAI-API-Compatible

Grok is a family of AI models developed by xAI. Since Grok follows OpenAI's API specification, it works seamlessly with AG2's existing OpenAI client infrastructure — no extra dependencies needed.

This guide shows how to configure Grok (and other OpenAI-compatible APIs) with AG2, including function calling.

Available Grok Models#

The table below lists popular Grok models available via the API (as of March 2026). See xAI Models and Pricing for the full and up-to-date list.

Model ID Context Input / Output (per 1M tokens) Capabilities
grok-4-1-fast-reasoning 2M $0.20 / $0.50 Reasoning, functions, structured output, vision
grok-4-1-fast-non-reasoning 2M $0.20 / $0.50 Functions, structured output, vision
grok-4-0709 256K $3.00 / $15.00 Reasoning, functions, structured output
grok-code-fast-1 256K $0.20 / $1.50 Reasoning, functions, structured output (code-optimized)
grok-3 131K $3.00 / $15.00 Functions, structured output
grok-3-mini 131K $0.30 / $0.50 Reasoning, functions, structured output

Requirements#

To get started, ensure you meet the following requirements:

  1. Install the AG2 package:
  2. Run the following command to install the AG2 package:
    pip install ag2[openai]
    

!!! tip If you have been using autogen or ag2, all you need to do is upgrade it using:

pip install -U autogen[openai]
or
pip install -U ag2[openai]
as autogen and ag2 are aliases for the same PyPI package.

  1. Obtain a Grok API Key:
  2. Sign up for a Grok account at x.ai to generate your API key.
  3. Refer to the official Grok documentation for more information about obtaining and using the API key.

Configuration for Grok#

Basic Configuration#

Here's how to configure AG2 to use Grok:

import os
from autogen import LLMConfig

# Basic Grok configuration
llm_config = LLMConfig(
    {
        "model": "grok-4-1-fast-non-reasoning",
        "api_type": "openai",  # Grok is OpenAI-compatible
        "base_url": "https://api.x.ai/v1",
        "api_key": os.environ.get("XAI_API_KEY"),
    },
    temperature=0.7,
)

Grok offers real-time web search (web_search) and X/Twitter search (x_search) tools through xAI's Responses API. To use these with AG2, set api_type to "responses" and enable web search via built_in_tools. Requires AG2 >= 0.12:

llm_config = LLMConfig(
    {
        "model": "grok-4-1-fast-non-reasoning",
        "api_type": "responses",
        "base_url": "https://api.x.ai/v1",
        "api_key": os.environ.get("XAI_API_KEY"),
        "built_in_tools": ["web_search"],
    },
    temperature=0.5,
)

API Parameters#

Grok supports standard OpenAI-compatible parameters. You can include additional parameters in your configuration. Some commonly used parameters include:

  • temperature (number 0..1): Controls randomness in outputs
  • max_tokens (integer): Maximum tokens in the response
  • top_p (number 0..1): Nucleus sampling parameter

Basic Conversation Example#

Here's a simple example demonstrating a conversation with Grok:

import os
from autogen import AssistantAgent, UserProxyAgent, LLMConfig

grok_config = LLMConfig(
    {
        "model": "grok-4-1-fast-non-reasoning",
        "api_type": "openai",
        "base_url": "https://api.x.ai/v1",
        "api_key": os.getenv("XAI_API_KEY"),
    },
    temperature=0.5,
    max_tokens=1000,
)

# Create agents
assistant = AssistantAgent(
    name="grok_assistant",
    system_message="You are a helpful AI assistant powered by Grok.",
    llm_config=grok_config,
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=3,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)

# Start conversation
user_proxy.initiate_chat(
    assistant,
    message="Explain the key differences between Python's asyncio and threading.",
    max_turns=2,
)

Function Calling with Grok#

Grok supports advanced function calling capabilities. Here's how to set it up properly with AG2:

Warning

Critical Pattern: Functions must be registered with the ConversableAgent, not passed to LLMConfig.tools. Passing function objects to LLMConfig.tools will cause a "TypeError: Object of type function is not JSON serializable" error.

import os
from typing import Annotated
from autogen import AssistantAgent, UserProxyAgent, LLMConfig

# Configure Grok for function calling
function_config = LLMConfig(
    {
        "model": "grok-4-1-fast-non-reasoning",
        "api_key": os.getenv("XAI_API_KEY"),
        "base_url": "https://api.x.ai/v1",
        "api_type": "openai",
    },
    temperature=0.3,
    max_tokens=800,
)

# Create function-calling assistant
function_assistant = AssistantAgent(
    name="grok_function_assistant",
    system_message="You are a helpful assistant that can call functions to get weather information and perform calculations. Use the available tools when appropriate.",
    llm_config=function_config,
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=3,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)

# Define and register functions using the decorator pattern
@user_proxy.register_for_execution()
@function_assistant.register_for_llm(description="Get current weather for a city.")
def get_weather(city: Annotated[str, "The city name"]) -> str:
    # Mock function - in production, call a real weather API
    return f"The current weather in {city} is sunny with a temperature of 22 degrees C."

@user_proxy.register_for_execution()
@function_assistant.register_for_llm(description="Calculate a mathematical expression safely.")
def calculate_math(expression: Annotated[str, "Mathematical expression to evaluate"]) -> str:
    try:
        # Simple evaluation - use a safer parser in production
        result = eval(expression.replace("^", "**"))
        return f"The result of {expression} is {result}"
    except:
        return f"Could not evaluate the expression: {expression}"

# Test function calling
user_proxy.initiate_chat(
    function_assistant,
    message="What's the weather like in Tokyo? Also, can you calculate 15 * 23 + 7?",
    max_turns=2,
)

Function Calling Features#

Grok's function calling API supports several advanced features:

Tool Choice Options#

You can control how Grok uses functions with the tool_choice parameter:

# Automatic function calling (default)
llm_config = LLMConfig({
    "model": "grok-4-1-fast-non-reasoning",
    "api_type": "openai",
    "base_url": "https://api.x.ai/v1",
    "api_key": os.environ.get("XAI_API_KEY"),
    "tool_choice": "auto",  # Model decides when to call functions
})

# Force function calling
llm_config = LLMConfig({
    # ... config
    "tool_choice": "required",  # Always call a function
})

# Disable function calling
llm_config = LLMConfig({
    # ... config
    "tool_choice": "none",  # Never call functions
})

Parallel Function Calling#

Grok supports calling multiple functions simultaneously, which is enabled by default:

llm_config = LLMConfig({
    "model": "grok-4-1-fast-non-reasoning",
    # ... other config
    "parallel_tool_calls": True,  # Enable parallel function calls (default)
})

Best Practices#

1. Security#

  • Always use environment variables for API keys
  • Never hardcode sensitive information in your code

2. Configuration#

  • Use the correct api_type for your provider ("openai" for most compatible services)
  • Place common parameters like temperature at the top level of LLMConfig
  • Use extra_body for provider-specific features not covered by standard parameters
  • Use extra_headers for custom HTTP headers required by your server (e.g., authentication tokens, routing headers for VLLM or other OpenAI-compatible servers)

3. Function Registration#

  • Register functions with @agent.register_for_llm() and @proxy.register_for_execution() decorators
  • Use type annotations and docstrings for better function calling performance
  • Consider using the decorator pattern for complex registration scenarios

4. Error Handling#

  • Implement proper error handling for API calls
  • Use secure evaluation methods for mathematical expressions
  • Test function calling thoroughly before production use

Troubleshooting#

Common Issues#

ValidationError: Extra inputs are not permitted - Cause: Placing parameters like temperature inside config_list entries - Solution: Move parameters to the top level of LLMConfig

TypeError: Object of type function is not JSON serializable - Cause: Passing function objects to LLMConfig.tools - Solution: Use @agent.register_for_llm() and @proxy.register_for_execution() decorators instead

Model not found warnings - Cause: AG2 doesn't recognize the model name for cost calculation - Solution: Add custom pricing with the price parameter or ignore the warning

Other OpenAI-Compatible Providers#

The configuration pattern above works with any service that implements the OpenAI chat completions API — just change base_url and api_key:

import os
from autogen import LLMConfig

llm_config = LLMConfig(
    {
        "model": "your-model-name",
        "api_type": "openai",
        "base_url": "https://your-provider.com/v1",  # swap this
        "api_key": os.environ.get("YOUR_API_KEY"),    # and this
    },
    temperature=0.7,
)

Some popular OpenAI-compatible providers:

Provider base_url Example models
Together AI https://api.together.xyz/v1 meta-llama/Llama-3-70b-chat-hf
Fireworks AI https://api.fireworks.ai/inference/v1 accounts/fireworks/models/llama-v3p1-70b-instruct
Groq https://api.groq.com/openai/v1 llama-3.3-70b-versatile

For provider-specific features, use extra_body or extra_headers to pass additional parameters without changing the core configuration.

Additional Resources#