Code Execution

Add code execution as a capability for your AG2 agents with the PythonCodeExecutionTool! Utilizing a provided Python environment, whether that's the system one, a local virtual one, or a Docker container, your agents can generate, run and utilize the results of Python code.

The ability to execute code enables your agents to not only create code for applications but also to validate hypotheses, validate existing code, and cycle through code iterations until a working solution can be found.

Warning

Executing code, particularly code generated by an LLM, has inherent risks.

Different environments offer different levels of isolation: - System environment: No isolation - code runs directly on your system - Virtual environment: Package isolation only - still executes within your operating system - Docker environment: Container isolation - provides stronger separation from your host system

The PythonCodeExecutionTool is comprised of a Python environment and a working directory, both of which you need to specify.

Installation#

No additional extras/packages are required for basic usage, and this tool will work for Mac, Windows, and Linux.

For Docker environments, you need to have Docker installed and running on your system.

You will need to have the Python versions you want to run installed. The tool will not install a specific Python version for you.

Environments#

There are three environments the tool can run code in: 1. System Environment - this is equivalent to your terminal or command prompt, and utilizes the Python interpreter (version) available by default at your system-level. 2. Virtual Environment (venv) - utilizing Python's virtual environment capabilities, this has its own independent set of Python packages. 3. Docker Environment - executes code within Docker containers, providing stronger isolation and environment control.

You create and pass these in using the python_environment parameter of the PythonCodeExecutionTool.

Tip

In terms of isolation and security:

System environment is the least isolated
Virtual environment provides package isolation but still executes on your system
Docker environment provides the strongest isolation and is recommended for untrusted code

System Environment#

The SystemPythonEnvironment uses the operating system's Python environment. It allows you to run code with all the installed packages.

There is only one, optional, parameter for the SystemPythonEnvironment and that's executable which can be a path (including filename) to the Python executable. This is useful if you have multiple Python interpreters installed and want to execute the code with a specific Python one.

Venv Environment#

The VenvPythonEnvironment uses or creates a virtual environment to run the code in. This provides package isolation from the operating system Python environment.

The parameters for the VenvPythonEnvironment allow you to use existing, or create new, virtual environments.

Parameter	Description
`python_version`	If you do not specify a `python_path` or a `venv_path` with an existing virtual environment, a virtual environment with the specified Python version will be created. The Python version must already exist, it will not be installed.
`python_path`	If you want to use a specific Python interpreter you can pass in the path (and filename) of it. This takes precedence over the `python_version`, but will not be used if the `venv_path` is provided and a virtual environment already exists.
`venv_path`	If you want to use an existing virtual environment or want a new one to be created in a specific location, you can specify a path for it.

Tip

If you pass in a python_version the tool will look for the Python interpreter executable in typical locations on disk. However, if yours is installed elsewhere you will need to use the python_path to specify the correct location.

Note: The virtual environment directory will not be removed by the tool.

Docker Environment#

The DockerPythonEnvironment executes code within Docker containers, providing stronger isolation from the host system and greater flexibility in environment configuration.

Key parameters for the DockerPythonEnvironment include:

Parameter	Description
`image`	Docker image to use (e.g., `python:3.11-slim`). Defaults to Python 3.11.
`pip_packages`	List of pip packages to install in the container.
`requirements_file`	Path to a requirements.txt file to install in the container.
`volumes`	Dictionary mapping host paths to container paths for mounting volumes.
`dockerfile`	Optional path to a Dockerfile to build and use instead of pulling an image.
`cleanup_container`	Whether to remove the container after use. Defaults to `True`.
`keep_container_running`	Whether to keep the container running after execution. Defaults to `False`.

Tip

The Docker environment offers the strongest isolation for executing untrusted code. It's highly recommended for running code from LLMs in production systems or when working with potentially risky operations.

Note: Docker must be installed and running on your system to use this environment.

Working Directory#

The tool will create a script.py file with the code and execute it. The location of this file is specified by passing in a WorkingDirectory to the working_directory parameter of the PythonCodeExecutionTool.

If you don't want to specify a directory you can use WorkingDirectory's create_tmp method to create a temporary one.

With Docker environments, the working directory is mounted inside the container, allowing files to be shared between the host and the container.

Note: The working directory will not be removed by the tool.

Timeout#

To assist in handling execution that runs beyond a reasonable amount of time, a timeout parameter (in seconds) is provided. The default timeout is 30 seconds.

Packages#

Packages are not automatically installed if the code depends on them and they are missing, except in Docker environments where you can specify packages to install.

For system and virtual environments, create and configure your environment with the needed packages beforehand.

For Docker environments, you can specify packages directly via the pip_packages parameter or with a requirements_file.

Examples#

System Environment#

from autogen import ConversableAgent, LLMConfig, register_function

# Import the environment, working directory, and code execution tool
from autogen.environments import SystemPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonCodeExecutionTool

with SystemPythonEnvironment(executable="/usr/local/bin/python") as sys_py_env:
    with WorkingDirectory(path="/tmp/ag2_working_dir/") as wd:
        # Create our code execution tool, using the environment and working directory from the above context managers
        python_executor = PythonCodeExecutionTool(
            timeout=60,
            # If not using the context managers above, you can set the working directory and python environment here
            # working_directory=wd,
            # python_environment=sys_py_env,
        )

with LLMConfig(model="gpt-4o", api_type="openai"):

    # code_runner has the code execution tool available to execute
    code_runner = ConversableAgent(
        name="code_runner",
        system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
        human_input_mode="NEVER",
    )

    # question_agent has the code execution tool available to its LLM
    question_agent = ConversableAgent(
        name="question_agent",
        system_message=("You are a developer AI agent. "
            "Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
            "Keep refining the code until it works."
        ),
    )

# Register the python execution tool with the agents
register_function(
    python_executor,
    caller=question_agent,
    executor=code_runner,
    description="Run Python code",
)

result = code_runner.initiate_chat(
    recipient=question_agent,
    message=("Write Python code to print the current Python version followed by the numbers 1 to 11. "
             "Make a syntax error in the first version and fix it in the second version."
    ),
    max_turns=5,
)

print(f"Result: {result.summary}")

Venv Environment#

from autogen import ConversableAgent, LLMConfig, register_function

# Import the environment, working directory, and code execution tool
from autogen.environments import VenvPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonCodeExecutionTool

# Create a new virtual environment using a Python version
# Change this to match a version you have installed
venv = VenvPythonEnvironment(python_version="3.11")

# Create a temporary directory
working_dir = WorkingDirectory.create_tmp()

# Create our code execution tool
python_executor = PythonCodeExecutionTool(
    working_directory=working_dir,
    python_environment=venv,
)

with LLMConfig(model="gpt-4o", api_type="openai"):

    # code_runner has the code execution tool available to execute
    code_runner = ConversableAgent(
        name="code_runner",
        system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
        human_input_mode="NEVER",
    )

    # question_agent has the code execution tool available to its LLM
    question_agent = ConversableAgent(
        name="question_agent",
        system_message=("You are a developer AI agent. "
            "Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
            "Keep refining the code until it works."
        ),
    )

# Register the python execution tool with the agents
register_function(
    python_executor,
    caller=question_agent,
    executor=code_runner,
    description="Run Python code",
)

result = code_runner.initiate_chat(
    recipient=question_agent,
    message=("Write a Python program to write a poem to a file. "
             "Follow up with another program to read the poem from the file and print it."
    ),
    max_turns=5,
)

print(f"Result: {result.summary}")

Docker Environment#

from autogen import ConversableAgent, LLMConfig, register_function

# Import the environment, working directory, and code execution tool
from autogen.environments import DockerPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonCodeExecutionTool

with DockerPythonEnvironment(image="python:3.11-slim", pip_packages=["numpy", "pandas", "matplotlib"]) as docker_env:
    # When you exit the DockerPythonEnvironment context manager it will delete the Docker container (unless you set cleanup_container=False)

    with WorkingDirectory(path="/tmp/ag2_working_dir/") as wd:
        # Create our code execution tool, using the environment and working directory from the above context managers
        python_executor = PythonCodeExecutionTool(
            timeout=60,
            # If not using the context managers above, you can set the working directory and python environment here
            # working_directory=wd,
            # python_environment=docker_env,
        )

    with LLMConfig(model="gpt-4o", api_type="openai"):

        # code_runner has the code execution tool available to execute
        code_runner = ConversableAgent(
            name="code_runner",
            system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
            human_input_mode="NEVER",
        )

        # question_agent has the code execution tool available to its LLM
        question_agent = ConversableAgent(
            name="question_agent",
            system_message=("You are a developer AI agent. "
                "Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
                "Keep refining the code until it works."
            ),
        )

    # Register the python execution tool with the agents
    register_function(
        python_executor,
        caller=question_agent,
        executor=code_runner,
        description="Run Python code",
    )

    result = code_runner.initiate_chat(
        recipient=question_agent,
        message=("Write Python code to print the current Python version followed by the numbers 1 to 11. "
                "Make a syntax error in the first version and fix it in the second version."
        ),
        max_turns=5,
    )

    print(f"Result: {result.summary}")

Comparing Execution Environments#

Feature	System Environment	Virtual Environment	Docker Environment
Isolation	Minimal	Package-level	Container-level
Security	Low	Medium	Higher
Setup Complexity	Low	Medium	Higher
Resource Overhead	Minimal	Low	Medium
Dependency Management	Shared	Isolated	Fully isolated
Persistence	System-wide	Directory-based	Image/Container based
Recommended Use	Trusted code, simple scripts	General development	Untrusted code, complex dependencies