Code Execution
Add code execution as a capability for your AG2 agents with the PythonCodeExecutionTool
! Utilizing a provided Python environment, whether that's the system one, a local virtual one, or a Docker container, your agents can generate, run and utilize the results of Python code.
The ability to execute code enables your agents to not only create code for applications but also to validate hypotheses, validate existing code, and cycle through code iterations until a working solution can be found.
Warning
Executing code, particularly code generated by an LLM, has inherent risks.
Different environments offer different levels of isolation: - System environment: No isolation - code runs directly on your system - Virtual environment: Package isolation only - still executes within your operating system - Docker environment: Container isolation - provides stronger separation from your host system
The PythonCodeExecutionTool
is comprised of a Python environment and a working directory, both of which you need to specify.
Installation#
No additional extras/packages are required for basic usage, and this tool will work for Mac, Windows, and Linux.
For Docker environments, you need to have Docker installed and running on your system.
You will need to have the Python versions you want to run installed. The tool will not install a specific Python version for you.
Environments#
There are three environments the tool can run code in: 1. System Environment - this is equivalent to your terminal or command prompt, and utilizes the Python interpreter (version) available by default at your system-level. 2. Virtual Environment (venv) - utilizing Python's virtual environment capabilities, this has its own independent set of Python packages. 3. Docker Environment - executes code within Docker containers, providing stronger isolation and environment control.
You create and pass these in using the python_environment
parameter of the PythonCodeExecutionTool
.
Tip
In terms of isolation and security:
- System environment is the least isolated
- Virtual environment provides package isolation but still executes on your system
- Docker environment provides the strongest isolation and is recommended for untrusted code
System Environment#
The SystemPythonEnvironment
uses the operating system's Python environment. It allows you to run code with all the installed packages.
There is only one, optional, parameter for the SystemPythonEnvironment
and that's executable
which can be a path (including filename) to the Python executable. This is useful if you have multiple Python interpreters installed and want to execute the code with a specific Python one.
Venv Environment#
The VenvPythonEnvironment
uses or creates a virtual environment to run the code in. This provides package isolation from the operating system Python environment.
The parameters for the VenvPythonEnvironment
allow you to use existing, or create new, virtual environments.
Parameter | Description |
---|---|
python_version | If you do not specify a python_path or a venv_path with an existing virtual environment, a virtual environment with the specified Python version will be created. The Python version must already exist, it will not be installed. |
python_path | If you want to use a specific Python interpreter you can pass in the path (and filename) of it. This takes precedence over the python_version , but will not be used if the venv_path is provided and a virtual environment already exists. |
venv_path | If you want to use an existing virtual environment or want a new one to be created in a specific location, you can specify a path for it. |
Tip
If you pass in a python_version
the tool will look for the Python interpreter executable in typical locations on disk. However, if yours is installed elsewhere you will need to use the python_path
to specify the correct location.
Note: The virtual environment directory will not be removed by the tool.
Docker Environment#
The DockerPythonEnvironment
executes code within Docker containers, providing stronger isolation from the host system and greater flexibility in environment configuration.
Key parameters for the DockerPythonEnvironment
include:
Parameter | Description |
---|---|
image | Docker image to use (e.g., python:3.11-slim ). Defaults to Python 3.11. |
pip_packages | List of pip packages to install in the container. |
requirements_file | Path to a requirements.txt file to install in the container. |
volumes | Dictionary mapping host paths to container paths for mounting volumes. |
dockerfile | Optional path to a Dockerfile to build and use instead of pulling an image. |
cleanup_container | Whether to remove the container after use. Defaults to True . |
keep_container_running | Whether to keep the container running after execution. Defaults to False . |
Tip
The Docker environment offers the strongest isolation for executing untrusted code. It's highly recommended for running code from LLMs in production systems or when working with potentially risky operations.
Note: Docker must be installed and running on your system to use this environment.
Working Directory#
The tool will create a script.py
file with the code and execute it. The location of this file is specified by passing in a WorkingDirectory
to the working_directory
parameter of the PythonCodeExecutionTool
.
If you don't want to specify a directory you can use WorkingDirectory
's create_tmp
method to create a temporary one.
With Docker environments, the working directory is mounted inside the container, allowing files to be shared between the host and the container.
Note: The working directory will not be removed by the tool.
Timeout#
To assist in handling execution that runs beyond a reasonable amount of time, a timeout
parameter (in seconds) is provided. The default timeout is 30 seconds.
Packages#
Packages are not automatically installed if the code depends on them and they are missing, except in Docker environments where you can specify packages to install.
For system and virtual environments, create and configure your environment with the needed packages beforehand.
For Docker environments, you can specify packages directly via the pip_packages
parameter or with a requirements_file
.
Examples#
System Environment#
from autogen import ConversableAgent, LLMConfig, register_function
# Import the environment, working directory, and code execution tool
from autogen.environments import SystemPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonCodeExecutionTool
with SystemPythonEnvironment(executable="/usr/local/bin/python") as sys_py_env:
with WorkingDirectory(path="/tmp/ag2_working_dir/") as wd:
# Create our code execution tool, using the environment and working directory from the above context managers
python_executor = PythonCodeExecutionTool(
timeout=60,
# If not using the context managers above, you can set the working directory and python environment here
# working_directory=wd,
# python_environment=sys_py_env,
)
with LLMConfig(model="gpt-4o", api_type="openai"):
# code_runner has the code execution tool available to execute
code_runner = ConversableAgent(
name="code_runner",
system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
human_input_mode="NEVER",
)
# question_agent has the code execution tool available to its LLM
question_agent = ConversableAgent(
name="question_agent",
system_message=("You are a developer AI agent. "
"Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
"Keep refining the code until it works."
),
)
# Register the python execution tool with the agents
register_function(
python_executor,
caller=question_agent,
executor=code_runner,
description="Run Python code",
)
result = code_runner.initiate_chat(
recipient=question_agent,
message=("Write Python code to print the current Python version followed by the numbers 1 to 11. "
"Make a syntax error in the first version and fix it in the second version."
),
max_turns=5,
)
print(f"Result: {result.summary}")
Venv Environment#
from autogen import ConversableAgent, LLMConfig, register_function
# Import the environment, working directory, and code execution tool
from autogen.environments import VenvPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonCodeExecutionTool
# Create a new virtual environment using a Python version
# Change this to match a version you have installed
venv = VenvPythonEnvironment(python_version="3.11")
# Create a temporary directory
working_dir = WorkingDirectory.create_tmp()
# Create our code execution tool
python_executor = PythonCodeExecutionTool(
working_directory=working_dir,
python_environment=venv,
)
with LLMConfig(model="gpt-4o", api_type="openai"):
# code_runner has the code execution tool available to execute
code_runner = ConversableAgent(
name="code_runner",
system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
human_input_mode="NEVER",
)
# question_agent has the code execution tool available to its LLM
question_agent = ConversableAgent(
name="question_agent",
system_message=("You are a developer AI agent. "
"Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
"Keep refining the code until it works."
),
)
# Register the python execution tool with the agents
register_function(
python_executor,
caller=question_agent,
executor=code_runner,
description="Run Python code",
)
result = code_runner.initiate_chat(
recipient=question_agent,
message=("Write a Python program to write a poem to a file. "
"Follow up with another program to read the poem from the file and print it."
),
max_turns=5,
)
print(f"Result: {result.summary}")
Docker Environment#
from autogen import ConversableAgent, LLMConfig, register_function
# Import the environment, working directory, and code execution tool
from autogen.environments import DockerPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonCodeExecutionTool
with DockerPythonEnvironment(image="python:3.11-slim", pip_packages=["numpy", "pandas", "matplotlib"]) as docker_env:
# When you exit the DockerPythonEnvironment context manager it will delete the Docker container (unless you set cleanup_container=False)
with WorkingDirectory(path="/tmp/ag2_working_dir/") as wd:
# Create our code execution tool, using the environment and working directory from the above context managers
python_executor = PythonCodeExecutionTool(
timeout=60,
# If not using the context managers above, you can set the working directory and python environment here
# working_directory=wd,
# python_environment=docker_env,
)
with LLMConfig(model="gpt-4o", api_type="openai"):
# code_runner has the code execution tool available to execute
code_runner = ConversableAgent(
name="code_runner",
system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
human_input_mode="NEVER",
)
# question_agent has the code execution tool available to its LLM
question_agent = ConversableAgent(
name="question_agent",
system_message=("You are a developer AI agent. "
"Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
"Keep refining the code until it works."
),
)
# Register the python execution tool with the agents
register_function(
python_executor,
caller=question_agent,
executor=code_runner,
description="Run Python code",
)
result = code_runner.initiate_chat(
recipient=question_agent,
message=("Write Python code to print the current Python version followed by the numbers 1 to 11. "
"Make a syntax error in the first version and fix it in the second version."
),
max_turns=5,
)
print(f"Result: {result.summary}")
Comparing Execution Environments#
Feature | System Environment | Virtual Environment | Docker Environment |
---|---|---|---|
Isolation | Minimal | Package-level | Container-level |
Security | Low | Medium | Higher |
Setup Complexity | Low | Medium | Higher |
Resource Overhead | Minimal | Low | Medium |
Dependency Management | Shared | Isolated | Fully isolated |
Persistence | System-wide | Directory-based | Image/Container based |
Recommended Use | Trusted code, simple scripts | General development | Untrusted code, complex dependencies |