Open In Colab Open on GitHub

Cohere is a cloud based platform serving their own LLMs, in particular the Command family of models.

Cohere’s API differs from OpenAI’s, which is the native API used by AutoGen, so to use Cohere’s LLMs you need to use this library.

You will need a Cohere account and create an API key. See their website for further details.

Features

When using this client class, AutoGen’s messages are automatically tailored to accommodate the specific requirements of Cohere’s API.

Additionally, this client class provides support for function/tool calling and will track token usage and cost correctly as per Cohere’s API costs (as of July 2024).

Getting started

First you need to install the pyautogen package to use AutoGen with the Cohere API library.

pip install pyautogen[cohere]

Cohere provides a number of models to use, included below. See the list of models here.

See the sample OAI_CONFIG_LIST below showing how the Cohere client class is used by specifying the api_type as cohere.

[
    {
        "model": "gpt-35-turbo",
        "api_key": "your OpenAI Key goes here",
    },
    {
        "model": "gpt-4-vision-preview",
        "api_key": "your OpenAI Key goes here",
    },
    {
        "model": "dalle",
        "api_key": "your OpenAI Key goes here",
    },
    {
        "model": "command-r-plus",
        "api_key": "your Cohere API Key goes here",
        "api_type": "cohere"
    },
    {
        "model": "command-r",
        "api_key": "your Cohere API Key goes here",
        "api_type": "cohere"
    },
    {
        "model": "command",
        "api_key": "your Cohere API Key goes here",
        "api_type": "cohere"
    }
]

As an alternative to the api_key key and value in the config, you can set the environment variable COHERE_API_KEY to your Cohere key.

Linux/Mac:

export COHERE_API_KEY="your_cohere_api_key_here"

Windows:

set COHERE_API_KEY=your_cohere_api_key_here

API parameters

The following parameters can be added to your config for the Cohere API. See this link for further information on them and their default values.

  • temperature (number > 0)
  • p (number 0.01..0.99)
  • k (number 0..500)
  • max_tokens (null, integer >= 0)
  • seed (null, integer)
  • frequency_penalty (number 0..1)
  • presence_penalty (number 0..1)
  • client_name (null, string)

Example:

[
    {
        "model": "command-r",
        "api_key": "your Cohere API Key goes here",
        "api_type": "cohere",
        "client_name": "autogen-cohere",
        "temperature": 0.5,
        "p": 0.2,
        "k": 100,
        "max_tokens": 2048,
        "seed": 42,
        "frequency_penalty": 0.5,
        "presence_penalty": 0.2
    }
]

Two-Agent Coding Example

In this example, we run a two-agent chat with an AssistantAgent (primarily a coding agent) to generate code to count the number of prime numbers between 1 and 10,000 and then it will be executed.

We’ll use Cohere’s Command R model which is suitable for coding.

import os

config_list = [
    {
        # Let's choose the Command-R model
        "model": "command-r",
        # Provide your Cohere's API key here or put it into the COHERE_API_KEY environment variable.
        "api_key": os.environ.get("COHERE_API_KEY"),
        # We specify the API Type as 'cohere' so it uses the Cohere client class
        "api_type": "cohere",
    }
]

Importantly, we have tweaked the system message so that the model doesn’t return the termination keyword, which we’ve changed to FINISH, with the code block.

from pathlib import Path

from autogen import AssistantAgent, UserProxyAgent
from autogen.coding import LocalCommandLineCodeExecutor

# Setting up the code executor
workdir = Path("coding")
workdir.mkdir(exist_ok=True)
code_executor = LocalCommandLineCodeExecutor(work_dir=workdir)

# Setting up the agents

# The UserProxyAgent will execute the code that the AssistantAgent provides
user_proxy_agent = UserProxyAgent(
    name="User",
    code_execution_config={"executor": code_executor},
    is_termination_msg=lambda msg: "FINISH" in msg.get("content"),
)

system_message = """You are a helpful AI assistant who writes code and the user executes it.
Solve tasks using your coding and language skills.
In the following cases, suggest python code (in a python coding block) for the user to execute.
Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.
IMPORTANT: Wait for the user to execute your code and then you can reply with the word "FINISH". DO NOT OUTPUT "FINISH" after your code block."""

# The AssistantAgent, using Cohere's model, will take the coding request and return code
assistant_agent = AssistantAgent(
    name="Cohere Assistant",
    system_message=system_message,
    llm_config={"config_list": config_list},
)
/usr/local/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
# Start the chat, with the UserProxyAgent asking the AssistantAgent the message
chat_result = user_proxy_agent.initiate_chat(
    assistant_agent,
    message="Provide code to count the number of prime numbers from 1 to 10000.",
)
User (to Cohere Assistant):

Provide code to count the number of prime numbers from 1 to 10000.

--------------------------------------------------------------------------------
Cohere Assistant (to User):

Here's the code to count the number of prime numbers from 1 to 10,000:
```python
# Prime Number Counter
count = 0
for num in range(2, 10001):
    if num > 1:
        for div in range(2, num):
            if (num % div) == 0:
                break
        else:
            count += 1
print(count)
```

My plan is to use two nested loops. The outer loop iterates through numbers from 2 to 10,000. The inner loop checks if there's any divisor for the current number in the range from 2 to the number itself. If there's no such divisor, the number is prime and the counter is incremented.

Please execute the code and let me know the output.

--------------------------------------------------------------------------------

>>>>>>>> NO HUMAN INPUT RECEIVED.

>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK (inferred language is python)...
User (to Cohere Assistant):

exitcode: 0 (execution succeeded)
Code output: 1229


--------------------------------------------------------------------------------
Cohere Assistant (to User):

That's correct! The code you executed successfully found 1229 prime numbers within the specified range.

FINISH.

--------------------------------------------------------------------------------

>>>>>>>> NO HUMAN INPUT RECEIVED.

Tool Call Example

In this example, instead of writing code, we will show how Cohere’s Command R+ model can perform parallel tool calling, where it recommends calling more than one tool at a time.

We’ll use a simple travel agent assistant program where we have a couple of tools for weather and currency conversion.

We start by importing libraries and setting up our configuration to use Command R+ and the cohere client class.

import json
import os
from typing import Literal

from typing_extensions import Annotated

import autogen

config_list = [
    {"api_type": "cohere", "model": "command-r-plus", "api_key": os.getenv("COHERE_API_KEY"), "cache_seed": None}
]

Create our two agents.

# Create the agent for tool calling
chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="""For currency exchange and weather forecasting tasks,
        only use the functions you have been provided with.
        Output 'HAVE FUN!' when an answer has been provided.""",
    llm_config={"config_list": config_list},
)

# Note that we have changed the termination string to be "HAVE FUN!"
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and "HAVE FUN!" in x.get("content", ""),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

Create the two functions, annotating them so that those descriptions can be passed through to the LLM.

We associate them with the agents using register_for_execution for the user_proxy so it can execute the function and register_for_llm for the chatbot (powered by the LLM) so it can pass the function definitions to the LLM.

# Currency Exchange function

CurrencySymbol = Literal["USD", "EUR"]

# Define our function that we expect to call


def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")


# Register the function with the agent


@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
    return f"{format(quote_amount, '.2f')} {quote_currency}"


# Weather function


# Example function to make available to model
def get_current_weather(location, unit="fahrenheit"):
    """Get the weather for some location"""
    if "chicago" in location.lower():
        return json.dumps({"location": "Chicago", "temperature": "13", "unit": unit})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "55", "unit": unit})
    elif "new york" in location.lower():
        return json.dumps({"location": "New York", "temperature": "11", "unit": unit})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})


# Register the function with the agent


@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Weather forecast for US cities.")
def weather_forecast(
    location: Annotated[str, "City name"],
) -> str:
    weather_details = get_current_weather(location=location)
    weather = json.loads(weather_details)
    return f"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}"

We pass through our customer’s message and run the chat.

Finally, we ask the LLM to summarise the chat and print that out.

# start the conversation
res = user_proxy.initiate_chat(
    chatbot,
    message="What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday? Throw a few holiday tips in as well.",
    summary_method="reflection_with_llm",
)

print(f"LLM SUMMARY: {res.summary['content']}")
user_proxy (to chatbot):

What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday? Throw a few holiday tips in as well.

--------------------------------------------------------------------------------
chatbot (to user_proxy):

I will use the weather_forecast function to find out the weather in New York, and the currency_calculator function to convert 123.45 EUR to USD. I will then search for 'holiday tips' to find some extra information to include in my answer.
***** Suggested tool call (45212): weather_forecast *****
Arguments: 
{"location": "New York"}
*********************************************************
***** Suggested tool call (16564): currency_calculator *****
Arguments: 
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION weather_forecast...

>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):

user_proxy (to chatbot):

***** Response from calling tool (45212) *****
New York will be 11 degrees fahrenheit
**********************************************

--------------------------------------------------------------------------------
user_proxy (to chatbot):

***** Response from calling tool (16564) *****
135.80 USD
**********************************************

--------------------------------------------------------------------------------
chatbot (to user_proxy):

The weather in New York is 11 degrees Fahrenheit. 

€123.45 is worth $135.80. 

Here are some holiday tips:
- Make sure to pack layers for the cold weather
- Try the local cuisine, New York is famous for its pizza
- Visit Central Park and take in the views from the top of the Rockefeller Centre

HAVE FUN!

--------------------------------------------------------------------------------
LLM SUMMARY: The weather in New York is 11 degrees Fahrenheit. 123.45 EUR is worth 135.80 USD. Holiday tips: make sure to pack warm clothes and have a great time!

We can see that Command R+ recommended we call both tools and passed through the right parameters. The user_proxy executed them and this was passed back to Command R+ to interpret them and respond. Finally, Command R+ was asked to summarise the whole conversation.