Examples
- Examples by Category
- Examples by Notebook
- Notebooks
- Using RetrieveChat Powered by MongoDB Atlas for Retrieve Augmented Code Generation and Question Answering
- Using RetrieveChat Powered by PGVector for Retrieve Augmented Code Generation and Question Answering
- Using RetrieveChat with Qdrant for Retrieve Augmented Code Generation and Question Answering
- Agent Tracking with AgentOps
- AgentOptimizer: An Agentic Way to Train Your LLM Agent
- Task Solving with Code Generation, Execution and Debugging
- Assistants with Azure Cognitive Search and Azure Identity
- CaptainAgent
- Usage tracking with AutoGen
- Agent Chat with custom model loading
- Agent Chat with Multimodal Models: DALLE and GPT-4V
- Use AutoGen in Databricks with DBRX
- Auto Generated Agent Chat: Task Solving with Provided Tools as Functions
- Task Solving with Provided Tools as Functions (Asynchronous Function Calls)
- Writing a software application using function calls
- Currency Calculator: Task Solving with Provided Tools as Functions
- Groupchat with Llamaindex agents
- Group Chat
- Group Chat with Retrieval Augmented Generation
- Group Chat with Customized Speaker Selection Method
- FSM - User can input speaker transition constraints
- Perform Research with Multi-Agent Group Chat
- StateFlow: Build Workflows through State-Oriented Actions
- Group Chat with Coder and Visualization Critic
- Using Guidance with AutoGen
- Auto Generated Agent Chat: Task Solving with Code Generation, Execution, Debugging & Human Feedback
- Generate Dalle Images With Conversable Agents
- Auto Generated Agent Chat: Function Inception
- Auto Generated Agent Chat: Task Solving with Langchain Provided Tools as Functions
- Engaging with Multimodal Models: GPT-4V in AutoGen
- Agent Chat with Multimodal Models: LLaVA
- Runtime Logging with AutoGen
- Agent with memory using Mem0
- Solving Multiple Tasks in a Sequence of Async Chats
- Solving Multiple Tasks in a Sequence of Chats
- Nested Chats for Tool Use in Conversational Chess
- Conversational Chess using non-OpenAI clients
- Solving Complex Tasks with A Sequence of Nested Chats
- Solving Complex Tasks with Nested Chats
- OptiGuide with Nested Chats in AutoGen
- Chat with OpenAI Assistant using function call in AutoGen: OSS Insights for Advanced GitHub Data Analysis
- Auto Generated Agent Chat: Group Chat with GPTAssistantAgent
- RAG OpenAI Assistants in AutoGen
- OpenAI Assistants in AutoGen
- Auto Generated Agent Chat: GPTAssistant with Code Interpreter
- Agent Observability with OpenLIT
- Auto Generated Agent Chat: Collaborative Task Solving with Coding and Planning Agent
- ReasoningAgent - Advanced LLM Reasoning with Multiple Search Strategies
- SocietyOfMindAgent
- SQL Agent for Spider text-to-SQL benchmark
- Interactive LLM Agent Dealing with Data Stream
- Structured output
- WebSurferAgent
- Swarm Orchestration with AG2
- Using a local Telemetry server to monitor a GraphRAG agent
- Trip planning with a FalkorDB GraphRAG agent using a Swarm
- (Legacy) Implement Swarm-style orchestration with GroupChat
- Chatting with a teachable agent
- Making OpenAI Assistants Teachable
- Auto Generated Agent Chat: Teaching AI New Skills via Natural Language Interaction
- Preprocessing Chat History with `TransformMessages`
- Auto Generated Agent Chat: Collaborative Task Solving with Multiple Agents and Human Users
- Translating Video audio using Whisper and GPT-3.5-turbo
- Auto Generated Agent Chat: Solving Tasks Requiring Web Info
- Web Scraping using Apify Tools
- Websockets: Streaming input and output using websockets
- Solving Multiple Tasks in a Sequence of Chats with Different Conversable Agent Pairs
- Demonstrating the `AgentEval` framework using the task of solving math problems as an example
- Agent Chat with Async Human Inputs
- Automatically Build Multi-agent System from Agent Library
- AutoBuild
- A Uniform interface to call different LLMs
- From Dad Jokes To Sad Jokes: Function Calling with GPTAssistantAgent
- Language Agent Tree Search
- Mitigating Prompt hacking with JSON Mode in Autogen
- Using RetrieveChat for Retrieve Augmented Code Generation and Question Answering
- Using Neo4j's graph database with AG2 agents for Question & Answering
- Enhanced Swarm Orchestration with AG2
- Cross-Framework LLM Tool Integration with AG2
- RealtimeAgent in a Swarm Orchestration
- ReasoningAgent - Advanced LLM Reasoning with Multiple Search Strategies
- Application Gallery
Chatting with a teachable agent
Conversational assistants based on LLMs can remember the current chat with the user, and can even demonstrate in-context learning of things that the user teaches the assistant during the chat. But these memories and learnings are lost once the chat is over, or when a single chat grows too long for the LLM to handle effectively. In subsequent chats, the user is forced to repeat any necessary instructions over and over.
The optional agent capability called Teachability
addresses these
limitations by persisting user teachings across chat boundaries in
long-term memory (a vector database). Memories (called memos) are
created and saved to disk throughout a conversation, then loaded from
disk later. Instead of copying all the memos into the context window,
which would eat up valuable space, individual memos are retrieved into
context only as needed. This allows the user to teach many facts,
preferences and skills to the teachable agent just once, and have it
remember them in later chats.
In making decisions about memo storage and retrieval, Teachability
calls an instance of TextAnalyzerAgent
to analyze pieces of text in
several different ways. This adds extra LLM calls involving a relatively
small number of tokens. These calls can add a few seconds to the time a
user waits for a response.
This notebook demonstrates how Teachability
can be added to an agent
so that it can learn facts, preferences, and skills from users. To chat
with a teachable agent yourself, run
chat_with_teachable_agent.py.
Requirements
:::info Requirements
Some extra dependencies are needed for this notebook, which can be installed via pip:
```bash
pip install autogen[teachable]
```
For more information, please refer to the [installation guide](/docs/installation/).
:::
Set your API Endpoint
The
config_list_from_json
function loads a list of configurations from an environment variable or
a json file.
import autogen
from autogen import ConversableAgent, UserProxyAgent
from autogen.agentchat.contrib.capabilities.teachability import Teachability
config_list = autogen.config_list_from_json(
env_or_file="OAI_CONFIG_LIST",
file_location=".",
filter_dict={
"model": ["gpt-4", "gpt-4-1106-preview", "gpt4", "gpt-4-32k"],
},
)
print(config_list[0]["model"])
gpt-4-1106-preview
:::tip
Learn more about configuring LLMs for agents [here](/docs/topics/llm_configuration).
:::
Construct Agents
For this walkthrough, we start by creating a teachable agent and resetting its memory store. This deletes any memories from prior conversations that may be stored on disk.
# Start by instantiating any agent that inherits from ConversableAgent.
teachable_agent = ConversableAgent(
name="teachable_agent", # The name is flexible, but should not contain spaces to work in group chat.
llm_config={"config_list": config_list, "timeout": 120, "cache_seed": None}, # Disable caching.
)
# Instantiate the Teachability capability. Its parameters are all optional.
teachability = Teachability(
verbosity=0, # 0 for basic info, 1 to add memory operations, 2 for analyzer messages, 3 for memo lists.
reset_db=True,
path_to_db_dir="./tmp/notebook/teachability_db",
recall_threshold=1.5, # Higher numbers allow more (but less relevant) memos to be recalled.
)
# Now add the Teachability capability to the agent.
teachability.add_to_agent(teachable_agent)
# Instantiate a UserProxyAgent to represent the user. But in this notebook, all user input will be simulated.
user = UserProxyAgent(
name="user",
human_input_mode="NEVER",
is_termination_msg=lambda x: True if "TERMINATE" in x.get("content") else False,
max_consecutive_auto_reply=0,
code_execution_config={
"use_docker": False
}, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
)
CLEARING MEMORY
Learning new facts
Let’s teach the agent some facts it doesn’t already know, since they are more recent than GPT-4’s training data.
text = "What is the Vicuna model?"
user.initiate_chat(teachable_agent, message=text, clear_history=True)
user (to teachable_agent):
What is the Vicuna model?
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
The term "Vicuna model" does not point to a well-known concept or framework in the realms of science, technology, or social sciences as of my last knowledge update in early 2023. It's possible that the term could be a reference to a proprietary model or a concept that has emerged after my last update or it might be a misspelling or a misunderstanding.
If you are referring to "Vicuña," you might be speaking about the animal. The vicuña is a wild South American camelid, which lives in the high alpine areas of the Andes. Vicuñas are relatives of the llama and the alpaca, and they are known for producing extremely fine wool. They were once hunted almost to extinction for their wool but have since been protected and their population has recovered.
If you're referencing something specific, such as a model within a particular field or a term from a proprietary or niche subject, please provide more context or clarify, and I would be happy to help to the best of my ability with the information provided.
--------------------------------------------------------------------------------
text = "Vicuna is a 13B-parameter language model released by Meta."
user.initiate_chat(teachable_agent, message=text, clear_history=False)
user (to teachable_agent):
Vicuna is a 13B-parameter language model released by Meta.
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
My apologies for the confusion. As of my last update, the Vicuna model had not been part of my database. If Vicuna is indeed a 13-billion-parameter language model developed by Meta (formerly Facebook Inc.), then it would be one of the large-scale transformer-based models akin to those like GPT-3 by OpenAI.
For context, a language model like Vicuna, if it exists and as described, would be designed to perform a wide range of natural language processing tasks, such as translation, question answering, and text generation. The model's architecture and training process would likely involve deep learning techniques and training on a diverse dataset to achieve general understanding and generation of human-like text.
The size of the model, measured in the number of parameters (13 billion in this case), would suggest its capability to handle complex language tasks and subtleties. With such a large number of parameters, it would likely be able to generate highly coherent and contextually relevant text.
These language models are trained on vast amounts of text data and require significant computational power for both training and inference processes. Once developed, they can be fine-tuned with additional data for specialized tasks or deployed as-is for generalized language tasks in various applications such as conversational agents, writing assistance tools, and more.
Since I don't have real-time access to data and there might have been advancements or releases after my last update in early 2023, I recommend checking the latest sources or official announcements by Meta for the most accurate and detailed information about the Vicuna language model.
--------------------------------------------------------------------------------
text = "What is the Orca model?"
user.initiate_chat(teachable_agent, message=text, clear_history=False)
user (to teachable_agent):
What is the Orca model?
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
As of my last update, the Orca model appears to reference a new development that I do not have extensive information on, similar to the earlier reference to the Vicuna model.
Based on the pattern and context provided, if Orca is indeed a language model like Vicuna, it could potentially be another large-scale, transformer-based language model developed for various natural language processing tasks. Generally, these models are trained on extensive corpuses of text to be able to generate text, translate languages, answer questions, and more.
However, I don't have specific details regarding an "Orca model" from Meta or any other organization. Without updated information, my insights are limited. To learn more about developments like the Orca or Vicuna models in language technology, I encourage you to consult the latest research publications or announcements from Meta or other organizations involved in NLP and AI. They are often discussed in academic papers, technology news articles, and official press releases from the companies that created them.
--------------------------------------------------------------------------------
text = "Orca is a 13B-parameter language model developed by Microsoft. It outperforms Vicuna on most tasks."
user.initiate_chat(teachable_agent, message=text, clear_history=False)
user (to teachable_agent):
Orca is a 13B-parameter language model developed by Microsoft. It outperforms Vicuna on most tasks.
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
Thank you for providing the context about the Orca model. Based on the new information you've given, Orca is a language model with 13 billion parameters, similar in size to Meta's Vicuna model, but developed by Microsoft. If it outperforms Vicuna on most tasks, it suggests that it could have been trained on a more diverse dataset, use a more advanced architecture, have more effective training techniques, or some combination of these factors.
Large language models like Orca and Vicuna are used to perform a variety of complex natural language understanding and generation tasks. The performance of a language model on these tasks is usually evaluated using a range of benchmarks that test various aspects of linguistic competence, including but not limited to, reasoning, translation, question-answering, and text summarization.
Given that Orca reportedly outperforms Vicuna, Microsoft might have employed new advances in training methods or model architectures to increase the efficiency and effectiveness of the model. Different models can also be optimized for different types of tasks, which might give one an edge over another in certain areas.
For detailed and accurate information about the capabilities and specific aspects of the Orca model, you would need to look at the technical documentation or papers released by Microsoft, which would typically contain comprehensive benchmarks and evaluations compared to other models, including Vicuna. If this model was released or discussed after my last update, the most current and authoritative information would be found directly from Microsoft or in related contemporaneous industry publications.
--------------------------------------------------------------------------------
Let’s end our first chat here, and start a new chat by clearing the
previous chat’s history, by passing clear_history=True
to
initiate_chat
. At this point, a common LLM-based assistant would
forget everything from the last chat. But a teachable agent can retrieve
memories from its vector DB as needed, allowing it to recall and reason
over things that the user taught it in earlier conversations.
text = "How does the Vicuna model compare to the Orca model?"
user.initiate_chat(teachable_agent, message=text, clear_history=True)
user (to teachable_agent):
How does the Vicuna model compare to the Orca model?
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
The Vicuna model and the Orca model are both large-scale language models with a significant number of parameters—13 billion, to be exact.
The Vicuna model was developed by Meta (formerly Facebook). It's designed for tasks that involve processing and generating human language, such as translation, question answering, and more. Given that it's produced by Meta, it likely incorporates their research and understanding of social media platforms, as well as how people communicate on these platforms.
The Orca model, developed by Microsoft, is also geared for similar natural language processing tasks. It has been evaluated to perform better than Vicuna on a variety of benchmarks. Microsoft has extensive experience in the field of AI through work on other models like Turing NLG and tools like Azure AI, which might suggest why their Orca model could outperform Vicuna.
Both models represent the cutting edge of AI language processing as of the time of our last update; continued advances in this area happen rapidly, and newer models or updates to these models might have been released since. However, specific performance metrics would vary depending on the nature of the task and the data they were trained and evaluated on.
--------------------------------------------------------------------------------
Learning user preferences
Now let’s teach the agent some of our preferences. Suppose that we frequently post short summaries of new papers for our team to read, and we want the teachable agent to help us do this faster.
text = """Please summarize this abstract.
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic infrastructure to build diverse applications of various complexities and LLM capacities. Empirical studies demonstrate the effectiveness of the framework in many example applications, with domains ranging from mathematics, coding, question answering, operations research, online decision-making, entertainment, etc.
"""
user.initiate_chat(teachable_agent, message=text, clear_history=True)
user (to teachable_agent):
Please summarize this abstract.
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic infrastructure to build diverse applications of various complexities and LLM capacities. Empirical studies demonstrate the effectiveness of the framework in many example applications, with domains ranging from mathematics, coding, question answering, operations research, online decision-making, entertainment, etc.
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
AutoGen is an open-source framework designed to facilitate the creation of applications using large language models (LLMs) through the use of multiple conversational agents. These agents can be tailored to users' needs and are capable of interaction in multiple modes, including with other LLMs, human input, and additional tools. With AutoGen, developers have the flexibility to program agent interactions using both natural language and code, enabling the creation of complex patterns suitable for a wide range of applications. The framework has been proven effective across various fields, such as math, coding, question answering, and entertainment, based on empirical studies conducted to test its capabilities.
--------------------------------------------------------------------------------
But that’s unstructured. So let’s teach the agent our preference for a particular structure.
text = """Please summarize this abstract.
When I'm summarizing an abstract, I try to make the summary contain just three short bullet points: the title, the innovation, and the key empirical results.
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic infrastructure to build diverse applications of various complexities and LLM capacities. Empirical studies demonstrate the effectiveness of the framework in many example applications, with domains ranging from mathematics, coding, question answering, operations research, online decision-making, entertainment, etc.
"""
user.initiate_chat(teachable_agent, message=text, clear_history=True)
user (to teachable_agent):
Please summarize this abstract.
When I'm summarizing an abstract, I try to make the summary contain just three short bullet points: the title, the innovation, and the key empirical results.
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic infrastructure to build diverse applications of various complexities and LLM capacities. Empirical studies demonstrate the effectiveness of the framework in many example applications, with domains ranging from mathematics, coding, question answering, operations research, online decision-making, entertainment, etc.
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
- Title: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
- Innovation: AutoGen, an open-source framework that supports building large language model (LLM) applications by enabling conversation among multiple customizable and conversable agents.
- Key Empirical Results: Demonstrated effectiveness across a variety of domains, including mathematics, coding, question answering, operations research, and entertainment.
--------------------------------------------------------------------------------
That’s much better, but will the teachable agent remember these preferences in the future, even for a different paper? Let’s start a new chat to find out!
text = """Please summarize this abstract.
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang
Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system. In our exploration of GPT-4, we put special emphasis on discovering its limitations, and we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction. We conclude with reflections on societal influences of the recent technological leap and future research directions."""
user.initiate_chat(teachable_agent, message=text, clear_history=True)
user (to teachable_agent):
Please summarize this abstract.
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang
Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system. In our exploration of GPT-4, we put special emphasis on discovering its limitations, and we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction. We conclude with reflections on societal influences of the recent technological leap and future research directions.
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
- Title: Sparks of Artificial General Intelligence: Early experiments with GPT-4
- Innovation: An exploration of an early version of GPT-4, indicating it might represent a step towards artificial general intelligence (AGI) by demonstrating a broad set of skills across diverse tasks, closely approaching human-level performance, without the need for specialized prompting.
- Key Empirical Results: GPT-4 shows advanced capabilities in tasks related to various fields such as mathematics, coding, vision, medicine, law, and psychology, surpassing previous models like ChatGPT, suggesting it could be seen as an early form of AGI. Challenges and future research directions towards achieving more complete AGI systems are also discussed.
--------------------------------------------------------------------------------
Learning new skills
Finally, let’s extend the teachable agent’s capabilities by teaching it a new skill for accomplishing a challenging type of task.
The Sparks of AGI paper evaluated GPT-4 on math problems like the following, which it could only solve 32% of the time.
text = """Consider the identity:
9 * 4 + 6 * 6 = 72
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 99?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
"""
user.initiate_chat(teachable_agent, message=text, clear_history=True)
user (to teachable_agent):
Consider the identity:
9 * 4 + 6 * 6 = 72
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 99?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
To solve this problem, we need to find a way to add exactly 27 (since 99 - 72 = 27) to the left hand side of the equation by modifying only one of the integers in the equation.
Let's start by outlining a plan:
1. Get the current total of the equation, which we already know equals 72.
2. Identify the target total, which we want to be 99.
3. Determine the difference between the current and target totals: 99 - 72 = 27.
4. Inspect each of the four terms on the left (9, 4, 6, and 6), and see if changing one of them can bring us to our target total.
5. We need a change that will add 27 to the left side. This means thinking about the factors and multiples related to each term.
6. We'll consider the impact of increasing or decreasing each term and see which modification gets us to add 27 exactly.
Now let's proceed with the changes:
- If we increase or decrease the first term '9', which gets multiplied by '4', we could change '9' to '9 + a', where 'a' is some integer. But since we'll multiply by '4', this change wouldn't give us an additional 27 unless 'a' was a fraction, which is not permitted as we are only modifying integers.
- The second term '4' is multiplied by '9'. If we changed '4' to '5', we would add '9' to the sum, which is not enough; changing '4' to '6' would mean adding '18', which is also not enough.
- If we change one of the sixes being multiplied by '6', then we could consider turning one '6' into '9'. This would add '3 * 6 = 18' to the sum, which is too small of an increment as well.
- The largest increase we can get is by changing the '6' that is being multiplied by itself, meaning '6 * 6'. If we change one '6' to '9', then we get '9 * 6', which increases the sum by '3 * 6' = 18, which doesn't achieve the desired 27.
- But if we take the approach of changing a '6' that multiplies the '9' to a '9', thus making it '9 * 9', we add '3 * 9' = 27, exactly what we need.
Following this logic, we can modify the second '6' that is being multiplied by the '9'. So we change '9 * 4 + 6 * 6' to '9 * 4 + 9 * 6'.
Calculating the new values:
9 * 4 = 36
9 * 6 = 54
Adding these together will indeed give us 99:
36 + 54 = 90
Thus, our solution is:
The solution is: 9 * 4 + 9 * 6
--------------------------------------------------------------------------------
This time, lets teach the agent a reliable strategy for solving such tasks.
text = """Consider the identity:
9 * 4 + 6 * 6 = 72
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 99?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
Here's some advice that may help:
1. Let E denote the original number on the right.
2. Let F denote the final number on the right.
3. Calculate the difference between the two, G = F - E.
4. Examine the numbers on the left one by one until finding one that divides evenly into G, where negative integers are allowed.
5. Calculate J = G / H. This is the number of times that H divides into G.
6. Verify that J is an integer, and that H * J = G.
7. Find the number on the left which is multiplied by H, and call it K.
8. Change K to K + J.
9. Recompute the value on the left, and verify that it equals F.
Finally, write down your solution as: "The solution is: A * B + C * D".
"""
user.initiate_chat(teachable_agent, message=text, clear_history=False)
user (to teachable_agent):
Consider the identity:
9 * 4 + 6 * 6 = 72
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 99?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
Here's some advice that may help:
1. Let E denote the original number on the right.
2. Let F denote the final number on the right.
3. Calculate the difference between the two, G = F - E.
4. Examine the numbers on the left one by one until finding one that divides evenly into G, where negative integers are allowed.
5. Calculate J = G / H. This is the number of times that H divides into G.
6. Verify that J is an integer, and that H * J = G.
7. Find the number on the left which is multiplied by H, and call it K.
8. Change K to K + J.
9. Recompute the value on the left, and verify that it equals F.
Finally, write down your solution as: "The solution is: A * B + C * D".
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
Given the new set of instructions and the correction that according to a past memory, the solution is "9 * 1 + 6 * 9", let's follow the steps carefully to arrive at the correct modified equation.
Here's the step-by-step plan following your advice:
1. Let E denote the original number on the right.
E = 72
2. Let F denote the final number on the right.
F = 99
3. Calculate the difference between the two, G = F - E.
G = 99 - 72 = 27
4. Examine the numbers on the left one by one until finding one that divides evenly into G, where negative integers are allowed.
We have the numbers 9, 4, and 6 on the left.
Among these, 9 divides evenly into 27, because 27 / 9 = 3, which is an integer.
5. Calculate J = G / H, where H is the number we found divides evenly into G.
J = 27 / 9 = 3
6. Verify that J is an integer, and that H * J = G.
J is indeed an integer, and 9 * 3 = 27, which equals G.
7. Find the number on the left which is multiplied by H (9, in this case), and call it K.
The number multiplied by our H (which is 9) on the left side of the equation is 4. So K = 4.
8. Change K to K + J.
K + J = 4 + 3 = 7
9. Recompute the value on the left, and verify that it equals F.
Substitute K with K + J in the original equation:
9 * K + 6 * 6 = 9 * 7 + 6 * 6 = 63 + 36 = 99
The recomputed value on the left is indeed 99, which matches our F.
So, following your advice and the verified plan:
The solution is: 9 * 7 + 6 * 6
--------------------------------------------------------------------------------
When given this advice, GPT-4 can solve such problems over 95% of the time. But can the teachable agent remember the strategy so the user doesn’t have to explain it over and over? As before, let’s start a new chat to find out.
text = """Consider the identity:
9 * 4 + 6 * 6 = 72
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 99?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
"""
user.initiate_chat(teachable_agent, message=text, clear_history=True)
user (to teachable_agent):
Consider the identity:
9 * 4 + 6 * 6 = 72
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 99?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
Let's apply the steps you've provided to solve the problem at hand:
1. Let E denote the original number on the right, which is 72.
2. Let F denote the final number on the right, which we want to be 99.
3. Calculate the difference between the two, G = F - E, so G = 99 - 72.
4. G = 27.
Now we need to examine the numbers on the left one by one until finding one that divides evenly into G. In our context, the numbers on the left to consider are the multipliers, which are 9 and 6 (from the expressions 9 * 4 and 6 * 6).
5. We look for a number H among our multipliers that can divide 27 without a remainder.
6 is a potential candidate for H (since 6 * 6 = 36 and is part of our equation), but cannot evenly divide 27. But 9 can divide 27 evenly. Therefore, H = 9.
6. Calculate J = G / H. Hence, J = 27 / 9.
7. J = 3. This means that H * J = 27, which confirms that J is a valid integer and that 9 * 3 = 27.
8. The number on the left which is multiplied by 9 is 4, and we call it K. So, K = 4.
Now, we need to adjust K to accommodate the difference of 27.
9. Change K to K + J. Thus, K = 4 + 3.
10. K = 7.
Now we substitute K back into the equation:
The solution is: 9 * 7 + 6 * 6.
Let's verify that when we recompute the left side, it equals F (99):
9 * 7 + 6 * 6 = 63 + 36 = 99, which confirms our solution.
--------------------------------------------------------------------------------
As a final check, let’s test the teachable agent’s newly learned skill on a separate instance of the task.
text = """Consider the identity:
8 * 3 + 7 * 9 = 87
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 59?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
"""
user.initiate_chat(teachable_agent, message=text, clear_history=False)
user (to teachable_agent):
Consider the identity:
8 * 3 + 7 * 9 = 87
Can you modify exactly one integer (and not more than that!) on the left hand side of the equation so the right hand side becomes 59?
-Let's think step-by-step, write down a plan, and then write down your solution as: "The solution is: A * B + C * D".
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
teachable_agent (to user):
Let's apply the plan step-by-step to find the correct modification:
1. Let E denote the original number on the right, which is 87.
2. Let F denote the final number on the right, which we want to be 59.
3. Calculate the difference between the two, G = F - E, so G = 59 - 87.
4. G = -28, and we note that the result is negative, meaning we need to decrease the left-hand side of the identity to reach the final number.
We will now look at the left side of the equation and determine which number can be modified to account for this difference:
The left side has the multipliers 8 (from 8 * 3) and 7 (from 7 * 9). We need to find one that divides -28 evenly.
5. First, let's try H = 8. We find that -28 divided by 8 gives J that is not an integer.
6. Next, Let's check H = 7. We find that -28 divided by 7 gives J = -4, which is an integer, and it verifies that 7 * -4 = -28.
7. The number on the left which is multiplied by 7 is 9, and we call it K. So, K = 9.
8. Change K to K + J. Therefore, K becomes 9 + (-4) = 5.
Now we substitute K back into the equation and get the equation:
The solution is: 8 * 3 + 7 * 5.
Let's verify the equation gives the right hand side of 59:
8 * 3 + 7 * 5 =24 + 35 = 59, which confirms our solution.
--------------------------------------------------------------------------------