Skip to content

Blog#

Introducing RealtimeAgent Capabilities in AG2

TL;DR: - RealtimeAgent is coming in the AG2 0.6 release, enabling real-time conversational AI. - Features include real-time voice interactions, seamless task delegation to Swarm teams, and Twilio-based telephony integration. - Learn how to integrate Twilio and RealtimeAgent into your swarm in this blogpost.

Realtime API Support: What's New?

We're thrilled to announce the release of RealtimeAgent, extending AG2's capabilities to support real-time conversational AI tasks. This new experimental feature makes it possible for developers to build agents capable of handling voice-based interactions with minimal latency, integrating OpenAI’s Realtime API, Twilio for telephony, and AG2’s Swarm orchestration.

ReasoningAgent Update - Beam Search, MCTS, and LATS for LLM Reasoning

Key Updates in this Release:

  1. Configuration Changes
  2. All reasoning parameters are now configured through a single reason_config dictionary
  3. Breaking Change: Parameters like max_depth, beam_size, and answer_approach have moved from constructor arguments into reason_config

  4. New Search Strategies

  5. Added Monte Carlo Tree Search (MCTS) as an alternative to Beam Search
  6. Introduced Language Agent Tree Search (LATS) - an enhancement to MCTS that incorporates reflection prior to the next round of simulation.

  7. Enhanced Features

  8. New forest_size parameter enables maintaining multiple independent reasoning trees
  9. Support for ground truth answers in prompts to generate training data for LLM fine-tuning

Tree of Thoughts

Introduction

In our previous post, we introduced the ReasoningAgent, which utilized Beam Search for systematic reasoning. Today, we include MCTS (Monte Carlo Tree Search) and Language Agent Tree Search (LATS) as alternative search strategies, which present advantages in different scenarios.

Our previous ReasoningAgent draws inspiration from OpenAI's 2023 paper, Let's Verify Step by Step, as well as the 2024 O1 feature. The landscape of contemporary research is rich, with notable works such as DeepSeek-R1, Macro-O1, and OpenR.

Knowledgeable Agents with FalkorDB Graph RAG

FalkorDB Web

TL;DR: * We introduce a new ability for AG2 agents, Graph RAG with FalkorDB, providing the power of knowledge graphs * Structured outputs, using OpenAI models, provide strict adherence to data models to improve reliability and agentic flows * Nested chats are now available with a Swarm

FalkorDB Graph RAG

Typically, RAG uses vector databases, which store information as embeddings, mathematical representations of data points. When a query is received, it's also converted into an embedding, and the vector database retrieves the most similar embeddings based on distance metrics.

Graph-based RAG, on the other hand, leverages graph databases, which represent knowledge as a network of interconnected entities and relationships. When a query is received, Graph RAG traverses the graph to find relevant information based on the query's structure and semantics.

ReasoningAgent - Tree of Thoughts with Beam Search in AG2

TL;DR: * We introduce ReasoningAgent, an AG2 agent that implements tree-of-thought reasoning with beam search to solve complex problems. * ReasoningAgent explores multiple reasoning paths in parallel and uses a grader agent to evaluate and select the most promising paths. * The exploration trajectory and thought tree can be saved locally for further analysis. These logs can even be saved as SFT dataset and preference dataset for DPO and PPO training.

Tree of Thoughts

Introduction

Large language models (LLMs) have shown impressive capabilities in various tasks, but they can still struggle with complex reasoning problems that require exploring multiple solution paths. To address this limitation, we introduce ReasoningAgent, an AG2 agent that implements tree-of-thought reasoning with beam search.

The key idea behind ReasoningAgent is to: 1. Generate multiple possible reasoning steps at each point 2. Evaluate these steps using a grader agent 3. Keep track of the most promising paths using beam search 4. Continue exploring those paths while pruning less promising ones

This approach allows the agent to systematically explore different reasoning strategies while managing computational resources efficiently.

Agentic testing for prompt leakage security

Prompt leakage social img

Introduction

As Large Language Models (LLMs) become increasingly integrated into production applications, ensuring their security has never been more crucial. One of the most pressing security concerns for these models is prompt injection, specifically prompt leakage.

LLMs often rely on system prompts (also known as system messages), which are internal instructions or guidelines that help shape their behavior and responses. These prompts can sometimes contain sensitive information, such as confidential details or internal logic, that should never be exposed to external users. However, with careful probing and targeted attacks, there is a risk that this sensitive information can be unintentionally revealed.

To address this issue, we have developed the Prompt Leakage Probing Framework, a tool designed to probe LLM agents for potential prompt leakage vulnerabilities. This framework serves as a proof of concept (PoC) for creating and testing various scenarios to evaluate how easily system prompts can be exposed. By automating the detection of such vulnerabilities, we aim to provide a powerful tool for testing the security of LLMs in real-world applications.

AgentOps, the Best Tool for AutoGen Agent Observability

AgentOps and AutoGen

TL;DR

  • AutoGen® offers detailed multi-agent observability with AgentOps.
  • AgentOps offers the best experience for developers building with AutoGen in just two lines of code.
  • Enterprises can now trust AutoGen in production with detailed monitoring and logging from AgentOps.

AutoGen is excited to announce an integration with AgentOps, the industry leader in agent observability and compliance. Back in February, Bloomberg declared 2024 the year of AI Agents. And it's true! We've seen AI transform from simplistic chatbots to autonomously making decisions and completing tasks on a user's behalf.

However, as with most new technologies, companies and engineering teams can be slow to develop processes and best practices. One part of the agent workflow we're betting on is the importance of observability. Letting your agents run wild might work for a hobby project, but if you're building enterprise-grade agents for production, it's crucial to understand where your agents are succeeding and failing. Observability isn't just an option; it's a requirement.

As agents evolve into even more powerful and complex tools, you should view them increasingly as tools designed to augment your team's capabilities. Agents will take on more prominent roles and responsibilities, take action, and provide immense value. However, this means you must monitor your agents the same way a good manager maintains visibility over their personnel. AgentOps offers developers observability for debugging and detecting failures. It provides the tools to monitor all the key metrics your agents use in one easy-to-read dashboard. Monitoring is more than just a “nice to have”; it's a critical component for any team looking to build and scale AI agents.

Enhanced Support for Non-OpenAI Models

agents

TL;DR

  • AutoGen has expanded integrations with a variety of cloud-based model providers beyond OpenAI.
  • Leverage models and platforms from Gemini, Anthropic, Mistral AI, Together.AI, and Groq for your AutoGen agents.
  • Utilise models specifically for chat, language, image, and coding.
  • LLM provider diversification can provide cost and resilience benefits.

In addition to the recently released AutoGen Google Gemini client, new client classes for Mistral AI, Anthropic, Together.AI, and Groq enable you to utilize over 75 different large language models in your AutoGen agent workflow.

These new client classes tailor AutoGen's underlying messages to each provider's unique requirements and remove that complexity from the developer, who can then focus on building their AutoGen workflow.

Using them is as simple as installing the client-specific library and updating your LLM config with the relevant api_type and model. We'll demonstrate how to use them below.

The community is continuing to enhance and build new client classes as cloud-based inference providers arrive. So, watch this space, and feel free to discuss or develop another one.