Skip to content

Realtime API#

RealtimeAgent with Gemini API

Deprecated

RealtimeAgent is deprecated as of v0.12 and will be removed in v0.14. It relies on deprecated realtime API endpoints. This blog post and associated notebooks will also be removed in v0.14.

Realtime agent communication with Gemini live API

TL;DR:

Why is this important?

We previously supported a Realtime Agent powered by OpenAI. In December 2024, Google rolled out Gemini 2.0, which includes the multi-modal live APIs. These APIs enable advanced capabilities such as real-time processing of audio inputs in live conversational settings. To ensure developers can fully leverage the capabilities of the latest LLMs, we now also support a RealtimeAgent powered by Gemini.

Real-Time Voice Interactions over WebRTC

Deprecated

RealtimeAgent is deprecated as of v0.12 and will be removed in v0.14. It relies on deprecated realtime API endpoints. This blog post and associated notebooks will also be removed in v0.14.

Realtime agent communication over WebRTC

TL;DR: - Build a real-time voice application using WebRTC and connect it with the RealtimeAgent. Demo implementation. - Optimized for Real-Time Interactions: Experience seamless voice communication with minimal latency and enhanced reliability.

Real-Time Voice Interactions with the WebSocket Audio Adapter

Deprecated

RealtimeAgent is deprecated as of v0.12 and will be removed in v0.14. It relies on deprecated realtime API endpoints. This blog post and associated notebooks will also be removed in v0.14.

Realtime agent communication over websocket

TL;DR: - Demo implementation: Implement a website using websockets and communicate using voice with the RealtimeAgent - Introducing WebSocketAudioAdapter: Stream audio directly from your browser using WebSockets. - Simplified Development: Connect to real-time agents quickly and effortlessly with minimal setup.

Realtime over WebSockets

In our previous blog post, we introduced a way to interact with the RealtimeAgent using TwilioAudioAdapter. While effective, this approach required a setup-intensive process involving Twilio integration, account configuration, number forwarding, and other complexities. Today, we're excited to introduce theWebSocketAudioAdapter, a streamlined approach to real-time audio streaming directly via a web browser.

This post explores the features, benefits, and implementation of the WebSocketAudioAdapter, showing how it transforms the way we connect with real-time agents.

Introducing RealtimeAgent Capabilities in AG2

Deprecated

RealtimeAgent is deprecated as of v0.12 and will be removed in v0.14. It relies on deprecated realtime API endpoints. This blog post and associated notebooks will also be removed in v0.14.

TL;DR: - RealtimeAgent is coming in the AG2 0.6 release, enabling real-time conversational AI. - Features include real-time voice interactions, seamless task delegation to Swarm teams, and Twilio-based telephony integration. - Learn how to integrate Twilio and RealtimeAgent into your swarm in this blogpost.

Realtime API Support: What's New?

We're thrilled to announce the release of RealtimeAgent, extending AG2's capabilities to support real-time conversational AI tasks. This new experimental feature makes it possible for developers to build agents capable of handling voice-based interactions with minimal latency, integrating OpenAI’s Realtime API, Twilio for telephony, and AG2’s Swarm orchestration.