This OpenAI demo shows how real-time AI agents actually work

OpenAI’s new demo shows developers how to build real-time voice agents with handoffs, tools and production-style orchestration.

Tuesday April 28, 2026 , 4 min Read

Voice AI sounds simple until it has to work in real time. That is where things get messy. OpenAI has released an open-source demonstration called Realtime API Agents Demo, showing how developers can build production-style voice agents step by step.

The project combines the OpenAI Realtime API with the OpenAI Agents SDK, giving teams a clearer look at orchestration, handoffs and tool calling without digging through a full commercial product stack.

A demo that shows the moving parts

The repository is designed to make real-time agent design easier to understand. Instead of presenting one large, complex system, it breaks the process into reusable patterns. Developers can study how conversations are handled, how agents call tools and how control moves between different specialist agents.

This is useful because voice agents are different from text chatbots. They need to respond quickly, handle interruptions and continue conversations without breaking context. The demo focuses on exactly these challenges.

The two patterns developers should notice

The first major design is called the Chat Supervisor pattern. In this setup, a fast voice agent handles immediate conversation with the user. A more capable text model sits behind it and takes care of deeper reasoning, tool calls and complex decisions.

This keeps the voice experience responsive while still allowing the system to handle harder tasks. The second pattern is Sequential Handoffs. This allows a user to move between specialist agents, such as authentication, returns or sales, without restarting the conversation. For customer support teams, this structure is especially relevant.

Built for practical experimentation

The demo is built as a Next.js TypeScript app. It includes environment variables, scenario selection and placeholders for tool logic, approvals and guardrails. This makes it easier for developers to copy the pattern and adapt it to their own use case.

The repository also includes simple examples, including a greeter-to-haiku flow and a Customer Service Retail scenario. These examples help teams understand the mechanics before adding their own prompts, APIs and business rules.

What the Realtime API changes

Traditional chat systems work through separate requests. A user sends a message, the model responds, and the process repeats. The Realtime API works differently. It keeps a long-lived session active, allowing the model to process audio and text incrementally.

It can stream speech back, call functions and handle interruptions during the same live interaction. This is critical for voice agents. Users expect conversations to feel natural, not delayed or overly scripted.

How real-time orchestration works

The demo shows developers how to define one or more Real-timeAgent objects. Each agent can include instructions, tools, handoffs and output guardrails. A Real-time Runner then opens the live session and manages the interaction.

Developers can send audio or text into the session, listen for SDK events and access lower-level Realtime API events when they need more control. The system also supports human approvals for sensitive tool actions, which matters for production use.

1052 people loved this story
OpenAI’s GPT‑5.5 takes another step toward agentic intelligence

A stronger foundation for customer-facing AI

Voice agents are becoming more relevant across support, education, healthcare, commerce and personal assistance. The Realtime API already supports browser-based WebRTC, server-side WebSocket and SIP for phone-based use cases. It also supports text, audio and image inputs, with text and audio outputs.

This gives developers flexibility. They can build web assistants, phone agents or internal voice tools using the same underlying approach.

What Indian product teams can take away?

For Indian startups, the demo lowers the barrier to building serious voice AI. Many sectors, from fintech to edtech and customer support, depend on high-volume conversations. A structured voice agent can reduce waiting time, improve consistency and support multilingual expansion over time.

The key is to begin carefully. Teams should start with narrow workflows, add approvals for sensitive actions and test latency, accuracy and fallback behaviour before scaling.

Where this could go next

OpenAI’s demo is not a finished product. It is a pattern library for developers who want to understand how real-time agents should be assembled. That makes it valuable. As voice becomes a more natural interface for software, teams will need reliable ways to manage agents, tools, memory and handoffs. OpenAI’s repository gives builders a practical starting point for that shift.

Advertise with us