VERCEL

Build realtime voice agents on AI Gateway

Bùi Đăng Minh•Thứ hai, 29/6/2026, 10:00 (GMT+7)•4 min read

AI Gateway now supports audio/voice. You can add realtime voice, text to speech, and speech to text with the same calls you already use for text, image, and video, routed through AI Gateway alongside every other modality.

Audio launches with models from OpenAI and xAI. Each call gets the same provider routing, observability, spend controls, and bring-your-own-key support you already use for your other models.

These capabilities are in beta and available in AI SDK 7.

Live audio in and out, for streaming, low-latency session

Two-way voice agents and live conversation

Text in, audio file out, single request

Voiceovers, spoken responses, audio versions of written content

Recorded audio in, text out, single request

Nguồn / Original source: Vercel (@vercel & @addyosmani)