Avatar Realtime opens a live streaming session where an avatar speaks in real time — useful for live agents, kiosks, and voice assistants with a face. You create a session, poll for the playback URL, and play it.
Avatar Realtime streams at 720p only.
Create a session
POST /v3/avatar-realtime — choose how to drive speech with the type field:
tts — speak a script (avatar_id, voice_id, text)
audio — lip-sync to your own audio (avatar_id, audio)
text_stream — stream text live, e.g. from an LLM (avatar_id, voice_id, text)
curl -X POST "https://api.heygen.com/v3/avatar-realtime" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"type": "tts",
"avatar_id": "Daisy-inskirt-20220818",
"voice_id": "1bd001e7e50f421d891986aad5158bc8",
"text": "Hi there — welcome to HeyGen Avatar Realtime."
}'
{ "data": { "stream_id": "a1b2c3d4-..." } }
Get the playback URL
GET /v3/avatar-realtime/{stream_id} — poll until the session is ready, then play the HLS url in any HLS player.
curl -X GET "https://api.heygen.com/v3/avatar-realtime/a1b2c3d4-..." \
-H "X-Api-Key: $HEYGEN_API_KEY"
{ "data": { "stream_id": "a1b2c3d4-...", "status": "ready", "url": "https://.../stream.m3u8" } }
Stream more text
For text_stream sessions, append text as it becomes available — the avatar keeps speaking on the open stream.
curl -X POST "https://api.heygen.com/v3/avatar-realtime/a1b2c3d4-.../text" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "text": "Here are the results I found." }'
Limits
| Limit | Default | Notes |
|---|
Idle timeout (text_stream) | 30 sec | The session closes if no new text arrives for 30 seconds. |
| Max session length | 1 hour | Sessions are capped at one hour. |
| Concurrent sessions per space | 3 | Maximum simultaneous realtime sessions. |
Pricing
Billed per second of session duration (720p only): $0.05 / sec self-serve, 0.05 credits / sec on Enterprise. See Self-Serve Pricing and Enterprise Pricing.