Skip to main content
The HeyGen API gives you the pieces of a produced video — a talking avatar, a background-music catalog, and a sound-effects catalog. Hyperframes is the compositor that stitches them together: an HTML composition can lay an avatar clip into a designed scene, add a lower third and motion graphics, score it with music, and punch in sound effects — then render the whole thing to a single MP4. This guide wires all four APIs into one pipeline:
1

Generate the avatar video

Create the talking-head clip with POST /v3/videos. Pick an engine — Avatar IV (default), Avatar V (highest fidelity), or Avatar III — then poll until it’s completed and keep the video_url.
curl -X POST "https://api.heygen.com/v3/videos" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "avatar",
    "avatar_id": "YOUR_LOOK_ID",
    "script": "Here are our Q2 results.",
    "voice_id": "YOUR_VOICE_ID",
    "resolution": "1080p",
    "engine": { "type": "avatar_v" }
  }'
Poll GET /v3/videos/{video_id} until status is completed, then read video_url. See the Digital Twin guide for the full request-and-poll flow.
2

Find background music

Search the music catalog with a plain-language description and take the top track’s audio_url (Background music).
curl -X GET "https://api.heygen.com/v3/audio/sounds?query=upbeat%20corporate%20background&limit=1" \
  -H "x-api-key: YOUR_API_KEY"
3

Find sound effects

Same endpoint, with type=sound_effects — grab a whoosh for your title reveal or a chime for a stat pop (Sound effects).
curl -X GET "https://api.heygen.com/v3/audio/sounds?query=whoosh%20for%20a%20scene%20change&type=sound_effects&limit=1" \
  -H "x-api-key: YOUR_API_KEY"
Each audio_url is a short-lived pre-signed link. Feed it straight into the render step below rather than caching it.
4

Compose the scene in Hyperframes

Build a composition that arranges the avatar clip and audio inside a designed frame. Expose each asset URL as a composition variable so you can inject the URLs from steps 1–3 at render time — the same bundle then works for any avatar clip and any track.
index.html
<body
  data-composition-variables='{
    "avatarUrl": "",
    "musicUrl": "",
    "sfxUrl": ""
  }'
>
  <!-- The avatar clip, placed into your branded scene -->
  <video data-hf-src="avatarUrl" class="avatar"></video>

  <!-- Lower third, titles, charts — your motion graphics go here -->
  <div class="lower-third">Q2 Revenue · $1.2M</div>

  <!-- Score + one-shot sound effect, layered over the scene -->
  <audio data-hf-src="musicUrl" data-hf-volume="0.3"></audio>
  <audio data-hf-src="sfxUrl" data-hf-start="0.5"></audio>
</body>
The exact media, timing, and audio APIs live in the Hyperframes developer docs — the snippet above is illustrative. For design patterns, see the Hyperframes cookbook.
5

Render the finished video

Package the composition as a .zip, then submit it to POST /v3/hyperframes/renders, passing the URLs from the earlier steps as variables. HeyGen renders the composed scene — avatar, graphics, music, and SFX baked into one file.
curl -X POST "https://api.heygen.com/v3/hyperframes/renders" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "project": { "type": "asset_id", "asset_id": "YOUR_PROJECT_ZIP_ASSET_ID" },
    "format": "mp4",
    "resolution": "1080p",
    "aspect_ratio": "16:9",
    "variables": {
      "avatarUrl": "https://.../avatar-video.mp4",
      "musicUrl": "https://.../background-music.wav",
      "sfxUrl": "https://.../whoosh.wav"
    },
    "title": "Q2 results — composed"
  }'
Poll GET /v3/hyperframes/renders/{render_id} until status is completed and download the final video_url. Full render options are in Hyperframes Cloud Rendering.

End-to-end example

import requests, time

API_KEY = "YOUR_API_KEY"
BASE = "https://api.heygen.com"
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}


def poll(url, done="completed", key="status"):
    while True:
        data = requests.get(url, headers=HEADERS).json()["data"]
        if data[key] == done:
            return data
        if data[key] == "failed":
            raise RuntimeError(data.get("failure_message", "job failed"))
        time.sleep(10)


# 1. Avatar video
vid = requests.post(f"{BASE}/v3/videos", headers=HEADERS, json={
    "type": "avatar",
    "avatar_id": "YOUR_LOOK_ID",
    "script": "Here are our Q2 results.",
    "voice_id": "YOUR_VOICE_ID",
    "resolution": "1080p",
    "engine": {"type": "avatar_v"},
}).json()["data"]
avatar_url = poll(f"{BASE}/v3/videos/{vid['video_id']}")["video_url"]

# 2 + 3. Music and a sound effect (same endpoint, different type)
music_url = requests.get(
    f"{BASE}/v3/audio/sounds?query=upbeat corporate background&limit=1",
    headers=HEADERS).json()["data"][0]["audio_url"]
sfx_url = requests.get(
    f"{BASE}/v3/audio/sounds?query=whoosh for a scene change&type=sound_effects&limit=1",
    headers=HEADERS).json()["data"][0]["audio_url"]

# 4 + 5. Compose everything in Hyperframes and render
render = requests.post(f"{BASE}/v3/hyperframes/renders", headers=HEADERS, json={
    "project": {"type": "asset_id", "asset_id": "YOUR_PROJECT_ZIP_ASSET_ID"},
    "format": "mp4",
    "resolution": "1080p",
    "aspect_ratio": "16:9",
    "variables": {
        "avatarUrl": avatar_url,
        "musicUrl": music_url,
        "sfxUrl": sfx_url,
    },
    "title": "Q2 results — composed",
}).json()["data"]
final = poll(f"{BASE}/v3/hyperframes/renders/{render['render_id']}")
print("Final video:", final["video_url"])
Every step here is also available through the HeyGen MCP, so an agent can run the whole avatar → audio → compose → render pipeline on its own.