Skip to main content

Prerequisites

An image of a person (PNG or JPEG) — accessible via a public URL or uploaded as an asset
A voice_id for the voice you want. Use GET /v3/voices to browse options.

Step 1 — Generate the video

Use POST /v3/videos with image_url or image_asset_id instead of avatar_id:
curl -X POST "https://api.heygen.com/v3/videos" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/person.jpg",
    "script": "Hello! This video was generated directly from a photo, with no avatar setup needed.",
    "voice_id": "YOUR_VOICE_ID",
    "title": "Image to Video Demo",
    "resolution": "1080p",
    "aspect_ratio": "16:9"
  }'
image_url, image_asset_id, and avatar_id are mutually exclusive. Use exactly one.

Step 2 — Poll for completion

Video generation is asynchronous. Poll GET /v3/videos/{video_id} until the status reaches completed:
curl -X GET "https://api.heygen.com/v3/videos/YOUR_VIDEO_ID" \
  -H "x-api-key: YOUR_API_KEY"
StatusMeaning
pendingQueued for processing
processingVideo is being generated
completedReady — video_url is available
failedSomething went wrong

Full example

import requests
import time

API_KEY = "YOUR_API_KEY"
BASE = "https://api.heygen.com"
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}

# 1. Generate video from an image URL
resp = requests.post(f"{BASE}/v3/videos", headers=HEADERS, json={
    "image_url": "https://example.com/person.jpg",
    "script": "Welcome! This entire video was created from a single photograph.",
    "voice_id": "YOUR_VOICE_ID",
    "title": "Image-to-Video Example",
    "resolution": "1080p",
    "aspect_ratio": "16:9"
})
video_id = resp.json()["data"]["video_id"]
print(f"Video created: {video_id}")

# 2. Poll until done
while True:
    status_resp = requests.get(f"{BASE}/v3/videos/{video_id}", headers=HEADERS)
    data = status_resp.json()["data"]
    print(f"Status: {data['status']}")
    if data["status"] == "completed":
        print(f"Download: {data['video_url']}")
        break
    elif data["status"] == "failed":
        print(f"Error: {data.get('failure_message')}")
        break
    time.sleep(10)

Using audio instead of a script

You can lip-sync to a custom audio file instead of generating speech from text. Pass audio_url or audio_asset_id instead of script + voice_id:
{
  "image_url": "https://example.com/person.jpg",
  "audio_url": "https://example.com/narration.mp3",
  "title": "Image-to-Video with custom audio"
}
script and audio_url/audio_asset_id are mutually exclusive. If you provide a script, you must also provide a voice_id.

Optional parameters

ParameterTypeDescription
titlestringDisplay name in the HeyGen dashboard
resolutionstring1080p or 720p
aspect_ratiostring16:9 or 9:16
remove_backgroundbooleanRemove the image background from the video
backgroundobjectSet a solid color or image background
voice_settingsobjectAdjust speed (0.5–1.5), pitch (-50 to +50), locale
callback_urlstringWebhook URL for completion notification
callback_idstringYour own ID echoed back in the webhook payload

Image-to-video vs. Photo Avatar

CriteriaImage-to-VideoPhoto Avatar
SetupNone — pass an image and goRequires POST /v3/avatars first
ReusabilityOne-off per image URLReusable across many videos
Motion promptNot supportedSupported
ExpressivenessNot supportedhigh / medium / low
Best forQuick tests, one-off contentRecurring brand content
If you plan to generate multiple videos with the same person, create a Photo Avatar once and reuse its avatar_id. This saves processing time and unlocks motion and expressiveness controls.