Skip to main content
HeyGen’s self-serve (Pay-As-You-Go) plan lets you purchase USD balance when you need it — no monthly subscription, no commitments.

How Billing Works

When you authenticate with an API Key (x-api-key header), you are billed under the API tier. Usage is deducted from your prepaid USD wallet. Check your balance at any time:
GET /v3/user/me → wallet
OAuth vs API Key: If you authenticate with an OAuth bearer token, usage is billed against your web plan, not the API tier. Check your web plan balance with GET /v3/user/me → subscription.Using an API Key is recommended for automation and integration workflows. API key authentication provides higher concurrency limits and is more flexible and powerful for programmatic use.

Pricing

All rates are billed in USD based on output duration.

Video Agent

FeatureRate
Prompt to Video$0.0333 / sec

Video Generation

Avatar TypeEngineRate
Public AvatarIV$0.1 / sec
Digital TwinIV$0.1 / sec
Photo AvatarIV$0.1 / sec

Video Translation

ModeRate
Speed Mode$0.05 / sec
Precision Mode$0.1 / sec
Proofread$0.0083 / sec

Overdub

ModeRate
Speed Mode$0.05 / sec
Precision Mode$0.1 / sec

Text-to-Speech

ModelRate
Speech — Starfish$0.000333 / sec

Avatar Creation

OperationRate
Digital Twin$1 per call
Photo Avatar$1 per call

Concurrency Limits

PlanMax Concurrent Video Jobs
Pay-As-You-Go10
Concurrent jobs include any asynchronous generation in progress: Video Agent sessions, avatar video renders, and video translations. Exceeding the limit returns 429 Too Many Requests with a Retry-After header.

Endpoint Limits

Video Generation Input

Resources provided to POST /v3/videos must meet these limits. Invalid resources will cause render failures.
Resource TypeSupported FormatsMax File SizeMax Resolution
VideoMP4, WebM100 MB< 2K
ImageJPG, PNG50 MB< 2K
AudioWAV, MP350 MB
Requirements:
  • Resource URLs must be publicly accessible (no authentication required).
  • The file extension must match the actual file format.
  • Files must not be corrupted or malformed.

Avatar Input

  • Script text: Maximum 5,000 characters.
  • Audio input: Maximum 10 minutes (600 seconds).

Video Agent Input

  • Prompt: 1–10,000 characters.
  • File attachments: Up to 20 files. Supported types: image (PNG, JPEG), video (MP4, WebM), audio (MP3, WAV), and PDF.
  • Files can be provided as an asset_id (from POST /v3/assets), an HTTPS URL, or base64-encoded content.

Asset Upload (POST /v3/assets)

  • Maximum file size: 32 MB.
  • Supported types: Image (PNG, JPEG), video (MP4, WebM), audio (MP3, WAV), and PDF.

Text-to-Speech Input (POST /v3/voices/speech)

  • Text length: 1–5,000 characters.
  • Speed multiplier: 0.5× to 2.0×.
  • Input type: Plain text or SSML markup.

Output Video Specifications

  • Frame rate: 25 fps for videos containing avatars.
  • Resolution: Width and height must each be between 128 and 4,096 pixels. Default output is 1080p.
  • Aspect ratio: 16:9 or 9:16.
  • Maximum scenes: 50 per video.
  • Maximum duration: 30 minutes.

Pagination

Most list endpoints use cursor-based pagination with a limit parameter and next_token for the next page.
EndpointDefaultMax
GET /v3/videos10100
GET /v3/avatars2050
GET /v3/avatars/looks2050
GET /v3/voices20100
GET /v3/video-agents/styles20100
GET /v3/video-translations10100
GET /v3/webhooks/endpoints10100
GET /v3/webhooks/events10100
GET /v3/video-agents/sessions/{id}/resources8100

Rate Limiting

All endpoints enforce rate limits. When exceeded, the API returns 429 Too Many Requests with a Retry-After header indicating the number of seconds to wait before retrying.