Enterprise Pricing

Overview

Enterprise plans are billed in credits. Credits are consumed when a video, translation, or speech job completes successfully — you are not charged for failed jobs. Most rates are per second of output video (or audio) produced. The rate depends on the feature and the quality mode you select.

Credit balances and remaining usage are available via GET /v3/users/me. Contact your account team to purchase additional credits or adjust your credit pool.

OAuth vs API Key: If you authenticate with an OAuth bearer token, usage is billed against your web plan, not the API tier.Using an API Key is recommended for automation and integration workflows. API key authentication provides higher concurrency limits and is more flexible and powerful for programmatic use.

Pricing

Video Generation — Avatar IV & V

Avatar Type	Rate
Photo Avatar	0.1 credits / sec
Digital Twin	0.1 credits / sec
Studio Avatar	0.1 credits / sec

Video Generation — Avatar III

Avatar III is available to existing customers only. It is not offered to new users. For all new integrations use Avatar IV or Avatar V.

Avatar Type	Rate
Photo Avatar	0.0033 credits / sec
Digital Twin	0.0033 credits / sec
Studio Avatar	0.0033 credits / sec

Video Agent

Feature	Rate
Prompt to Video	0.0667 credits / sec

Video Translation

Mode	Rate
Speed — Audio Only	0.05 credits / sec
Speed — Lip Sync	0.05 credits / sec
Precision — Lip Sync	0.1 credits / sec
Proofread	0.00833 credits / sec

Lipsync

Mode	Rate
Speed	0.05 credits / sec
Precision	0.1 credits / sec

Text-to-Speech

Model	Rate
Speech — Starfish	0.000333 credits / sec

Avatar Creation

Operation	Rate
Digital Twin	1 credit / call
Photo Avatar	1 credit / call

Concurrency Limits

Plan	Max Concurrent Video Jobs
Enterprise	20+ (varies by contract)

Concurrent jobs include any asynchronous generation in progress: Video Agent sessions, avatar video renders, and video translations. Exceeding the limit returns 429 Too Many Requests with a Retry-After header.

Endpoint Limits

Video Generation Input

Resources provided to POST /v3/videos must meet these limits. Invalid resources will cause render failures.

Resource Type	Supported Formats	Max File Size	Max Resolution
Video	MP4, WebM	100 MB	< 2K
Image	JPG, PNG	50 MB	< 2K
Audio	WAV, MP3	50 MB	—

Requirements:

Resource URLs must be publicly accessible (no authentication required).
The file extension must match the actual file format.
Files must not be corrupted or malformed.

Avatar Input

Script text: Maximum 5,000 characters.
Audio input: Maximum 10 minutes (600 seconds).

Video Agent Input

Prompt: 1–10,000 characters.
File attachments: Up to 20 files. Supported types: image (PNG, JPEG), video (MP4, WebM), audio (MP3, WAV), and PDF.
Files can be provided as an asset_id (from POST /v3/assets), an HTTPS URL, or base64-encoded content.

Asset Upload (`POST /v3/assets`)

Maximum file size: 32 MB.
Supported types: Image (PNG, JPEG), video (MP4, WebM), audio (MP3, WAV), and PDF.

Text-to-Speech Input (`POST /v3/voices/speech`)

Text length: 1–5,000 characters.
Speed multiplier: 0.5× to 2.0×.
Input type: Plain text or SSML markup.

Output Video Specifications

Frame rate: 25 fps for videos containing avatars.
Resolution: Width and height must each be between 128 and 4,096 pixels. Default output is 1080p (up to 4K on Enterprise).
Aspect ratio: 16:9 or 9:16.
Maximum scenes: 50 per video.
Maximum duration: Custom (contact your account team).

Pagination

Most list endpoints use cursor-based pagination with a limit parameter and next_token for the next page.

Endpoint	Default	Max
`GET /v3/videos`	10	100
`GET /v3/avatars`	20	50
`GET /v3/avatars/looks`	20	50
`GET /v3/voices`	20	100
`GET /v3/video-agents/styles`	20	100
`GET /v3/video-translations`	10	100
`GET /v3/webhooks/endpoints`	10	100
`GET /v3/webhooks/events`	10	100
`GET /v3/video-agents/sessions/{id}/resources`	8	100

Rate Limiting

All endpoints enforce rate limits. When exceeded, the API returns 429 Too Many Requests with a Retry-After header indicating the number of seconds to wait before retrying.

Auth

User Info

Pricing

Video Agent

Video Generation

Video Translation

Avatars

Voices

Lipsync

Webhook

Assets

Integrations

Legacy APIs

Limits

Overview

Pricing

Video Generation — Avatar IV & V

Video Generation — Avatar III

Video Agent

Video Translation

Lipsync

Text-to-Speech

Avatar Creation

Concurrency Limits

Endpoint Limits

Video Generation Input

Avatar Input

Video Agent Input

Asset Upload (`POST /v3/assets`)

Text-to-Speech Input (`POST /v3/voices/speech`)

Output Video Specifications

Rate Limiting

Auth

User Info

Pricing

Video Agent

Video Generation

Video Translation

Avatars

Voices

Lipsync

Webhook

Assets

Integrations

Legacy APIs

Limits

Documentation Index

​Overview

​Pricing

​Video Generation — Avatar IV & V

​Video Generation — Avatar III

​Video Agent

​Video Translation

​Lipsync

​Text-to-Speech

​Avatar Creation

​Concurrency Limits

​Endpoint Limits

​Video Generation Input

​Avatar Input

​Video Agent Input

​Asset Upload (POST /v3/assets)

​Text-to-Speech Input (POST /v3/voices/speech)

​Output Video Specifications

​Pagination

​Rate Limiting

Overview

Pricing

Video Generation — Avatar IV & V

Video Generation — Avatar III

Video Agent

Video Translation

Lipsync

Text-to-Speech

Avatar Creation

Concurrency Limits

Endpoint Limits

Video Generation Input

Avatar Input

Video Agent Input

Asset Upload (`POST /v3/assets`)

Text-to-Speech Input (`POST /v3/voices/speech`)

Output Video Specifications

Pagination

Rate Limiting