> ## Documentation Index
> Fetch the complete documentation index at: https://heygen-1fa696a7.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Create Video

> Creates a video from a HeyGen avatar or an arbitrary image. Supports scripts or pre-recorded audio for lip-sync. Supports the Avatar III, Avatar IV, and Avatar V engines; set the 'engine' field to select. Avatar IV is used by default when 'engine' is omitted.


## OpenAPI

````yaml /openapi/external-api.json post /v3/videos
openapi: 3.1.0
info:
  title: HeyGen External API
  version: 1.0.0
  description: >-
    HeyGen's external API for programmatic AI video creation. See
    https://docs.heygen.com for full documentation.
  contact:
    name: HeyGen Product Infra
    url: https://heygen.com
servers:
  - url: https://api.heygen.com
    description: Production
security:
  - ApiKeyAuth: []
  - BearerAuth: []
tags:
  - name: Video Agent
    description: Create videos from text prompts using AI
  - name: Videos
    description: Create, list, retrieve, and delete videos
  - name: Voices
    description: Text-to-speech and voice management
  - name: Audio
    description: Search the background-music and sound-effects catalog
  - name: Video Translate
    description: Translate videos into other languages
  - name: AI Clipping
    description: Turn long-form videos into ready-to-share short clips with captions
  - name: User
    description: Account information and billing
  - name: Avatars
    description: List and manage avatars and looks
  - name: Avatar Realtime
    description: >-
      Low-latency streaming avatar sessions — create a stream, poll for its HLS
      URL, push text, consume per-word timestamps
  - name: Assets
    description: Upload files for use in video creation
  - name: Webhooks
    description: Manage webhook endpoints and events
  - name: Lipsync
    description: Dub or replace audio on existing videos
  - name: Brand
    description: >-
      Brand-related resources — brand kits (colors, fonts, logos) and brand
      glossaries (custom term translations)
  - name: HyperFrames
    description: Render HyperFrames composition zips into video — separate from /v3/videos
paths:
  /v3/videos:
    post:
      tags:
        - Videos
      summary: Create Video
      description: >-
        Creates a video from a HeyGen avatar or an arbitrary image. Supports
        scripts or pre-recorded audio for lip-sync. Supports the Avatar III,
        Avatar IV, and Avatar V engines; set the 'engine' field to select.
        Avatar IV is used by default when 'engine' is omitted.
      operationId: createVideo
      parameters:
        - $ref: '#/components/parameters/IdempotencyKey'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateVideoV3RequestBody'
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                type: object
                properties:
                  data:
                    $ref: '#/components/schemas/CreateAvatarVideoResponse'
        '400':
          description: Invalid request parameters
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    $ref: '#/components/schemas/StandardAPIError'
              example:
                error:
                  code: invalid_parameter
                  message: >-
                    Exactly one visual source required: avatar_id, image_url, or
                    image_asset_id.
                  param: avatar_id
                  doc_url: null
        '401':
          description: Authentication failed
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    $ref: '#/components/schemas/StandardAPIError'
              example:
                error:
                  code: authentication_failed
                  message: Invalid or expired API key. Verify your x-api-key header.
                  param: null
                  doc_url: null
        '409':
          $ref: '#/components/responses/IdempotencyInProgress'
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    $ref: '#/components/schemas/StandardAPIError'
              example:
                error:
                  code: rate_limit_exceeded
                  message: >-
                    Too many requests. Retry after the duration specified in the
                    Retry-After header.
                  param: null
                  doc_url: null
          headers:
            Retry-After:
              description: Seconds to wait before retrying
              schema:
                type: integer
      security:
        - ApiKeyAuth: []
        - BearerAuth: []
components:
  parameters:
    IdempotencyKey:
      name: Idempotency-Key
      in: header
      required: false
      description: >-
        Optional client-supplied key for safely retrying mutations. Subsequent
        calls within 24 hours that share this key replay the original response —
        even if the request body differs slightly (a warning is logged). A retry
        that arrives while the original is still in flight gets a 409
        `request_in_progress`. Keys must be 1–255 characters from
        `[A-Za-z0-9_:.-]`; a UUID is a safe default. Scope is per-endpoint and
        per-resource: the same key on a different route or path parameter is
        independent.
      schema:
        type: string
        pattern: ^[A-Za-z0-9_\-:.]{1,255}$
        maxLength: 255
        minLength: 1
      example: 550e8400-e29b-41d4-a716-446655440000
  schemas:
    CreateVideoV3RequestBody:
      description: Discriminated union for POST /v3/videos request body.
      discriminator:
        mapping:
          avatar:
            $ref: '#/components/schemas/CreateVideoFromAvatar'
          cinematic_avatar:
            $ref: '#/components/schemas/CreateVideoFromCinematicAvatar'
          image:
            $ref: '#/components/schemas/CreateVideoFromImage'
        propertyName: type
      oneOf:
        - $ref: '#/components/schemas/CreateVideoFromAvatar'
        - $ref: '#/components/schemas/CreateVideoFromImage'
        - $ref: '#/components/schemas/CreateVideoFromCinematicAvatar'
      title: CreateVideoV3RequestBody
    CreateAvatarVideoResponse:
      properties:
        video_id:
          description: Unique identifier for the created video.
          examples:
            - v_abc123def456
          title: Video Id
          type: string
        status:
          description: Initial video status (e.g. 'waiting').
          examples:
            - waiting
          title: Status
          type: string
        output_format:
          $ref: '#/components/schemas/VideoOutputFormat'
          default: mp4
          description: Resolved output format for the video.
      required:
        - video_id
        - status
      title: CreateAvatarVideoResponse
      type: object
    StandardAPIError:
      type: object
      properties:
        code:
          type: string
          description: Machine-readable error code
          example: invalid_parameter
        message:
          type: string
          description: Human-readable error message
          example: Video not found
        param:
          type:
            - string
            - 'null'
          description: Which request field caused the error
        doc_url:
          type:
            - string
            - 'null'
          description: Link to error documentation
      required:
        - code
        - message
    CreateVideoFromAvatar:
      additionalProperties: false
      description: >-
        Create a video from a HeyGen avatar (video or photo avatar).


        Provide an avatar_id to use a previously created avatar. Supports all

        avatar types: studio_avatar, digital_twin, and photo_avatar. Optionally

        set ``engine`` to select Avatar V for eligible avatars; when omitted,
        the

        server defaults to Avatar IV.
      properties:
        title:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Display title for the video in the HeyGen dashboard.
          title: Title
        resolution:
          anyOf:
            - $ref: '#/components/schemas/VideoResolution'
            - type: 'null'
          default: null
          description: Output video resolution.
        aspect_ratio:
          anyOf:
            - $ref: '#/components/schemas/VideoAspectRatio'
            - type: 'null'
          default: '16:9'
          description: >-
            Output video aspect ratio. Supported values: '16:9', '9:16', '4:5',
            '5:4', '1:1', 'auto'. Defaults to '16:9'. 'auto' preserves the
            source's aspect ratio (avatar source frames or uploaded image),
            short-edge anchored to the requested resolution and capped at the
            tier's long edge. Falls back to '16:9' when source dimensions can't
            be read.
          x-cli-default: auto
          x-mcp-default: auto
        fit:
          anyOf:
            - $ref: '#/components/schemas/AvatarFit'
            - type: 'null'
          default: null
          description: >-
            How the subject is fitted to the output canvas. 'cover' scales to
            fill the frame (may crop edges). 'contain' scales to fit entirely
            within the frame (may show background). When omitted, the server
            picks the best option based on the source and canvas orientations.
        background:
          anyOf:
            - $ref: '#/components/schemas/BackgroundSetting'
            - type: 'null'
          default: null
          description: Background settings for the video.
        remove_background:
          anyOf:
            - type: boolean
            - type: 'null'
          default: null
          description: >-
            Remove the avatar background. Video avatars must be trained with
            matting enabled.
          title: Remove Background
        callback_url:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Webhook URL to receive a POST notification when the video is ready.
          title: Callback Url
        callback_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Caller-defined identifier echoed back in the webhook payload.
          title: Callback Id
        watermark:
          anyOf:
            - $ref: '#/components/schemas/WatermarkInput'
            - type: 'null'
          default: null
          description: >-
            Custom watermark image to overlay on the video (PNG or JPEG).
            Available as a premium option for select Enterprise customers. To
            request access, please contact our support team.
          x-cli-visible: false
          x-mcp-visible: false
        caption:
          anyOf:
            - $ref: '#/components/schemas/CaptionSetting'
            - type: 'null'
          default: null
          description: >-
            Caption generation settings. A sidecar subtitle file is always
            returned via subtitle_url; set 'style' to additionally burn captions
            into the rendered video.
        output_format:
          $ref: '#/components/schemas/VideoOutputFormat'
          default: mp4
          description: >-
            Output container. 'webm' returns a video with a transparent
            background (alpha channel); 'mp4' (default) returns a standard
            video. 'webm' requires an avatar that supports matting. When 'webm'
            is selected, any 'background' value is rejected and background
            removal is applied automatically — the caller does not need to set
            'remove_background'.
        script:
          anyOf:
            - minLength: 1
              type: string
            - type: 'null'
          default: null
          description: >-
            Text script for the avatar to speak. Pair with voice_id, or omit
            voice_id when using avatar_id to use the avatar's default voice.
            Mutually exclusive with audio_url/audio_asset_id.
          title: Script
        voice_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            Voice ID for text-to-speech. Required when script is provided,
            unless avatar_id is set (the avatar's default voice is used as
            fallback).
          title: Voice Id
        audio_url:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            Public URL of an audio file to lip-sync. Mutually exclusive with
            script.
          title: Audio Url
        audio_asset_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            HeyGen asset ID of an uploaded audio file. Mutually exclusive with
            script.
          title: Audio Asset Id
        voice_settings:
          anyOf:
            - $ref: '#/components/schemas/VoiceSettingsInput'
            - type: 'null'
          default: null
          description: Voice tuning parameters (speed, pitch, locale).
        type:
          const: avatar
          description: Must be 'avatar' for avatar-based video creation.
          title: Type
          type: string
        avatar_id:
          description: HeyGen avatar ID (video avatar or photo avatar look ID).
          title: Avatar Id
          type: string
        motion_prompt:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            Natural-language prompt controlling avatar body motion and hand
            gestures. Supported for photo avatars on either engine, and for
            video avatars when engine.type is 'avatar_v'. Rejected for video
            avatars on the default Avatar IV engine.
          title: Motion Prompt
        expressiveness:
          anyOf:
            - $ref: '#/components/schemas/Expressiveness'
            - type: 'null'
          default: null
          description: >-
            Avatar expressiveness level. Photo avatars only. Defaults to 'low'
            when omitted. Avatar IV only; rejected when engine.type is
            'avatar_v'.
        engine:
          anyOf:
            - discriminator:
                mapping:
                  avatar_iii:
                    $ref: '#/components/schemas/AvatarIIIEngineConfig'
                  avatar_iv:
                    $ref: '#/components/schemas/AvatarIVEngineConfig'
                  avatar_v:
                    $ref: '#/components/schemas/AvatarVEngineConfig'
                propertyName: type
              oneOf:
                - $ref: '#/components/schemas/AvatarVEngineConfig'
                - $ref: '#/components/schemas/AvatarIVEngineConfig'
                - $ref: '#/components/schemas/AvatarIIIEngineConfig'
            - type: 'null'
          default: null
          description: >-
            Engine configuration for video generation. Pass {"type": "avatar_v"}
            to enable cross-reference-driven animation for higher quality. Check
            supported_api_engines on the avatar look to confirm eligibility.
            Defaults to Avatar IV when omitted.
          title: Engine
      required:
        - type
        - avatar_id
      title: CreateVideoFromAvatar
      type: object
    CreateVideoFromCinematicAvatar:
      additionalProperties: false
      description: >-
        Create a video from a text prompt plus avatar and asset references
        (Cinematic Avatar).


        Cinematic Avatar generates a video from a natural-language ``prompt``
        guided by

        reference content: one to three avatar looks and optional reference
        assets

        (images / videos / audio). Unlike the ``avatar`` and ``image`` modes
        there is

        no script or voice — motion and speech are driven entirely by the prompt
        and

        the supplied references. Backed by the Seedance generation pipeline.
      properties:
        type:
          const: cinematic_avatar
          description: Must be 'cinematic_avatar' for prompt-and-reference video creation.
          title: Type
          type: string
        prompt:
          description: Natural-language prompt describing the video to generate.
          maxLength: 10000
          minLength: 1
          title: Prompt
          type: string
        avatar_id:
          description: >-
            Avatar look ID(s) used as visual references. Provide 1 to 3 look
            IDs.
          items:
            type: string
          title: Avatar Id
          type: array
        references:
          anyOf:
            - items:
                discriminator:
                  mapping:
                    asset_id:
                      $ref: '#/components/schemas/AssetId'
                    base64:
                      $ref: '#/components/schemas/AssetBase64'
                    url:
                      $ref: '#/components/schemas/AssetUrl'
                  propertyName: type
                oneOf:
                  - $ref: '#/components/schemas/AssetUrl'
                  - $ref: '#/components/schemas/AssetId'
                  - $ref: '#/components/schemas/AssetBase64'
              type: array
            - type: 'null'
          default: null
          description: >-
            Reference assets (images, videos, or audio) guiding the generation.
            Each accepts a URL, an asset_id, or inline base64. Combined limits:
            at most 3 videos and 9 images across avatars and references.
          title: References
        aspect_ratio:
          $ref: '#/components/schemas/VideoAspectRatio'
          default: '16:9'
          description: >-
            Output aspect ratio. Supported for cinematic_avatar: '16:9', '9:16',
            '1:1'. Defaults to '16:9'.
        resolution:
          default: 720p
          description: >-
            Output resolution. Supported for cinematic_avatar: '720p', '1080p'.
            Defaults to '720p'.
          enum:
            - 720p
            - 1080p
          title: Resolution
          type: string
        auto_duration:
          default: false
          description: Let the model choose the video length. When true, omit duration.
          title: Auto Duration
          type: boolean
        duration:
          anyOf:
            - maximum: 15
              minimum: 4
              type: integer
            - type: 'null'
          default: null
          description: >-
            Video length in seconds (4–15). Defaults to 10. Omit when
            auto_duration is true.
          title: Duration
        enhance_prompt:
          default: false
          description: Enable server-side prompt enhancement.
          title: Enhance Prompt
          type: boolean
        title:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Display title for the video in the HeyGen dashboard.
          title: Title
      required:
        - type
        - prompt
        - avatar_id
      title: CreateVideoFromCinematicAvatar
      type: object
    CreateVideoFromImage:
      additionalProperties: false
      description: |-
        Create a video by animating an arbitrary image.

        Provide an image via URL, asset ID, or inline base64. The image will be
        animated with lip-sync to the provided audio or generated speech.
      properties:
        title:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Display title for the video in the HeyGen dashboard.
          title: Title
        resolution:
          anyOf:
            - $ref: '#/components/schemas/VideoResolution'
            - type: 'null'
          default: null
          description: Output video resolution.
        aspect_ratio:
          anyOf:
            - $ref: '#/components/schemas/VideoAspectRatio'
            - type: 'null'
          default: '16:9'
          description: >-
            Output video aspect ratio. Supported values: '16:9', '9:16', '4:5',
            '5:4', '1:1', 'auto'. Defaults to '16:9'. 'auto' preserves the
            source's aspect ratio (avatar source frames or uploaded image),
            short-edge anchored to the requested resolution and capped at the
            tier's long edge. Falls back to '16:9' when source dimensions can't
            be read.
          x-cli-default: auto
          x-mcp-default: auto
        fit:
          anyOf:
            - $ref: '#/components/schemas/AvatarFit'
            - type: 'null'
          default: null
          description: >-
            How the subject is fitted to the output canvas. 'cover' scales to
            fill the frame (may crop edges). 'contain' scales to fit entirely
            within the frame (may show background). When omitted, the server
            picks the best option based on the source and canvas orientations.
        background:
          anyOf:
            - $ref: '#/components/schemas/BackgroundSetting'
            - type: 'null'
          default: null
          description: Background settings for the video.
        remove_background:
          anyOf:
            - type: boolean
            - type: 'null'
          default: null
          description: >-
            Remove the avatar background. Video avatars must be trained with
            matting enabled.
          title: Remove Background
        callback_url:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Webhook URL to receive a POST notification when the video is ready.
          title: Callback Url
        callback_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Caller-defined identifier echoed back in the webhook payload.
          title: Callback Id
        watermark:
          anyOf:
            - $ref: '#/components/schemas/WatermarkInput'
            - type: 'null'
          default: null
          description: >-
            Custom watermark image to overlay on the video (PNG or JPEG).
            Available as a premium option for select Enterprise customers. To
            request access, please contact our support team.
          x-cli-visible: false
          x-mcp-visible: false
        caption:
          anyOf:
            - $ref: '#/components/schemas/CaptionSetting'
            - type: 'null'
          default: null
          description: >-
            Caption generation settings. A sidecar subtitle file is always
            returned via subtitle_url; set 'style' to additionally burn captions
            into the rendered video.
        output_format:
          $ref: '#/components/schemas/VideoOutputFormat'
          default: mp4
          description: >-
            Output container. 'webm' returns a video with a transparent
            background (alpha channel); 'mp4' (default) returns a standard
            video. 'webm' requires an avatar that supports matting. When 'webm'
            is selected, any 'background' value is rejected and background
            removal is applied automatically — the caller does not need to set
            'remove_background'.
        script:
          anyOf:
            - minLength: 1
              type: string
            - type: 'null'
          default: null
          description: >-
            Text script for the avatar to speak. Pair with voice_id, or omit
            voice_id when using avatar_id to use the avatar's default voice.
            Mutually exclusive with audio_url/audio_asset_id.
          title: Script
        voice_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            Voice ID for text-to-speech. Required when script is provided,
            unless avatar_id is set (the avatar's default voice is used as
            fallback).
          title: Voice Id
        audio_url:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            Public URL of an audio file to lip-sync. Mutually exclusive with
            script.
          title: Audio Url
        audio_asset_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            HeyGen asset ID of an uploaded audio file. Mutually exclusive with
            script.
          title: Audio Asset Id
        voice_settings:
          anyOf:
            - $ref: '#/components/schemas/VoiceSettingsInput'
            - type: 'null'
          default: null
          description: Voice tuning parameters (speed, pitch, locale).
        type:
          const: image
          description: Must be 'image' for image-based video creation.
          title: Type
          type: string
        image:
          description: Image to animate. Accepts URL, asset ID, or base64-encoded data.
          discriminator:
            mapping:
              asset_id:
                $ref: '#/components/schemas/AssetId'
              base64:
                $ref: '#/components/schemas/AssetBase64'
              url:
                $ref: '#/components/schemas/AssetUrl'
            propertyName: type
          oneOf:
            - $ref: '#/components/schemas/AssetUrl'
            - $ref: '#/components/schemas/AssetId'
            - $ref: '#/components/schemas/AssetBase64'
          title: Image
        motion_prompt:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            Natural-language prompt controlling avatar body motion. Photo
            avatars only.
          title: Motion Prompt
        expressiveness:
          anyOf:
            - $ref: '#/components/schemas/Expressiveness'
            - type: 'null'
          default: null
          description: >-
            Avatar expressiveness level. Photo avatars only. Defaults to 'low'
            when omitted.
      required:
        - type
        - image
      title: CreateVideoFromImage
      type: object
    VideoOutputFormat:
      description: Output container for the generated video.
      enum:
        - mp4
        - webm
      title: VideoOutputFormat
      type: string
    VideoResolution:
      description: Output video resolution.
      enum:
        - 4k
        - 1080p
        - 720p
      title: VideoResolution
      type: string
    VideoAspectRatio:
      description: >-
        Output video aspect ratio.


        - ``16:9`` / ``9:16``: classic landscape / portrait.

        - ``4:5`` / ``5:4`` / ``1:1``: social-media-friendly ratios. Output is
        short-edge anchored to
          the requested resolution (e.g. ``1080p`` 1:1 → 1080x1080, ``1080p`` 4:5 → 1080x1350).
        - ``auto``: preserve the source's aspect ratio. The dimensions are
        derived from the avatar's
          source frames (``avatar_id``) or the uploaded image (``image_url`` / ``image_asset_id``),
          short-edge anchored to the requested resolution and capped at the tier's long edge.
          Falls back to ``16:9`` when source dimensions can't be read.
      enum:
        - '16:9'
        - '9:16'
        - '4:5'
        - '5:4'
        - '1:1'
        - auto
      title: VideoAspectRatio
      type: string
    AvatarFit:
      description: How the avatar is scaled to the output canvas.
      enum:
        - contain
        - cover
      title: AvatarFit
      type: string
    BackgroundSetting:
      description: Background configuration for the generated video.
      properties:
        type:
          description: >-
            Background type. 'color' uses a solid hex color; 'image' uses an
            image from url or asset_id.
          enum:
            - color
            - image
          title: Type
          type: string
        value:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Hex color code (e.g. '#ff0000'). Required when type is 'color'.
          title: Value
        url:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            URL of the background image. Used when type is 'image'. Mutually
            exclusive with asset_id.
          title: Url
        asset_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            HeyGen asset ID of the background image. Used when type is 'image'.
            Mutually exclusive with url.
          title: Asset Id
      required:
        - type
      title: BackgroundSetting
      type: object
    WatermarkInput:
      additionalProperties: false
      description: Watermark configuration for video creation.
      properties:
        image:
          description: Image asset to use as the watermark overlay (PNG or JPEG).
          discriminator:
            mapping:
              asset_id:
                $ref: '#/components/schemas/AssetId'
              base64:
                $ref: '#/components/schemas/AssetBase64'
              url:
                $ref: '#/components/schemas/AssetUrl'
            propertyName: type
          oneOf:
            - $ref: '#/components/schemas/AssetUrl'
            - $ref: '#/components/schemas/AssetId'
            - $ref: '#/components/schemas/AssetBase64'
          title: Image
        scale:
          default: 1
          description: >-
            Scale multiplier for the watermark image. 1.0 renders at native
            size.
          exclusiveMinimum: 0
          maximum: 2
          title: Scale
          type: number
        opacity:
          default: 1
          description: Watermark opacity. 0.0 is fully transparent, 1.0 is fully opaque.
          maximum: 1
          minimum: 0
          title: Opacity
          type: number
        placement:
          anyOf:
            - $ref: '#/components/schemas/WatermarkPlacement'
            - type: 'null'
          default: null
          description: >-
            Watermark placement. Defaults to bottom-right with standard margins
            when omitted.
      required:
        - image
      title: WatermarkInput
      type: object
    CaptionSetting:
      description: >-
        Caption generation settings for video creation.


        A sidecar subtitle file is always generated and returned via
        ``subtitle_url``

        in the chosen ``file_format``. When ``style`` is also set, captions are

        additionally burned into the rendered video — the sidecar is still
        delivered.
      properties:
        file_format:
          $ref: '#/components/schemas/CaptionFileFormat'
          default: srt
          description: Output format for the sidecar caption file.
        style:
          anyOf:
            - $ref: '#/components/schemas/CaptionStyle'
            - type: 'null'
          default: null
          description: >-
            Visual style for burning captions into the rendered video. Omit for
            sidecar-only captions.
      title: CaptionSetting
      type: object
    VoiceSettingsInput:
      description: >-
        Voice tuning parameters for text-to-speech.


        Applies only when 'script' + 'voice_id' are provided — not when
        audio_url/audio_asset_id

        is used (uploaded audio bypasses TTS).
      properties:
        speed:
          default: 1
          description: Playback speed multiplier. 0.5 (half speed) to 1.5 (1.5x speed).
          maximum: 1.5
          minimum: 0.5
          title: Speed
          type: number
        pitch:
          default: 0
          description: Pitch adjustment in semitones. -50 to +50.
          maximum: 50
          minimum: -50
          title: Pitch
          type: number
        volume:
          default: 1
          description: >-
            Voice audio volume. 1.0 = full, 0.0 = silent. Useful when mixing
            spoken voice with background audio.
          maximum: 1
          minimum: 0
          title: Volume
          type: number
        locale:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: Locale/accent hint for multi-lingual voices (e.g. 'en-US').
          title: Locale
        engine_settings:
          anyOf:
            - discriminator:
                mapping:
                  elevenlabs:
                    $ref: '#/components/schemas/ElevenLabsEngineSettings'
                  fish:
                    $ref: '#/components/schemas/FishEngineSettings'
                  starfish:
                    $ref: '#/components/schemas/StarfishEngineSettings'
                propertyName: engine_type
              oneOf:
                - $ref: '#/components/schemas/ElevenLabsEngineSettings'
                - $ref: '#/components/schemas/FishEngineSettings'
                - $ref: '#/components/schemas/StarfishEngineSettings'
            - type: 'null'
          default: null
          description: >-
            Engine-specific voice tuning, discriminated by 'engine_type'. Use
            the variant matching the engine backing the chosen voice (e.g.
            engine_type='elevenlabs' for ElevenLabs-backed voices). The request
            is rejected if the voice_id is not compatible with the selected
            engine.
          title: Engine Settings
      title: VoiceSettingsInput
      type: object
    Expressiveness:
      description: Avatar expressiveness level for photo avatars.
      enum:
        - high
        - medium
        - low
      title: Expressiveness
      type: string
    AvatarIIIEngineConfig:
      additionalProperties: false
      description: >-
        Avatar III engine configuration.


        A single engine value that resolves to the right product by the avatar's

        look type (mirrors how ``avatar_iv`` already serves both photo and video

        avatars):


        - video avatar looks (``digital_twin``, ``studio_avatar``) -> Digital
        Twin
          (supports 4K)
        - ``photo_avatar`` look -> Photo Avatar (no 4K output)


        Not supported for raw image input (``type: "image"``).

        ``motion_prompt`` and ``expressiveness`` are not supported with this
        engine.
      properties:
        type:
          const: avatar_iii
          description: >-
            Engine type discriminator. Must be 'avatar_iii'. Resolves to Digital
            Twin for video avatar looks (digital_twin, studio_avatar) and Photo
            Avatar for photo_avatar looks; not supported for raw image input.
            Check supported_api_engines on the avatar look to confirm
            eligibility.
          title: Type
          type: string
      required:
        - type
      title: AvatarIIIEngineConfig
      type: object
    AvatarIVEngineConfig:
      additionalProperties: false
      description: Avatar IV engine configuration (default behavior).
      properties:
        type:
          const: avatar_iv
          description: Engine type discriminator. Must be 'avatar_iv'.
          title: Type
          type: string
      required:
        - type
      title: AvatarIVEngineConfig
      type: object
    AvatarVEngineConfig:
      additionalProperties: false
      description: Avatar V engine configuration with cross-reference-driven animation.
      properties:
        type:
          const: avatar_v
          description: >-
            Engine type discriminator. Must be 'avatar_v'. Check
            supported_api_engines on the avatar look to confirm eligibility.
          title: Type
          type: string
        reference_look_id:
          anyOf:
            - type: string
            - type: 'null'
          default: null
          description: >-
            Optional instant_avatar look to use as the animation reference. Must
            be an `instant_avatar` look (studio / photo / other look types are
            rejected) belonging to the same avatar group as `avatar_id`. When
            omitted, video avatars self-reference and photo avatars auto-select
            the best instant_avatar sibling in their group.
          title: Reference Look Id
      required:
        - type
      title: AvatarVEngineConfig
      type: object
    AssetId:
      additionalProperties: false
      description: Asset input via HeyGen asset ID from the asset upload endpoint.
      properties:
        type:
          const: asset_id
          description: Input type discriminator
          title: Type
          type: string
        asset_id:
          description: HeyGen asset ID from the asset upload endpoint
          title: Asset Id
          type: string
      required:
        - type
        - asset_id
      title: AssetId
      type: object
    AssetBase64:
      additionalProperties: false
      description: Asset input via base64-encoded content.
      properties:
        type:
          const: base64
          description: Input type discriminator
          title: Type
          type: string
        media_type:
          description: MIME type of the encoded content (e.g. "image/png")
          title: Media Type
          type: string
        data:
          description: Base64-encoded file content
          title: Data
          type: string
      required:
        - type
        - media_type
        - data
      title: AssetBase64
      type: object
      x-mcp-visible: false
    AssetUrl:
      additionalProperties: false
      description: Asset input via publicly accessible HTTPS URL.
      properties:
        type:
          const: url
          description: Input type discriminator
          title: Type
          type: string
        url:
          description: Publicly accessible HTTPS URL for the asset
          title: Url
          type: string
      required:
        - type
        - url
      title: AssetUrl
      type: object
    WatermarkPlacement:
      additionalProperties: false
      description: Watermark placement configuration.
      properties:
        position:
          $ref: '#/components/schemas/WatermarkPosition'
          default: bottom_right
          description: Anchor corner for the watermark.
        offset_x:
          anyOf:
            - maximum: 1
              minimum: -1
              type: number
            - type: 'null'
          default: null
          description: >-
            Fine-tune horizontal position. Fraction of frame width; 0.05 shifts
            5% rightward, -0.05 shifts 5% leftward.
          title: Offset X
        offset_y:
          anyOf:
            - maximum: 1
              minimum: -1
              type: number
            - type: 'null'
          default: null
          description: >-
            Fine-tune vertical position. Fraction of frame height; 0.05 shifts
            5% downward, -0.05 shifts 5% upward.
          title: Offset Y
      title: WatermarkPlacement
      type: object
    CaptionFileFormat:
      description: Supported caption file output formats.
      enum:
        - srt
      title: CaptionFileFormat
      type: string
    CaptionStyle:
      description: Visual style applied when burning captions into the rendered video.
      enum:
        - default
      title: CaptionStyle
      type: string
    ElevenLabsEngineSettings:
      description: >-
        Engine-specific voice settings for ElevenLabs-backed voices.


        Inherits the ElevenLabs tuning fields (model, stability,
        similarity_boost, style,

        use_speaker_boost) along with the eleven_v3 stability validator from

        :class:`movio.api_service.app.api_types.video.ElevenLabsSettings`.
      properties:
        model:
          anyOf:
            - $ref: '#/components/schemas/ElevenLabsModel'
            - type: 'null'
          default: null
          description: The model ID to use for ElevenLabs.
        similarity_boost:
          anyOf:
            - maximum: 1
              minimum: 0
              type: number
            - type: 'null'
          default: null
          description: The similarity boost parameter for ElevenLabs.
          title: Similarity Boost
        stability:
          anyOf:
            - maximum: 1
              minimum: 0
              type: number
            - type: 'null'
          default: null
          description: The stability parameter for ElevenLabs.
          title: Stability
        style:
          anyOf:
            - maximum: 1
              minimum: 0
              type: number
            - type: 'null'
          default: null
          description: The style parameter for ElevenLabs.
          title: Style
        use_speaker_boost:
          anyOf:
            - type: boolean
            - type: 'null'
          default: null
          description: Whether to use speaker boost for ElevenLabs.
          title: Use Speaker Boost
        engine_type:
          const: elevenlabs
          description: >-
            Engine type discriminator. Must be 'elevenlabs' for
            ElevenLabs-backed voices.
          title: Engine Type
          type: string
      required:
        - engine_type
      title: ElevenLabsEngineSettings
      type: object
    FishEngineSettings:
      description: |-
        Engine-specific voice settings for Fish Audio-backed voices.

        Inherits Fish's tuning fields (model, stability, similarity).
      properties:
        model:
          anyOf:
            - $ref: '#/components/schemas/FishModel'
            - type: 'null'
          default: null
          description: Fish Audio model version (default 's1').
        stability:
          anyOf:
            - maximum: 1
              minimum: 0
              type: number
            - type: 'null'
          default: null
          description: Stability parameter; higher is more consistent.
          title: Stability
        similarity:
          anyOf:
            - maximum: 1
              minimum: 0
              type: number
            - type: 'null'
          default: null
          description: Similarity parameter; how closely to match the source voice.
          title: Similarity
        engine_type:
          const: fish
          description: >-
            Engine type discriminator. Must be 'fish' for Fish Audio-backed
            voices.
          title: Engine Type
          type: string
      required:
        - engine_type
      title: FishEngineSettings
      type: object
    StarfishEngineSettings:
      description: >-
        Engine-selection for Starfish-backed voices.


        Starfish has no user-tunable settings today; set
        ``engine_type='starfish'`` to force

        Starfish routing on voices that support multiple engines.
      properties:
        engine_type:
          const: starfish
          description: >-
            Engine type discriminator. Must be 'starfish' for Starfish-backed
            voices.
          title: Engine Type
          type: string
      required:
        - engine_type
      title: StarfishEngineSettings
      type: object
    WatermarkPosition:
      description: Anchor corner for a custom watermark overlay.
      enum:
        - top_left
        - top_right
        - bottom_left
        - bottom_right
      title: WatermarkPosition
      type: string
    ElevenLabsModel:
      description: >-
        ElevenLabs model IDs exposed on the public API.


        Only current models are included — deprecated models (monolingual_v1,
        multilingual_v1,

        turbo_v2) are not accepted. The web auto-remaps them to newer
        equivalents; the API

        should not offer models we wouldn't recommend using.
      enum:
        - eleven_multilingual_v2
        - eleven_turbo_v2_5
        - eleven_flash_v2_5
        - eleven_v3
      title: ElevenLabsModel
      type: string
    FishModel:
      description: >-
        Fish Audio model version. Mirrors the choices exposed on the web
        (FISH_MODELS).
      enum:
        - s1
        - s2-pro
      title: FishModel
      type: string
  responses:
    IdempotencyInProgress:
      description: >-
        A prior request with this Idempotency-Key is still in progress. Wait for
        the original request to complete, then retry.
      content:
        application/json:
          schema:
            type: object
            properties:
              error:
                $ref: '#/components/schemas/StandardAPIError'
          example:
            error:
              code: request_in_progress
              message: >-
                A request with this Idempotency-Key is already in progress.
                Retry shortly.
              param: null
              doc_url: null
  securitySchemes:
    ApiKeyAuth:
      type: apiKey
      in: header
      name: x-api-key
      description: HeyGen API key. Obtain from your HeyGen dashboard.
    BearerAuth:
      type: http
      scheme: bearer
      description: OAuth2 bearer token.

````