> ## Documentation Index
> Fetch the complete documentation index at: https://heygen-1fa696a7.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Stream Avatar Realtime Word Timestamps

> Open a Server-Sent Events (`text/event-stream`) stream of per-word timestamps for a session. Each `data:` frame is a `WordBatch` (`{"words": [{"word", "start", "end"}, ...]}`); times are in seconds from the start of the streamed audio (matching OpenAI Whisper and ElevenLabs). Punctuation is emitted as its own word event. Batches are capped at ~1s of audio or 10 words, whichever comes first. Late subscribers receive the full session history; completed sessions are served from a durable snapshot. The stream ends with a single `event: end` frame carrying a `WordsEndEvent`.



## OpenAPI

````yaml /openapi/external-api.json get /v3/avatar-realtime/{stream_id}/words
openapi: 3.1.0
info:
  title: HeyGen External API
  version: 1.0.0
  description: >-
    HeyGen's external API for programmatic AI video creation. See
    https://docs.heygen.com for full documentation.
  contact:
    name: HeyGen Product Infra
    url: https://heygen.com
servers:
  - url: https://api.heygen.com
    description: Production
security:
  - ApiKeyAuth: []
  - BearerAuth: []
tags:
  - name: Video Agent
    description: Create videos from text prompts using AI
  - name: Videos
    description: Create, list, retrieve, and delete videos
  - name: Voices
    description: Text-to-speech and voice management
  - name: Audio
    description: Search the background-music and sound-effects catalog
  - name: Video Translate
    description: Translate videos into other languages
  - name: User
    description: Account information and billing
  - name: Avatars
    description: List and manage avatars and looks
  - name: Avatar Realtime
    description: >-
      Low-latency streaming avatar sessions — create a stream, poll for its HLS
      URL, push text, consume per-word timestamps
  - name: Assets
    description: Upload files for use in video creation
  - name: Webhooks
    description: Manage webhook endpoints and events
  - name: Lipsync
    description: Dub or replace audio on existing videos
  - name: Brand
    description: >-
      Brand-related resources — brand kits (colors, fonts, logos) and brand
      glossaries (custom term translations)
  - name: HyperFrames
    description: Render HyperFrames composition zips into video — separate from /v3/videos
  - name: AI Clipping
    description: Turn long-form videos into ready-to-share short clips with captions
paths:
  /v3/avatar-realtime/{stream_id}/words:
    get:
      tags:
        - Avatar Realtime
      summary: Stream Avatar Realtime Word Timestamps
      description: >-
        Open a Server-Sent Events (`text/event-stream`) stream of per-word
        timestamps for a session. Each `data:` frame is a `WordBatch`
        (`{"words": [{"word", "start", "end"}, ...]}`); times are in seconds
        from the start of the streamed audio (matching OpenAI Whisper and
        ElevenLabs). Punctuation is emitted as its own word event. Batches are
        capped at ~1s of audio or 10 words, whichever comes first. Late
        subscribers receive the full session history; completed sessions are
        served from a durable snapshot. The stream ends with a single `event:
        end` frame carrying a `WordsEndEvent`.
      operationId: streamAvatarRealtimeWords
      parameters:
        - name: stream_id
          in: path
          required: true
          schema:
            type: string
          description: Streaming session identifier returned by POST /v3/avatar-realtime.
      responses:
        '200':
          description: >-
            Server-Sent Events stream (`text/event-stream`). Each `data:` frame
            carries one `WordBatch` JSON payload; the stream terminates with a
            single `event: end` frame whose data is a `WordsEndEvent`.
          content:
            text/event-stream:
              schema:
                $ref: '#/components/schemas/WordBatch'
        '400':
          description: Invalid request parameters
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    $ref: '#/components/schemas/StandardAPIError'
        '401':
          description: Authentication failed
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    $ref: '#/components/schemas/StandardAPIError'
              example:
                error:
                  code: authentication_failed
                  message: Invalid or expired API key. Verify your x-api-key header.
                  param: null
                  doc_url: null
        '404':
          description: Resource not found
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    $ref: '#/components/schemas/StandardAPIError'
              example:
                error:
                  code: not_found
                  message: Streaming session not found.
                  param: null
                  doc_url: null
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    $ref: '#/components/schemas/StandardAPIError'
              example:
                error:
                  code: rate_limit_exceeded
                  message: >-
                    Too many requests. Retry after the duration specified in the
                    Retry-After header.
                  param: null
                  doc_url: null
          headers:
            Retry-After:
              description: Seconds to wait before retrying
              schema:
                type: integer
      security:
        - ApiKeyAuth: []
        - BearerAuth: []
components:
  schemas:
    WordBatch:
      description: One SSE `data:` payload — a batch of words that share a ~1s window.
      properties:
        words:
          description: Words in this batch, ordered by start time.
          items:
            $ref: '#/components/schemas/WordEvent'
          title: Words
          type: array
      required:
        - words
      title: WordBatch
      type: object
    StandardAPIError:
      type: object
      properties:
        code:
          type: string
          description: Machine-readable error code
          example: invalid_parameter
        message:
          type: string
          description: Human-readable error message
          example: Video not found
        param:
          type:
            - string
            - 'null'
          description: Which request field caused the error
        doc_url:
          type:
            - string
            - 'null'
          description: Link to error documentation
      required:
        - code
        - message
    WordEvent:
      description: |-
        One spoken word and its time bounds in the streamed audio.

        Punctuation (periods, commas, etc.) is emitted as its own event with its
        own time bounds — clients that don't care can filter them out. Times are
        in seconds, matching OpenAI Whisper and ElevenLabs alignment APIs.
      properties:
        word:
          description: >-
            The spoken token. May be a word, partial word, or a punctuation
            mark.
          title: Word
          type: string
        start:
          description: >-
            Word start time in seconds, measured from the beginning of the
            streamed audio.
          title: Start
          type: number
        end:
          description: >-
            Word end time in seconds, measured from the beginning of the
            streamed audio.
          title: End
          type: number
      required:
        - word
        - start
        - end
      title: WordEvent
      type: object
  securitySchemes:
    ApiKeyAuth:
      type: apiKey
      in: header
      name: x-api-key
      description: HeyGen API key. Obtain from your HeyGen dashboard.
    BearerAuth:
      type: http
      scheme: bearer
      description: OAuth2 bearer token.

````