Generate Speech
Synthesize speech audio from text using a specified voice. The voice must support the starfish engine — use GET /v3/voices?engine=starfish to find compatible voices. Supports plain text and SSML. Speed range: 0.5–2.0x. Returns a URL to the generated audio file along with duration and optional word-level timestamps. See the Text to Speech guide.
Documentation Index
Fetch the complete documentation index at: https://heygen-1fa696a7.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
HeyGen API key. Obtain from your HeyGen dashboard.
Body
Request body for text-to-speech generation.
Text to synthesize (1-5000 characters).
1 - 5000Voice ID to use. The voice must support the starfish engine. Filter compatible voices by passing engine=starfish to the voice listing endpoint.
Type of the input: 'text' for plain text, 'ssml' for SSML markup. Defaults to 'text'.
Speed multiplier (0.5-2.0).
0.5 <= x <= 2Base language code (e.g. 'en', 'pt', 'zh'). Optional — auto-detected from text when omitted.
BCP-47 locale tag (e.g. 'en-US', 'pt-BR'). When set, language is inferred from locale.
Response
Successful response
Response payload for text-to-speech generation.

