heygen-avatar - HeyGen Documentation

Skill: heygen-avatar · Invoke: /heygen-avatar [name_or_description] · Source

heygen-avatar creates and manages persistent HeyGen avatars — a reusable face + voice identity powered by HeyGen Avatar V. It handles identity extraction, avatar generation, voice selection, and saves everything to an AVATAR-<NAME>.md file so every downstream video reuses the same look and voice. This is the correct first step when no avatar exists yet. It returns an avatar_id and voice_id you pass directly to heygen-video.

When to Use

Give the agent a face + voice so it can present videos — “bring yourself to life”, “create your avatar”, “design a presenter”.
Put the user in videos as themselves — “create my avatar”, “a digital twin of me”, “I want my face in a video”.
Build a named character presenter — “create an avatar called Cleo”.
Establish identity before making videos — run this, then chain into heygen-video.

Not for: generating videos (use heygen-video), translating videos (use heygen-translate), or TTS-only tasks.

Creation Modes

Two creation types, two scoping modes — covering everything from agent personas to real-person twins.

Type A — From prompt (default)

AI-generated appearance from a text description. The default path for agents and named characters. create_prompt_avatar(name, prompt, avatar_id?, avatar_group_id?).

Type B — From reference photo

A real-person digital twin from an uploaded image (URL, asset_id, or base64). Opt-in, for when the user wants photo realism. create_photo_avatar(name, file, avatar_group_id?).

Scoping mode	Behavior
New character (omit `avatar_id` / `avatar_group_id`)	Creates a brand-new character with its own group.
New look (prompt: pass a look’s `avatar_id` · photo: include `avatar_group_id`)	Adds a variation — outfit, pose, orientation — to an existing character. The default for iteration.

For prompt avatars, avatar_id (a look ID) is the visual reference that keeps the character’s identity consistent — the new look saves to that avatar’s group automatically. avatar_group_id only selects which group the result is saved to; it does not condition the generation. See Create Avatar for the full behavior matrix.

The Group ID is the stable character identity that never changes. Individual look_ids are ephemeral — always resolve them fresh from the group at runtime. AVATAR files store only the Group ID.

Workflow

Who are we creating for?

Routes to agent (default), user (explicit “my”/“me”), or named character. Reads workspace SOUL.md / IDENTITY.md for the agent’s identity before asking anything.

Identity extraction

Pulls appearance and voice traits from workspace files first; asks the user conversationally only for what’s genuinely missing — one or two questions at a time, never a form.

Avatar creation

Builds the prompt (or uploads the photo), shows it to the user for approval, then calls the creation API. Identity traits map to HeyGen enums (age, gender, ethnicity, style, orientation, pose).

Voice

Design (describe the voice, get 3 semantic matches with audio previews) or Browse (filter the catalog by language + gender). Language-aware — matches the user’s detected language. Waits for the user to pick.

Save & maintain aliases

Writes the AVATAR-<NAME>.md file with Group ID, Voice ID, and looks. Maintains AVATAR-AGENT.md / AVATAR-USER.md symlinks so consumer skills resolve identity generically.

Test (optional)

Generates a short greeting video in the user’s language to preview the avatar in action.

Identity Field Enums

When building a prompt avatar, identity traits map to these HeyGen enums:

Field	Values
age	Young Adult · Early Middle Age · Late Middle Age · Senior · Unspecified
gender	Man · Woman · Unspecified
ethnicity	White · Black · Asian American · East Asian · South East Asian · South Asian · Middle Eastern · Pacific · Hispanic · Unspecified
style	Realistic · Pixar · Cinematic · Vintage · Noir · Cyberpunk · Unspecified
orientation	square · horizontal · vertical
pose	half_body · close_up · full_body

The AVATAR File

Every avatar gets one AVATAR-<NAME>.md file at the workspace root — the single source of truth that consumer skills read.

# Avatar: <Name>

## Appearance
- Age / Gender / Ethnicity / Hair / Build / Features / Style

## Voice
- Tone / Accent / Energy / Think (one-line analogy)

## HeyGen
- Group ID: <stable character anchor — never changes>
- Voice ID / Voice Name / Voice Designed / Voice Seed
- Looks: landscape=<look_id>, portrait=<look_id>, square=<look_id>
- Last Synced: <ISO timestamp>

The top sections are portable natural language any platform can use; the HeyGen section is runtime config that skills read to make API calls.

Example Prompts

Prompt	What happens
”Bring yourself to life — create your own avatar.”	Reads identity files, builds the agent’s prompt avatar, designs a voice.
”Create my avatar — I have a headshot. What look works for a founder intro?”	Asks discovery questions, uploads the photo, recommends setting and tone.
”Create an avatar called Cleo, a warm documentary narrator.”	Builds a named character with a matched voice.
”Give me a new look — something more cinematic.”	Adds a look under the existing group (Mode 2), updates the AVATAR file.

View the full SKILL.md

Includes the complete workflow, reference docs for avatar creation, asset routing, and troubleshooting.

​When to Use

​Creation Modes

Type A — From prompt (default)

Type B — From reference photo

​Workflow

​Identity Field Enums

​The AVATAR File

​Example Prompts

View the full SKILL.md

When to Use

Creation Modes

Workflow

Identity Field Enums

The AVATAR File

Example Prompts