Visemes

From VRChat Wiki

In VRChat, “visemes” are the 15 mouth‑shape targets that the Oculus LipSync library drives while you speak.

VRChat pipes your live microphone audio through that library, converts the sound into a single integer (0‑14) every video frame, and writes it into the built‑in Animator parameter Viseme (often shown in the docs as “VisemeOculus”). Your avatar’s FX layer (or the SDK’s default layer) turns that number into blend‑shape or bone animation so the mouth appears to pronounce your words in real time. See VRChat Creators Wiki

V · EThis page is a stub.
You can help the VRChat Wiki by improving it.
[Reason: You can contribute by expanding and proofreading this article, in accordance with the Manual of Style.]
V · ECommunity-written content
The following was created by the community. It may contain material not directly endorsed by the VRChat team. To learn more, consider reading Contributing to the VRChat Wiki.

Viseme Slots

Index Short code Typical phonemes What the mouth does
0 sil silence / pause Neutral, lips relaxed
1 pp p, b, m Lips fully closed, slight pout
2 ff f, v Lower lip touches upper teeth
3 th th Tongue between teeth
4 dd t, d Tip of tongue touches ridge
5 kk k, g Back of tongue touches palate
6 ch ch, j Lips rounded, jaw slightly down
7 ss s, z, sh Teeth almost together, lips wide
8 nn n Tongue presses ridge, lips apart
9 rr r Lips slightly rounded, cheeks firm
10 aa a as in “father” Jaw open, oval mouth
11 e eh as in “men” Mouth wider, mid‑open
12 i / ih ee as in “meet” Lips stretched, jaw higher
13 o / oh aw as in “bought” Lips rounded, jaw mid‑open
14 u / ou oo as in “boot” Lips rounded, slightly forward

Oculus docs use the longer spellings ih/oh/ou; VRChat’s parameter list trims them to i/o/u.


Wiring Avatar

  1. Add blend‑shapes (shape keys) or a jaw bone. Each shapekey should be named exactly like the codes above (case‑sensitive in Unity). Keep the sil key—even if it only moves one vertex—to stop Unity deleting it on import.
  2. Set the Avatar Descriptor’s › Lip‑Sync mode to “Viseme Blend Shape”. Hit Auto Detect! first; if the SDK guesses wrong, pick the correct shapekey for every slot in the dropdown.
  3. Test in‑editor: play the scene, enable the Lip Sync preview on the descriptor, or simply talk into your mic while the Game view is active.
  4. Fine‑tune in your FX Animator (optional). The int parameter Viseme updates every frame; you can:
    • Drive a 15‑way BlendTree that weights each shapekey.
    • Branch to custom mouth animations (e.g., a “big scream” version of aa) when volume is high by also reading the Voice float (0‑1).
  5. Performance tip: keep viseme blend‑shapes on a separate head mesh; the GPU only has to update the vertices it actually changes.

Troubleshooting quick‑hits

Symptom Likely cause Fix
Mouth never moves Lip‑Sync mode still on Jaw Flap or Default Switch to Viseme Blend Shape
Wrong shapes (e.g., “th” looks like “ff”) Mis‑matched dropdown slots Re‑assign each viseme in the Descriptor
Shapes snap instead of blend Using an Int blend‑tree with thresholds too close Use a 1D float tree driven by 0‑14 or add smoothing
Shapes missing on Quest Head mesh not included in the Android build In Build tab, mark head mesh for both PC & Android

Take‑away

Visemes are simply numbered mouth cues. Name 15 blend‑shapes (or equivalent bone poses) to match the Oculus set, point your Avatar Descriptor at them, and VRChat’s built‑in Viseme parameter will make your character lip‑sync automatically—no extra scripts required.

See Also