# AI Lipsync API

> Sync any face video to any audio track via MuApi.ai — Sync, Latentsync, Creatify, Veed lipsync models behind one API.

## Overview

Upload a face video and an audio file (or text via TTS first) and get back a video where the mouth tracks the audio. The standard pipeline for AI avatars, dubbed content, multilingual product videos, and reactive character animations.

- Models: Sync, Latentsync, Creatify, Veed Lipsync
- Inputs: video URL + audio URL (or pair with TTS endpoints for text-driven flows)
- Quality vs. cost spectrum — Sync is highest fidelity, Latentsync is fastest

## API Pattern

Every model in this category uses the same submit-then-poll API:

```http
POST https://api.muapi.ai/api/v1/{model}
x-api-key: YOUR_API_KEY
Content-Type: application/json
```

Response: `{ "request_id": "abc123", "status": "processing" }`. Poll `GET https://api.muapi.ai/api/v1/predictions/{request_id}/result` until `status` is `completed` — the result URLs are in the `outputs[]` array. Optionally pass `?webhook=https://your-server` on the submit call to receive a callback instead of polling.

Get an API key at https://muapi.ai/access-keys.

## Quick Start

```bash
# 1. Submit
REQUEST_ID=$(curl -s -X POST https://api.muapi.ai/api/v1/sync-lipsync \
  -H "x-api-key: $MUAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{}' | jq -r .request_id)

# 2. Poll
while :; do
  RESP=$(curl -s https://api.muapi.ai/api/v1/predictions/$REQUEST_ID/result -H "x-api-key: $MUAPI_API_KEY")
  STATUS=$(echo "$RESP" | jq -r .status)
  [ "$STATUS" = "completed" ] && echo "$RESP" | jq .outputs && break
  [ "$STATUS" = "failed" ] && echo "$RESP" && exit 1
  sleep 3
done
```

## Models in This Category

- [ltx-2.3-lipsync](https://muapi.ai/playground/ltx-2.3-lipsync): LTX-2.3 LipSync generates a realistic talking video by synchronizing mouth movements to an input audio clip. It preserves facial identity, head position, light…
  - Endpoint: `POST https://api.muapi.ai/api/v1/ltx-2.3-lipsync`
  - Per-model llms.txt: https://muapi.ai/playground/ltx-2.3-lipsync/llms.txt
  - Cost: 0.260 credits per call
- [wan2.2-speech-to-video](https://muapi.ai/playground/wan2.2-speech-to-video): WAN2.2 Speech-to-Video transforms a static image into a talking video by synchronizing lip movements and facial expressions with an audio input. Simply provide…
  - Endpoint: `POST https://api.muapi.ai/api/v1/wan2.2-speech-to-video`
  - Per-model llms.txt: https://muapi.ai/playground/wan2.2-speech-to-video/llms.txt
  - Cost: 0.200 credits per call
- [infinitetalk-image-to-video](https://muapi.ai/playground/infinitetalk-image-to-video): InfiniteTalk Image-to-Video brings still portraits and character photos to life by generating natural, realistic talking videos. You provide a single face imag…
  - Endpoint: `POST https://api.muapi.ai/api/v1/infinitetalk-image-to-video`
  - Per-model llms.txt: https://muapi.ai/playground/infinitetalk-image-to-video/llms.txt
  - Cost: 0.200 credits per call
- [sync-lipsync](https://muapi.ai/playground/sync-lipsync): Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization.
  - Endpoint: `POST https://api.muapi.ai/api/v1/sync-lipsync`
  - Per-model llms.txt: https://muapi.ai/playground/sync-lipsync/llms.txt
  - Cost: 0.040 credits per call
- [kling-v1-avatar-pro](https://muapi.ai/playground/kling-v1-avatar-pro): Kling AI Avatar Pro is the premium tier for making high-quality talking avatars. You upload a character image plus an audio file, and the model generates a rea…
  - Endpoint: `POST https://api.muapi.ai/api/v1/kling-v1-avatar-pro`
  - Per-model llms.txt: https://muapi.ai/playground/kling-v1-avatar-pro/llms.txt
  - Cost: 0.650 credits per call
- [kling-v2-avatar-pro](https://muapi.ai/playground/kling-v2-avatar-pro): AI-Avatar v2 Pro takes a reference image of a person/character and an audio dialogue clip, then generates a realistic talking-avatar video. It preserves identi…
  - Endpoint: `POST https://api.muapi.ai/api/v1/kling-v2-avatar-pro`
  - Per-model llms.txt: https://muapi.ai/playground/kling-v2-avatar-pro/llms.txt
  - Cost: 0.750 credits per call
- [veed-lipsync](https://muapi.ai/playground/veed-lipsync): Generate realistic lipsync from any audio using VEED's latest model
  - Endpoint: `POST https://api.muapi.ai/api/v1/veed-lipsync`
  - Per-model llms.txt: https://muapi.ai/playground/veed-lipsync/llms.txt
  - Cost: 0.040 credits per call
- [kling-v2-avatar-standard](https://muapi.ai/playground/kling-v2-avatar-standard): AI-Avatar v2 Standard generates a talking-avatar video from a reference image and an audio dialogue. It performs accurate lip-sync, natural facial expressions,…
  - Endpoint: `POST https://api.muapi.ai/api/v1/kling-v2-avatar-standard`
  - Per-model llms.txt: https://muapi.ai/playground/kling-v2-avatar-standard/llms.txt
  - Cost: 0.350 credits per call
- [latent-sync](https://muapi.ai/playground/latent-sync): LatentSync is a video-to-video model that generates lip sync animations from audio using advanced algorithms for high-quality synchronization.
  - Endpoint: `POST https://api.muapi.ai/api/v1/latent-sync`
  - Per-model llms.txt: https://muapi.ai/playground/latent-sync/llms.txt
  - Cost: 0.040 credits per call
- [ltx-2-19b-lipsync](https://muapi.ai/playground/ltx-2-19b-lipsync): LTX-2-19B LipSync generates a realistic talking video by synchronizing a person’s mouth movements to an input audio clip. It preserves facial identity, head po…
  - Endpoint: `POST https://api.muapi.ai/api/v1/ltx-2-19b-lipsync`
  - Per-model llms.txt: https://muapi.ai/playground/ltx-2-19b-lipsync/llms.txt
  - Cost: 0.200 credits per call
- [creatify-lipsync](https://muapi.ai/playground/creatify-lipsync): Realistic lipsync video - optimized for speed, quality, and consistency.
  - Endpoint: `POST https://api.muapi.ai/api/v1/creatify-lipsync`
  - Per-model llms.txt: https://muapi.ai/playground/creatify-lipsync/llms.txt
  - Cost: 0.040 credits per call
- [kling-v1-avatar-standard](https://muapi.ai/playground/kling-v1-avatar-standard): Kling AI Avatar Standard creates talking avatar videos from a single image + audio input. It supports realistic humans, animals, or stylized characters, produc…
  - Endpoint: `POST https://api.muapi.ai/api/v1/kling-v1-avatar-standard`
  - Per-model llms.txt: https://muapi.ai/playground/kling-v1-avatar-standard/llms.txt
  - Cost: 0.350 credits per call

## FAQ

**Can I do multilingual lipsync?**

Yes — pair a TTS endpoint (Suno, MMAudio) with a lipsync endpoint to translate and re-sync a video to a new language end-to-end via the workflow builder.

**What audio formats work?**

MP3, WAV, and M4A are all accepted. Submit as a public URL or upload via `/api/v1/upload_file`.

## Agent Integration

MuApi ships an MCP server so agents (Claude Code, Cursor, custom) can call every model in this category as a tool:

```bash
npm install -g muapi-cli
muapi auth login
muapi mcp serve  # exposes all MuApi models as MCP tools
```

## Resources

- Category page: https://muapi.ai/playground/group/lipsync
- Global llms.txt: https://muapi.ai/llms.txt
- API docs: https://muapi.ai/docs
- OpenAPI spec: https://api.muapi.ai/openapi.json