Music & Speech Documentation

MuAPI provides professional-grade audio synthesis and lip-synchronization tools to bring your creative projects to life with realistic sound and speech.

1. Music Generation (Suno)

Create, remix, and extend professional music tracks using the Suno model family.

  • Endpoints:
    • POST /api/v1/suno-create-music: Generate new tracks from prompts.
    • POST /api/v1/suno-remix-music: Create variations of existing audio.
    • POST /api/v1/suno-extend-music: Continue an existing track with new sections.
  • Features: High-fidelity audio, genre/mood control, and seamless extensions.

2. Lip-Synchronization

Synchronize character lip movements with audio tracks using state-of-the-art sync models.

  • Models Supported:
    • Sync-Lipsync: Optimized for high-fidelity facial alignment.
    • LatentSync: Faster inference with smooth temporal consistency.
    • Creatify/Veed: Specialized models for different video formats.
  • Endpoints:
    • POST /api/v1/sync-lipsync
    • POST /api/v1/latentsync-video
    • POST /api/v1/creatify-lipsync
    • POST /api/v1/veed-lipsync

3. Audio & Music Utilities (MMAudio)

  • Text-to-Audio: Generate Foley, sound effects, or speech.
  • Video-to-Video Audio: Synchronize audio with pre-existing video motion.
  • Endpoints: POST /api/v1/mmaudio-v2/text-to-audio, POST /api/v1/mmaudio-v2/video-to-video