AI Image Generator

MuApi gives you a single HTTP endpoint to call every leading text-to-image model in production. Submit a prompt, get back a request ID, poll for the result. No model-specific SDKs, no cold starts, no surprise billing — every model is priced per call in transparent credits, and you only pay for successful generations. Mix and match Flux Schnell for fast iteration, Seedream and HiDream for photoreal output, GPT-Image and Reve for text rendering, or Midjourney and Qwen for stylized art — all via the same `x-api-key` header and JSON body shape.

  • 30+ text-to-image models exposed as `POST /api/v1/{model}` — Flux Dev/Schnell/Kontext, Seedream, GPT-Image, HiDream, Qwen, Reve, Midjourney, Wan
  • Standard submit-then-poll workflow for every model — one client integration covers them all
  • Auto-generated input forms in the playground; full Pydantic schema available via OpenAPI
  • Credits charged on successful completion only — failures are free
  • Optional webhook callback per request to skip polling

Quick Start

Every model in this category uses the same submit-then-poll API. Replace flux-schnell with any model endpoint from the list below.

# 1. Submit
curl -X POST https://api.muapi.ai/api/v1/flux-schnell \
  -H "x-api-key: $MUAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"a serene mountain lake at dawn, photorealistic"}'
# → {"request_id":"abc123","status":"processing"}

# 2. Poll until completed
curl https://api.muapi.ai/api/v1/predictions/abc123/result \
  -H "x-api-key: $MUAPI_API_KEY"

Top 5 Image Generator Models

ModelProviderCostBest For
nano-banana$0.030Nano Banana is an advanced AI model excelling in natural language-driven image generation and editing. It produces hyper-realistic, physics-aware visuals with seamless style transformations.
nano-banana-pro$0.120Nano Banana 2 is the next-generation image generation developed by Google DeepMind, following the original Nano Banana (also known as Gemini 2.5 Flash Image). It offers advanced text-to-image capabilitie with improved resolution.
nano-banana-2$0.060Nano Banana 2 (Gemini 3.1 Flash Image) is Google's most advanced image generation model, combining speed with high-fidelity 4K output and revolutionary character consistency.
flux-dev$0.015Generate stunning visuals from simple text prompts. Flux Dev transforms your ideas into high-quality, creative images using powerful AI vision models. Perfect for design, storytelling, concept art, and marketing.
bytedance-seedream-v3$0.030Seedream is designed for generating visually rich and artistic images from text prompts. It excels at fantasy, anime, surrealism, and vibrant color compositions — ideal for creative visuals, storyboards, and concept art.

All 59 Models

flux-dev
11%
Text to Image
$0.0167$0.015

flux-dev

Generate stunning visuals from simple text prompts. Flux Dev transforms your ideas into high-quality, creative images using powerful AI vision models. Perfect for design, storytelling, concept art, and marketing.

bytedance-seedream-v4.5
11%
Text to Image
$0.0556$0.050

bytedance-seedream-v4.5

Seedream-v4.5 is ByteDance’s advanced text-to-image diffusion model designed for generating high-detail, high-contrast, cinematic and stylized images. It excels at surreal fantasy concepts, sci-fi worlds, product visuals, photoreal scenes, and artistic compositions with strong prompt adherence and crisp detail.

hidream-i1-fast
11%
Text to Image
$0.0089$0.008

hidream-i1-fast

Optimized for speed, this variant generates images in just a few steps. Ideal for previews, real-time applications, and use cases where fast results are more important than fine detail.

hidream-i1-full
10%
Text to Image
$0.0444$0.040

hidream-i1-full

The most advanced version of HiDream I1, delivering high-resolution, detailed images with superior prompt understanding. Best suited for production, content creation, and high-fidelity applications.

hidream-i1-dev
10%
Text to Image
$0.0222$0.020

hidream-i1-dev

Optimized for speed, this variant generates images in just a few steps. Ideal for previews, real-time applications, and use cases where fast results are more important than fine detail.

wan2.1-text-to-image
10%
Text to Image
$0.0333$0.030

wan2.1-text-to-image

WAN 2.1 is a powerful AI model that transforms text prompts into high-resolution, photorealistic images. It excels at detailed object rendering, realistic lighting, and fine textures, making it ideal for visual content, concept art, advertising, and digital storytelling.

flux-kontext-pro-t2i
10%
Text to Image
$0.0333$0.030

flux-kontext-pro-t2i

Flux Kontext Pro T2I offers fast and reliable generation with creative flexibility. It supports stylized prompts, character design, and fantasy themes while maintaining clear subject coherence.

gpt4o-text-to-image
10%
Text to Image
$0.0444$0.040

gpt4o-text-to-image

Generate images from text prompts using GPT-4o's vision capabilities. Ideal for basic concept visuals, diagrams, and abstract compositions.

qwen-image
10%
Text to Image
$0.0333$0.030

qwen-image

Generate high-quality, detailed images from text prompts in various styles — from realistic to artistic — perfect for creative visuals, product shots, and concept art.

ideogram-v3-t2i
10%
Text to Image
$0.0222$0.020

ideogram-v3-t2i

Ideogram v3 is an advanced text-to-image model designed for creating highly detailed and visually striking images directly from text prompts. It’s especially good for artistic compositions, design mockups, concept art, and photorealistic scenes. With strong support for text rendering inside images, it’s widely used for posters, typography-based art, and creative branding.

nano-banana
10%
Text to Image
$0.0333$0.030

nano-banana

Nano Banana is an advanced AI model excelling in natural language-driven image generation and editing. It produces hyper-realistic, physics-aware visuals with seamless style transformations.

google-imagen4-fast
10%
Text to Image
$0.0222$0.020

google-imagen4-fast

Imagen 4 Fast is optimized for speed and accessibility, allowing you to generate high-quality images in seconds. While slightly less detailed than the Ultra version, it excels at rapid ideation, drafts, storyboarding, and casual creativity.

google-imagen4-ultra
11%
Text to Image
$0.0667$0.060

google-imagen4-ultra

Imagen 4 Ultra is Google’s flagship model, designed for photorealism, rich textures, and production-level imagery. It produces crisp, high-resolution visuals with advanced detail, lighting precision, and natural compositions.

sdxl-image
10%
Text to Image
$0.0044$0.004

sdxl-image

SDXL is a high-quality, large Stable Diffusion model for creating photorealistic and stylized images from text. It excels at fine detail, realistic lighting, and complex scenes.

bytedance-seedream-v4
10%
Text to Image
$0.0444$0.040

bytedance-seedream-v4

Seedream v4 generates stunning, high-fidelity images from text prompts. It’s designed for creativity with strong support for realism, fantasy, and artistic styles.

hunyuan-image-2.1
11%
Text to Image
$0.0389$0.035

hunyuan-image-2.1

Hunyuan Image is a powerful text-to-image generation model that produces photorealistic and highly detailed visuals. It excels at creating portraits, environments, and concept art with strong consistency and realism. Designed for versatility, it supports both natural photography styles and imaginative artistic outputs.

chroma-image
10%
Text to Image
$0.0222$0.020

chroma-image

Croma Image is an advanced text-to-image generation model designed for high-quality, creative, and versatile visuals. It can produce anything from photorealistic portraits and products to imaginative concept art, fantasy illustrations, and cinematic scenes.

perfect-pony-xl
10%
Text to Image
$0.0222$0.020

perfect-pony-xl

Pony XL is a high-quality image generation model based on Stable Diffusion XL architecture. It specializes in character art, hybrid styles, and producing detailed, polished visuals even with simpler prompts.

wan2.5-text-to-image
10%
Text to Image
$0.0444$0.040

wan2.5-text-to-image

WAN 2.5 Text-to-Image generates high-quality, realistic or stylized images from textual descriptions. It supports detailed visual storytelling, cinematic compositions, and versatile styles — from portraits and product shots to landscapes and fantasy scenes.

leonardoai-phoenix-1.0
11%
Text to Image
$0.0556$0.050

leonardoai-phoenix-1.0

LeonardoAI Phoenix 1.0 is a professional-grade AI image model designed for realistic, cinematic, and highly detailed visuals. It excels at interpreting complex prompts, rendering text within images, and creating high-resolution outputs suitable for editorial, commercial, or creative projects.

leonardoai-lucid-origin
10%
Text to Image
$0.0333$0.030

leonardoai-lucid-origin

Lucid Origin is LeonardoAI’s advanced image generation model, designed for ultra-realistic, vibrant, and highly detailed visuals. It excels at creating photorealistic portraits, landscapes, product shots, and stylized art while faithfully following complex prompts.

nano-banana-pro
10%
Text to Image
$0.1333$0.120

nano-banana-pro

Nano Banana 2 is the next-generation image generation developed by Google DeepMind, following the original Nano Banana (also known as Gemini 2.5 Flash Image). It offers advanced text-to-image capabilitie with improved resolution.

kling-o1-text-to-image
11%
Text to Image
$0.0400$0.036

kling-o1-text-to-image

Kling O1 Text-to-Image is a high-fidelity creative image model that converts rich natural-language prompts into ultra-detailed stills. It excels at cinematic composition, realistic lighting, and coherent scene detail—great for concept art, environment renders, character portraits, and stylized imagery with photoreal or illustrative looks.

z-image-turbo
11%
Text to Image
$0.0078$0.007

z-image-turbo

Z-Image Turbo is a high-speed text-to-image model optimized for fast creative generation. It produces detailed, high-contrast, high-resolution images with strong stylization control. Ideal for rapid concept creation, visual exploration, product ideas, fantasy scenes, and cinematic composition tests. Designed for low latency and strong prompt adherence.

flux-2-dev
11%
Text to Image
$0.0167$0.015

flux-2-dev

Flux 2 Dev is a powerful text-to-image diffusion model designed for high-quality, fast, and highly detailed visual generation. It excels at creating cinematic lighting, vibrant compositions, surreal concepts, characters, products, and worlds with strong prompt following and artistic control. Ideal for rapid image ideation, visual storytelling, and concept art.

flux-2-flex
11%
Text to Image
$0.1000$0.090

flux-2-flex

Flux-2-Flex Text-to-Image is a flexible, high-fidelity generative model capable of producing detailed, imaginative, and stylistically rich scenes from text alone. It excels at surreal concepts, fantasy environments, sci-fi structures, cinematic atmospheres, and high-resolution artistic compositions with strong prompt adherence.

flux-2-pro
11%
Text to Image
$0.0356$0.032

flux-2-pro

Flux-2-Pro Text-to-Image is a premium, high-fidelity generative model capable of producing ultra-realistic, cinematic, and deeply detailed images from text prompts. It excels at complex lighting, layered compositions, surreal visual concepts, and professional art-grade rendering suitable for concept art, advertising visuals, and world-building.

vidu-q2-text-to-image
10%
Text to Image
$0.0444$0.040

vidu-q2-text-to-image

VIDU Text-to-Image Q2 is a high-quality generative model focused on producing vivid, dynamic, and cinematic still images using natural language prompts. It excels at atmospheric depth, expressive lighting, surreal concepts, and motion-infused compositions typical of VIDU’s visual identity.

gpt-image-1.5
10%
Text to Image
$0.0600$0.054

gpt-image-1.5

GPT-Image-1.5 is a high-quality text-to-image generation model designed for rich visual reasoning, detailed compositions, and strong prompt understanding. It excels at complex scenes, symbolic imagery, cinematic lighting, surreal concepts, product visuals, and imaginative world-building while maintaining coherence and fine detail.

wan2.6-text-to-image
10%
Text to Image
$0.0444$0.040

wan2.6-text-to-image

WAN 2.6 Text-to-Image generates detailed, cinematic still images from text prompts. It focuses on strong composition, atmospheric lighting, and clear subject structure, making it suitable for fantasy and sci-fi environments, surreal concepts, architectural visuals, and dramatic world-building imagery.

flux-2-klein-9b
10%
Text to Image
$0.0144$0.013

flux-2-klein-9b

Flux-2-Klein-9B is a mid-size text-to-image model that balances detail quality and generation speed. It handles richer lighting, better textures, and more nuanced scenes than smaller variants, while still working well with clear, grounded prompts. Ideal for polished illustrations, product visuals, mascots, and everyday scenes with character.

z-image-base
10%
Text to Image
$0.0144$0.013

z-image-base

Z-Image Base is a general-purpose text-to-image model designed for reliable, high-quality image generation from natural language prompts. It focuses on clear composition, good prompt adherence, and versatile output across everyday scenes, product-style visuals, characters, and creative concepts.

bytedance-seedream-v5.0
9%
Text to Image
$0.0361$0.033

bytedance-seedream-v5.0

Seedream 5.0 Lite is ByteDance’s next-generation text-to-image model, delivering high-fidelity AI art with advanced visual reasoning and precise typography. Supporting up to 4K resolution and cinematic detail, it excels at complex scene construction, consistent character generation, and real-time knowledge integration for accurate, contextually relevant visuals.

z-image-p
10%
Text to Image
$0.0044$0.004

z-image-p

Z-Image P is based on PiAPI's Qubico/z-image text-to-image model.

qwen-image-2.0
10%
Text to Image
$0.0444$0.040

qwen-image-2.0

Qwen 2.0 Text to Image model with enhanced realism.

qwen-image-2.0-pro
11%
Text to Image
$0.1000$0.090

qwen-image-2.0-pro

Qwen 2.0 Pro Text to Image model with maximum realism and fidelity.

tiktok-carousel
10%
Text to Image
$0.0311$0.028

tiktok-carousel

AI TikTok Carousel Generator — create viral TikTok carousel posts from a single text prompt. Choose a proven storytelling format (Problem-Solution, Listicle, Tutorial, Before & After), set your slide count (3-10), and get stunning AI-generated images at 1080x1920 portrait resolution, ready to upload to TikTok.

flux-2-klein-4b-turbo
14%
Text to Image
$0.0058$0.005

flux-2-klein-4b-turbo

Flux-2-Klein-4B Turbo is an ultra-fast, high-efficiency text-to-image model. It is a distilled version of the Klein 4B model, designed for near-instant rendering while maintaining impressive adherence to prompts. Perfect for rapid prototyping, real-time creative tools, and applications where speed is paramount.

flux-2-klein-9b-turbo
17%
Text to Image
$0.0072$0.006

flux-2-klein-9b-turbo

Flux-2-Klein-9B Turbo is a high-performance, mid-size text-to-image model. This distilled variant of Klein 9B provides a superior balance of speed and detail, delivering richer textures and complex scenes with significantly reduced generation times. Ideal for polished illustrations and character-rich visuals where performance is key.

nano-banana-2
11%
Text to Image
$0.0667$0.060

nano-banana-2

Nano Banana 2 (Gemini 3.1 Flash Image) is Google's most advanced image generation model, combining speed with high-fidelity 4K output and revolutionary character consistency.

gpt-image-2-text-to-image
11%
Text to Image
$0.1000$0.090

gpt-image-2-text-to-image

Generate high-quality images from text prompts using GPT Image 2, supporting up to 20,000 character prompts for detailed and precise image creation.

midjourney-niji
10%
Text to Image
$0.1111$0.100

midjourney-niji

Generate 4 anime and illustration-style images per run with Midjourney Niji. Optimized for character art, manga, and stylized illustrations. Supports reference image guidance.

grok-imagine-text-to-image-quality
10%
Text to Image
$0.1111$0.100

grok-imagine-text-to-image-quality

Grok Imagine Quality is xAI's high-fidelity text-to-image mode that prioritizes accuracy and detail over speed. It produces sharper, more visually accurate images with stronger lighting, depth, and artistic clarity. Get 6 images each time.

flux-kontext-dev-t2i
10%
Text to Image
$0.0222$0.020

flux-kontext-dev-t2i

Generates an image from a text prompt, with optional reference image for pose or style guidance. Ideal for controlled, consistent image creation using just a description.

ai-anime-generator
10%
Text to Image
$0.0333$0.030

ai-anime-generator

Create stunning anime-style artwork instantly with our AI Anime Generator. Customize characters, scenes, and styles effortlessly in seconds!

flux-kontext-max-t2i
11%
Text to Image
$0.0667$0.060

flux-kontext-max-t2i

Flux Kontext Max T2I delivers photorealistic or cinematic-quality images with exceptional detail. It's optimized for high-end visuals — from realistic humans to polished product renders.

bytedance-seedream-v3
10%
Text to Image
$0.0333$0.030

bytedance-seedream-v3

Seedream is designed for generating visually rich and artistic images from text prompts. It excels at fantasy, anime, surrealism, and vibrant color compositions — ideal for creative visuals, storyboards, and concept art.

flux-krea-dev
11%
Text to Image
$0.0167$0.015

flux-krea-dev

Flux Krea Dev is a text-to-image model built by Black Forest Labs in collaboration with Krea AI, designed to generate highly photorealistic images that avoid the common 'AI look' artifacts (plastic skin, overexposed lighting, synthetic textures). It emphasizes real texture, natural lighting, and aesthetic control.

neta-lumina
10%
Text to Image
$0.0222$0.020

neta-lumina

Neta Lumina is a powerful anime-style text-to-image model developed by Neta.art Lab. It’s built on Lumina-Image-2.0, fine-tuned with over 13 million high-quality anime images. It offers strong understanding of multilingual prompts, excellent detail fidelity, support for Danbooru tags, and leaning into niche styles like furry, Guofeng, pets, scenic backgrounds, etc.

hunyuan-image-3.0
10%
Text to Image
$0.0722$0.065

hunyuan-image-3.0

Hunyuan Image 3.0 brings together powerful architecture (Mixture-of-Experts + autoregressive style) to produce richly detailed and coherent images from complex prompts. It can read narrative descriptions, render text and signage cleanly, and support multiple visual styles — from photorealism to illustrations.

google-imagen4
10%
Text to Image
$0.0333$0.030

google-imagen4

Google Imagen 4 is the latest text-to-image AI model from DeepMind, designed to produce stunningly photorealistic images with crisp detail, accurate text rendering, and creative flexibility. It supports high-resolution output (up to 2K), generates visuals in seconds, and embeds SynthID watermarks for authenticity.

grok-imagine-text-to-image
11%
Text to Image
$0.0556$0.050

grok-imagine-text-to-image

Grok Imagine is xAI’s high-quality image generation model that transforms text prompts into detailed, stylish, and visually expressive images. It excels at creating vivid scenes, characters, environments, and concept art with strong lighting, depth, and artistic clarity. Get 6 images each time.

reve-text-to-image
11%
Text to Image
$0.0356$0.032

reve-text-to-image

Generate images from text prompts using reve's vision capabilities. Ideal for basic concept visuals, diagrams, and abstract compositions.

flux-2-klein-4b
14%
Text to Image
$0.0116$0.010

flux-2-klein-4b

Flux-2-Klein-4B is a lightweight, fast text-to-image model optimized for clear subject rendering, good prompt adherence, and efficient generation. It works best with simple compositions, everyday scenes, and cute or friendly visuals, making it ideal for UI graphics, demos, thumbnails, mascots, and quick creative iterations.

wan2.7-text-to-image
11%
Text to Image
$0.0556$0.050

wan2.7-text-to-image

Alibaba WAN 2.7 Text-to-Image generates high-quality images from text prompts with thinking mode for enhanced image quality.

wan2.7-text-to-image-pro
10%
Text to Image
$0.1111$0.100

wan2.7-text-to-image-pro

Alibaba WAN 2.7 Text-to-Image Pro generates high-quality images up to 4K from text prompts with thinking mode for enhanced image quality.

midjourney-v8
10%
Text to Image
$0.1111$0.100

midjourney-v8

Generate 4 photorealistic images per run with Midjourney V8. Improved coherence and detail over V7. Supports text-to-image and reference image guidance.

flux-schnell
10%
Text to Image
$0.0033$0.003

flux-schnell

Flux Schnell is a lightning-fast image generation model designed for rapid iterations. It delivers good visual quality from text prompts almost instantly, making it perfect for real-time concept testing, brainstorming, and UI-integrated experiences.

midjourney-v7
10%
Text to Image
$0.1111$0.100

midjourney-v7

Generate 4 photorealistic images per run with Midjourney V7. Supports text-to-image and reference image guidance via source_image_url.

Frequently Asked Questions

Which AI image generator API is the best?

It depends on the trade-off you want. Flux Schnell and Seedream lead on photoreal speed; HiDream and Reve produce the most accurate in-image text; Midjourney and Qwen Image lead on stylized artistic output; GPT-Image is the strongest generalist with editing support. MuApi exposes all of them so you can A/B in code without rewriting your client.

How do I authenticate?

Send your MuApi key as the `x-api-key` header on every request. Create one at https://muapi.ai/access-keys.

What does a request look like?

POST a JSON body to `https://api.muapi.ai/api/v1/{model}` (e.g. `flux-schnell`). You get back a `request_id`. Poll `GET /api/v1/predictions/{request_id}/result` until `status` is `completed`, then read the image URLs from `outputs[]`.

Is there a free tier?

Yes — new accounts get free credits to try every model in the playground before committing.

How is pricing calculated?

Each model has a per-call cost in credits, listed on the playground page. Some models (e.g. variable-resolution or duration-based ones) compute cost dynamically from the request payload — the playground shows the live calculation as you tweak inputs.

Can I use this from an agent or LLM?

Yes. MuApi ships an MCP server (`muapi mcp serve`) that exposes every model as a tool — drop it into Claude Code, Cursor, or any MCP-aware agent and call image generation by name.