Claude Fable 5 API: Pricing, Refusals & Setup
The official quickstart shows you claude-opus-4-8 boilerplate and calls it a day. That's not enough to ship Fable 5. This model has its own behaviors — an always-on thinking surface, a refusal stop reason that returns an empty array, a hard 400 on the thinking budget you used to pass, and a flat rejection of zero-data-retention orgs. This guide is the Fable-5-specific integration walkthrough, against Muapi's lower-cost endpoint, with code you can paste and run.
What Is Claude Fable 5 (and Why Call It Through Muapi)?
Claude Fable 5 is Anthropic's most capable widely released model: top-tier complex reasoning, long-form and creative writing, expert-level coding and debugging, and multimodal image-plus-text analysis — all with a 1M-token context window (the default) and up to 128K output tokens. Muapi exposes it as a Text-to-Text endpoint (claude-fable-5, provider anthropic) at $8.00 per million input tokens and $40.00 per million output tokens — well below calling api.anthropic.com directly. This guide is for developers who want a working integration plus the Fable-5-specific behaviors that break code written for older Opus models. Grab a key and the live endpoint on the Muapi playground for claude-fable-5.
Pricing: Muapi vs. Calling Anthropic Directly
Muapi bills per token, deducted after each call from the token counts the model returns:
| Input | Output | Notes | |
|---|---|---|---|
Muapi (claude-fable-5) | $8.00 / M | $40.00 / M | Minimum $0.001 per call |
| Anthropic (official, per Muapi's comparison) | ~$15.00 / M | ~$75.00 / M | Direct via api.anthropic.com |
| Fal.ai | — | — | Not available |
A worked example: a call with 2,000 input tokens and 1,000 output tokens costs (2,000 / 1,000,000 × $8.00) + (1,000 / 1,000,000 × $40.00) = $0.016 + $0.040 = $0.056. The same call against Anthropic's rates in Muapi's comparison runs roughly $0.030 + $0.075 = $0.105 — close to double.
An honest note on the comparison: Anthropic's published list price for Fable 5 is commonly cited at around $10/M input and $50/M output, not the $15/$75 in Muapi's table — so frame the higher figure as Muapi's comparison data, not a universal quote. Either way, Muapi's $8/$40 sits below both. Fable 5 is not available on Fal.ai, so Muapi is the lower-cost managed path.
Quickstart: Your First Fable 5 Call on Muapi
The request body takes prompt (string), an optional system_prompt (string), and an optional image_url (string) for multimodal requests. There are two endpoints:
/claude-fable-5— async. You receive arequest_idand poll for the result. Good for workflows and automation./claude-fable-5/stream— live SSE. Good for chat UIs and progress feedback.
Hard tasks can run for many minutes at higher effort, so set generous timeouts and prefer streaming. The canonical safe pattern is to read stop_reason before touching content[0] — a refusal returns an empty array.
# cURL — async endpoint, then poll for the result
RID=$(curl -s https://muapi.ai/api/claude-fable-5 \
-H "Authorization: Bearer $MUAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Summarize the architectural tradeoffs in this diagram.",
"system_prompt": "You are a precise senior staff engineer.",
"image_url": "https://example.com/architecture.png"
}' | jq -r '.request_id')
curl -s "https://muapi.ai/api/claude-fable-5/result/$RID" \
-H "Authorization: Bearer $MUAPI_KEY"
# cURL — live streaming variant (Server-Sent Events)
curl -N https://muapi.ai/api/claude-fable-5/stream \
-H "Authorization: Bearer $MUAPI_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "Write a 600-word noir opening set in a server room."}'
import time, requests
BASE = "https://muapi.ai/api"
HEADERS = {"Authorization": f"Bearer {MUAPI_KEY}", "Content-Type": "application/json"}
def call_fable5(prompt, system_prompt=None, image_url=None, timeout=900):
body = {"prompt": prompt}
if system_prompt:
body["system_prompt"] = system_prompt
if image_url:
body["image_url"] = image_url
# Three Fable-5 gotchas, in one payload:
body["model"] = "claude-fable-5" # 1. exact ID, NO date suffix
body["thinking"] = {"type": "adaptive"} # 2. omit, or adaptive — never budget_tokens
body["betas"] = ["server-side-fallback-2026-06-01"] # 3. opt into fallbacks by default
body["fallbacks"] = [{"model": "claude-opus-4-8"}]
rid = requests.post(f"{BASE}/claude-fable-5", json=body, headers=HEADERS).json()["request_id"]
deadline = time.time() + timeout
while time.time() < deadline:
r = requests.get(f"{BASE}/claude-fable-5/result/{rid}", headers=HEADERS).json()
if r.get("status") == "completed":
# CHECK stop_reason BEFORE indexing content — a refusal has an empty array
if r.get("stop_reason") == "refusal":
return handle_refusal(r)
return r["content"][0]["text"]
time.sleep(2)
raise TimeoutError("Fable 5 task exceeded timeout — prefer the /stream endpoint")
def handle_refusal(resp):
# Pre-output refusal: empty content, not billed. Mid-stream: partial output, billed (discard it).
return None
Four Fable 5 Gotchas That Break Code Written for Older Models
1. Model ID must be exactly claude-fable-5 — no date suffix. Older Anthropic habits push you toward claude-fable-5-20260101. That ID does not exist and will fail. The string claude-fable-5 is complete as written.
2. Thinking is always on. Fable 5 thinks on every request. Omit the thinking parameter entirely, or pass {"type": "adaptive"}. Both {"type": "enabled", "budget_tokens": N} and {"type": "disabled"} return a 400 — budget_tokens worked on Opus 4.7, but it is fully removed here. Control depth with effort, not a token budget.
3. stop_reason: "refusal" returns HTTP 200 with an empty (or partial) content array. Safety classifiers can decline a request. A pre-output decline gives you an empty content array and is not billed at all; a mid-stream decline gives you partial output that is billed (discard it). Either way, branch on stop_reason first — content[0] on a refused request throws.
4. No zero data retention. Fable 5 requires 30-day retention and is not available under ZDR. If your org is configured for zero data retention (or anything below 30 days), every request returns 400 invalid_request_error regardless of payload. If a working integration suddenly 400s with a clean body, check the org's retention setting before debugging the request.
Handling Refusals Gracefully with Server-Side Fallbacks
Fable 5's classifiers target research biology and most cybersecurity content — but benign adjacent work (security tooling, life-sciences tasks) can trip a false positive. Rather than failing the request, opt into server-side fallbacks by default:
body["betas"] = ["server-side-fallback-2026-06-01"]
body["fallbacks"] = [{"model": "claude-opus-4-8"}]
On a policy decline, the API transparently re-serves the same request with Claude Opus 4.8 inside the same call. Billing is handled automatically: a decline before any output isn't billed, and the rescue bills at the fallback model's own rates. A mid-stream decline bills the streamed partial. The header must be exactly server-side-fallback-2026-06-01. If, after the chain runs, the final stop_reason is still "refusal", the whole chain declined — surface that to the user rather than retrying the identical prompt. Ship this in new Fable 5 code from day one; it costs nothing on the happy path and rescues the false positives.
What to Build With It: Reasoning, Writing, Code, and Vision
- Complex reasoning — multi-step logic, deep analysis, and research synthesis. Run it through the async
/claude-fable-5endpoint with a focusedsystem_promptand generous timeouts. - Creative writing — long-form stories, scripts, and essays with nuanced style control. Stream from
/claude-fable-5/streamso users see prose as it lands. - Coding and engineering — write, review, debug, and refactor across any language. Put the spec in
promptand engineering conventions insystem_prompt. - Multimodal analysis — pass a diagram, screenshot, or document via
image_urlalongside a textpromptto extract structured insight from the image.
Conclusion & Next Steps
Three things carry the integration: Muapi's lower price ($8/$40 per million vs. the official comparison), the safe stop_reason-first calling pattern that never blind-indexes content[0], and the always-on adaptive thinking surface (omit the param or pass {"type": "adaptive"} — never budget_tokens). Add server-side fallbacks by default, confirm your org meets 30-day retention, and prefer streaming for anything that might run long. Then grab a key and run the quickstart on the Muapi playground for claude-fable-5.
FAQ
Why does my claude-fable-5 request return a 400 when I pass thinking with budget_tokens, even though it worked on Opus 4.7?
budget_tokens is fully removed on Fable 5. Thinking is always on, and the only accepted configuration is adaptive — either omit the thinking parameter entirely or pass {"type": "adaptive"}. Both {"type": "enabled", "budget_tokens": N} and {"type": "disabled"} return a 400. There is no token-budget knob; control reasoning depth with effort instead. Delete the budget_tokens plumbing and the request will succeed.
How do I handle stop_reason "refusal" when the content array is empty — was I billed, and should I retry?
A refusal comes back as a successful HTTP 200 with stop_reason: "refusal". If the classifier fired before any output, the content array is empty and you were not billed (no input or output tokens). If it fired mid-stream, you have partial output that was billed — discard it rather than treating it as complete. Always check stop_reason before reading content[0]. Don't retry the identical prompt; instead enable server-side fallbacks (betas: ["server-side-fallback-2026-06-01"] + fallbacks: [{"model": "claude-opus-4-8"}]) so a false-positive decline is re-served by Opus 4.8 in the same call. A final stop_reason: "refusal" after the chain means the whole chain declined.
Why do all my Fable 5 requests fail with a 400 invalid_request_error when my org has zero data retention configured?
Fable 5 requires 30-day data retention and is not available under zero data retention. If your org's retention configuration is ZDR — or anything below 30 days — every request returns 400 invalid_request_error no matter how valid the payload is. The fix is on the org's retention configuration, not the request body. Once the org meets the 30-day requirement, the same payloads go through unchanged.





