Explore/muapi.ai/moderate-text

muapi/moderate-text

Text to Text

Classify any text snippet across OpenAI's standard moderation categories (sexual, hate, harassment, self-harm, violence, and more). Returns a boolean flag plus per-category booleans and confidence scores — a drop-in safety gate for chat inputs, generated text, and free-form prompts.

Input

Configure the model parameters below.

Result

Your generated results
will appear here

📝

Overview

About this model

Classify any text snippet across the standard policy categories — sexual, hate, harassment, self-harm, violence, and more — with confidence scores per category. A drop-in safety gate for chat inputs, generated text, and free-form prompts.

1Chat applications: Filter user messages before forwarding to an LLM.
2Prompt safety: Pre-check prompts before billing a generation request.
3Content review: Score open-text submissions for downstream moderation queues.
💰

Pricing & Value

Cost analysis

muapiapp$0.001 per call

Flat fee per text moderation call. No volume tiers.

Fal.aiNot available

No equivalent first-party text-moderation model.

ReplicateNot available

No equivalent first-party text-moderation model.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Textstring

Text to moderate. Truncated to 8000 characters before evaluation.

Default ValueA beautiful sunset over the mountains.
📖

Implementation Guide

Developer documentation

How to Use Text Moderation

  1. Submit the text: POST text (truncated to 8000 chars before evaluation). No other fields required.

  2. Read the verdict: Text moderation completes synchronously — poll /api/v1/predictions/{request_id}/result immediately and you'll usually have a result within a second.

  3. Inspect categories and scores: output.flagged is the binary gate. output.categories is a dict of {category: boolean} showing which policies were tripped, and output.category_scores gives a confidence score per category (0–1).

Common Questions

Frequently asked

How long can the text be?

Up to 8000 characters; anything longer is truncated before evaluation.

Which languages are supported?

The moderator handles a broad set of languages, but English yields the most reliable category scores.

What are the categories?

Sexual, sexual/minors, hate, hate/threatening, harassment, harassment/threatening, self-harm and its sub-types, violence, and violence/graphic.