Getting Started
Installation
Choose any of the four install methods below. The recommended approach is the MCP config, which requires zero global installs.
1 MCP Config (recommended)
Add the following to your .mcp.json or claude_desktop_config.json:
2 npx (no install)
3 npm global
4 curl
Your First Optimization
Once installed, use the optimize_prompt tool in any MCP-compatible client (Claude Code, Cursor, Windsurf). Here is a step-by-step walkthrough:
1 Send a raw prompt to optimize_prompt:
2 Review the PreviewPack response. It contains:
quality_before— A multi-dimensional score from 0–100 with traceable deductionscompiled_prompt— The structured, optimized prompt formatted for your target LLMblocking_questions— Ambiguities that must be resolved before approvalcost_estimate— Token counts and costs across 11 models from 4 providersmodel_recommendation— The recommended model for this task
3 Answer blocking questions (if any) using refine_prompt:
4 Approve the final result:
Key principle: All analysis is deterministic. The same input always produces the same output. Zero LLM calls happen inside the MCP — all intelligence comes from the host model (Claude, GPT, etc.).
Architecture
Prompt Control Plane processes your prompts before they reach any LLM. It is a pre-LLM quality gate — every prompt is scored, analyzed, compiled, and policy-checked before execution. Zero LLM calls inside.
The pipeline analyzes intent, scores quality, detects risks, compiles to your target format, estimates cost across providers, and routes to the optimal model. The result is packaged into a PreviewPack for human review — nothing reaches the LLM until the user approves.
What’s in a PreviewPack?
Every call to optimize_prompt, refine_prompt, or pre_flight returns a PreviewPack — a complete analysis bundle for human review before anything is sent to an LLM.
| Field | Type | Description | Tier |
|---|---|---|---|
request_id |
string | Unique ID for traceability | All |
session_id |
string | Session identifier for iterative refinement | All |
quality_before |
QualityScore | Multi-dimensional quality score (0–100) with traceable deductions | All |
compiled_prompt |
string | The optimized prompt in target format (Claude XML, OpenAI, or Markdown) | All |
intent_spec |
IntentSpec | Decomposed intent: task type, audience, tone, constraints, definition of done | All |
compilation_checklist |
object | 9-item structural coverage checklist (role, goal, constraints, etc.) | All |
blocking_questions |
Question[] | Questions that must be resolved before the prompt is ready | All |
assumptions |
Assumption[] | Assumptions made during compilation (review & override) | All |
cost_estimate |
CostEstimate | Token counts + cost across 11 models from 4 providers | All |
model_recommendation |
string | Optimal model based on task complexity, risk, and budget | All |
changes_made |
string[] | List of modifications applied during compilation | All |
target |
OutputTarget | claude (XML), openai (system/user), or generic (Markdown) |
All |
storage_health |
"ok" | "degraded" |
Storage status — pipeline continues in degraded mode with fail-open semantics | All |
Enterprise-only pipeline stages that influence the PreviewPack:
- Policy Gate — can block or warn before compilation begins (Enterprise only)
- Custom Rules — additional rules in steps 3–4 affect risk score and blocking questions (Enterprise only)
- Audit Log — every PreviewPack approval is written to a tamper-evident JSONL log (Enterprise only)
Verification
Prompt Control Plane is designed to be trustworthy by default. Here’s what that means in practice.
| Guarantee | What It Means For You |
|---|---|
| Reproducible results | Run the same prompt twice — get the same score, same routing, same cost estimate. No randomness, ever. |
| Accurate cost estimates | Pricing verified against live provider rates for Anthropic, OpenAI, Google, and Perplexity. Updated regularly. |
| Reliable quality scores | Multi-dimensional scoring catches vagueness, missing constraints, scope creep, and hallucination risk — before the prompt reaches an LLM. |
| Safe compression | Guaranteed to never increase token count. Code blocks, tables, and structured content are always preserved. |
| Stable contracts | Output shapes are locked and versioned. Integrations won’t break across updates. |
| Fully offline | Zero external calls. Your prompts never leave your machine. No API keys required for analysis. |
| Tamper-evident audit | Cryptographically chained log of every decision. If any entry is modified, the chain breaks. (Enterprise) |
Extensively tested. Run the suite yourself to verify — completes in under 5 seconds.
Tool Reference
Prompt Control Plane exposes 19 tools via MCP. Three are metered (count against your plan); the rest are free and unlimited. Every response includes a request_id for traceability.
All 19 Tools at a Glance
| Tool | Category | Cost | Plan |
|---|---|---|---|
optimize_prompt | Analysis | Metered | All Plans |
check_prompt | Analysis | Free | All Plans |
classify_task | Analysis | Free | All Plans |
route_model | Routing | Free | All Plans |
pre_flight | Routing | Metered | All Plans |
refine_prompt | Refinement | Metered | All Plans |
approve_prompt | Refinement | Free | All Plans |
estimate_cost | Cost | Free | All Plans |
compress_context | Cost | Free | All Plans |
prune_tools | Cost | Free | All Plans |
configure_optimizer | Config | Free | All Plans* |
get_usage | Config | Free | All Plans |
prompt_stats | Config | Free | All Plans |
set_license | License | Free | All Plans |
license_status | License | Free | All Plans |
list_sessions | Sessions | Free | Enterprise |
export_session | Sessions | Free | Enterprise |
delete_session | Sessions | Free | Enterprise |
purge_sessions | Sessions | Free | Enterprise |
*configure_optimizer: Enterprise settings (policy_mode, audit_log, config lock, session_retention_days) require Enterprise tier.
The main entry point. Analyzes a raw prompt, detects ambiguities, compiles an optimized version, scores quality, and estimates cost across providers. Returns a PreviewPack for review.
PreviewPack — Contains session_id, quality_before (0–100), compiled_prompt, blocking_questions, assumptions, cost_estimate, model_recommendation, compilation_checklist, and changes_made.
Quick pass/fail check of a prompt. Returns a quality score, the top issues detected, and a suggestion for improvement. No compilation, no session. Ideal for CI gates or lightweight checks.
Quality score (0–100), pass/fail verdict based on configured strictness threshold, top 2 issues, and an improvement suggestion.
Classifies a prompt by task type (13 types), reasoning complexity (6 levels), dimensional risk score, and suggests an optimization profile. Useful for understanding a prompt before committing to optimization.
taskType, complexity (simple_factual / analytical / multi_step / creative / long_context / agent_orchestration), riskScore (0–100), riskDimensions (underspec, hallucination, scope, constraint), suggestedProfile, and signals.
Routes to the optimal model based on task complexity, risk, budget, and latency preferences. Returns a full recommendation with a decision_path audit trail showing every routing decision. Two-step routing: (1) complexity + risk determine the default tier, (2) budget/latency overrides produce the final tier.
ModelRecommendation — Contains primary (model, provider, temperature, maxTokens), fallback, confidence (0–100), costEstimate, rationale, tradeoffs, savings_vs_default, and decision_path array.
Full pre-flight analysis in a single call: classifies task, assesses risk, routes model, and scores quality. Returns a complete decision bundle. Counts as 1 metered use (does NOT call optimize_prompt internally).
Task classification, complexity result, risk score with dimensions, model recommendation with decision_path, quality score, cost estimate, and pre-flight deltas (estimated token savings from compression and pruning).
Refines a prompt by answering blocking questions or providing manual edits. Re-runs the full analysis pipeline and returns an updated PreviewPack.
Updated PreviewPack with re-scored quality, re-compiled prompt, and reduced blocking questions.
Sign-off gate. Approves the compiled prompt and returns the final optimized prompt ready for use. Fails if blocking questions remain. In Enterprise enforce mode, runs a policy enforcement check before allowing approval.
The final compiled prompt, quality score before compilation, and the compilation checklist summary.
Estimates token count and cost across all supported providers for any prompt text. No session needed. Covers 11 models from Anthropic, OpenAI, Google, and Perplexity.
CostEstimate — Contains input_tokens, estimated_output_tokens, per-model costs array (model, provider, input/output cost in USD), recommended_model, and recommendation_reason.
Pricing covers models across Anthropic, OpenAI, Google, and Perplexity. Updated regularly. See Models for supported models.
Compresses context (code, docs) by removing irrelevant sections while respecting protected zones (fenced code, tables, lists, JSON, YAML). Multi-stage pipeline intelligently removes boilerplate while preserving meaningful content.
compressed_context, removed_sections, original_tokens, compressed_tokens, tokens_saved, savings_percent, and mode.
Scores and ranks MCP tools by relevance to a task intent. In prune mode, marks the lowest-scoring tools for removal to save context tokens. Respects mention protection and the always-relevant tool set (search, read, write, edit, bash).
Ranked tool list with relevance_score (0–100), signals, and tokens_saved_estimate per tool. Plus pruned_count, pruned_tools, and total tokens_saved_estimate.
Configures optimizer behavior. Supports mode, threshold, strictness, default target, ephemeral mode, session limits, and Enterprise features (policy mode, audit log, config lock with passphrase protection).
The full OptimizerConfig object and list of applied_changes.
Returns current usage count, plan limits, remaining quota, tier information, and first/last used timestamps.
total_optimizations, limits (TierLimits), remaining (lifetime + monthly), tier, first_used_at, last_used_at.
Returns aggregated optimization statistics: total optimized, total approved, average quality score, top task types, blocking question frequency, and estimated cost savings in USD.
total_optimized, total_approved, average_score_before, task_type_counts, blocking_question_counts, estimated_cost_savings_usd.
Activates a Pro, Power, or Enterprise license key. Uses Ed25519 asymmetric signature verification (public key only, offline, zero dependencies). The license payload contains tier, issued_at, expires_at, and license_id — no PII.
status ("activated"), tier, expires_at, license_id, and limits for the activated tier.
Checks current license status, tier, and expiry. Re-validates expiry on every read. Returns purchase URLs if no license is active.
LicenseData (or null with purchase links if inactive): tier, issued_at, expires_at, license_id, valid, validation_error (if expired).
Lists all optimization sessions with metadata (no raw prompts). Returns newest-first, with task type, quality score, state, and prompt hash.
SessionListResponse — Array of SessionRecord (session_id, state, task_type, quality_before, prompt_hash, target) plus total_sessions and storage_path.
Exports full session details including raw prompt, compiled prompt, and reproducibility metadata. Auto-calculates rule_set_hash, rule_set_version, risk_score, and custom_rules_applied.
SessionExport — Contains raw_prompt, compiled_prompt, quality_before/after, rule_set_hash, rule_set_version, and metadata (target, task_type, complexity, risk_score, custom_rules_applied, engine_version, policy_mode, policy_hash).
Deletes a single optimization session by ID. Audit-logged in Enterprise tier.
deleted: true and session_id, or a not_found error.
Bulk-purges optimization sessions by age policy or deletes all. Safe-by-default: requires explicit parameters. Supports dry_run preview, keep_last protection, and config-based session_retention_days fallback.
PurgeResult — deleted_count, retained_count, scanned_count, deleted_session_ids (capped at 100), dry_run, cutoff_date, effective_older_than_days.
Enterprise Features
Enterprise tier unlocks governance features designed for regulated environments. All features are local-only, deterministic, and require no external services.
Policy Enforcement Enterprise
Switch from advisory mode (issues are reported but never block) to enforce mode, where BLOCKING rules gate every prompt operation.
Setup
Advisory vs Enforce
- advisory (default) — Rules fire and are reported in the PreviewPack, but no operation is blocked. Suitable for onboarding and iteration.
- enforce — BLOCKING rules prevent
approve_promptfrom succeeding. The response includes apolicy_violationsarray describing each violation. Risk threshold gating also blocks high-risk prompts from approval.
When violations are found
In enforce mode, approve_prompt returns an error with policy_violations listing each triggered BLOCKING rule (rule_id, description, severity, risk_dimension). The user must resolve the issues via refine_prompt or adjust the prompt before re-attempting approval.
Audit Logging Enterprise
Append-only, tamper-evident JSONL audit trail. Every significant action generates an entry with SHA-256 hash chaining.
Enable
JSONL format
Each line is a JSON object with these fields:
Verify integrity
If any line in the JSONL file is deleted or modified, the hash chain breaks. To verify: iterate the log, recompute SHA-256(prev_hash + line) for each entry, and compare against the stored integrity_hash. A mismatch indicates tampering.
Privacy invariant: The audit log never stores raw_prompt, compiled_prompt, or prompt_preview content. Only metadata (event type, session ID, risk score, outcome) is recorded.
Config Lock Enterprise
Lock your governance configuration with a passphrase. Once locked, no one can change policy, strictness, or audit settings without the correct secret.
Lock
Unlock
The passphrase is stored as a SHA-256 hash only — never in plaintext. Every lock/unlock attempt (successful or blocked) is audit-logged if the audit trail is enabled.
Custom Rules Enterprise
Define organization-specific prompt governance rules that run alongside the built-in rule engine.
JSON Schema
Place your rules in ~/.prompt-control-plane/custom-rules.json:
Rule fields
| Field | Type | Constraints |
|---|---|---|
| id | string | snake_case, max 64 chars, regex: ^[a-z][a-z0-9_]{0,63}$ |
| description | string | Max 200 chars |
| pattern | string | JavaScript regex, max 500 chars |
| negative_pattern | string | Optional exclusion regex, max 500 chars |
| applies_to | string | "code" | "prose" | "all" |
| severity | string | "BLOCKING" | "NON-BLOCKING" |
| risk_dimension | string | "hallucination" | "constraint" | "underspec" | "scope" |
| risk_weight | number | 1–25 |
Hard cap: 25 rules per configuration. Invalid regex patterns are skipped with a warning (skip-on-error pattern).
Validate custom rules
Session Retention Enterprise
Configure automatic session cleanup by age policy.
Set retention
Manual purge with dry_run
Purge only touches session files. Config, audit log, and license data are never deleted.
CLI Linter (prompt-lint)
A standalone CLI for linting prompt files. Reuses the same scoring, rules, and analysis engine. No MCP dependency — works in any CI pipeline or terminal.
All Flags
| Flag | Description |
|---|---|
--file <glob> | Glob pattern for prompt files to lint (e.g., "prompts/**/*.txt") |
--json | Output results as JSON (for machine consumption) |
--strict | Use strict threshold (75) |
--relaxed | Use relaxed threshold (40) |
--threshold N | Set a custom threshold (0–100). Overrides --strict/--relaxed. |
--validate-custom-rules | Validate the custom rules JSON file and exit |
Exit Codes
| Code | Meaning |
|---|---|
0 | All files pass the quality threshold |
1 | One or more files fail the threshold |
2 | Error (invalid args, file not found, etc.) |
CI Usage Example
Reads from stdin if no --file is provided:
GitHub Action
Run prompt linting as a GitHub Actions check. Fails the workflow if any prompt file scores below the threshold.
Workflow Example
Action Inputs
| Input | Required | Default | Description |
|---|---|---|---|
files | Yes | — | Glob pattern for prompt files |
threshold | No | — | Minimum score (0–100). Overrides strictness. |
strictness | No | standard | Preset: standard (60), strict (75), relaxed (40) |
version | No | tag ref | npm version to install |
Programmatic API
Import the optimization pipeline directly into your Node.js application. Pure, synchronous, deterministic. No I/O, no side effects, no MCP server required.
Quick Start
optimize() Signature
OptimizeResult Shape
Available Exports
The barrel export at claude-prompt-optimizer-mcp provides access to all core functions:
| Category | What’s Available |
|---|---|
| Convenience | optimize() — one-call pipeline |
| Analysis | Intent parsing, task detection, complexity classification |
| Scoring | Quality scoring, structural checklist generation |
| Compilation | Multi-target prompt compilation, context compression |
| Cost | Token estimation, cost calculation, model routing |
| Risk | Rule evaluation, risk scoring, risk level derivation |
| Profiles | Optimization profile resolution and suggestions |
| Enterprise | Policy evaluation, audit logging, session management |
| Custom Rules | CustomRulesManager, customRules |
| Types | All TypeScript interfaces and types (see types.ts) |
Dual Entry Points
| Import | Effect |
|---|---|
import { optimize } from 'claude-prompt-optimizer-mcp' | Pure API — no side effects, no server |
import 'claude-prompt-optimizer-mcp/server' | Starts MCP stdio server (side-effect import) |
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
PROMPT_CONTROL_PLANE_PRO | unset | Set to true to enable pro tier via environment variable. Tier priority: license key (cryptographically verified) > env var > default free. |
PROMPT_CONTROL_PLANE_LOG_LEVEL | info | Log verbosity: debug, info, warn, error |
PROMPT_CONTROL_PLANE_LOG_PROMPTS | unset (false) | Set to true to enable raw prompt logging. Never enable in shared environments. |
Security: Prompt logging (PROMPT_CONTROL_PLANE_LOG_PROMPTS) writes raw prompt content to stderr. Never enable this in production, CI, or any shared environment where logs are aggregated.
Storage
All data is stored locally in ~/.prompt-control-plane/. No cloud services, no telemetry, no data leaves your machine.
Directory Layout
Storage Health
Every metered tool response includes a storage_health field: "ok" or "degraded". In degraded mode, the pipeline continues with fail-open semantics (no data loss, but usage may not persist to disk). This design ensures zero downtime even if the filesystem is temporarily unavailable.
Session Limits
| Setting | Default | Description |
|---|---|---|
max_sessions | 200 | Maximum number of session files |
max_session_size_kb | 50 | Maximum size per session file |
max_session_dir_mb | 20 | Absolute cap on session directory size |
When limits are exceeded, the oldest sessions are automatically cleaned up.
Phase A / Phase B
The current release (Phase A) uses local file-based storage. All tool handlers use an async StorageInterface that abstracts the backend. Phase B will swap to Cloudflare Worker + Supabase without changing any tool handler code, test contracts, or interfaces.