Product · 01 / Machine learning

Promptomize.

A supervised rewriter that makes any LLM 35% more accurate.

Promptomize sparkle mark

The problem

Most teams using an LLM in production are leaving accuracy on the table because the prompts they ship were written by a human, once, in a rush. The model is capable of more — but the prompt is the ceiling. Existing tooling helps you track prompts. Almost nothing helps you improve them.

The product

Promptomize is a supervised model trained on 18,000 (input, weak prompt, strong prompt, eval-score) tuples. Given a frozen base prompt and a small held-out evaluation set, it rewrites the prompt and reports the projected lift before you ever push it to production.

How it works

  • Ingestion. You provide a base prompt and 20–200 evaluation examples (input + golden output).
  • Rewriting. The Promptomize model produces 8 candidate rewrites with diverse strategies — chain-of-thought, role priming, format scaffolding, decomposition.
  • Evaluation. Each candidate is scored against your held-out set on the target backend (GPT, Claude, Gemini, open-weight).
  • Selection. The best variant is returned with a confidence interval and a delta against your baseline.

Results

Across our internal benchmark suite — 14 tasks spanning reasoning, code generation, classification, and structured extraction — Promptomize lifts accuracy by an average of 35% versus the human baseline. The largest wins are on structured-extraction tasks (+58% F1 on financial-table extraction). The smallest wins are on creative writing, where evaluation is noisier.

Stack

  • Training: PyTorch, Lightning, DeepSpeed on 8×A100.
  • Backend: Modal for GPU rewrite endpoints, Postgres for tuples & evals, Redis for the rewrite cache.
  • Frontend: Next.js 15 App Router, React Server Components, Tailwind, shadcn/ui.
  • Eval harness: a custom multi-backend runner with deterministic sampling and per-task scoring.

Status

Promptomize is in private beta with 14 paying teams. The hosted product is closed-loop with a self-hosted CLI on the way for teams that need to keep prompts and evals on-prem. Expected GA: Q3 2026.

← All products