Now in beta · Claude Sonnet 4.6 supported

Save up to 99% on
LLM API costs.

promptzip automatically injects prompt caching into every request. One line of code. No changes to your app. You only pay when you save.

One line of code

your_app.py
# Before
import openai
client = openai.OpenAI(api_key="sk-...")

# After — one change, that's it
from promptzip import optimize
client = optimize(openai.OpenAI(api_key="pz-..."), proxy_url="https://api.promptzip.io")

# Everything else stays the same
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},  # ← auto-cached
        {"role": "user",   "content": user_message},
    ]
)

print(client.savings_report())
# {"cached_tokens": 23520, "saved_percent": 99.4, "target_reached": true}

How it works

01

Install & connect

pip install promptzip, wrap your client. Takes 60 seconds.

client = optimize(openai.OpenAI(api_key="pz-..."))
02

We inject caching

Our proxy automatically adds cache_control to every system prompt. No Anthropic expertise needed.

// proxy injects cache_control
// on every request ✓
03

Watch savings grow

Real-time dashboard shows tokens saved, money saved, cache hit rate.

99.4% cached
$63/day saved

Real benchmark results

Claude Sonnet 4.6 · 10 requests · 2,365-token system prompt

Without promptzip

0%

cached tokens

$70.9/day

at 10k req/day

With promptzip

99.4%

cached tokens

$7.1/day

at 10k req/day

That's $1,890/month saved for a typical customer support bot.

Pricing that makes sense

You only pay when we save you money. No savings, no charge.

20%

of what we save you

You save $1,000/mo → you pay $200
You save $0 → you pay $0
Tracked automatically, invoiced monthly
Works with OpenAI, Anthropic, Gemini
Start saving for free →

No credit card required to start