Save up to 99% on
LLM API costs.

promptzip automatically injects prompt caching into every request. One line of code. No changes to your app. You only pay when you save.

Get started free →Sign in to portal

One line of code

your_app.py

# Before
import openai
client = openai.OpenAI(api_key="sk-...")

# After — one change, that's it
from promptzip import optimize
client = optimize(openai.OpenAI(api_key="pz-..."), proxy_url="https://api.promptzip.io")

# Everything else stays the same
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},  # ← auto-cached
        {"role": "user",   "content": user_message},
    ]
)

print(client.savings_report())
# {"cached_tokens": 23520, "saved_percent": 99.4, "target_reached": true}

How it works

Install & connect

pip install promptzip, wrap your client. Takes 60 seconds.

client = optimize(openai.OpenAI(api_key="pz-..."))

We inject caching

Our proxy automatically adds cache_control to every system prompt. No Anthropic expertise needed.

// proxy injects cache_control
// on every request ✓

Watch savings grow

Real-time dashboard shows tokens saved, money saved, cache hit rate.

99.4% cached
$63/day saved

Real benchmark results

Claude Sonnet 4.6 · 10 requests · 2,365-token system prompt

Without promptzip

cached tokens

$70.9/day

at 10k req/day

With promptzip

99.4%

cached tokens

$7.1/day

at 10k req/day

That's $1,890/month saved for a typical customer support bot.

Pricing that makes sense

You only pay when we save you money. No savings, no charge.

20%

of what we save you

✓ You save $1,000/mo → you pay $200

✓ You save $0 → you pay $0

✓ Tracked automatically, invoiced monthly

✓ Works with OpenAI, Anthropic, Gemini

Start saving for free →

No credit card required to start

Save up to 99% onLLM API costs.