The Real Cost of Running Clawdbot (And How I Cut Mine by 70%)

Last month I got my Anthropic API bill and nearly choked on my coffee. $83. For a personal assistant. That's more than my phone plan.

The thing is, I was being dumb about it. I had Clawdbot running on Claude Sonnet for everything — including stuff like "what time is it in Tokyo" and "remind me to call mom at 5." That's like taking an Uber Black to the mailbox.

So I spent a weekend figuring out how to actually optimize this. Got my bill down to $22 the following month. Same functionality, same experience. Here's everything I learned.

First, let's talk about what you're actually paying for

Clawdbot itself is free. It's built on OpenClaw, which is open source. You don't pay for the software. What you pay for is two things:

Somewhere to run it (a server, your own computer, whatever)
AI API calls (every time Clawdbot thinks, it costs a fraction of a cent)

The first part is easy to figure out. The second part is where people get surprised.

The hosting side: simpler than you think

You need something that stays on 24/7 so your bot is always available. Here are your real options:

A $5/month VPS — DigitalOcean, Vultr, Hetzner. This is what most people start with. It works. Clawdbot doesn't need much horsepower, so the cheapest tier is fine.

Your old laptop or Mac Mini — If you've got one sitting around, just use it. Cost: $0 (plus maybe $2/month in electricity). I wrote a whole article about the Mac Mini route if you're curious.

AWS/GCP free tier — Both offer a free year for new accounts. Decent option if you're just trying things out.

So hosting is somewhere between $0 and $6/month. Not the problem. The API bill is the problem.

Where the real money goes: API pricing explained

Every time you send Clawdbot a message, it sends your text to an AI model, gets a response, and sends it back. You pay per "token" — roughly one token per word, give or take.

Here's what the major models charge per million tokens right now:

| Model | Input | Output | Vibe | |-------|-------|--------|------| | Claude Haiku 3 | $0.25 | $1.25 | Cheap and surprisingly good | | Claude Haiku 4.5 | $1.00 | $5.00 | Fast, smart, affordable | | Claude Sonnet 4.5 | $3.00 | $15.00 | The sweet spot for most people | | Claude Opus 4.6 | $5.00 | $25.00 | The big brain, big price | | GPT-4o | $2.50 | $10.00 | OpenAI's workhorse | | GPT-4o-mini | $0.15 | $0.60 | Dirt cheap | | Gemini 2.0 Flash | $0.10 | $0.40 | Google's budget beast | | Gemini 2.5 Pro | $1.25 | $10.00 | Google's top tier | | Ollama (local) | $0 | $0 | Free, runs on your hardware |

Those numbers look small until you realize a typical back-and-forth conversation uses 2,000–5,000 tokens. A longer task — like summarizing a document or writing an email draft — can hit 10,000–20,000 tokens easily.

My actual bills: a case study

Let me show you my real numbers from three months of use. I track this stuff because I'm a nerd.

Month 1: The "I don't care" phase — $83

I set up Clawdbot with Claude Sonnet 4.5 as the default model for everything. Used it maybe 15–20 times a day. Asked it to do everything from writing emails to answering trivia questions. Didn't set any token limits. Didn't think about it at all.

Breakdown:

~2.1 million input tokens
~850K output tokens
Hosting: $5 (DigitalOcean)
API: $78
Total: $83

Month 2: The "oh crap" adjustment — $41

After seeing that bill, I made two changes: switched simple queries to Haiku, and set a max token limit of 2,000 for casual conversations. Still used Sonnet for anything that needed real thinking — writing, analysis, coding help.

Breakdown:

~1.8M input tokens (mix of Haiku and Sonnet)
~600K output tokens
Hosting: $5
API: $36
Total: $41

Month 3: The optimized setup — $22

This is where I got serious. More on the specific tricks below, but the short version: I routed 80% of my requests through Haiku, batched similar tasks, shortened my system prompts, and started using Gemini Flash for throwaway questions.

Breakdown:

~2.5M input tokens (mostly Haiku + Gemini Flash)
~700K output tokens
Hosting: $5
API: $17
Total: $22

Same number of daily interactions. Actually more, because once it got cheaper I used it more freely. The experience barely changed.

The 7 tricks that actually saved me money

1. Use the right model for the job

This is the single biggest lever. Most of what you ask a personal assistant to do doesn't need a frontier model. Setting reminders, answering quick questions, simple lookups — Haiku handles all of this perfectly.

I keep Sonnet for things that actually need it: writing longer content, analyzing documents, complex reasoning. That's maybe 20% of my usage.

Think of it like cars. You don't drive a Ferrari to get groceries. A Honda gets you there just fine.

Savings: 50–60% of my API bill.

2. Trim your system prompt

Here's something most people don't realize: your system prompt gets sent with every single message. If your system prompt is 500 tokens, and you send 50 messages a day, that's 25,000 tokens per day just on the system prompt. Over a month, that's 750,000 tokens you're paying for — and they're all the same text, repeated.

I cut my system prompt from ~600 tokens down to ~150. Removed the fluff, kept the essentials. The bot behaved exactly the same.

Savings: 10–15% of total input tokens.

3. Set max token limits

Without a cap, the model will sometimes write a 500-word response to a yes/no question. Setting a reasonable max output token limit (I use 1,000 for casual chat, 4,000 for longer tasks) prevents this.

In OpenClaw's config, you can set this per-model. Do it.

Savings: 15–20% of output tokens.

4. Use Gemini Flash for throwaway stuff

Google's Gemini 2.0 Flash costs $0.10 per million input tokens. That's 10x cheaper than Haiku and 30x cheaper than Sonnet. For things like "what's the weather" or "convert 50 USD to EUR" — it's more than good enough.

OpenClaw supports multiple model backends through OpenRouter. Set up Gemini Flash as a secondary model and route simple queries to it.

Savings: another 10–15% if you have a lot of simple queries.

5. Don't send your entire conversation history every time

By default, most AI setups send the full conversation context with each new message. A 30-message conversation means message #31 includes all 30 previous messages as input tokens. This adds up fast.

Configure a reasonable context window — I use the last 10 messages. For most personal assistant tasks, you don't need the bot to remember what you asked it three hours ago.

Savings: 20–30% of input tokens for heavy users.

6. Use OpenRouter as your gateway

OpenRouter lets you access Claude, GPT, Gemini, and dozens of other models through a single API key. They pass through provider pricing at cost. The real benefit isn't price — it's flexibility. You can switch models without changing your setup, compare costs, and take advantage of whichever provider is cheapest for your use case.

It also makes trick #1 and #4 much easier to implement.

7. Set a spending alert (seriously)

Every API provider lets you set spending limits or alerts. Do this before anything else. I have mine set at $30/month with an alert at $20. If something goes wrong — a bug that loops requests, an unexpectedly long task — you won't wake up to a $500 bill.

Anthropic: Console → Settings → Spending Limits OpenAI: Platform → Settings → Limits OpenRouter: Settings → Credits

This isn't a savings trick, it's insurance. But I'm including it because I've seen people on Reddit posting about surprise $400 bills. Don't be that person.

Real monthly costs for different types of users

Based on my experience and what I've seen from other Clawdbot users:

The casual user (5–10 messages/day)

Mostly quick questions and reminders
Model: Haiku or Gemini Flash
Hosting: $5 VPS or $0 (own hardware)
API: $3–10/month
Total: $3–15/month

The daily driver (20–40 messages/day)

Mix of quick tasks and longer requests
Model: Haiku for simple stuff, Sonnet for complex
Hosting: $5
API: $15–35/month
Total: $20–40/month

The power user (50+ messages/day, automation)

Heavy document processing, coding help, automation
Model: Sonnet primary, Opus for hard problems
Hosting: $5–20 (might need more server resources)
API: $40–100/month
Total: $45–120/month

Most people I've talked to land in that middle category. $20–40/month for a personal AI assistant that's available 24/7 on your phone, in your group chats, doing actual tasks. That's less than a gym membership you don't use.

The nuclear option: go fully local

If you really want to pay $0 in API costs, you can run models locally with Ollama. Llama 3, Mistral, Phi — there are solid open-source models that run on consumer hardware.

The catch: you need a decent GPU (or an Apple Silicon Mac with enough RAM) to get usable response times. A Mac Mini with 16GB RAM can run 7B–13B parameter models reasonably well. For bigger models, you're looking at a GPU with 12GB+ VRAM.

The quality isn't quite Claude or GPT-4o level, but for a lot of personal assistant tasks, it's perfectly fine. And the price is unbeatable: literally zero ongoing cost.

I run a hybrid setup now — Ollama for basic stuff when I'm home (routed through my Mac Mini), and cloud APIs for everything else. Best of both worlds.

The bottom line

Running Clawdbot doesn't have to be expensive. The people who end up with big bills are usually doing one of two things: using a premium model for everything, or not setting any limits.

With a little bit of setup — routing simple tasks to cheaper models, trimming your context window, setting spending caps — you can run a genuinely useful AI assistant for $15–25/month. That's a couple of coffees.

And if you haven't set up Clawdbot yet, the setup is free. We handle the technical stuff, you just tell us which messaging platform you want to use. The whole point is to make this accessible, not to drain your wallet.

Start cheap, see what you actually use it for, and scale up from there. That's the smart way to do it.