The Real Cost of Running Clawdbot (And How I Cut Mine by 70%)
Last month I got my Anthropic API bill and nearly choked on my coffee. $83. For a personal assistant. That's more than my phone plan.
The thing is, I was being dumb about it. I had Clawdbot running on Claude Sonnet for everything โ including stuff like "what time is it in Tokyo" and "remind me to call mom at 5." That's like taking an Uber Black to the mailbox.
So I spent a weekend figuring out how to actually optimize this. Got my bill down to $22 the following month. Same functionality, same experience. Here's everything I learned.
First, let's talk about what you're actually paying for
Clawdbot itself is free. It's built on OpenClaw, which is open source. You don't pay for the software. What you pay for is two things:
- Somewhere to run it (a server, your own computer, whatever)
- AI API calls (every time Clawdbot thinks, it costs a fraction of a cent)
The first part is easy to figure out. The second part is where people get surprised.
The hosting side: simpler than you think
You need something that stays on 24/7 so your bot is always available. Here are your real options:
A $5/month VPS โ DigitalOcean, Vultr, Hetzner. This is what most people start with. It works. Clawdbot doesn't need much horsepower, so the cheapest tier is fine.
Your old laptop or Mac Mini โ If you've got one sitting around, just use it. Cost: $0 (plus maybe $2/month in electricity). I wrote a whole article about the Mac Mini route if you're curious.
AWS/GCP free tier โ Both offer a free year for new accounts. Decent option if you're just trying things out.
So hosting is somewhere between $0 and $6/month. Not the problem. The API bill is the problem.
Where the real money goes: API pricing explained
Every time you send Clawdbot a message, it sends your text to an AI model, gets a response, and sends it back. You pay per "token" โ roughly one token per word, give or take.
Here's what the major models charge per million tokens right now:
| Model | Input | Output | Vibe | |-------|-------|--------|------| | Claude Haiku 3 | $0.25 | $1.25 | Cheap and surprisingly good | | Claude Haiku 4.5 | $1.00 | $5.00 | Fast, smart, affordable | | Claude Sonnet 4.5 | $3.00 | $15.00 | The sweet spot for most people | | Claude Opus 4.6 | $5.00 | $25.00 | The big brain, big price | | GPT-4o | $2.50 | $10.00 | OpenAI's workhorse | | GPT-4o-mini | $0.15 | $0.60 | Dirt cheap | | Gemini 2.0 Flash | $0.10 | $0.40 | Google's budget beast | | Gemini 2.5 Pro | $1.25 | $10.00 | Google's top tier | | Ollama (local) | $0 | $0 | Free, runs on your hardware |
Those numbers look small until you realize a typical back-and-forth conversation uses 2,000โ5,000 tokens. A longer task โ like summarizing a document or writing an email draft โ can hit 10,000โ20,000 tokens easily.
My actual bills: a case study
Let me show you my real numbers from three months of use. I track this stuff because I'm a nerd.
Month 1: The "I don't care" phase โ $83
I set up Clawdbot with Claude Sonnet 4.5 as the default model for everything. Used it maybe 15โ20 times a day. Asked it to do everything from writing emails to answering trivia questions. Didn't set any token limits. Didn't think about it at all.
Breakdown:
- ~2.1 million input tokens
- ~850K output tokens
- Hosting: $5 (DigitalOcean)
- API: $78
- Total: $83
Month 2: The "oh crap" adjustment โ $41
After seeing that bill, I made two changes: switched simple queries to Haiku, and set a max token limit of 2,000 for casual conversations. Still used Sonnet for anything that needed real thinking โ writing, analysis, coding help.
Breakdown:
- ~1.8M input tokens (mix of Haiku and Sonnet)
- ~600K output tokens
- Hosting: $5
- API: $36
- Total: $41
Month 3: The optimized setup โ $22
This is where I got serious. More on the specific tricks below, but the short version: I routed 80% of my requests through Haiku, batched similar tasks, shortened my system prompts, and started using Gemini Flash for throwaway questions.
Breakdown:
- ~2.5M input tokens (mostly Haiku + Gemini Flash)
- ~700K output tokens
- Hosting: $5
- API: $17
- Total: $22
Same number of daily interactions. Actually more, because once it got cheaper I used it more freely. The experience barely changed.
The 7 tricks that actually saved me money
1. Use the right model for the job
This is the single biggest lever. Most of what you ask a personal assistant to do doesn't need a frontier model. Setting reminders, answering quick questions, simple lookups โ Haiku handles all of this perfectly.
I keep Sonnet for things that actually need it: writing longer content, analyzing documents, complex reasoning. That's maybe 20% of my usage.
Think of it like cars. You don't drive a Ferrari to get groceries. A Honda gets you there just fine.
Savings: 50โ60% of my API bill.
2. Trim your system prompt
Here's something most people don't realize: your system prompt gets sent with every single message. If your system prompt is 500 tokens, and you send 50 messages a day, that's 25,000 tokens per day just on the system prompt. Over a month, that's 750,000 tokens you're paying for โ and they're all the same text, repeated.
I cut my system prompt from ~600 tokens down to ~150. Removed the fluff, kept the essentials. The bot behaved exactly the same.
Savings: 10โ15% of total input tokens.
3. Set max token limits
Without a cap, the model will sometimes write a 500-word response to a yes/no question. Setting a reasonable max output token limit (I use 1,000 for casual chat, 4,000 for longer tasks) prevents this.
In OpenClaw's config, you can set this per-model. Do it.
Savings: 15โ20% of output tokens.
4. Use Gemini Flash for throwaway stuff
Google's Gemini 2.0 Flash costs $0.10 per million input tokens. That's 10x cheaper than Haiku and 30x cheaper than Sonnet. For things like "what's the weather" or "convert 50 USD to EUR" โ it's more than good enough.
OpenClaw supports multiple model backends through OpenRouter. Set up Gemini Flash as a secondary model and route simple queries to it.
Savings: another 10โ15% if you have a lot of simple queries.
5. Don't send your entire conversation history every time
By default, most AI setups send the full conversation context with each new message. A 30-message conversation means message #31 includes all 30 previous messages as input tokens. This adds up fast.
Configure a reasonable context window โ I use the last 10 messages. For most personal assistant tasks, you don't need the bot to remember what you asked it three hours ago.
Savings: 20โ30% of input tokens for heavy users.
6. Use OpenRouter as your gateway
OpenRouter lets you access Claude, GPT, Gemini, and dozens of other models through a single API key. They pass through provider pricing at cost. The real benefit isn't price โ it's flexibility. You can switch models without changing your setup, compare costs, and take advantage of whichever provider is cheapest for your use case.
It also makes trick #1 and #4 much easier to implement.
7. Set a spending alert (seriously)
Every API provider lets you set spending limits or alerts. Do this before anything else. I have mine set at $30/month with an alert at $20. If something goes wrong โ a bug that loops requests, an unexpectedly long task โ you won't wake up to a $500 bill.
Anthropic: Console โ Settings โ Spending Limits OpenAI: Platform โ Settings โ Limits OpenRouter: Settings โ Credits
This isn't a savings trick, it's insurance. But I'm including it because I've seen people on Reddit posting about surprise $400 bills. Don't be that person.
Real monthly costs for different types of users
Based on my experience and what I've seen from other Clawdbot users:
The casual user (5โ10 messages/day)
- Mostly quick questions and reminders
- Model: Haiku or Gemini Flash
- Hosting: $5 VPS or $0 (own hardware)
- API: $3โ10/month
- Total: $3โ15/month
The daily driver (20โ40 messages/day)
- Mix of quick tasks and longer requests
- Model: Haiku for simple stuff, Sonnet for complex
- Hosting: $5
- API: $15โ35/month
- Total: $20โ40/month
The power user (50+ messages/day, automation)
- Heavy document processing, coding help, automation
- Model: Sonnet primary, Opus for hard problems
- Hosting: $5โ20 (might need more server resources)
- API: $40โ100/month
- Total: $45โ120/month
Most people I've talked to land in that middle category. $20โ40/month for a personal AI assistant that's available 24/7 on your phone, in your group chats, doing actual tasks. That's less than a gym membership you don't use.
The nuclear option: go fully local
If you really want to pay $0 in API costs, you can run models locally with Ollama. Llama 3, Mistral, Phi โ there are solid open-source models that run on consumer hardware.
The catch: you need a decent GPU (or an Apple Silicon Mac with enough RAM) to get usable response times. A Mac Mini with 16GB RAM can run 7Bโ13B parameter models reasonably well. For bigger models, you're looking at a GPU with 12GB+ VRAM.
The quality isn't quite Claude or GPT-4o level, but for a lot of personal assistant tasks, it's perfectly fine. And the price is unbeatable: literally zero ongoing cost.
I run a hybrid setup now โ Ollama for basic stuff when I'm home (routed through my Mac Mini), and cloud APIs for everything else. Best of both worlds.
The bottom line
Running Clawdbot doesn't have to be expensive. The people who end up with big bills are usually doing one of two things: using a premium model for everything, or not setting any limits.
With a little bit of setup โ routing simple tasks to cheaper models, trimming your context window, setting spending caps โ you can run a genuinely useful AI assistant for $15โ25/month. That's a couple of coffees.
And if you haven't set up Clawdbot yet, the setup is free. We handle the technical stuff, you just tell us which messaging platform you want to use. The whole point is to make this accessible, not to drain your wallet.
Start cheap, see what you actually use it for, and scale up from there. That's the smart way to do it.