Day 19 of 28 Β· OpenClaw Challenge
Optimizing Token Usage
β± 8 min
π Intermediate
Your agent is connected, remembering, and automating. But there's something you need to know: all of this costs money.
Every time the agent thinks, it uses tokens β and tokens cost real dollars. Today you'll learn how to cut your costs by 60% or more without losing any capability.
Why default settings destroy your wallet
Here's the problem: OpenClaw's default configuration sends every single request to the most expensive AI model. It doesn't matter if you're asking it to classify an email or solve a complex research problem β it uses the same heavy-duty model for everything.
That's like taking a Ferrari to the grocery store. Sure, it works. But it's wildly expensive for a simple trip.
80% of your agent's tasks don't need the most powerful model. Sorting emails? Simple model. Formatting a report? Simple model. Scheduling a meeting? Simple model.
You only need the heavy-duty model for complex reasoning, nuanced writing, and multi-step problem-solving.
Knowledge Check
Why are OpenClaw's default cost settings problematic?
A
The default settings limit how many requests you can make
B
OpenClaw charges a monthly subscription
C
The AI models are overpriced
D
Every request goes to the most expensive model, even simple tasks that a cheaper model handles equally well
The default routing sends everything to the most powerful (and expensive) model. Simple tasks like sorting, formatting, and scheduling don't need that power. Routing them to cheaper models saves money with zero quality loss.
Model tiering β the biggest win
The fix is model tiering: route different tasks to different models based on complexity.
Tier 1 β Heavy tasks (complex reasoning, creative writing, nuanced analysis) β Use the most capable model. About 20% of your tasks.
Tier 2 β Medium tasks (summarization, email drafting, research synthesis) β Use a mid-tier model. About 30% of your tasks.
Tier 3 β Simple tasks (classification, formatting, parsing, scheduling) β Use the cheapest/fastest model. About 50% of your tasks.
How much did I spend on AI tokens last month?
Last month's usage:
π° **Total cost**: $187.40
π **Breakdown**:
β 847 requests at top-tier pricing ($0.18 avg)
β 92% of those were simple tasks (email sorting, scheduling, formatting)
π₯ **If you enable model tiering**:
β Estimated cost: $68.20
β Savings: **$119.20/month (63%)**
β Zero quality loss on simple tasks
Want me to enable smart routing?
Model tiering enabled β
Simple tasks β fast model ($0.02/request avg)
Medium tasks β mid-tier model ($0.08/request avg)
Complex tasks β top-tier model ($0.18/request avg)
I'll route automatically based on task complexity.
β» Replay conversation
Knowledge Check
What percentage of a typical user's tasks need the most powerful (and expensive) AI model?
A
100% β you need the best for everything
B
About 20% β the other 80% work equally well with cheaper models
Only about 20% of tasks truly need top-tier reasoning β complex analysis, nuanced writing, multi-step problem solving. The other 80% (sorting, formatting, scheduling, simple drafting) run perfectly on cheaper, faster models.
Fix session memory bloat
The second biggest cost driver is context size. Every conversation with your agent carries context β memory, instructions, tool definitions, conversation history.
The default configuration loads about 50kb of context per session. That's like carrying a 50-page briefing document into every conversation, even when you just want to check the weather.
With proper configuration, you can trim that to 8kb β the essentials only. Less context means fewer tokens per request, which means lower costs across every interaction.
Knowledge Check
How does reducing context size save money?
A
It makes the AI model cheaper
B
It only saves money on the first request
C
It reduces your API key's monthly limit
D
Every request sends less data, using fewer tokens β and you pay per token
You pay per token, and context is sent with every request. If your context is 50kb, every single interaction costs more than it needs to. Trimming to 8kb means you're paying 84% less in context overhead on every request.
The "thinking" token trap
Some AI models have a thinking mode β they reason internally before responding. This reasoning produces "thinking tokens" that you pay for but never see.
A simple question like "what time is my next meeting?" might generate 200 thinking tokens internally before the model gives you a 20-token answer. You paid for 220 tokens but only saw 20.
For simple tasks, thinking mode is pure waste. Make sure it's disabled for Tier 3 (simple) tasks.
Show me the cost of my last 5 requests.
Here are your last 5 requests with costs:
1. "Sort my inbox" β $0.02 (Tier 3, no thinking)
2. "Draft client proposal" β $0.14 (Tier 1, thinking enabled)
3. "What's the weather?" β $0.02 (Tier 3, no thinking)
4. "Analyze competitor pricing strategy" β $0.16 (Tier 1, thinking enabled)
5. "Schedule team sync" β $0.02 (Tier 3, no thinking)
**Before optimization**: all 5 would have been $0.14-0.18 each = ~$0.80 total
**After optimization**: $0.36 total β **55% savings**
β» Replay conversation
Knowledge Check
What are "thinking tokens"?
A
Tokens that make the AI smarter
B
Internal reasoning tokens the AI generates before responding β you pay for them but never see them
C
Tokens you earn as rewards for using the agent
D
Tokens used for memory storage
Thinking tokens are the AI's internal reasoning process. They're useful for complex tasks (you want the agent to think deeply about a hard problem), but wasteful for simple tasks (you don't need deep reasoning to check the weather).
Local models for free tasks
For truly repetitive, simple work β classifying emails, parsing data, formatting text β you can run a local model on your own hardware at zero cost.
Tools like Ollama let you run small AI models right on your machine. They're not as powerful as cloud models, but they're perfect for Tier 3 tasks. And they're completely free β no API costs, no token charges.
The savings add up
Here's what real users report after optimizing:
Light usage ($200/month) β drops to around $70/month
Medium usage ($800/month) β drops to around $300/month
Heavy usage ($3,000/month) β drops to around $1,000/month
That's a 63-67% reduction across the board. Same capability, same results, dramatically less cost.
Final Check
What's the #1 thing you should do RIGHT NOW to reduce your OpenClaw costs?
A
Switch to a completely free AI model
B
Use the agent less frequently
C
Turn off all automations
D
Enable model tiering β route simple tasks to cheaper models
Model tiering is the biggest single win. You don't have to use the agent less or sacrifice quality. You just stop using the most expensive model for tasks that don't need it. It's the grocery-store-Ferrari problem solved.
π°
Day 19 Complete
"Same agent, same results, 63% less cost. Model tiering is the move."
Tomorrow β Day 20
Workflows β Chaining Tools Together
Now that you're running lean, let's build powerful multi-tool workflows.