Day 19 of 28 · OpenClaw Challenge

Optimizing Token Usage

⏱ 8 min 📊 Intermediate

Your agent is connected, remembering, and automating. But there's something you need to know: all of this costs money.

Every time the agent thinks, it uses tokens — and tokens cost real dollars. Today you'll learn how to cut your costs by 60% or more without losing any capability.

Why default settings destroy your wallet

Here's the problem: OpenClaw's default configuration sends every single request to the most expensive AI model. It doesn't matter if you're asking it to classify an email or solve a complex research problem — it uses the same heavy-duty model for everything.

That's like taking a Ferrari to the grocery store. Sure, it works. But it's wildly expensive for a simple trip.

80% of your agent's tasks don't need the most powerful model. Sorting emails? Simple model. Formatting a report? Simple model. Scheduling a meeting? Simple model.

You only need the heavy-duty model for complex reasoning, nuanced writing, and multi-step problem-solving.

Knowledge Check

Why are OpenClaw's default cost settings problematic?

A

The AI models are overpriced

B

Every request goes to the most expensive model, even simple tasks that a cheaper model handles equally well

C

The default settings limit how many requests you can make

D

OpenClaw charges a monthly subscription

The default routing sends everything to the most powerful (and expensive) model. Simple tasks like sorting, formatting, and scheduling don't need that power. Routing them to cheaper models saves money with zero quality loss.

Model tiering — the biggest win

The fix is model tiering: route different tasks to different models based on complexity.

Tier 1 — Heavy tasks (complex reasoning, creative writing, nuanced analysis) → Use the most capable model. About 20% of your tasks.

Tier 2 — Medium tasks (summarization, email drafting, research synthesis) → Use a mid-tier model. About 30% of your tasks.

Tier 3 — Simple tasks (classification, formatting, parsing, scheduling) → Use the cheapest/fastest model. About 50% of your tasks.

🦞

OpenClaw 🦞

online

How much did I spend on AI tokens last month?

Last month's usage: 💰 **Total cost**: $187.40 📊 **Breakdown**: — 847 requests at top-tier pricing ($0.18 avg) — 92% of those were simple tasks (email sorting, scheduling, formatting) 🔥 **If you enable model tiering**: — Estimated cost: $68.20 — Savings: **$119.20/month (63%)** — Zero quality loss on simple tasks Want me to enable smart routing?

Yes, do it.

Model tiering enabled ✓ Simple tasks → fast model ($0.02/request avg) Medium tasks → mid-tier model ($0.08/request avg) Complex tasks → top-tier model ($0.18/request avg) I'll route automatically based on task complexity.

↻ Replay conversation

Knowledge Check

What percentage of a typical user's tasks need the most powerful (and expensive) AI model?

A

100% — you need the best for everything

B

Less than 5%

C

About 50%

D

About 20% — the other 80% work equally well with cheaper models

Only about 20% of tasks truly need top-tier reasoning — complex analysis, nuanced writing, multi-step problem solving. The other 80% (sorting, formatting, scheduling, simple drafting) run perfectly on cheaper, faster models.

Fix session memory bloat

The second biggest cost driver is context size. Every conversation with your agent carries context — memory, instructions, tool definitions, conversation history.

The default configuration loads about 50kb of context per session. That's like carrying a 50-page briefing document into every conversation, even when you just want to check the weather.

With proper configuration, you can trim that to 8kb — the essentials only. Less context means fewer tokens per request, which means lower costs across every interaction.

Knowledge Check

How does reducing context size save money?

A

It only saves money on the first request

B

It makes the AI model cheaper

C

Every request sends less data, using fewer tokens — and you pay per token

D

It reduces your API key's monthly limit

You pay per token, and context is sent with every request. If your context is 50kb, every single interaction costs more than it needs to. Trimming to 8kb means you're paying 84% less in context overhead on every request.

The "thinking" token trap

Some AI models have a thinking mode — they reason internally before responding. This reasoning produces "thinking tokens" that you pay for but never see.

A simple question like "what time is my next meeting?" might generate 200 thinking tokens internally before the model gives you a 20-token answer. You paid for 220 tokens but only saw 20.

For simple tasks, thinking mode is pure waste. Make sure it's disabled for Tier 3 (simple) tasks.

🦞

OpenClaw 🦞

online

Show me the cost of my last 5 requests.

Here are your last 5 requests with costs: 1. "Sort my inbox" → $0.02 (Tier 3, no thinking) 2. "Draft client proposal" → $0.14 (Tier 1, thinking enabled) 3. "What's the weather?" → $0.02 (Tier 3, no thinking) 4. "Analyze competitor pricing strategy" → $0.16 (Tier 1, thinking enabled) 5. "Schedule team sync" → $0.02 (Tier 3, no thinking) **Before optimization**: all 5 would have been $0.14-0.18 each = ~$0.80 total **After optimization**: $0.36 total — **55% savings**

↻ Replay conversation

Knowledge Check

What are "thinking tokens"?

A

Tokens used for memory storage

B

Tokens you earn as rewards for using the agent

C

Internal reasoning tokens the AI generates before responding — you pay for them but never see them

D

Tokens that make the AI smarter

Thinking tokens are the AI's internal reasoning process. They're useful for complex tasks (you want the agent to think deeply about a hard problem), but wasteful for simple tasks (you don't need deep reasoning to check the weather).

Local models for free tasks

For truly repetitive, simple work — classifying emails, parsing data, formatting text — you can run a local model on your own hardware at zero cost.

Tools like Ollama let you run small AI models right on your machine. They're not as powerful as cloud models, but they're perfect for Tier 3 tasks. And they're completely free — no API costs, no token charges.

The savings add up

Here's what real users report after optimizing:

Light usage ($200/month) → drops to around $70/month

Medium usage ($800/month) → drops to around $300/month

Heavy usage ($3,000/month) → drops to around $1,000/month

That's a 63-67% reduction across the board. Same capability, same results, dramatically less cost.

Final Check

What's the #1 thing you should do RIGHT NOW to reduce your OpenClaw costs?

A

Enable model tiering — route simple tasks to cheaper models

B

Switch to a completely free AI model

C

Use the agent less frequently

D

Turn off all automations

Model tiering is the biggest single win. You don't have to use the agent less or sacrifice quality. You just stop using the most expensive model for tasks that don't need it. It's the grocery-store-Ferrari problem solved.

💰

Day 19 Complete

"Same agent, same results, 63% less cost. Model tiering is the move."

Tomorrow — Day 20

Workflows — Chaining Tools Together

Now that you're running lean, let's build powerful multi-tool workflows.

Continue to Day 20 →