Token Efficiency in OpenClaw: Let Scripts Do the Heavy Lifting

Friday, February 6, 2026 AI

Scraped Article

Here's how I cut token costs by moving from "smart polling" to "dumb scripts + smart triggers". The Problem When you first set up an AI assistant (OpenClaw, in my case), it's tempting to make it do everything. Check your email every hour. Monitor your print farm. Sync your bookmarks. Watch for errors. But here's the math problem: if your agent wakes up every few minutes to "check on things," and each check burns tokens just to read context and decide "nothing happening" — you're spending real money on nothing. I learned this the hard way, with the problem being magnified while I was building a "swarm" coding system. Looking back at my session logs: 260+ Swarm Dispatcher sessions where Opus checked if there was work to do 315+ "Queue empty" responses — each one burning $0.01-0.07 in tokens 8 consecutive HEARTBEAT_OK responses in 16 minutes — all on the most expensive model The Swarm dispatcher alone burned $10-20 just checking empty queues. Every two minutes, Opus woke up, loaded context, checked an endpoint, found nothing, and responded. Repeat 260+ times. And that was just one feature I was building...not even accounting for all the other things I was having OpenClaw do. How Heartbeats Work OpenClaw has a "heartbeat" system — a periodic poll (default every 30 minutes) that wakes your agent and asks "anything need attention?" The agent reads a HEARTBEAT.⁠md file in your workspace, which lists tasks to check: emails, calendar, printer status, whatever you've configured. The idea is good: your agent stays aware of the world without you manually asking. But every heartbeat is a full model invocation — context loading, reasoning, tool calls, response. Even if the answer is "nothing happening," you've paid for the model to figure that out. The Philosophy Models are expensive thinkers. Scripts are free doers. If a task can be expressed as deterministic logic ("if X, then Y"), it belongs in a script. Models should only engage when there's actual ambiguity — formatting for humans, deciding whether something is worth reporting, or handling edge cases that would be painful to code. The Three-Tier Model Strategy Not all model work is equal. OpenClaw can/will use browsers a lot to find data or perform actions. But "browsing" itself doesn't need a fancy model. Here's what I measured when running the same browser task on different models: Opus: $0.089 per turn, ~$0.15+ per task Haiku: $0.017 per turn, ~$0.03 per task 5x cheaper for identical work. This led to a tiered approach: Premium (Opus) — Main conversation, complex decisions Workhorse (Sonnet 4.5) — Research, writing, analysis Cheap (Haiku 4.5) — Browser automation, mechanical tasks The key insight: your main session should orchestrate, not execute. When I need browser work done, I spawn a Haiku sub-agent. Research tasks get Sonnet. Opus stays focused on conversation. One gotcha: model names matter. OpenClaw initially used claude-3-5-haiku instead of claude-haiku-4-5, which caused silent fallback to Opus. Always verify the model actually applied. The Cron + Script Pattern Before (Token-Heavy) Heartbeat → Model wakes → Reads HEARTBEAT.⁠md → Figures out what to check → Runs commands → Interprets output → Decides action → Maybe reports Every step burns tokens. The model is thinking about things that don't require thought. After (Token-Light) Cron fires → Script runs (zero tokens) → Script handles all logic → Only calls model if there's something to report → Model formats & sends Real Examples Swarm Dispatcher Old way: Opus cron job every 2 minutes checking dispatch queue, $0.01-0.07 per tick New way: Native code in hankOS checks the queue, only invokes model when work exists Savings: 260+ empty invocations eliminated = $10-20 saved Print Farm Monitoring Old way: Model checks printer status every 5 minutes, compares to last state, decides if alert needed New way: Bash script does the diff, only outputs if something changed Savings: ~50 model invocations/day → ~3 Auth Watchdogs Old way: Model checks if API tokens are valid during every heartbeat New way: Script returns exit code 0 (valid) or 1 (invalid). Model only wakes on failure. Savings: Moved from every-heartbeat to 6-hour cron = 12x reduction in invocations Browser Automation Old way: Opus doing the clicking, burning $0.089 per turn New way: Haiku sub-agent, $0.017 per turn Savings: 5x cost reduction, plus 168k tokens isolated from main session context Implementation Tips 1. Write Scripts That Output Nothing on Success Then your cron job prompt becomes: "Run the script. If there's output, send it to me. If not, stay silent." 2. Use delivery: "none" by default Most cron jobs shouldn't announce themselves. Set delivery mode to none and have the agent explicitly send messages only when warranted. 3. Pre-Format in Scripts Don't make the model format tables or summaries. Do it in the script: The model's job becomes "send this" — not "understand this data and present it nicely." 4. Isolated Sessions for Everything