@kavinbm | Scrollback

Kavin @kavinbm

Earlier this week my OpenClaw Agent burnt through over 150M tokens in a day (!).

The 1st optimization: Enabled 1hr long cache on Claude Opus so that duplicate context is charged at a 90% discount. Important as OC sends whole files in the prompt

The 2nd: Opus Orchestra with Opus acting as a conductor across multiple models:

• Opus 4.6 — all direct conversations, trade decisions, anything touching money, deep analysis
• Sonnet 4.5 — sub-agents, daily briefs, CRM ingestion, structured research
• Gemini 3 Flash — heartbeats, healthchecks, trigger scans, keyword monitoring

Cron jobs across Flash, Sonnet and Opus

Escalation rule: Cheap model detects something → reports to main session → Opus makes the call.

Also enabled: Memory flush at 80k tokens (saves context to memory files before compaction) & Compaction threshold bumped to 80k (from default 40k).

Token consumption is down 80% 🙏

Feb 15, 2026 View on X →

https://x.com/kavinbm/status/2023031683826508129

Sunday, February 15, 2026 AI