https://x.com/kavinbm/status/2023031683826508129
Kavin @kavinbmEarlier this week my OpenClaw Agent burnt through over 150M tokens in a day (!).
The 1st optimization: Enabled 1hr long cache on Claude Opus so that duplicate context is charged at a 90% discount. Important as OC sends whole files in the prompt
The 2nd: Opus Orchestra with Opus acting as a conductor across multiple models:
• Opus 4.6 — all direct conversations, trade decisions, anything touching money, deep analysis
• Sonnet 4.5 — sub-agents, daily briefs, CRM ingestion, structured research
• Gemini 3 Flash — heartbeats, healthchecks, trigger scans, keyword monitoring
Cron jobs across Flash, Sonnet and Opus
Escalation rule: Cheap model detects something → reports to main session → Opus makes the call.
Also enabled: Memory flush at 80k tokens (saves context to memory files before compaction) & Compaction threshold bumped to 80k (from default 40k).
Token consumption is down 80% 🙏Feb 15, 2026 View on X →
Sunday, February 15, 2026 AI