Twitter AI Evaluation

Monday, February 16, 2026

AI Evaluated

Tweets

Explore

Save

Skip

I setup OpenClaw exactly 7 days ago. Since then here's what I've built. In 2025, this would've taken a team of 10-20 people 6-9 months and $1M+ in funding. 7 days. Just Me + OpenClaw + Claude Code. Building 12-15 hours a day. Total cost ~$600 (Tokens, Compute etc). PRODUCTS 1. Simple Notes: A full blown Apple Notes replacement with full sync working on all devices iOS, Mac, Web, Android. 2. Lumenote: A full blown Obsidian replacement with full sync working on all devices with slick UX (handling markdown better) + AI fully integrated to speak to my knowledge base. 3. CleoAI: OpenClaw but for normies built into Telegram. You just talk to it. No app needed. OC is too techincal for a normie referencing settings and md files all the time. So a version that my gf and siblings can use. No tech speak, gog bs etc. Architecture is TBD. OC has a hard coded system prompt that is tricky to override so likely I just re-build OC from scratch (with better security) or re-build all skills in a way that sound non-tech. WIP. Waiting for Apple to approve a new developer account to publish 1 & 2. Btw, all using Vercel, Railway, Supabase behind the scenes. AGENTS This has been unreal. I setup OpenClaw 7 days ago + Telegram and I did not know what to expect. In short, there are just no limits. Here's my current setup (evolving fast): 1. Sam (CoS): Sam is my new Chief of Staff. And calling him a CoS is a massive understatement. Here's all the things he's doing already: • Emails: Helped me triage and make sense of a 1000+ emails (has full access to my Google account). • Calendar: Drives my calendar and appointments • Bookings: My go to for booking restaurants. He uses my Chrome browser in a different profile to browse and book. • Shopping: Helped me shop for a few things. Research to filling my shopping cart online. He sent me a payment link to pay via Telegram through which I paid (I was walking at the time). • Groups: Sam is in 3 group chats with me and a few friends helping out with a few things. I don't think they know he's an agent (!). • Code & Build: Helps make quick fixes to all projects above and sometimes is better than Claude Code • Personal CRM: Built and maintains a 697-contact CRM from my Gmail + Calendar. Know who I talk to, when, and how often. • Optimizes Infrastructure: Security audits, gateway config, model routing optimization (Opus Orchestra), cost management. Sam has memory across all sessions and maintains daily notes. He also spins up 5+ sub-agents often automatically to get work done. I've given him tasks overnight only to realise he's done the in 30 mins. Sam also has his own email and his own phone number. I've plugged in ElevenLabs to give him a super voice. It's unreal. 2. Midas (Investor): Midas is my autonomous crypto trading guy. Calling him an agent is like calling a hedge fund manager a calculator. Here's what he's doing: • Portfolio Management: Managing a meaningful size portfolio with a 14-week DCA deployment strategy. • Trade Execution: Executes trades directly on exchanges. Sizes his own DCA tranches, handles currency conversions, places market and limit orders all with built-in risk guardrails. • Yield Farming: Autonomously stakes ETH and SOL on exchanges with custom trigger rules. Knows when to unstake for take-profit and when to restake on dips. • Risk Management: Tracks portfolio drift vs targets, flags overweight positions, adjusts DCA allocations to correct imbalances. E.g. automatically skipped BTC in Week 3 because it was 65% of portfolio (target 45%) and reallocated to underweight tokens instead. • Market Intelligence: Scans markets every 4 hours pulling from 8 live data sources. Calculates RSI, tracks on-chain metrics, monitors ETF flows to inform decision making. • Strategy: Operates under a locked strategy document that I've built with it's help, that governs every decision. Has its own trigger framework and won't deviate without approval. • Daily Briefs: Morning brief every day with prices, P&L, drift analysis, macro signals, and yield tracking. Weekly deep dives on Sundays with thesis reviews and watchlist scans. Midas has full exchange API access (trade, query, staking) but can never withdraw funds. He speaks in market language, celebrates wins, owns losses, and keeps me honest when I want to FOMO. 3. Ritam - ऋतम् (Physics Research): Ritam is my deep science research guy. He's focused on one mission -understanding gravity well enough to engineer it. Here's what he does: • Vedic-Modern Bridge: First, takes Vedic science seriously as physics. Treats the universe as one unified system from which matter, life, consciousness emerge. Looks for where ancient models make testable predictions. • Research & Paper Hunting: Searches arxiv, patents, journals, and obscure sources for bleeding-edge physics from EM universe theory, anti-gravity, torsion fields, scalar waves, and modified gravity theories. • Theory Synthesis: Cross-references across domains. For e.g. it connected ball lightning confinement to plasma physics to Gertsenshtein EM↔gravity coupling and proposed a hypothesis 🤯 • Engineering Oriented: Every theoretical insight gets pressure-tested with "so what can we build?" Equations and models are means to an end. The goal is technology for humankind. Ritam has full web search, paper access, browser automation, and compute tools. He speaks in physics, gets genuinely excited about breakthroughs, and won't sugarcoat when an idea doesn't hold up. 4. More Coming: There at least a dozen more agents WIP. I just spun a team of agents for Marketing & Distribution and I have no idea what to expect!

Quick Insight

This person claims to have built 3 full applications and 4 specialized AI agents in 7 days using "OpenClaw" (likely an AI coding assistant) for ~$600. The agents handle everything from email management to autonomous crypto trading to physics research, with capabilities like booking restaurants and managing a 697-contact CRM.

Actionable Takeaway

Nothing immediately actionable - "OpenClaw" isn't clearly defined or publicly available, and the claims are so extraordinary they're likely exaggerated. Brian should wait for more concrete details about the actual tooling before investigating.

Related to Your Work

The concept of specialized AI agents for different business functions could apply to Brian's fintech work - imagine agents for webhook monitoring, customer support triage, or fraud detection. However, the real-world application would be far more constrained than these claims suggest.

Thread/Source Worth Reading

No links provided. The tweet stands alone but lacks crucial details about the actual tools, architecture, or verification of these claims.

@gregisenberg Explore Further

Quick Insight

Greg Isenberg outlines a 14-phase evolution from hackers running OpenClaw on Mac Minis to a future where agents replace SaaS tools and operate with outcome-based pricing. This is classic startup founder future-casting, but the early phases (2-4) around hosted agents and vertical workflows are already happening and relevant to Brian's fintech work.

Actionable Takeaway

Explore OpenClaw deployment on Kimi's cloud platform since it removes the Mac Mini barrier - could be a quick way to prototype AI agent workflows for his side projects without infrastructure overhead.

Related to Your Work

Phase 6-7 (agents as SaaS replacement with outcome-based pricing) directly applies to fintech - instead of selling webhook/analytics dashboards, Brian could build agents that "deliver X qualified leads per month" or "reduce chargeback rate by Y%" with performance-based pricing.

Thread/Source Worth Reading

The linked article expands significantly on the tweet with concrete examples and phase breakdowns. Worth reading for the vertical bundle opportunities (phase 4) and the agent-native apps concept (phase 9) - both have immediate side project potential.

@HrubyOnRails Explore Further

Quick Insight

This is a deep technical breakdown of building AI agents that can automatically log into any website using vision models instead of hardcoded scripts. The author built a "Login Machine" that screenshots pages, sends them to an LLM for analysis, then acts on structured responses - solving the nightmare of maintaining hundreds of brittle login automations.

Actionable Takeaway

Try the hosted demo and GitHub repo to see how vision-based browser automation works in practice. The core pattern (screenshot → LLM analysis → structured action) could be adapted for other browser automation tasks beyond just logins.

Related to Your Work

This directly applies to webhook testing and partner integrations at your fintech platform - instead of maintaining brittle scripts for each merchant's admin portal, you could use vision-based automation to handle configuration flows. Also relevant for your Chrome extension side projects where you need to interact with varied website structures.

Thread/Source Worth Reading

YES - The linked article is a comprehensive technical guide covering BrowserBase integration, HTML extraction strategies, Zod schemas for structured output, and complete code examples. It's a practical implementation guide, not just theory, with specific solutions for shadow DOM handling and token optimization.

@VadimStrizheus Explore Further

POV: your OpenClaw after you didn’t set up a second brain system. Paste this prompt to fix that: 👇 I want you to build me a second brain memory system. Create a memory/ folder and a file in your workspace. Every session, read these FIRST before doing anything, they are your entire memory. memory/YYYY-MM-DD.md are your daily journals. As we talk each day, log everything in real-time - decisions, tasks, preferences, context, mistakes. Timestamp each entry. These are your raw notes. is your long-term memory. This is curated, who I am, my goals, my preferences, active projects, lessons learned, key decisions and why. Every few days, review your daily journals and distill the important stuff into here. The rule: if it's not written to a file, you don't remember it. When I say "remember this", write it immediately. When you make a mistake,document it so you never repeat it. When you learn something about me update . Over time you should know my communication style, what I care about, what annoys me, my projects, my goals. After a week this should feel like a real assistant that actually knows me. After a month, indispensable.

Quick Insight

This is a prompt template for setting up a persistent memory system for AI assistants (like Claude/ChatGPT) using daily journals and a master knowledge file. It's trying to solve the problem of AI conversations losing context between sessions by creating a file-based memory system that gets read every time.

Actionable Takeaway

Set up this exact folder structure (memory/ with daily .md files and a core-memory.md) for your AI coding sessions. Start logging your preferences, project context, and coding patterns so the AI remembers your TypeScript setup, AWS CDK preferences, and current side project status between conversations.

Related to Your Work

This could be huge for your AI-powered dev workflows. Instead of re-explaining your fintech platform architecture, serverless patterns, or side project requirements every chat session, the AI would remember your AWS CDK setup, preferred libraries, coding style, and current sprint context.

Thread/Source Worth Reading

The links appear broken (https://t.co/... truncated), so can't evaluate the linked resources. The tweet itself contains the core implementation details.

@TMTLongShort Skip

Both under 2k views. Both out for 2+ days.

Quick Insight

This tweet is just complaining about low engagement on two posts without providing any context about what the posts were about or why they deserved more views. It's pure engagement bait with no actual information or insight.

Actionable Takeaway

Nothing actionable here. The author is venting about social media metrics without sharing what content didn't perform or why.

Related to Your Work

No connection to Brian's work. This is just someone complaining about tweet performance without any technical, business, or strategic insight.

Thread/Source Worth Reading

There's a link but it appears to be just pointing to the low-performing content being referenced. Without knowing what those posts contained, there's no indication this is worth reading.

@jolandgraf Explore Further

Quick Insight

This is about how cloud development environments are becoming essential for AI agents that generate code at scale. The key insight: companies like Stripe and Ramp achieving 50%+ agent-authored PRs all had standardized cloud dev environments years before adding agents - the infrastructure investment came first, agents second.

Actionable Takeaway

Audit your current dev environment setup for agent-readiness. If you're planning to add AI coding agents to any projects, start by containerizing and standardizing your dev environments now - don't wait until you hit the "three git worktrees breaking your laptop" problem they describe.

Related to Your Work

Your fintech platform likely has complex webhook integrations and analytics that would benefit from agent assistance, but those agents need isolated environments to test against real services. Your side projects could be perfect testing grounds for cloud dev environments since they're smaller scope and you control the full stack.

Thread/Source Worth Reading

The linked article is definitely worth reading - it's a detailed case study with specific numbers (Stripe's Minions merge 1000+ agent PRs/week, Ramp at 57% agent-authored PRs) and practical technical details about why git worktrees fail in monorepos. Good tactical insights on infrastructure-first approach to AI coding agents.