Why Not Just Use the Claude App? Same Brain, Different Body

Same brain, different body. The case for building NanoClaw.

A founder asked us a fair question recently. We'd just finished walking through how BitSafe runs on a fleet of AI agents, and the reply was blunt: "What's the point of all this when you could just use Claude?"

It deserves a real answer instead of a defensive one. Here's the honest version.

NanoClaw runs on Claude

Start with what NanoClaw actually is. It isn't a Claude competitor. It runs on Claude, the same models, with the Claude Agent SDK underneath. We didn't build a smarter brain. We rented the same one Anthropic sells and gave it a different body.

So the question isn't "Claude or NanoClaw." Both are Claude. The real comparison is narrower and more interesting: Anthropic's packaged consumer surface, the claude.ai app with Projects and memory on desktop and mobile, against your own system of state, scheduling, and shared surfaces running on those same models.

Put it in one line: the model is rented; the context is owned. Anyone can rent the model. What you build around it is the part that's yours.

The Claude app is a great product

For a lot of people, the app is the right call, and we'll say so plainly. There's nothing to set up and nothing to maintain. Pricing is flat and per seat, so you always know the bill. Projects and memory work well for one person's work: drop in your files and the assistant carries your context across chats. And because it's Anthropic's own surface, new model features land there first.

If you're one person doing artifact work like drafting or analysis, the app alone is often all you need. We're not going to pretend otherwise.

The picture changes at the company level. Four differences explain why.

1. Sessions versus state

A chat is a conversation. It happens, then it's gone. Projects and memory soften that, but they're per user, they're opaque, and you can't query them, and they live inside Anthropic's product, shaped for one person's recall.

Think of a chat as a brilliant contractor with amnesia. Great work, but you re-brief them every morning. The substrate underneath our agents is more like an employee with tenure and a filing system.

That substrate is the durable asset: the CRM, the tasks, the documents, the meeting notes. It's structured and relational, with real permissions. Agents read from it and write back to it, so every run leaves it a little better than it found it. A chat history doesn't compound. A database does.

2. Single-player versus multiplayer

The Claude app is personal by design. My chats, my memory. If one person works out a sharp workflow, it lives in their chat history and dies there.

In our system, work lands where the team already works. When one agent enriches a CRM record, all 22 of us see the better record straight away. A non-technical teammate doesn't write a prompt to get value out of it. They mention an agent on a page or press a button. The leverage isn't trapped with whoever figured it out.

3. On-demand versus ambient

The app works while you're typing at it. Stop typing and the work stops.

NanoClaw runs about 80 scheduled jobs that fire whether anyone is at a keyboard or not: monitoring, enrichment, digests, alerts. One of them, a contact-role enricher, has run roughly 7,700 times. Nobody sits through 7,700 chat sessions. That work has a cron shape, and a chat window isn't shaped like that.

4. Output versus operations

The app's sweet spot is artifacts, like drafts and analysis. Real, useful work.

Most of our fleet doesn't produce artifacts. It performs operations. It moves a deal to the next stage, files meeting notes against the right account, updates a status, flags an inconsistency between two records. The unit of value is updated state, not generated text. That's how we replaced our old CRM with a Notion-native one in eight weeks and switched the old system off in May 2026. Picture AI as a faster typewriter and you miss that it can also be a better back office.

Governance is the difference that grows with you

There's a less glamorous difference that matters more as a team scales: who's accountable.

We run roughly 60 governed agents, 58 of them active. Each one has an owner, a registry entry, scoped permissions, and a change log. Instruction changes move through propose, then approve, then apply, and humans stay in the loop on anything that writes.

A team running individual Claude accounts is a room full of people freelancing with prompts. There's no audit trail, and no clean answer when someone asks what AI actually does here and who signed off on it. We build custody infrastructure for a living, and we don't accept single points of failure there. It would be strange to run our own operations as one person's chat history.

"But I run an agency with multiple clients"

This is the strongest version of the objection, so let's concede the real points first. A Project per client gives you clean multi-tenancy at no cost. Agencies sell deliverables, and deliverables are artifact work, which is exactly the app's strength. Clients churn, which punishes any investment in long-lived substrate. Harness maintenance is time you can't bill. And some client contracts forbid centralizing their data at all, which turns a sessions-only setup into a compliance feature rather than a gap.

All fair. Here's the other side.

The agency is itself a company. Clients come and go, but your pipeline, proposals, staffing, and invoicing are constant. That's substrate worth owning even if no client data ever touches it.

Repeatability across clients is the agency business model. The same onboarding, the same monthly report, and the same monitoring routine run for 20 clients is a parameterized skill, not 20 fresh chats. In a chat-only world, quality drifts with whoever happens to be typing that day.

A client is a row. Per-client views, per-client permissions, per-client scheduled jobs: databases have done this for fifty years.

An agency's moat is its methodology, and sessions don't compound. Substrate turns "we did this once for client X" into "this is how we do it for everyone," which is the path from custom work to productized services.

Leverage is the margin story. An ambient fleet covers every client at once and breaks the billable-hour ceiling. A chat app only earns while someone types.

And governance matters more with client data, not less. Owned agents with scoped permissions and an audit trail are the defensible answer when a client asks how you use AI on their account.

So the sequencing for an agency is the reverse of ours. Start chat-everywhere. Graduate a workflow into the system the third time you repeat it. The system isn't day-one infrastructure. It's what you build once the repetition becomes visible.

The part that surprises people

We run a weekly build-versus-buy audit where NanoClaw evaluates its own components against off-the-shelf alternatives, including Anthropic's own features. When the packaged product catches up to something we built, we retire our version.

So the honest answer to "why not just use Claude?" is that we ask ourselves that exact question every week, with data in front of us. The point was never loyalty to a harness. It's having a system that can even ask the question and act on the answer. NanoClaw is open source, so anyone can do the same.

Where your company's memory lives

None of this is free. The system carries real costs. Someone maintains the harness, governance takes discipline, and you have to route work to the right model at the right price. It's an infrastructure investment with infrastructure-shaped costs, and it pays back only when coordination and shared state are your actual bottleneck.

For one person, use the app. It's a great product and the math is simple.

For a company, the question isn't which tool is smarter. They're the same brain. The question is where your company's memory lives: in scattered chat histories that walk out the door when people do, or in a shared system that gets better every time an agent touches it.

Same brain. Different body. That's the whole case.

Keep reading

Start with the hub: The Infrastructure Mindset, Turned Inward — How BitSafe Runs on AI

How BitSafe Runs on Notion — the brain:

Part 1: Notion as the Company OS · Part 2: The Architecture · Part 3: Agents, Automations, and the AI Layer · Part 4: Replacing Salesforce with Notion · Part 5: The Agent Governance Model

The NanoClaw series — the reach:

Part 1: Building a Company-Wide AI Assistant · Part 2: The Architecture · Part 3: The Autonomous Engine · Part 4: The Substrate · Part 5: Working With NanoClaw · Companion: Cost Discipline

Standalone deep-dives:

Why Not Just Use the Claude App? · The Invisible Seam · Measuring an AI OS, Honestly