Harnesses, Not Frameworks — The New Shape of AI Tools
On April 18, 2026, Gregor Zunic — co-founder of Browser Use — posted this:
Introducing: Browser Harness. A self-healing harness that can complete virtually any browser task. We got tired of browser frameworks restricting the LLM. So we removed the framework.
No framework. Direct CDP. One websocket to Chrome. A helpers.py the agent edits on the fly. Drop-in for Claude Code and Codex. The tweet is here.
This isn’t just a browser automation tool. It’s the clearest statement yet of a pattern that’s been quietly taking over AI tooling in 2026: the harness.
What’s a Harness?
A harness is the minimum wrapping around an LLM that lets it do useful work. It exposes a tool surface — usually filesystem, shell, maybe HTTP — and then gets out of the way.
Compare the two shapes:
| Framework | Harness |
|---|---|
| Defines workflows, steps, DAGs | No workflow. The LLM decides. |
| Abstracts away the underlying tools | Exposes raw tools (shell, CDP, fs) |
| Prescribes what the agent should do | Prescribes what the agent can do |
| Breaks when the task doesn’t fit the template | Bends, because there’s no template |
| Optimizes for dumb models | Optimizes for smart models |
Frameworks made sense in 2023. Models weren’t reliable enough to trust with raw capability, so you built rails. LangChain, AutoGPT, CrewAI — all variations on “let me hand-hold this LLM through a pipeline.”
Models got smarter. The rails started costing more than they saved.
Claude Code Was the First Real Harness
Claude Code shipped in early 2025 with a radical design: no orchestration, no planner module, no memory graph. Just an LLM with Bash, Read, Edit, Write, Grep, and a few web tools. That’s it.
The bet was that a smart enough model, given file system access and a shell, could do the orchestration itself. And it could. Karpathy called it “the only AI tool I actually use every day.”
Codex landed on the same shape a few months later. Different model, same philosophy: give the LLM a sandbox and tools, not a framework.
Browser Harness is this pattern arriving in browser automation. Instead of Selenium-style step definitions or Playwright-style APIs wrapped in agent scaffolding, you get a raw Chrome DevTools Protocol connection and a helpers file the agent rewrites when something breaks.
That’s the “self-healing” part. There’s no retry logic, no fallback strategy, no parser for error states. The LLM reads the error, edits the helper, tries again. The code base is the memory.
Why Harnesses Are Winning
Three things shifted in parallel:
- Tool use got reliable. Claude 4 and GPT-5 follow tool schemas consistently enough that you don’t need a validator layer catching malformed calls.
- Context windows stopped being scarce. A 1M-token context means you can load the whole codebase, the whole browser DOM, the whole doc set — and let the model re-read instead of pre-chunking.
- Models learned to recover. When a call fails, a modern LLM edits the tool, writes a new helper, or changes approach. Framework authors used to write that recovery logic by hand. The model does it better.
Once those three are true, every abstraction layer between the LLM and the raw tool is a liability. It’s code that you maintain, that the model has to work around, that breaks when the task is even slightly off-pattern.
Greg’s line is the tell: “I challenge anyone to find a task that DOESN’T work.” Frameworks have known failure modes. Harnesses don’t — or rather, their failure mode is the LLM itself, which keeps getting better.
The Harness Stack in 2026
If you squint, you can see the stack forming:
- Coding harness: Claude Code, Codex, Cursor agent mode
- Browser harness: Browser Harness (Browser Use)
- Research harness: Karpathy’s autoresearch —
program.md+ Claude Code - Data harness: Emerging — direct DB access + shell
The common shape: LLM + raw tool + persistent working directory. The working directory is where context accumulates, where helpers get written, where the model’s memory lives between turns.
Harnesses Run on Context
Here’s the part that matters if you’re building with these tools: a harness is only as good as the context it’s handed.
Claude Code without a CLAUDE.md is a generic coding assistant. Claude Code with a well-curated CLAUDE.md, a library of reference docs, and a knowledge folder it can grep — that’s what Karpathy uses. That’s the 10x version.
Same for Browser Harness. The helpers.py it edits on the fly starts from somewhere. If you seed that somewhere with patterns, auth flows, site-specific quirks you’ve documented — the harness gets leverage. If you hand it a blank file, it has to rediscover everything.
The harness does the work. The context library is where your advantage lives.
Where Save Fits
Every harness we’ve talked about reads Markdown from disk. CLAUDE.md, AGENTS.md, reference docs, saved documentation pages, API notes — all Markdown, all sitting in a folder the agent can see.
Save is a one-click converter from any webpage to clean Markdown. Documentation pages, blog posts, Stack Overflow answers, GitHub READMEs, API references — whatever the next harness you run will need to read.
The people getting the most out of Claude Code and Browser Harness in 2026 aren’t building more framework. They’re curating better libraries. The harness is free. The context is the moat.
Save turns any webpage into Markdown your AI harness can read — install the extension and start building the library that makes your agents smarter.