ZeroNoise Logo zeronoise
Post
After “coding is solved”: plan-first, parallel-agent ops, and sandboxing become the workflow
Feb 20
6 min read
147 docs
Boris Cherny’s strongest claim yet: coding (for his work) is “largely solved,” and the real frontier is end-to-end agentic ops—backed by +200% PR productivity and Claude reviewing 100% of PRs. Plus: Cursor’s cross-OS agent sandboxing, Claude Code perf/regression signals, and new lightweight OpenClaw clones worth cloning.

🔥 TOP SIGNAL

Boris Cherny (Head of Claude Code) is blunt: for the kinds of programming he does, “coding is largely solved”, and the frontier is shifting to adjacent, end-to-end agentic work (project management, paying tickets, general ops) rather than better IDE autocomplete . In that world, throughput isn’t hypothetical: he says Anthropic saw +200% productivity per engineer (PRs), and Claude now reviews 100% of pull requests (with human review still in the loop) .

🛠️ TOOLS & MODELS

  • Claude Code — stability + performance signals

    • v2.1.47: long-running sessions use less memory.
    • Team guidance: keep reporting issues and they’ll fix them .
    • Practitioner complaint: Theo reports Claude Code has “regressed an absurd amount” with UI/feedback issues (timestamps not updating, missing “thinking,” multi-minute hangs with 0 output) and suggests it “needs to be rewritten from scratch.
  • Cursor — agent sandboxing shipped across desktop OSes

    • Cursor says it rolled out agent sandboxing on macOS, Linux, and Windows over the last three months .
    • Mechanism: agents run freely inside a sandbox, only requesting approval when they need to step outside it .
    • Implementation write-up: http://cursor.com/blog/agent-sandboxing.
  • OpenAI Codex — pricing/availability + compute pressure

    • @thsottiaux: Codex is included with a ChatGPT subscription (even Plus has “very generous” usage) ; they attribute this to gpt-5.3-codex achieving “SoTA at lower cost” .
    • Same source: candidates increasingly ask how much dedicated inference compute they’ll have, and usage/user is growing faster than user count → compute could be scarce.
  • Gemini 3.1 Pro — dev-workflow positioning (ramping up)

  • GitHub Copilot → Zed editor (GA)

  • Model choice drift + self-hosting pressure (reported trend)

    • Salvatore Sanfilippo says he’s seeing excellent programmers move off US models (Codex, Claude Code) toward Chinese open-weight models like Kimi 2.5 and GLM5, often via providers or by building in-house Nvidia GPU inference to avoid outages and keep sensitive data internal .
    • He frames DeepSeek v4 as a potentially major moment if it lands as SOTA (as rumors suggest), putting pressure on OpenAI/Anthropic business sustainability .

💡 WORKFLOWS & TRICKS

  • “Plan mode → execute” as a default loop (Claude Code / Boris Cherny)

    1. Start the task in plan mode (he says he does this for ~80% of tasks) .
    2. Iterate on the plan (model goes back-and-forth) .
    3. Once the plan is good, let it execute; he’ll auto-accept edits after that .
    • Implementation detail: plan mode is literally a prompt injection: “please don’t write any code yet” .
  • Parallel agents, but treat “state” as a first-class problem

    • Cherny: he runs ~5 agents in parallel while working (terminal/desktop/iOS) and highlights you can run many sessions in parallel .
    • Kent C. Dodds: similar “utter chaos” workflow—multiple projects, “a couple cloud agents” each, plus a locally guided agent .
    • Failure mode (real): Simon Willison describes “parallel agent psychosis”—losing track of where a feature lives across branches/worktrees/instances .
    • Recovery trick: after hacking in /tmp and crashing, he recovered the code from ~/.claude/projects/ session logs, and Claude Code could extract and recreate the missing feature .
  • Turn your feedback firehose into PRs (fast iteration loop)

    • Cherny’s pattern: point Quad/Cowork at an internal Slack feedback thread; it proposes changes and opens PRs quickly, which encourages more feedback because users feel heard .
    • Bug-fix loop: “as long as the description is good,” he can fix a bug in minutes by delegating to Claude .
  • Token policy as a productivity lever (especially early)

    • Cherny recommends giving engineers as many tokens as possible early (even “unlimited tokens” as a perk) so they try ideas that would otherwise feel too expensive; optimize/cost-cut after an idea works .
  • Avoid over-orchestration: tools + goal > rigid workflows (model-first design principle)

    • Cherny: don’t “box the model in” with strict step-by-step workflows; give it tools + a goal and let it figure it out—he argues heavy scaffolding mattered a year ago but often isn’t necessary now .
  • “Ephemeral app” mindset + AI-native interfaces (Karpathy)

    • Karpathy built a one-off cardio experiment dashboard with Claude; it had to reverse engineer a treadmill cloud API, process/debug data, and build a web UI; he still had to chase bugs (units, calendar alignment) .
    • His takeaway: the app-store model feels outdated for long-tail needs; instead, the industry needs AI-native sensors/actuators with agent-friendly APIs/CLIs so agents don’t have to click HTML UIs or reverse engineer services .
  • Agent “memory” ops in practice (LangSmith Agent Builder)

👤 PEOPLE TO WATCH

  • Boris Cherny — production-grade Claude Code habits (plan mode, parallel sessions) + strong claims about where “after coding” goes .
  • Andrej Karpathy — high-signal framing: ephemeral bespoke apps + “AI-native CLI/API” requirements for tools and hardware vendors .
  • Simon Willison — the best micro-case study of parallel-agent failure/recovery using session logs as the source of truth .
  • Steve Ruiz (tldraw) — pragmatic company-building: code gets easier, but alignment/positioning/communication get harder—and he’s automating the overhead away .
  • Theo — sharp practitioner critique on Claude Code regressions plus continued pressure on “harness vs infra” policy differences across vendors .
  • François Chollet — frames agentic coding as ML optimization (spec/tests as constraints) and asks what the “Keras of agentic coding” will be ; @swyx suggests DSPy as the presumptive community default .

🎬 WATCH & LISTEN

1) Boris Cherny — “Plan mode” as the default starter move (~1:09:52–1:10:41)

Hook: a simple, copyable workflow: force planning first (no code), iterate the plan, then execute + auto-accept when the plan is solid .

2) Boris Cherny — “Coding is largely solved… what’s next?” (~0:18:19–0:19:06)

Hook: his thesis on why the frontier is shifting from IDE coding to adjacent operational tasks and general automation .

3) Steve Ruiz — daily automated release notes from landed PRs (~0:20:35–0:21:02)

Hook: treat agents like scheduled staff: every day, Claude scans the last 24h PRs and drafts “release notes we’d publish if we shipped main today” .

📊 PROJECTS & REPOS


Editorial take: As agents make code cheap, the new edge is orchestration discipline: plan-first loops, sandboxing, session-log recoverability, and AI-native interfaces that don’t force your agent to “be the computer.”

After “coding is solved”: plan-first, parallel-agent ops, and sandboxing become the workflow
Summary
Coverage start
Feb 19 at 7:00 AM
Coverage end
Feb 20 at 7:00 AM
Frequency
Daily
Published
Feb 20 at 8:14 AM
Reading time
6 min
Research time
4 hrs 58 min
Documents scanned
147
Documents used
22
Citations
50
Sources monitored
110 / 110
Insights
Skipped contexts
Source details
Source Docs Insights Status
Lukas Möller 0 0
Jediah Katz 0 0
Aman Karmani 0 0
Jacob Jackson 0 0
Cursor Blog | RSS Feed 0 0
Nicholas Moy 0 0
Mike Krieger 0 0
Sualeh Asif 0 0
Michael Truell 0 0
Google Antigravity 3 1
Aman Sanger 0 0
cat 0 0
Mark Chen 0 0
Greg Brockman 4 0
Tongzhou Wang 0 0
fouad 0 0
Calvin French-Owen 0 0
Hanson Wang 0 0
Ed Bayes 0 0
Alexander Embiricos 0 0
Tibo 4 2
Romain Huet 0 0
DHH 8 0
Jane Street Blog 0 0
Miguel Grinberg's Blog: AI 0 0
xxchan's Blog 0 0
<antirez> 0 0
Brendan Long 0 0
The Pragmatic Engineer 0 0
David Heinemeier Hansson 0 0
Armin Ronacher ⇌ 10 1
Mitchell Hashimoto 0 0
Armin Ronacher's Thoughts and Writings 0 0
Peter Steinberger 0 0
Theo - t3.gg 19 3
Sourcegraph 0 0
Anthropic 0 0
Cursor 0 0
LangChain 0 0
Anthropic 0 0
LangChain Blog 1 0
LangChain 3 1
Cursor 1 1
Riley Brown 0 0
Riley Brown 6 0
Jason Zhou 1 0
Boris Cherny 1 0
Mckay Wrigley 0 0
geoff 1 0
Peter Steinberger 🦞 7 1
AI Jason 0 0
Alex Albert 4 0
Latent.Space 1 0
Logan Kilpatrick 3 0
Fireship 0 0
Fireship 0 0
Kent C. Dodds ⚡ 7 1
Practical AI 0 0
Practical AI Clips 0 0
Stories by Steve Yegge on Medium 0 0
Kent C. Dodds Blog 0 0
ThePrimeTime 0 0
Theo - t3․gg 1 1
ThePrimeagen 20 1
Ben Tossell 1 0
swyx 21 3
AI For Developers 0 0
Geoffrey Huntley 0 0
Addy Osmani 4 0
Andrej Karpathy 1 1
Simon Willison 9 1
Matthew Berman 1 1
Changelog 1 1
Simon Willison’s Newsletter 0 0
Agentic Coding Newsletter 0 0
Latent Space 1 1
Simon Willison's Weblog 1 0
Elevate 0 0
Lukas Möller 0 0
Jediah Katz 0 0
Sualeh Asif 0 0
Mike Krieger 0 0
Michael Truell 0 0
Cat Wu 0 0
Kevin Hou 0 0
Aman Sanger 0 0
Nicholas Moy 0 0
Andrey Mishchenko 0 0
Jerry Tworek 0 0
Romain Huet 0 0
Thibault Sottiaux 0 0
Alexander Embiricos 0 0
xxchan 0 0
Salvatore Sanfilippo 1 1
Armin Ronacher 0 0
David Heinemeier Hansson (DHH) 0 0
Alex Albert 0 0
Logan Kilpatrick 0 0
Shawn "swyx" Wang 0 0
Jason Zhou 0 0
Riley Brown 0 0
McKay Wrigley 0 0
Boris Cherny 1 1
Ben Tossell 0 0
Geoffrey Huntley 0 0
Peter Steinberger 0 0
Addy Osmani 0 0
Simon Willison 0 0
Andrej Karpathy 0 0
Harrison Chase 0 0