Hours of research in one daily brief–on your terms.

Tell us what you need to stay on top of. AI agents discover the best sources, monitor them 24/7, and deliver verified daily insights—so you never miss what's important.

Setup your daily brief agent
Discovering relevant sources...
Syncing sources 0/180...
Extracting information
Generating brief

Recent briefs

SwiftUI Agent Workflows Get Real as Cursor Pushes Demo-First Review
Mar 28
6 min read
91 docs
Romain Huet
Vadim
Cursor
+8
Simon Willison published the most copyable workflow of the day: small native SwiftUI tools built with Claude Opus 4.6 and GPT-5.4, Git checkpoints, and aggressive pattern reuse. Also inside: Cursor's million-commit signal, Codex starter prompts, Kody's secret-safe auth flow, and the eval checklist worth stealing.

🔥 TOP SIGNAL

Today's best practical signal is Simon Willison's fully documented SwiftUI loop: Claude Opus 4.6 and GPT-5.4 were good enough to build tiny macOS utilities as single-file apps without opening Xcode . His repeatable pattern is simple: start from a concrete monitoring question, ask for a native app in /tmp, checkpoint early with git init and git commit, then add narrow follow-up prompts until the app becomes a menu bar tool . The caveat is the useful part too: Willison says he barely reviewed the generated code and was not confident the reported system metrics were always correct, so treat this as a fast prototyping loop, not automatic ground truth .

🛠️ TOOLS & MODELS

  • SwiftUI is suddenly a live agent target. Simon says Claude Opus 4.6 and GPT-5.4 are both competent enough at SwiftUI to build single-file macOS apps without Xcode; separately, Codex now has a use-case gallery with starter prompts you can open directly in the app, including an iOS workflow with SwiftUI skills folded into the plugin .
  • Cursor is leaning into autonomy plus richer review. Michael Truell says Cursor cloud agents produced over a million commits in the last two weeks and that these were essentially all AI-generated with little human intervention because the agents run code on their own computers . Cursor also now shows demos, not diffs: agents can use the software they build and send back video of the result .
  • Kody is shipping a real secret-boundary pattern. Kent C. Dodds says Kody can now dynamically handle secrets for auth with any provider and any auth mechanism without giving the model access to the secret . He used that flow from Claude to build a Spotify control app with secure OAuth handling .
  • Claude Code got a power-user hook filter. Hooks now support an if field using permission-rule syntax, so you can run them on specific bash commands instead of every command .
  • Linear's agent numbers are getting hard to ignore. Theo highlighted Linear's reported stats: coding agents are installed in more than 75% of enterprise workspaces, agent-completed work volume grew 5x in the last three months, and agents authored nearly 25% of new issues. Linear also rolled out a workspace agent, Skills, and Automations .

💡 WORKFLOWS & TRICKS

  • Steal Simon's tiny-native-app loop. Ask the agent the underlying diagnostic question first; once it proves the data is accessible, prompt it to create a native SwiftUI app in /tmp; immediately commit a working baseline; ask the model to suggest next features; then request one concrete improvement at a time such as per-process stats, reverse DNS, layout changes, and a menu bar icon . For the next app, point the agent at the first repo and tell it to imitate the useful pieces . Transcript + code: Bandwidther, Gpuer, repo 1, repo 2.
  • Use Git as agent memory, not just backup. Willison's explicit move was git init and git commit what you have so far before branching into new features, and he argues repositories are core tooling for ambitious agent work because they let you inspect and reverse changes cleanly .
  • Teach the framework before asking for the app. Simon had Claude clone Starlette 1.0 and generate a skill markdown with feature examples, copied that into Claude skills, then started a fresh session asking for a task app. Claude wrote the app and manually tested it with TestClient, which is a very usable pattern for framework-heavy work . Repo: TaskFlow.
  • Keep secrets out of model context. Kent's concrete prompt was Use Kody to create an app that can control my Spotify; Claude then guided app setup, handled OAuth securely, and never needed the tokens or client secrets in-model. He says the same pattern works in any MCP-supporting client . Demo link: Spotify control app.
  • Eval coding agents like software, not vibes. LangChain's checklist is concrete: manually review 20-50 real traces before building eval infrastructure; define unambiguous pass/fail; separate capability evals from regression evals; start with full-turn evals that verify final response, trajectory, and actual state changes; for multi-turn, use N-1 testing; and run coding-agent trials in fresh containers or VMs before wiring regression evals into CI/CD gates . Full checklist: LangChain's guide.
  • Tune the tool surface before the prompt. LangChain's most reusable advice: tool interface design can eliminate entire classes of agent errors; their example is requiring absolute file paths so navigation mistakes become impossible .
  • Recurring pattern across operators: prototype first, spec second. Theo says his old Twitch loop was to build a scrappy version in 1-3 days to test UX and surface technical constraints, then either write a much better spec or just ship the first pass. He still thinks code is often the best planning artifact .

👤 PEOPLE TO WATCH

  • Simon Willison — still one of the best public labs for agentic coding because he publishes the prompts, transcripts, repos, and the warnings about what he does not trust yet .
  • Kent C. Dodds — worth following if you care about agent security that survives contact with real APIs; Kody's secret-handling and Spotify demo are concrete, not theoretical .
  • Romain Huet — good pulse on where Codex is getting operationally better: starter-prompt workflows today, and strong user anecdotes on bug-fixing performance. In one example he amplified, a user said Codex one-shotted a bug that Claude Code had spent four hours on, and Huet said he sees many cases like that .
  • Michael Truell — if Cursor's cloud agents are already pushing over a million essentially AI-generated commits in two weeks, that is a real scale signal for unattended code execution .
  • Theo — strong signal on workflow judgment: prototype-first building, skepticism of role-played org-chart skills, and a useful reminder that automations may be more valuable than many devs assume .

🎬 WATCH & LISTEN

  • 6:07-9:29 — Theo on prototype-first development. Worth the time for one specific loop: build a rough version fast, use it to discover UX and dependency truth, then decide whether to spec properly or just ship the prototype .
  • 17:06-20:31 — Theo on code as planning. The spicy part is optional; the useful part is his claim that role-played agent personas and bloated planning docs are often worse than letting the model do a first-pass build and learning from that artifact .

📊 PROJECTS & REPOS

  • Bandwidther — network bandwidth monitor built from minimal prompts. The real value is that the full transcript is public, so you can copy the exact build loop instead of guessing .
  • Gpuer — GPU/RAM monitor built by pointing the agent at Bandwidther and asking for a similar app. Best example in today's batch of repo-to-repo recombination as a workflow primitive .
  • TaskFlow — useful less as an app and more as a pattern: create a framework skill first, then generate and test a fuller app against that skill .
  • Kody PR #61 — secure secret routing for any provider and auth mechanism without exposing the secret to the model. This is one of the most important practical safety patterns in today's feed .

Editorial take: the sharpest teams today are not chasing maximum autonomy; they're building small reusable loops, keeping Git and evals close, and drawing hard boundaries around secrets and real-world state .

Works in Progress Leads Today’s High-Signal Founder Picks
Mar 28
2 min read
143 docs
arctotherium
Sam Bowman
Patrick Collison
+1
Today’s clearly organic recommendations are sparse but useful: Patrick Collison strongly endorses Works in Progress with a direct subscription link, while Marc Andreessen points readers to a thread on the 2015-2022 history of internet platform moderation.

Most compelling recommendation

Only clearly organic, non-self-promotional recommendations are included below.

Patrick Collison’s Works in Progress pick is the strongest recommendation in today’s set because it combines a direct endorsement with an exact, immediately usable link .

  • Title:Works in Progress
  • Content type: Publication subscription
  • Author/creator:Works in Progress
  • Link/URL:https://worksinprogress.co/print/
  • Who recommended it: Patrick Collison
  • Key takeaway: Collison says it "continues to get better with each issue" and explicitly recommends subscribing .
  • Why it matters: The upcoming issue is described as covering ASML, new Hindu megatemples, egg freezing, and Australia’s small boat crossings policy, which gives readers a concrete sense of the publication’s range .

"Works in Progress continues to get better with each issue. Highly recommend a subscription"

Also notable

  • Title:Master thread on the 2015-2022 closure of the Internet
  • Content type: X thread
  • Author/creator: @arctotherium42
  • Link/URL:https://x.com/arctotherium42/status/2037324942069342679
  • Who recommended it: Marc Andreessen
  • Key takeaway: Andreessen signals that the thread is something readers should not forget; the thread describes 2015-2022 as a period when major internet platforms moved from broad openness to stricter narrative enforcement, often with governments and NGOs involved in moderation .
  • Why it matters: It gives readers the exact thread Andreessen is elevating as historically important, along with a strong signal that he thinks it is worth revisiting .
Agents Stretch Into Multi-Day Work as Inference Takes Center Stage
Mar 28
4 min read
157 docs
Jack Clark
Jeff Dean
Jack Clark
+7
Today’s clearest thread is duration: leading researchers say agents are starting to sustain hours- or days-long work, while chips and datacenters are being redesigned around inference-heavy demand. The same cycle is sharpening the labor debate and the fight for default distribution on major devices.

The big shift

Longer-running agents are moving from benchmark wins to extended workflows

Onstage at GTC, Jeff Dean said the clearest recent gains are on verifiable tasks such as math and coding, citing Gemini gold medals in the IMO and ICPC, and added that agent workflows now let models pursue tasks that run for hours or even days with less close supervision . Jack Clark described agents more plainly as language models that use tools over time, and said Anthropic now sees some research projects shrink from two to three weeks to one to two days .

A separate controlled experiment shared on /r/MachineLearning pointed in the same direction: a Claude Code agent with access to 2M+ CS papers beat an otherwise identical setup by 3.2% on TinyStories after 100 automated experiments . Why it matters: The newest gains are increasingly tied to sustained tool use, retrieval, and iteration, not just better one-turn answers .

The infrastructure response

Inference is overtaking training as the main systems problem

"Inference is the job now."

Bill Dally said more than 90% of datacenter power is already going to inference, with different hardware needs emerging for training, prefill, and decode; he said Nvidia is targeting low-latency architectures that could run relatively large models at 10,000-20,000 tokens per second per user . He also argued that moving data dominates energy cost: reading a value from external memory can be roughly 1,000x costlier than a multiply-add, which is why Nvidia is exploring SRAM-local computation and stacked DRAM .

The same session suggested AI is starting to reshape chip design itself. Nvidia said NBCell can port cell libraries overnight with results that match or exceed human designs, Prefix RL produced adders 20-30% better on size, power, and timing metrics, and Google said AlphaChip has already helped with multiple TPU generations . Why it matters: The center of gravity is moving from raw training scale toward inference latency, memory movement, and AI-assisted hardware design .

Demand is showing up in both revenue and concrete

In a Plain English interview, the host said analysts estimate Anthropic's annual recurring revenue more than doubled from $9B in December 2025 to more than $20B in March 2026, with no known precedent at that scale . Physical capacity is moving in parallel: Microsoft said it is partnering with Crusoe on a 900MW AI factory campus in Abilene, Texas, and Sam Altman said the first steel beams went up this week at the Michigan Stargate site with Oracle and Related Digital .

Why it matters: Strong revenue growth is still being translated into very large compute buildouts, which makes AI demand look durable rather than purely speculative .

The policy and distribution questions

Labor warnings are getting sharper, but the timeline is contested

Senator Mark Warner said recent college graduate unemployment could rise from about 9% to 30%, pointed to law firms pausing first-year associate hiring, slashing back-office headcount, and cutting internships, and backed bipartisan bills to report AI-driven job losses and generate policy responses . He said government and society are not ready for the next three to five years of disruption .

Jack Clark, by contrast, said he does not agree with Dario Amodei's forecast of 20% unemployment and the loss of half of entry-level white-collar roles within about five years; he argued big employment shifts usually take time, policy choices matter, and today's agents multiply productivity more than they fully replace people . Why it matters: The debate has shifted from whether AI will affect white-collar work to how fast the shock arrives and what policy response should be ready first .

Default distribution is becoming a strategic front

Perplexity said it is deepening its Samsung partnership to power AI in the browser preinstalled on more than 1B Samsung devices with 100M+ active users, extending existing work on Bixby and preload on Galaxy S26 devices alongside Gemini . Separately, Big Technology highlighted a report that Apple will open Siri to AI assistants from rival companies in iOS 27 .

Why it matters: The contest is no longer just about model quality; placement inside browsers, assistants, and operating systems could decide who actually reaches users at scale .

Prototype-Literate PMs, Healthier Metrics, and Practical AI Workflows
Mar 28
10 min read
42 docs
Aakash Gupta
Tony Fadell
Elena Verna
+5
Why prototype-building is becoming part of the PM baseline, where AI actually creates leverage, and how to tighten execution with better metrics, post-mortems, and onboarding design. Also includes hiring signals and tool recommendations PMs can test now.

Big Ideas

1) Prototype literacy is becoming a core PM skill

"Instead of doing some case study and presentation, you need to be ready to build a full blown app as part of the interview."

Elena Verna argues that functional prototyping is becoming standardized across roles, not just PM . She also says turning a PRD into an interactive artifact improves the PRD itself, helps sell the idea faster, and gives engineers and designers a clearer shared vision .

Why it matters: The expected PM artifact is expanding from documents to clickable experiences. In Verna's workflow, the written spec is intentionally kept to a one-pager, with more detail discovered through prototyping .

How to apply: Write the shortest useful spec, prototype it immediately, and use the gaps you find to tighten the hypothesis before engineering starts. Keep engineering in the ideation loop rather than treating the handoff as closed .

2) AI gives PMs leverage unevenly: strongest in critique, synthesis, analysis, and execution hygiene

The Exponent framework splits PM work into vision, strategy, design, and execution . In that model, AI is already useful for customer insight synthesis, AI-moderated interviews, natural-language data analysis, prototyping, meeting agendas and summaries, and critiquing an existing strategy . Verna frames the same pattern another way: AI can do the first 30-50% of baseline work across PRDs, prototypes, and marketing plans, so PMs react and refine instead of starting from a blank page .

Why it matters: The fastest gains are in compressing time-to-insight and time-to-artifact, not outsourcing the hardest judgment calls .

How to apply: Use AI to research, critique, summarize, query data, and draft prototypes. Do not outsource direct customer contact, product vision, or the final strategic bet .

3) North star metrics need a pressure test before they turn into local optimizers

Run the Business offers four meta-questions to pressure-test north star metrics: Would you be proud of the behavior in 18 months? Can the team explain in one sentence how the metric makes a customer's life better? If every team's north star sat on one page, would any of them compete? What happens if you 10x the metric—would that be incredible or terrible?

Why it matters: The framework is explicitly designed to catch anti-patterns before a metric starts driving the wrong behavior or creating cross-team conflict .

How to apply: Run these four questions in your next review. If the customer-value sentence is weak or a 10x outcome sounds bad, treat that as a metric-design problem, not just a reporting issue .

Tactical Playbook

1) Use AI as a strategy critic, not a strategy author

  1. Create a project with explicit devil's-advocate instructions and tell the model not to be nice .
  2. Load opinionated strategy best practices into project knowledge, such as course material or a strategy book summary .
  3. Paste in your strategy and ask for critique .
  4. Look for concrete issues like an audience that is too broad or a claimed moat that is not actually defensible .
  5. Rewrite the strategy yourself; AI can critique the bet, but it is not the owner of the bet .

Why it matters: The demo shows critique quality improves when the model is grounded in a specific standard of good work, not a generic prompt .

How to apply: Save this as a reusable review step before leadership reviews. If the output feels generic, add exemplars rather than more vague instructions .

2) Build a weekly insight loop from surveys, NPS, and customer data

  1. Upload raw survey or NPS data to Claude and ask for the basics first: score, promoter/passive/detractor mix, trends, and segment cuts .
  2. Add segmentation fields you already have, such as email type, plan type, or usage intensity .
  3. Check significance where the analysis provides it .
  4. Turn the output into an executive readout if needed, including an AI-generated deck in Gamma .
  5. Once you trust the workflow, move from occasional reporting to a recurring cadence .

Why it matters: In the example, work that previously took about a week became fast enough to support weekly reporting instead of quarterly review .

How to apply: Start with one recurring customer metric and three segments. Expand only after you can verify the numbers and logic .

3) Improve AI data analysis with example pairs

  1. Let PMs ask data questions in plain English and inspect the generated SQL when needed .
  2. Expect early errors when schemas are messy or outdated .
  3. Give the model past natural-language question and SQL pairs as project knowledge .
  4. Reuse the same exemplar pattern for other PM work, including interview guides and strategy critique .

Why it matters: Rekhi's example is not just faster query writing; it expands the number of questions a PM can realistically ask from a few to all of them .

How to apply: Build a small internal library of approved examples, then graduate from ad hoc queries to scheduled dashboards .

4) Make post-mortems blameless but operational

"Work the problem"

  1. Name the single core reason the launch failed .
  2. Add three supporting reasons or evidence showing how that problem appeared in data or feedback .
  3. Use screenshots or other specifics to show what happened .
  4. Model accountability by taking blame yourself so others can do the same .
  5. End with process changes for next time, without overreacting to one incident .

Why it matters: This was shared as a practical structure for leadership-facing post-mortems, where the goal is reuse and learning rather than blame .

How to apply: Use the structure as the spine, then connect any survey data or stakeholder feedback back to the core reason and its supporting evidence .

5) When something ships broken, own the conversation without creating a product-vs-engineering culture

PMs are often the first stop for bugs, delays, and quality issues from stakeholders and customers . The discussion emphasizes owning the conversation rather than deflecting, avoiding a product-vs-engineering culture, giving the team credit when things go well, and taking blame when they do not .

Why it matters: Public ownership builds trust and keeps momentum, while quality issues can also signal deeper team problems .

How to apply: Acknowledge the issue, state the recovery plan, protect the team externally, and then use the failure to inspect process or coordination problems internally .

Case Studies & Lessons

1) Onboarding by doing beat onboarding by explaining

One PM spent about 40% of initial development time on a polished onboarding with animations, progress indicators, and tooltips, yet day-1 retention was 21% because users skipped through the flow, reached the main app confused, and left . Rebuilding the first-use experience so users performed the core action lifted day-1 retention to 44% .

Lesson: Users can complete onboarding without learning anything. What mattered in the follow-up comment was core-task success on day one, not tutorial completion .

How to apply: Watch the first session, find the first meaningful action, and redesign onboarding around doing that action instead of explaining it first .

2) Faster analysis changed the team's learning cadence

In the NPS demo, a CSV upload produced score summaries, monthly trends, segment comparisons, and statistical significance checks . Because the analysis became much faster, the team could move from quarterly NPS review to continuous weekly reporting and get more fresh insights .

Lesson: AI changes operating cadence when the bottleneck is analysis time .

How to apply: Look for recurring insight work that is currently too slow or too rare, and automate the end-to-end workflow rather than just one step .

3) Spec, prototype, and GTM draft can now happen in parallel

Verna's nonprofit feature demo starts with a ChatGPT tech spec intended to get 30-50% of the way to a usable draft . She then turns it into a one-pager , generates a prototype in Lovable and spends the next couple of hours editing structure, content, and visuals , while ChatGPT works in parallel on ICP, distribution partners like TechSoup, and examples to tear down such as Google and Slack nonprofit programs . She says this gets her to spec, prototype, and GTM thinking in roughly three hours, with engineering pulled into ideation rather than a final handoff .

Lesson: The power is parallel drafting plus human reaction, not expecting a one-shot answer .

How to apply: Run product, UX, and GTM thinking in parallel, then use human taste to edit hard and involve engineering before priorities are locked .

Career Corner

1) The job market is improving, but not evenly

PM openings are at the highest levels seen in more than three years . At the same time, Bay Area importance is rising, remote opportunities are declining, and recruiter demand is surging as a leading indicator of sustained hiring demand .

Why it matters: Better hiring volume does not mean an easier search if your location or remote requirements are narrow.

How to apply: Broaden Bay Area-targeted searches if possible, reset expectations on remote-only roles, and watch recruiter activity as a sign that demand is holding .

2) The skills that still compound are not the ones AI can average away

Verna's non-automated list is direct customer interaction , setting the vision and destination , understanding marketing and distribution as software commoditizes , and building functional prototypes .

Why it matters: Her warning is explicit: if everyone uses AI to choose direction, products converge .

How to apply: Protect time for customer calls and social listening, build a stronger point of view on where the product should go, and learn enough GTM and prototyping to turn that point of view into something concrete .

3) AI-native experimentation is becoming a hiring signal

Verna recommends bringing AI-native employees into teams, often including new grads, because they already treat AI as a normal part of work . She also argues that teams need bottom-up tool adoption and repeated experimentation because model performance changes quickly .

Why it matters: The advantage is not one favorite tool; it is an operating habit of trying, judging, and retrying workflows as the tools improve .

How to apply: Show concrete AI workflows in your portfolio or resume, push for lightweight experimentation on your team, and revisit tasks that did not work a month ago .

4) Senior leadership means more delegation and more accountability

"When the team succeeds, it's their fault. When we fail, it's my fault."

Tony Fadell's management advice is to let go of doing the work yourself, trust the team, and give people room to be creative . The Reddit thread adds the complementary leadership behavior: be the circuit breaker first when something goes wrong .

Why it matters: Advancement is not just better judgment. It is creating space for the team to do great work while absorbing external pressure yourself .

How to apply: Delegate real ownership, resist the urge to re-do the work, and take the first uncomfortable conversation when outcomes disappoint .

Tools & Resources

1) Perplexity Computer for deliverables, not just answers

Aakash Gupta argues that Perplexity's Computer produces finished outputs: research reports with citations, deployed dashboards, cleaned datasets with charts, and launch kits with positioning docs and email drafts . He highlights cloud execution, parallel agents, and persistent memory as the main differences . His example: a 28-page Notion messaging audit across five criteria, benchmarked against Coda and Slite, with per-page recommendations in about 20 minutes .

Why it matters: This is positioned as a tool for bounded PM work where the output itself matters more than chat .

How to apply: Start with a constrained audit or launch-prep task, and use the full guide for six PM use cases, exact prompts, and the prompt spec that Gupta says cuts cost by 60%+ .

2) Reusable Claude projects beat blank prompts

The same set of examples shows three high-value project templates for PMs: a strategy critic with devil's-advocate instructions , a customer-insight workflow that turns CSVs into reports , and a natural-language data analyst that answers plain-English questions with SQL, charts, and tables . The shared prompting rule is to give the model exemplars and a clear definition of good work .

Why it matters: PM leverage increases when the model has project-specific context instead of starting from zero every time .

How to apply: Save one reusable project per recurring workflow and feed each one examples from your own team rather than generic prompting advice .

3) Gamma can turn analysis into an executive-ready readout

In the NPS example, Gamma generated an executive summary deck, selected visuals, and structured the presentation automatically from the analysis .

Why it matters: It shortens the path from raw insight to stakeholder-ready communication .

How to apply: Pair it with a verification step on the underlying analysis so presentation speed does not outrun analysis quality .

4) Keep a north star metric review checklist handy

The four-question pressure test from Run the Business is simple enough to reuse as a standing template in roadmap, OKR, or quarterly business reviews .

Why it matters: It forces teams to connect metrics to customer value and cross-team alignment, not just target movement .

How to apply: Add the four questions as a required review section before approving a new north star .

Anthropic Leak, Compute Bottlenecks, and the Agent Playbook Take Center Stage
Mar 28
8 min read
608 docs
Tibo
Software Mansion
Zixuan Li
+35
The brief covers leaked Anthropic model details and the security fallout, tightening memory and power bottlenecks, the steady open-vs-closed model gap, and new research and product launches across agents, voice, vision, and chip design.

Top Stories

Why it matters: Four themes stood out: frontier-model security, physical infrastructure constraints, the economics of open vs. closed models, and a more formal operating model for AI agents.

Anthropic’s unreleased model leak became a security story

According to posts citing leaked materials, Anthropic has been testing a model called Mythos with select customers. Those posts described it as a new tier above Opus—later edited in one post to Capybara—with stronger results in coding, academic reasoning, and cybersecurity, plus a slow rollout because of compute intensity and security concerns . Fortune was separately cited for reporting that Anthropic left details of an unreleased model in an unsecured data trove .

Impact: Frontier-model competition is now tied not just to capability, but to selective access, cyber risk, and operational security .

Compute constraints are showing up in memory, power, and construction schedules

Epoch AI said the total memory bandwidth of AI chips shipped since 2022 has reached 70 million terabytes per second and is growing 4.1x per year, while AI inference is often bottlenecked by memory bandwidth rather than raw compute . It also said AI chips consumed more than 90% of total HBM production in 2025 and that HBM prices spiked in early 2026 as demand outpaced supply . At the same time, Microsoft said it is partnering with Crusoe on a 900MW AI factory in Abilene, Texas , OpenAI said steel beams went up this week at its Michigan Stargate site with Oracle and Related Digital , and NVIDIA said Vera Rubin + Groq 3 LPX can deliver up to 35x more performance per megawatt for trillion-parameter models and massive context workloads .

Impact: The competitive bottleneck is increasingly about watts, memory bandwidth, and buildout speed—not only model quality .

The open/closed gap is much smaller than it used to be, but the frontier is still closed

Arena said the gap between top open-source and proprietary text models has held at roughly 50-60 points for about 14 months, down from 100-150 points before mid-2024 . It also said proprietary models currently occupy the first 20 places on the Text Arena leaderboard, while the leading open models are GLM-5 at #20, Kimi-K2.5-Thinking at #23, and Qwen3.5-397b-a17b at #27 . In separate Arena analysis, GPT-5.4 High, Mini, and Nano behaved like scaled versions of the same model, suggesting price differences mainly reflect efficiency rather than different core capabilities .

Impact: Open models are closer than before, but the leading edge still sits with closed labs, and pricing is becoming more about efficiency per task than a simple proxy for intelligence .

The agent era is getting its own playbook

A new Google-linked report argues that intelligence explosions are social rather than individual, and that future progress may come from human-AI configurations and agent institutions rather than bigger monolithic models . In plain language, the argument is that groups of agents with roles, checks, and protocols may matter more than one ever-larger model .

Every prior intelligence explosion in human history was social, not individual.

IBM’s new survey on workflow optimization for LLM agents organizes agent systems by when workflow structure is set, what components are optimized, and which signals guide the optimization . Artificial Analysis also launched AA-AgentPerf, a hardware benchmark for the agent era that uses real coding-agent workloads and reports maximum concurrent users per accelerator, per kW, per dollar, and per rack .

Impact: The discussion is moving from which single model is best to how agent systems should be structured, evaluated, and deployed .

Research & Innovation

Why it matters: Research attention is shifting toward unified multimodal systems, better long-context reasoning, more stable world models, and more realistic evaluations.

  • Apple AToken: Apple introduced AToken, a shared tokenizer and encoder for images, video, and 3D objects in one framework. The post said it beats or rivals specialized models and allows knowledge transfer across media types .
  • SAGE: This closed-loop multi-agent training method co-evolves a Challenger, Planner, Solver, and Critic from one LLM backbone using just 500 seed examples. On Qwen-2.5-7B, it reportedly improved out-of-distribution performance by 4.2% while maintaining in-distribution accuracy .
  • Together Research’s divide-and-conquer approach: A Planner rewrites tasks for parallel Workers and a Manager combines their outputs. Together said Llama-3-70B and Qwen-72B using this setup can match or beat GPT-4o single-shot on long-context retrieval, QA, and summarization as context length grows, though the method still struggles when important clues are spread across distant chunks .
  • LeWorldModel: Yann LeCun’s team released LeWorldModel, described as a world model that avoids collapse by adding a SIGReg regularizer to its prediction loss. The post also claimed 15M parameters, training on one GPU in hours, 48x faster planning, and about 200x fewer tokens for encoding .
  • CursorBench: A new benchmark for coding agents uses real Cursor team coding sessions, evaluates more than functional correctness, emphasizes long-horizon tasks with a median 181 lines changed per task, and keeps the data refreshed with recent sessions .

Products & Launches

Why it matters: Product releases this cycle focused on deployability: lower-latency voice agents, faster video processing, more local execution, and tools that slot directly into agent workflows.

  • OpenAI gpt-realtime-1.5: OpenAI showed a clinic concierge demo for a Singapore health clinic. It speaks naturally with patients, collects the needed details, and books appointments in real time .
  • Meta SAM 3.1: Meta released SAM 3.1 as a drop-in update to SAM 3. Its core change is object multiplexing, which lets the model track up to 16 objects in one forward pass and doubles throughput from 16 to 32 FPS on a single H100 for medium-object videos . Meta said the point is to make high-performance video applications feasible on smaller, more accessible hardware .
  • Cohere Transcribe in the browser: Cohere’s multilingual speech recognition model can run entirely locally in a browser on WebGPU. A post said it can transcribe 1 hour of audio in 100 seconds, is fully private, free, and requires no installation .
  • LiteParse: LlamaIndex’s LiteParse is a model-free, open-source document parser for AI agents. It processes about 500 pages in 2 seconds on commodity hardware, supports 50+ file formats, and is designed to plug into agent tools, while the authors note it is not meant to replace OCR-heavy workflows for scanned documents .
  • Hermes Agent + Hugging Face: Hermes Agent is positioned as an open-source agent that remembers what it learns through a multi-level memory system and persistent machine access . Hugging Face is now a first-class inference provider inside Hermes, with 28 curated models in the picker and custom access to 100+ more .
  • Gemini video creation: Google added a Create video workflow in Gemini’s app and web experience, where users select the tool, describe the video, optionally upload a reference image or choose a template, and generate directly from the interface .

Industry Moves

Why it matters: Business activity keeps pointing to three battlegrounds: capital markets, distribution, and AI-shaped hardware.

  • Anthropic IPO talk is getting more concrete: A post citing reporting said Anthropic is eyeing a Q4 2026 IPO with a raise above $60 billion, that its annualized revenue more than doubled to $19 billion in the first two months of 2026, and that bankers think it could reach public markets before OpenAI because of its enterprise and developer focus plus a shorter projected path to profitability .
  • Perplexity expanded Samsung distribution: Perplexity said it now powers Samsung’s Browsing Assist in Samsung Browser on Galaxy Android and Windows . In a separate post, Aravind Srinivas said the broader partnership now reaches a browser pre-installed on more than 1 billion Samsung devices, extends prior work with Bixby, and includes pre-loading on Galaxy S26 devices alongside Gemini .
  • Microsoft added more physical capacity: Mustafa Suleyman said Microsoft is partnering with Crusoe on a 900MW AI factory in Abilene, Texas to add capacity to its AI fleet and support Microsoft AI infrastructure .
  • RicursiveAI is betting RL can compress chip design cycles: Lightspeed said it led RicursiveAI’s $300 million Series A in January. The company says its reinforcement-learning-based semiconductor design platform can compress chip development from years to weeks .

Policy & Regulation

Why it matters: Formal AI policy is still uneven, but courts, safety packs, and billing controls are increasingly shaping how models are deployed.

  • Anthropic won a major preliminary court ruling: A federal judge in California indefinitely blocked the Pentagon’s effort to label Anthropic a supply chain risk, though the ruling is temporary and a parallel case is still underway in Washington, D.C. .
  • OpenAI published a teen safety policy pack: OpenAI released a set of prompt-based safety policies intended to create age-appropriate protections for teens, and published the repository publicly .
  • Gemini API billing is getting harder to overspend: Starting April 1, Gemini API billing tiers get a monthly spending cap, with API access pausing until the next month or a tier upgrade if the cap is hit. Users can also set per-project spend caps in AI Studio .

Quick Takes

Why it matters: These are smaller updates, but they show where tooling, benchmarks, and open-source ecosystems are moving next.

  • OpenAI launched a Codex use-case gallery with starter prompts that can open directly in the app, and separately reset Codex usage limits across all plans so users can experiment with newly launched plugins .
  • GLM-5.1 is now available to all GLM Coding Plan users, and a separate post said GLM-5.1 will be open source.
  • Epoch AI removed one FrontierMath: Open Problems item after GPT-5.2 Pro solved it, because the problem did not meet the benchmark’s minimum notability bar; it also updated sourcing guidelines afterward .
  • Hugging Face’s HF Papers CLI adds semantic search and markdown retrieval for arXiv papers, aimed at supporting autoresearch workflows .
  • Strix packages multi-agent application pentesting with a built-in browser, proxy, terminal, and Python runtime, aiming to cut automated pentesting from weeks to hours .
  • React Native ExecuTorch v0.8.0 adds Vision Camera integration for real-time computer-vision inference on live camera feeds, including support for RF-DETR and Liquid AI’s vision-language models .
  • Qdrant is pushing sparse embeddings for e-commerce search, arguing they preserve exact matches and interpretability better than dense embeddings for product attributes such as SKU, size, and brand .
  • Huawei’s 950PR AI chip was priced at ¥70,000 with a 2H shipment target of 750,000 units, while one commenter argued it is not comparable to Nvidia’s H200 for training workloads .
Final RFS Volumes, Fertilizer Disruptions, and Acreage Bets Reset Ag Markets
Mar 28
9 min read
200 docs
Successful Farming
Foreign Ag Service
Secretary Brooke Rollins
+8
Final U.S. biofuel volumes, tightening fertilizer flows through the Middle East, and late-March acreage uncertainty are driving fresh moves in grains, livestock, and farm input markets. This brief also highlights fertilizer-saving field systems, manure-based nutrient strategies, and regional supply disruptions in Brazil and the U.S.

Market Movers

  • U.S. biofuels / oilseeds: USDA and allied agencies framed the final RFS volumes as creating $31 billion in 2026 value for U.S. corn and soybean oil and boosting net farm income by $3-4 billion, with more export opportunity for ethanol and co-products . Market reaction was more restrained: soybean oil had already rallied 16-17¢/lb since January, the ethanol mandate stayed at 15 billion gallons, and traders focused on the 70% SRE reallocation plus full RINs for foreign feedstocks and fuels, which some analysts said could cap soybean oil near 72-76¢/lb because Argentine oil is still 16-17¢ under Chicago and supplies remain ample .
  • U.S. soybeans: Midwest soybean basis improved about 25¢/bu in under two weeks, which market participants tied to strong domestic crush and farmer reluctance to sell after prior volatility . But demand expectations around China remain split heading into the May 14-15 Beijing visit; some market talk still hopes for bigger purchases, while another analyst cut expected old-crop business to 3 MMT from 8 MMT and said getting more than 3-4 MMT by late summer would be difficult .
  • Grain money / energy: Agricultural ETFs took in $149 million for the Invesco Ag Fund and $48 million for the Teucrium Corn ETF over five days, with more than $500 million entering the broader ag ETF category over the past month . At the same time, analysts argued grains remain cheap relative to the broader commodity complex, and noted that crude near $96/barrel has historically coincided with corn above $6 in many instances .
  • Wheat: Weather premium is building in the western Plains. Analysts cited ongoing dryness in western Kansas, Oklahoma, and Texas, with little rain in the next 10 days, while U.S. wheat has risen about 60¢/bu since March 1 versus about 8¢/bu in Russia . Separate market commentary said southern Plains wheat still needs rain soon, and next week could become more important if the coming event disappoints .
  • Livestock: Feeder cattle futures rose $10.78 on the week to $361.95/cwt, and April live cattle gained $4.70 to $238.75/cwt, even with fed steer cash nearly flat at $234.95/cwt. Drivers cited included strong retail beef and grilling demand, improving packer margins, a U.S. herd at 86.2 million head—the lowest in 75 years—and screw-worm concerns keeping roughly 120,000 Mexican feeders from crossing the border for now . In hogs, the Mar. 1 Hogs & Pigs report showed 74.3 million head total inventory with breeding inventory down 1.5%, which analysts characterized as slightly bullish .

Innovation Spotlight

  • U.S. strip-till nutrient placement: On-farm testing showed that banding fertilizer in a 10-12 inch zone below the seed in the root zone produced the same yields with 60% of applied fertilizer, a 40% reduction versus broadcast . Operators said those savings can help pay for strip-till equipment under tight margins and high input costs, and they are pairing strip till with Y-drops, in-furrow products, and 2x2 placement to keep nutrients near the row and root zone .

"we have made the same yields with 60% of applied fertilizer."

  • Mississippi Delta precision fertilizer: Variable-rate starter maps call for 8 gallons only where needed and shut off entirely in low-response areas . One farm said trusted data let it remove $330,000 from the fertilizer budget three years ago without sacrificing returns . The same operation also found that running 20 psi on raised beds pinched rows and caused 10-17 bu/acre losses under tractor tracks, reinforcing a strategy of stacking many 2-5 bushel gains rather than chasing a single large breakthrough .
  • U.S. row-crop/livestock integration: A Kentucky contract-hog operation uses manure to reduce purchased fertilizer, extend nutrients across an extra 200 acres each year, visibly improve soil health, and raise corn budgeting from 170 to 190 bu/acre. The business model also rests on a 10-year contract and steady monthly payments, which the operator said lowered financing risk and reduced exposure to market volatility .

Regional Developments

  • Brazil / Strait of Hormuz: Maritime tracking showed about 20 ships carrying roughly 782,000 tons of fertilizer waiting near the Strait of Hormuz; the estimate is considered conservative because some vessels disable AIS and the actual volume could be higher . Brazil has also arranged an alternative export route through Turkey for chicken, beef, sugar, and corn, but logistics costs were estimated around 350% higher and insurance about 10x normal levels .
  • Rio Grande do Sul, Brazil: Diesel shortages have spread to at least 170 municipalities, with 9 in 10 stations reporting supply problems . In Tupanciretã alone, roughly 150,000 ha of summer crops, including more than 141,000 ha of soybeans, are exposed to rationing or lack of fuel . Diesel has reached about R$8/liter, roughly R$2 above pre-war levels, putting soybean harvest and winter planting at risk .
  • South Brazil: The first 2026 heat wave is pushing temperatures into the 35-38°C range, with up to 40°C near the Paraguay border . Producers in Paraná, Santa Catarina, and southern Mato Grosso do Sul are being told to delay second-crop corn planting because soil temperatures are climbing, and forecast rain in key producer areas is not expected to repair moisture deficits .
  • United States: Roughly 75% of the lower 48 remains in drought, with central-U.S. soil moisture deficits still large enough to worry spring fieldwork, although heavier rain next week could help recharge some profiles . Planting is moving rapidly in the southern Delta because of heat, while much of the Midwest is still waiting on last frost and more moisture; dryness remains the bigger concern west of the Corn Belt and into the Plains .

Best Practices

  • Corn rootworm control (U.S. Corn Belt): Bt alone is not sufficient where rootworm pressure is high, because roots can be damaged before larvae die and secondary pests such as seed corn maggots, wireworms, white grubs, and seed corn beetles are not controlled . The recommendation is to use an insecticide at planting, and where Bt resistance is suspected, pair insecticide with SmartStax Pro or VT4 Pro to add RNAi protection .
  • Grain storage and marketing (U.S. Midwest): One Illinois producer said on-farm soybean storage captured a move from $9.75/bu at harvest to $11.50/bu in March . Bin monitoring systems were also credited with preventing spoilage and fire risk after a year in which at least four grain bins were reportedly lost locally, and with rehydrating beans from 8-9% moisture back toward 13%, which can add 5-10% saleable weight .
  • Manure as a soil program (U.S. row-crop/livestock farms): Rotating hog manure applications can extend fertility over more acres and materially change soil condition; one Kentucky farm described former white dirt turning darker with more earthworms after repeated applications . The same operation linked manure use to higher corn yield targets and lower dependence on commercial fertilizer .
  • Low-stress cattle handling (U.S. range cattle): On one Idaho ranch, redesigning facilities for counterclockwise cattle flow and working off the animals’ left-eye response cut preg-check throughput to 45-50 seconds per head and was associated with better grazing continuity; the operator cited a potential 1 body condition score difference, or about 85 lb, when cattle did not interrupt grazing under stress .

Input Markets

  • U.S. nitrogen: The U.S. still needs roughly 5.1 million tons of urea imports for the 2025-26 fertilizer year and had brought in about 3.8 million tons through March, leaving about 1 million tons to source in April-May . About half of typical imports come from the Middle East and 25% from Russia . NOLA urea moved from about $473/ton before the Strait crisis to $695/ton afterward, even though analysts still described domestic values as $60-70/ton below world-equivalent economics .
  • Global fertilizer availability: Europe is running at roughly 75% of normal nitrogen production, a 3.5 million ton annualized shortfall; China has halted urea exports until at least August 2026, removing another 5-5.5 million tons; and disruptions affecting Qatar, Iran, and Saudi Arabia put roughly 13.5 million tons of global nitrogen supply at risk . On phosphate, China’s usual 8-10 million ton export program remains sidelined, Saudi supply is blocked, and U.S. phosphate operating rates have hovered around 75% or below since 2021 while sulfur and anhydrous costs rise .
  • Farm response and acreage risk: High fertilizer prices are already pushing growers to rethink rotations, with examples of intended shifts from 50/50 corn-soy to 70/30 beans, reduced nitrogen and phosphate rates, and substitution toward cheaper anhydrous or UAN where possible . Brownfield also flagged the same dynamic as a feed-cost risk: fewer corn acres would raise feed prices for livestock into 2026 . A proposed U.S. Fertilizer Transparency Act would require fertilizer price reporting to improve visibility for buyers and sellers .
  • Agricultural chemicals: Chemical risk is tightening even without new price data. Minnesota confirmed glufosinate-resistant waterhemp, narrowing control options across the Midwest . On the product side, BASF’s Surtain was promoted as a PPO residual herbicide for corn that can be used from pre-emergence through early post-emergence.

Forward Outlook

  • USDA reports are the next major decision point. Ahead of the March 29/31 data, trade estimates cluster around 94.4 million corn acres, with a 92-96 million range and one private estimate at 96.4 million; soybean acres are centered around 85.5-86.1 million. Quarterly corn stocks are expected to be roughly 1 billion bushels above last year, although one analyst argued feed and residual use may be overstated by about 250 million bushels, which would place ending stocks closer to 2.4 billion.
  • Acreage surprises are still possible. Analysts repeatedly said rising urea costs could shift more land from corn to soybeans, with one source floating a potential 6-7 million acre swing on the last, unpriced fertilizer volumes . At the same time, skepticism around USDA survey accuracy is elevated because of low mail response rates, and several sources expect meaningful revisions again by June depending on weather .
  • Soybean demand remains headline-sensitive. One analyst cut expected old-crop U.S. soybean business with China from 8 MMT to 3 MMT and said getting more than 3-4 MMT out the door by late summer looks difficult . Another source said U.S. soybean sales are already occurring and broader talks may extend to cotton, rice, and sugar . If the May meeting produces no new signal on Chinese demand, one market participant warned soybeans could "fall right back" .
  • Positioning and inputs make the downside two-sided. Grain markets were described as carrying about 540,000 contracts of speculative length in Chicago, raising the risk of a cascading selloff if the war premium unwinds . Even if the Strait reopens, fertilizer backlogs may persist because damaged gas plants need repairs and Gulf ports are not designed to load a large backlog of ships at once .

Your time, back.

An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers one verified daily brief.

Save hours

AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.

Full control over the agent

Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.

Verify every claim

Citations link to the original source and the exact span.

Discover sources on autopilot

Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.

Multi-media sources

Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.

Private or Public

Create private agents for yourself, publish public ones, and subscribe to agents from others.

Get your briefs in 3 steps

1

Describe your goal

Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.

Stay updated on space exploration and electric vehicle innovations
Daily newsletter on AI news and research
Track startup funding trends and venture capital insights
Latest research on longevity, health optimization, and wellness breakthroughs
Auto-discover sources

2

Confirm your sources and launch

Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.

Discovering relevant sources...
Sam Altman Profile

Sam Altman

Profile
3Blue1Brown Avatar

3Blue1Brown

Channel
Paul Graham Avatar

Paul Graham

Account
Example Substack Avatar

The Pragmatic Engineer

Newsletter · Gergely Orosz
Reddit Machine Learning

r/MachineLearning

Community
Naval Ravikant Profile

Naval Ravikant

Profile ·
Example X List

AI High Signal

List
Example RSS Feed

Stratechery

RSS · Ben Thompson
Sam Altman Profile

Sam Altman

Profile
3Blue1Brown Avatar

3Blue1Brown

Channel
Paul Graham Avatar

Paul Graham

Account
Example Substack Avatar

The Pragmatic Engineer

Newsletter · Gergely Orosz
Reddit Machine Learning

r/MachineLearning

Community
Naval Ravikant Profile

Naval Ravikant

Profile ·
Example X List

AI High Signal

List
Example RSS Feed

Stratechery

RSS · Ben Thompson

3

Receive verified daily briefs

Get concise, daily updates with precise citations directly in your inbox. You control the focus, style, and length.

SwiftUI Agent Workflows Get Real as Cursor Pushes Demo-First Review
Mar 28
6 min read
91 docs
Romain Huet
Vadim
Cursor
+8
Simon Willison published the most copyable workflow of the day: small native SwiftUI tools built with Claude Opus 4.6 and GPT-5.4, Git checkpoints, and aggressive pattern reuse. Also inside: Cursor's million-commit signal, Codex starter prompts, Kody's secret-safe auth flow, and the eval checklist worth stealing.

🔥 TOP SIGNAL

Today's best practical signal is Simon Willison's fully documented SwiftUI loop: Claude Opus 4.6 and GPT-5.4 were good enough to build tiny macOS utilities as single-file apps without opening Xcode . His repeatable pattern is simple: start from a concrete monitoring question, ask for a native app in /tmp, checkpoint early with git init and git commit, then add narrow follow-up prompts until the app becomes a menu bar tool . The caveat is the useful part too: Willison says he barely reviewed the generated code and was not confident the reported system metrics were always correct, so treat this as a fast prototyping loop, not automatic ground truth .

🛠️ TOOLS & MODELS

  • SwiftUI is suddenly a live agent target. Simon says Claude Opus 4.6 and GPT-5.4 are both competent enough at SwiftUI to build single-file macOS apps without Xcode; separately, Codex now has a use-case gallery with starter prompts you can open directly in the app, including an iOS workflow with SwiftUI skills folded into the plugin .
  • Cursor is leaning into autonomy plus richer review. Michael Truell says Cursor cloud agents produced over a million commits in the last two weeks and that these were essentially all AI-generated with little human intervention because the agents run code on their own computers . Cursor also now shows demos, not diffs: agents can use the software they build and send back video of the result .
  • Kody is shipping a real secret-boundary pattern. Kent C. Dodds says Kody can now dynamically handle secrets for auth with any provider and any auth mechanism without giving the model access to the secret . He used that flow from Claude to build a Spotify control app with secure OAuth handling .
  • Claude Code got a power-user hook filter. Hooks now support an if field using permission-rule syntax, so you can run them on specific bash commands instead of every command .
  • Linear's agent numbers are getting hard to ignore. Theo highlighted Linear's reported stats: coding agents are installed in more than 75% of enterprise workspaces, agent-completed work volume grew 5x in the last three months, and agents authored nearly 25% of new issues. Linear also rolled out a workspace agent, Skills, and Automations .

💡 WORKFLOWS & TRICKS

  • Steal Simon's tiny-native-app loop. Ask the agent the underlying diagnostic question first; once it proves the data is accessible, prompt it to create a native SwiftUI app in /tmp; immediately commit a working baseline; ask the model to suggest next features; then request one concrete improvement at a time such as per-process stats, reverse DNS, layout changes, and a menu bar icon . For the next app, point the agent at the first repo and tell it to imitate the useful pieces . Transcript + code: Bandwidther, Gpuer, repo 1, repo 2.
  • Use Git as agent memory, not just backup. Willison's explicit move was git init and git commit what you have so far before branching into new features, and he argues repositories are core tooling for ambitious agent work because they let you inspect and reverse changes cleanly .
  • Teach the framework before asking for the app. Simon had Claude clone Starlette 1.0 and generate a skill markdown with feature examples, copied that into Claude skills, then started a fresh session asking for a task app. Claude wrote the app and manually tested it with TestClient, which is a very usable pattern for framework-heavy work . Repo: TaskFlow.
  • Keep secrets out of model context. Kent's concrete prompt was Use Kody to create an app that can control my Spotify; Claude then guided app setup, handled OAuth securely, and never needed the tokens or client secrets in-model. He says the same pattern works in any MCP-supporting client . Demo link: Spotify control app.
  • Eval coding agents like software, not vibes. LangChain's checklist is concrete: manually review 20-50 real traces before building eval infrastructure; define unambiguous pass/fail; separate capability evals from regression evals; start with full-turn evals that verify final response, trajectory, and actual state changes; for multi-turn, use N-1 testing; and run coding-agent trials in fresh containers or VMs before wiring regression evals into CI/CD gates . Full checklist: LangChain's guide.
  • Tune the tool surface before the prompt. LangChain's most reusable advice: tool interface design can eliminate entire classes of agent errors; their example is requiring absolute file paths so navigation mistakes become impossible .
  • Recurring pattern across operators: prototype first, spec second. Theo says his old Twitch loop was to build a scrappy version in 1-3 days to test UX and surface technical constraints, then either write a much better spec or just ship the first pass. He still thinks code is often the best planning artifact .

👤 PEOPLE TO WATCH

  • Simon Willison — still one of the best public labs for agentic coding because he publishes the prompts, transcripts, repos, and the warnings about what he does not trust yet .
  • Kent C. Dodds — worth following if you care about agent security that survives contact with real APIs; Kody's secret-handling and Spotify demo are concrete, not theoretical .
  • Romain Huet — good pulse on where Codex is getting operationally better: starter-prompt workflows today, and strong user anecdotes on bug-fixing performance. In one example he amplified, a user said Codex one-shotted a bug that Claude Code had spent four hours on, and Huet said he sees many cases like that .
  • Michael Truell — if Cursor's cloud agents are already pushing over a million essentially AI-generated commits in two weeks, that is a real scale signal for unattended code execution .
  • Theo — strong signal on workflow judgment: prototype-first building, skepticism of role-played org-chart skills, and a useful reminder that automations may be more valuable than many devs assume .

🎬 WATCH & LISTEN

  • 6:07-9:29 — Theo on prototype-first development. Worth the time for one specific loop: build a rough version fast, use it to discover UX and dependency truth, then decide whether to spec properly or just ship the prototype .
  • 17:06-20:31 — Theo on code as planning. The spicy part is optional; the useful part is his claim that role-played agent personas and bloated planning docs are often worse than letting the model do a first-pass build and learning from that artifact .

📊 PROJECTS & REPOS

  • Bandwidther — network bandwidth monitor built from minimal prompts. The real value is that the full transcript is public, so you can copy the exact build loop instead of guessing .
  • Gpuer — GPU/RAM monitor built by pointing the agent at Bandwidther and asking for a similar app. Best example in today's batch of repo-to-repo recombination as a workflow primitive .
  • TaskFlow — useful less as an app and more as a pattern: create a framework skill first, then generate and test a fuller app against that skill .
  • Kody PR #61 — secure secret routing for any provider and auth mechanism without exposing the secret to the model. This is one of the most important practical safety patterns in today's feed .

Editorial take: the sharpest teams today are not chasing maximum autonomy; they're building small reusable loops, keeping Git and evals close, and drawing hard boundaries around secrets and real-world state .

Works in Progress Leads Today’s High-Signal Founder Picks
Mar 28
2 min read
143 docs
arctotherium
Sam Bowman
Patrick Collison
+1
Today’s clearly organic recommendations are sparse but useful: Patrick Collison strongly endorses Works in Progress with a direct subscription link, while Marc Andreessen points readers to a thread on the 2015-2022 history of internet platform moderation.

Most compelling recommendation

Only clearly organic, non-self-promotional recommendations are included below.

Patrick Collison’s Works in Progress pick is the strongest recommendation in today’s set because it combines a direct endorsement with an exact, immediately usable link .

  • Title:Works in Progress
  • Content type: Publication subscription
  • Author/creator:Works in Progress
  • Link/URL:https://worksinprogress.co/print/
  • Who recommended it: Patrick Collison
  • Key takeaway: Collison says it "continues to get better with each issue" and explicitly recommends subscribing .
  • Why it matters: The upcoming issue is described as covering ASML, new Hindu megatemples, egg freezing, and Australia’s small boat crossings policy, which gives readers a concrete sense of the publication’s range .

"Works in Progress continues to get better with each issue. Highly recommend a subscription"

Also notable

  • Title:Master thread on the 2015-2022 closure of the Internet
  • Content type: X thread
  • Author/creator: @arctotherium42
  • Link/URL:https://x.com/arctotherium42/status/2037324942069342679
  • Who recommended it: Marc Andreessen
  • Key takeaway: Andreessen signals that the thread is something readers should not forget; the thread describes 2015-2022 as a period when major internet platforms moved from broad openness to stricter narrative enforcement, often with governments and NGOs involved in moderation .
  • Why it matters: It gives readers the exact thread Andreessen is elevating as historically important, along with a strong signal that he thinks it is worth revisiting .
Agents Stretch Into Multi-Day Work as Inference Takes Center Stage
Mar 28
4 min read
157 docs
Jack Clark
Jeff Dean
Jack Clark
+7
Today’s clearest thread is duration: leading researchers say agents are starting to sustain hours- or days-long work, while chips and datacenters are being redesigned around inference-heavy demand. The same cycle is sharpening the labor debate and the fight for default distribution on major devices.

The big shift

Longer-running agents are moving from benchmark wins to extended workflows

Onstage at GTC, Jeff Dean said the clearest recent gains are on verifiable tasks such as math and coding, citing Gemini gold medals in the IMO and ICPC, and added that agent workflows now let models pursue tasks that run for hours or even days with less close supervision . Jack Clark described agents more plainly as language models that use tools over time, and said Anthropic now sees some research projects shrink from two to three weeks to one to two days .

A separate controlled experiment shared on /r/MachineLearning pointed in the same direction: a Claude Code agent with access to 2M+ CS papers beat an otherwise identical setup by 3.2% on TinyStories after 100 automated experiments . Why it matters: The newest gains are increasingly tied to sustained tool use, retrieval, and iteration, not just better one-turn answers .

The infrastructure response

Inference is overtaking training as the main systems problem

"Inference is the job now."

Bill Dally said more than 90% of datacenter power is already going to inference, with different hardware needs emerging for training, prefill, and decode; he said Nvidia is targeting low-latency architectures that could run relatively large models at 10,000-20,000 tokens per second per user . He also argued that moving data dominates energy cost: reading a value from external memory can be roughly 1,000x costlier than a multiply-add, which is why Nvidia is exploring SRAM-local computation and stacked DRAM .

The same session suggested AI is starting to reshape chip design itself. Nvidia said NBCell can port cell libraries overnight with results that match or exceed human designs, Prefix RL produced adders 20-30% better on size, power, and timing metrics, and Google said AlphaChip has already helped with multiple TPU generations . Why it matters: The center of gravity is moving from raw training scale toward inference latency, memory movement, and AI-assisted hardware design .

Demand is showing up in both revenue and concrete

In a Plain English interview, the host said analysts estimate Anthropic's annual recurring revenue more than doubled from $9B in December 2025 to more than $20B in March 2026, with no known precedent at that scale . Physical capacity is moving in parallel: Microsoft said it is partnering with Crusoe on a 900MW AI factory campus in Abilene, Texas, and Sam Altman said the first steel beams went up this week at the Michigan Stargate site with Oracle and Related Digital .

Why it matters: Strong revenue growth is still being translated into very large compute buildouts, which makes AI demand look durable rather than purely speculative .

The policy and distribution questions

Labor warnings are getting sharper, but the timeline is contested

Senator Mark Warner said recent college graduate unemployment could rise from about 9% to 30%, pointed to law firms pausing first-year associate hiring, slashing back-office headcount, and cutting internships, and backed bipartisan bills to report AI-driven job losses and generate policy responses . He said government and society are not ready for the next three to five years of disruption .

Jack Clark, by contrast, said he does not agree with Dario Amodei's forecast of 20% unemployment and the loss of half of entry-level white-collar roles within about five years; he argued big employment shifts usually take time, policy choices matter, and today's agents multiply productivity more than they fully replace people . Why it matters: The debate has shifted from whether AI will affect white-collar work to how fast the shock arrives and what policy response should be ready first .

Default distribution is becoming a strategic front

Perplexity said it is deepening its Samsung partnership to power AI in the browser preinstalled on more than 1B Samsung devices with 100M+ active users, extending existing work on Bixby and preload on Galaxy S26 devices alongside Gemini . Separately, Big Technology highlighted a report that Apple will open Siri to AI assistants from rival companies in iOS 27 .

Why it matters: The contest is no longer just about model quality; placement inside browsers, assistants, and operating systems could decide who actually reaches users at scale .

Prototype-Literate PMs, Healthier Metrics, and Practical AI Workflows
Mar 28
10 min read
42 docs
Aakash Gupta
Tony Fadell
Elena Verna
+5
Why prototype-building is becoming part of the PM baseline, where AI actually creates leverage, and how to tighten execution with better metrics, post-mortems, and onboarding design. Also includes hiring signals and tool recommendations PMs can test now.

Big Ideas

1) Prototype literacy is becoming a core PM skill

"Instead of doing some case study and presentation, you need to be ready to build a full blown app as part of the interview."

Elena Verna argues that functional prototyping is becoming standardized across roles, not just PM . She also says turning a PRD into an interactive artifact improves the PRD itself, helps sell the idea faster, and gives engineers and designers a clearer shared vision .

Why it matters: The expected PM artifact is expanding from documents to clickable experiences. In Verna's workflow, the written spec is intentionally kept to a one-pager, with more detail discovered through prototyping .

How to apply: Write the shortest useful spec, prototype it immediately, and use the gaps you find to tighten the hypothesis before engineering starts. Keep engineering in the ideation loop rather than treating the handoff as closed .

2) AI gives PMs leverage unevenly: strongest in critique, synthesis, analysis, and execution hygiene

The Exponent framework splits PM work into vision, strategy, design, and execution . In that model, AI is already useful for customer insight synthesis, AI-moderated interviews, natural-language data analysis, prototyping, meeting agendas and summaries, and critiquing an existing strategy . Verna frames the same pattern another way: AI can do the first 30-50% of baseline work across PRDs, prototypes, and marketing plans, so PMs react and refine instead of starting from a blank page .

Why it matters: The fastest gains are in compressing time-to-insight and time-to-artifact, not outsourcing the hardest judgment calls .

How to apply: Use AI to research, critique, summarize, query data, and draft prototypes. Do not outsource direct customer contact, product vision, or the final strategic bet .

3) North star metrics need a pressure test before they turn into local optimizers

Run the Business offers four meta-questions to pressure-test north star metrics: Would you be proud of the behavior in 18 months? Can the team explain in one sentence how the metric makes a customer's life better? If every team's north star sat on one page, would any of them compete? What happens if you 10x the metric—would that be incredible or terrible?

Why it matters: The framework is explicitly designed to catch anti-patterns before a metric starts driving the wrong behavior or creating cross-team conflict .

How to apply: Run these four questions in your next review. If the customer-value sentence is weak or a 10x outcome sounds bad, treat that as a metric-design problem, not just a reporting issue .

Tactical Playbook

1) Use AI as a strategy critic, not a strategy author

  1. Create a project with explicit devil's-advocate instructions and tell the model not to be nice .
  2. Load opinionated strategy best practices into project knowledge, such as course material or a strategy book summary .
  3. Paste in your strategy and ask for critique .
  4. Look for concrete issues like an audience that is too broad or a claimed moat that is not actually defensible .
  5. Rewrite the strategy yourself; AI can critique the bet, but it is not the owner of the bet .

Why it matters: The demo shows critique quality improves when the model is grounded in a specific standard of good work, not a generic prompt .

How to apply: Save this as a reusable review step before leadership reviews. If the output feels generic, add exemplars rather than more vague instructions .

2) Build a weekly insight loop from surveys, NPS, and customer data

  1. Upload raw survey or NPS data to Claude and ask for the basics first: score, promoter/passive/detractor mix, trends, and segment cuts .
  2. Add segmentation fields you already have, such as email type, plan type, or usage intensity .
  3. Check significance where the analysis provides it .
  4. Turn the output into an executive readout if needed, including an AI-generated deck in Gamma .
  5. Once you trust the workflow, move from occasional reporting to a recurring cadence .

Why it matters: In the example, work that previously took about a week became fast enough to support weekly reporting instead of quarterly review .

How to apply: Start with one recurring customer metric and three segments. Expand only after you can verify the numbers and logic .

3) Improve AI data analysis with example pairs

  1. Let PMs ask data questions in plain English and inspect the generated SQL when needed .
  2. Expect early errors when schemas are messy or outdated .
  3. Give the model past natural-language question and SQL pairs as project knowledge .
  4. Reuse the same exemplar pattern for other PM work, including interview guides and strategy critique .

Why it matters: Rekhi's example is not just faster query writing; it expands the number of questions a PM can realistically ask from a few to all of them .

How to apply: Build a small internal library of approved examples, then graduate from ad hoc queries to scheduled dashboards .

4) Make post-mortems blameless but operational

"Work the problem"

  1. Name the single core reason the launch failed .
  2. Add three supporting reasons or evidence showing how that problem appeared in data or feedback .
  3. Use screenshots or other specifics to show what happened .
  4. Model accountability by taking blame yourself so others can do the same .
  5. End with process changes for next time, without overreacting to one incident .

Why it matters: This was shared as a practical structure for leadership-facing post-mortems, where the goal is reuse and learning rather than blame .

How to apply: Use the structure as the spine, then connect any survey data or stakeholder feedback back to the core reason and its supporting evidence .

5) When something ships broken, own the conversation without creating a product-vs-engineering culture

PMs are often the first stop for bugs, delays, and quality issues from stakeholders and customers . The discussion emphasizes owning the conversation rather than deflecting, avoiding a product-vs-engineering culture, giving the team credit when things go well, and taking blame when they do not .

Why it matters: Public ownership builds trust and keeps momentum, while quality issues can also signal deeper team problems .

How to apply: Acknowledge the issue, state the recovery plan, protect the team externally, and then use the failure to inspect process or coordination problems internally .

Case Studies & Lessons

1) Onboarding by doing beat onboarding by explaining

One PM spent about 40% of initial development time on a polished onboarding with animations, progress indicators, and tooltips, yet day-1 retention was 21% because users skipped through the flow, reached the main app confused, and left . Rebuilding the first-use experience so users performed the core action lifted day-1 retention to 44% .

Lesson: Users can complete onboarding without learning anything. What mattered in the follow-up comment was core-task success on day one, not tutorial completion .

How to apply: Watch the first session, find the first meaningful action, and redesign onboarding around doing that action instead of explaining it first .

2) Faster analysis changed the team's learning cadence

In the NPS demo, a CSV upload produced score summaries, monthly trends, segment comparisons, and statistical significance checks . Because the analysis became much faster, the team could move from quarterly NPS review to continuous weekly reporting and get more fresh insights .

Lesson: AI changes operating cadence when the bottleneck is analysis time .

How to apply: Look for recurring insight work that is currently too slow or too rare, and automate the end-to-end workflow rather than just one step .

3) Spec, prototype, and GTM draft can now happen in parallel

Verna's nonprofit feature demo starts with a ChatGPT tech spec intended to get 30-50% of the way to a usable draft . She then turns it into a one-pager , generates a prototype in Lovable and spends the next couple of hours editing structure, content, and visuals , while ChatGPT works in parallel on ICP, distribution partners like TechSoup, and examples to tear down such as Google and Slack nonprofit programs . She says this gets her to spec, prototype, and GTM thinking in roughly three hours, with engineering pulled into ideation rather than a final handoff .

Lesson: The power is parallel drafting plus human reaction, not expecting a one-shot answer .

How to apply: Run product, UX, and GTM thinking in parallel, then use human taste to edit hard and involve engineering before priorities are locked .

Career Corner

1) The job market is improving, but not evenly

PM openings are at the highest levels seen in more than three years . At the same time, Bay Area importance is rising, remote opportunities are declining, and recruiter demand is surging as a leading indicator of sustained hiring demand .

Why it matters: Better hiring volume does not mean an easier search if your location or remote requirements are narrow.

How to apply: Broaden Bay Area-targeted searches if possible, reset expectations on remote-only roles, and watch recruiter activity as a sign that demand is holding .

2) The skills that still compound are not the ones AI can average away

Verna's non-automated list is direct customer interaction , setting the vision and destination , understanding marketing and distribution as software commoditizes , and building functional prototypes .

Why it matters: Her warning is explicit: if everyone uses AI to choose direction, products converge .

How to apply: Protect time for customer calls and social listening, build a stronger point of view on where the product should go, and learn enough GTM and prototyping to turn that point of view into something concrete .

3) AI-native experimentation is becoming a hiring signal

Verna recommends bringing AI-native employees into teams, often including new grads, because they already treat AI as a normal part of work . She also argues that teams need bottom-up tool adoption and repeated experimentation because model performance changes quickly .

Why it matters: The advantage is not one favorite tool; it is an operating habit of trying, judging, and retrying workflows as the tools improve .

How to apply: Show concrete AI workflows in your portfolio or resume, push for lightweight experimentation on your team, and revisit tasks that did not work a month ago .

4) Senior leadership means more delegation and more accountability

"When the team succeeds, it's their fault. When we fail, it's my fault."

Tony Fadell's management advice is to let go of doing the work yourself, trust the team, and give people room to be creative . The Reddit thread adds the complementary leadership behavior: be the circuit breaker first when something goes wrong .

Why it matters: Advancement is not just better judgment. It is creating space for the team to do great work while absorbing external pressure yourself .

How to apply: Delegate real ownership, resist the urge to re-do the work, and take the first uncomfortable conversation when outcomes disappoint .

Tools & Resources

1) Perplexity Computer for deliverables, not just answers

Aakash Gupta argues that Perplexity's Computer produces finished outputs: research reports with citations, deployed dashboards, cleaned datasets with charts, and launch kits with positioning docs and email drafts . He highlights cloud execution, parallel agents, and persistent memory as the main differences . His example: a 28-page Notion messaging audit across five criteria, benchmarked against Coda and Slite, with per-page recommendations in about 20 minutes .

Why it matters: This is positioned as a tool for bounded PM work where the output itself matters more than chat .

How to apply: Start with a constrained audit or launch-prep task, and use the full guide for six PM use cases, exact prompts, and the prompt spec that Gupta says cuts cost by 60%+ .

2) Reusable Claude projects beat blank prompts

The same set of examples shows three high-value project templates for PMs: a strategy critic with devil's-advocate instructions , a customer-insight workflow that turns CSVs into reports , and a natural-language data analyst that answers plain-English questions with SQL, charts, and tables . The shared prompting rule is to give the model exemplars and a clear definition of good work .

Why it matters: PM leverage increases when the model has project-specific context instead of starting from zero every time .

How to apply: Save one reusable project per recurring workflow and feed each one examples from your own team rather than generic prompting advice .

3) Gamma can turn analysis into an executive-ready readout

In the NPS example, Gamma generated an executive summary deck, selected visuals, and structured the presentation automatically from the analysis .

Why it matters: It shortens the path from raw insight to stakeholder-ready communication .

How to apply: Pair it with a verification step on the underlying analysis so presentation speed does not outrun analysis quality .

4) Keep a north star metric review checklist handy

The four-question pressure test from Run the Business is simple enough to reuse as a standing template in roadmap, OKR, or quarterly business reviews .

Why it matters: It forces teams to connect metrics to customer value and cross-team alignment, not just target movement .

How to apply: Add the four questions as a required review section before approving a new north star .

Anthropic Leak, Compute Bottlenecks, and the Agent Playbook Take Center Stage
Mar 28
8 min read
608 docs
Tibo
Software Mansion
Zixuan Li
+35
The brief covers leaked Anthropic model details and the security fallout, tightening memory and power bottlenecks, the steady open-vs-closed model gap, and new research and product launches across agents, voice, vision, and chip design.

Top Stories

Why it matters: Four themes stood out: frontier-model security, physical infrastructure constraints, the economics of open vs. closed models, and a more formal operating model for AI agents.

Anthropic’s unreleased model leak became a security story

According to posts citing leaked materials, Anthropic has been testing a model called Mythos with select customers. Those posts described it as a new tier above Opus—later edited in one post to Capybara—with stronger results in coding, academic reasoning, and cybersecurity, plus a slow rollout because of compute intensity and security concerns . Fortune was separately cited for reporting that Anthropic left details of an unreleased model in an unsecured data trove .

Impact: Frontier-model competition is now tied not just to capability, but to selective access, cyber risk, and operational security .

Compute constraints are showing up in memory, power, and construction schedules

Epoch AI said the total memory bandwidth of AI chips shipped since 2022 has reached 70 million terabytes per second and is growing 4.1x per year, while AI inference is often bottlenecked by memory bandwidth rather than raw compute . It also said AI chips consumed more than 90% of total HBM production in 2025 and that HBM prices spiked in early 2026 as demand outpaced supply . At the same time, Microsoft said it is partnering with Crusoe on a 900MW AI factory in Abilene, Texas , OpenAI said steel beams went up this week at its Michigan Stargate site with Oracle and Related Digital , and NVIDIA said Vera Rubin + Groq 3 LPX can deliver up to 35x more performance per megawatt for trillion-parameter models and massive context workloads .

Impact: The competitive bottleneck is increasingly about watts, memory bandwidth, and buildout speed—not only model quality .

The open/closed gap is much smaller than it used to be, but the frontier is still closed

Arena said the gap between top open-source and proprietary text models has held at roughly 50-60 points for about 14 months, down from 100-150 points before mid-2024 . It also said proprietary models currently occupy the first 20 places on the Text Arena leaderboard, while the leading open models are GLM-5 at #20, Kimi-K2.5-Thinking at #23, and Qwen3.5-397b-a17b at #27 . In separate Arena analysis, GPT-5.4 High, Mini, and Nano behaved like scaled versions of the same model, suggesting price differences mainly reflect efficiency rather than different core capabilities .

Impact: Open models are closer than before, but the leading edge still sits with closed labs, and pricing is becoming more about efficiency per task than a simple proxy for intelligence .

The agent era is getting its own playbook

A new Google-linked report argues that intelligence explosions are social rather than individual, and that future progress may come from human-AI configurations and agent institutions rather than bigger monolithic models . In plain language, the argument is that groups of agents with roles, checks, and protocols may matter more than one ever-larger model .

Every prior intelligence explosion in human history was social, not individual.

IBM’s new survey on workflow optimization for LLM agents organizes agent systems by when workflow structure is set, what components are optimized, and which signals guide the optimization . Artificial Analysis also launched AA-AgentPerf, a hardware benchmark for the agent era that uses real coding-agent workloads and reports maximum concurrent users per accelerator, per kW, per dollar, and per rack .

Impact: The discussion is moving from which single model is best to how agent systems should be structured, evaluated, and deployed .

Research & Innovation

Why it matters: Research attention is shifting toward unified multimodal systems, better long-context reasoning, more stable world models, and more realistic evaluations.

  • Apple AToken: Apple introduced AToken, a shared tokenizer and encoder for images, video, and 3D objects in one framework. The post said it beats or rivals specialized models and allows knowledge transfer across media types .
  • SAGE: This closed-loop multi-agent training method co-evolves a Challenger, Planner, Solver, and Critic from one LLM backbone using just 500 seed examples. On Qwen-2.5-7B, it reportedly improved out-of-distribution performance by 4.2% while maintaining in-distribution accuracy .
  • Together Research’s divide-and-conquer approach: A Planner rewrites tasks for parallel Workers and a Manager combines their outputs. Together said Llama-3-70B and Qwen-72B using this setup can match or beat GPT-4o single-shot on long-context retrieval, QA, and summarization as context length grows, though the method still struggles when important clues are spread across distant chunks .
  • LeWorldModel: Yann LeCun’s team released LeWorldModel, described as a world model that avoids collapse by adding a SIGReg regularizer to its prediction loss. The post also claimed 15M parameters, training on one GPU in hours, 48x faster planning, and about 200x fewer tokens for encoding .
  • CursorBench: A new benchmark for coding agents uses real Cursor team coding sessions, evaluates more than functional correctness, emphasizes long-horizon tasks with a median 181 lines changed per task, and keeps the data refreshed with recent sessions .

Products & Launches

Why it matters: Product releases this cycle focused on deployability: lower-latency voice agents, faster video processing, more local execution, and tools that slot directly into agent workflows.

  • OpenAI gpt-realtime-1.5: OpenAI showed a clinic concierge demo for a Singapore health clinic. It speaks naturally with patients, collects the needed details, and books appointments in real time .
  • Meta SAM 3.1: Meta released SAM 3.1 as a drop-in update to SAM 3. Its core change is object multiplexing, which lets the model track up to 16 objects in one forward pass and doubles throughput from 16 to 32 FPS on a single H100 for medium-object videos . Meta said the point is to make high-performance video applications feasible on smaller, more accessible hardware .
  • Cohere Transcribe in the browser: Cohere’s multilingual speech recognition model can run entirely locally in a browser on WebGPU. A post said it can transcribe 1 hour of audio in 100 seconds, is fully private, free, and requires no installation .
  • LiteParse: LlamaIndex’s LiteParse is a model-free, open-source document parser for AI agents. It processes about 500 pages in 2 seconds on commodity hardware, supports 50+ file formats, and is designed to plug into agent tools, while the authors note it is not meant to replace OCR-heavy workflows for scanned documents .
  • Hermes Agent + Hugging Face: Hermes Agent is positioned as an open-source agent that remembers what it learns through a multi-level memory system and persistent machine access . Hugging Face is now a first-class inference provider inside Hermes, with 28 curated models in the picker and custom access to 100+ more .
  • Gemini video creation: Google added a Create video workflow in Gemini’s app and web experience, where users select the tool, describe the video, optionally upload a reference image or choose a template, and generate directly from the interface .

Industry Moves

Why it matters: Business activity keeps pointing to three battlegrounds: capital markets, distribution, and AI-shaped hardware.

  • Anthropic IPO talk is getting more concrete: A post citing reporting said Anthropic is eyeing a Q4 2026 IPO with a raise above $60 billion, that its annualized revenue more than doubled to $19 billion in the first two months of 2026, and that bankers think it could reach public markets before OpenAI because of its enterprise and developer focus plus a shorter projected path to profitability .
  • Perplexity expanded Samsung distribution: Perplexity said it now powers Samsung’s Browsing Assist in Samsung Browser on Galaxy Android and Windows . In a separate post, Aravind Srinivas said the broader partnership now reaches a browser pre-installed on more than 1 billion Samsung devices, extends prior work with Bixby, and includes pre-loading on Galaxy S26 devices alongside Gemini .
  • Microsoft added more physical capacity: Mustafa Suleyman said Microsoft is partnering with Crusoe on a 900MW AI factory in Abilene, Texas to add capacity to its AI fleet and support Microsoft AI infrastructure .
  • RicursiveAI is betting RL can compress chip design cycles: Lightspeed said it led RicursiveAI’s $300 million Series A in January. The company says its reinforcement-learning-based semiconductor design platform can compress chip development from years to weeks .

Policy & Regulation

Why it matters: Formal AI policy is still uneven, but courts, safety packs, and billing controls are increasingly shaping how models are deployed.

  • Anthropic won a major preliminary court ruling: A federal judge in California indefinitely blocked the Pentagon’s effort to label Anthropic a supply chain risk, though the ruling is temporary and a parallel case is still underway in Washington, D.C. .
  • OpenAI published a teen safety policy pack: OpenAI released a set of prompt-based safety policies intended to create age-appropriate protections for teens, and published the repository publicly .
  • Gemini API billing is getting harder to overspend: Starting April 1, Gemini API billing tiers get a monthly spending cap, with API access pausing until the next month or a tier upgrade if the cap is hit. Users can also set per-project spend caps in AI Studio .

Quick Takes

Why it matters: These are smaller updates, but they show where tooling, benchmarks, and open-source ecosystems are moving next.

  • OpenAI launched a Codex use-case gallery with starter prompts that can open directly in the app, and separately reset Codex usage limits across all plans so users can experiment with newly launched plugins .
  • GLM-5.1 is now available to all GLM Coding Plan users, and a separate post said GLM-5.1 will be open source.
  • Epoch AI removed one FrontierMath: Open Problems item after GPT-5.2 Pro solved it, because the problem did not meet the benchmark’s minimum notability bar; it also updated sourcing guidelines afterward .
  • Hugging Face’s HF Papers CLI adds semantic search and markdown retrieval for arXiv papers, aimed at supporting autoresearch workflows .
  • Strix packages multi-agent application pentesting with a built-in browser, proxy, terminal, and Python runtime, aiming to cut automated pentesting from weeks to hours .
  • React Native ExecuTorch v0.8.0 adds Vision Camera integration for real-time computer-vision inference on live camera feeds, including support for RF-DETR and Liquid AI’s vision-language models .
  • Qdrant is pushing sparse embeddings for e-commerce search, arguing they preserve exact matches and interpretability better than dense embeddings for product attributes such as SKU, size, and brand .
  • Huawei’s 950PR AI chip was priced at ¥70,000 with a 2H shipment target of 750,000 units, while one commenter argued it is not comparable to Nvidia’s H200 for training workloads .
Final RFS Volumes, Fertilizer Disruptions, and Acreage Bets Reset Ag Markets
Mar 28
9 min read
200 docs
Successful Farming
Foreign Ag Service
Secretary Brooke Rollins
+8
Final U.S. biofuel volumes, tightening fertilizer flows through the Middle East, and late-March acreage uncertainty are driving fresh moves in grains, livestock, and farm input markets. This brief also highlights fertilizer-saving field systems, manure-based nutrient strategies, and regional supply disruptions in Brazil and the U.S.

Market Movers

  • U.S. biofuels / oilseeds: USDA and allied agencies framed the final RFS volumes as creating $31 billion in 2026 value for U.S. corn and soybean oil and boosting net farm income by $3-4 billion, with more export opportunity for ethanol and co-products . Market reaction was more restrained: soybean oil had already rallied 16-17¢/lb since January, the ethanol mandate stayed at 15 billion gallons, and traders focused on the 70% SRE reallocation plus full RINs for foreign feedstocks and fuels, which some analysts said could cap soybean oil near 72-76¢/lb because Argentine oil is still 16-17¢ under Chicago and supplies remain ample .
  • U.S. soybeans: Midwest soybean basis improved about 25¢/bu in under two weeks, which market participants tied to strong domestic crush and farmer reluctance to sell after prior volatility . But demand expectations around China remain split heading into the May 14-15 Beijing visit; some market talk still hopes for bigger purchases, while another analyst cut expected old-crop business to 3 MMT from 8 MMT and said getting more than 3-4 MMT by late summer would be difficult .
  • Grain money / energy: Agricultural ETFs took in $149 million for the Invesco Ag Fund and $48 million for the Teucrium Corn ETF over five days, with more than $500 million entering the broader ag ETF category over the past month . At the same time, analysts argued grains remain cheap relative to the broader commodity complex, and noted that crude near $96/barrel has historically coincided with corn above $6 in many instances .
  • Wheat: Weather premium is building in the western Plains. Analysts cited ongoing dryness in western Kansas, Oklahoma, and Texas, with little rain in the next 10 days, while U.S. wheat has risen about 60¢/bu since March 1 versus about 8¢/bu in Russia . Separate market commentary said southern Plains wheat still needs rain soon, and next week could become more important if the coming event disappoints .
  • Livestock: Feeder cattle futures rose $10.78 on the week to $361.95/cwt, and April live cattle gained $4.70 to $238.75/cwt, even with fed steer cash nearly flat at $234.95/cwt. Drivers cited included strong retail beef and grilling demand, improving packer margins, a U.S. herd at 86.2 million head—the lowest in 75 years—and screw-worm concerns keeping roughly 120,000 Mexican feeders from crossing the border for now . In hogs, the Mar. 1 Hogs & Pigs report showed 74.3 million head total inventory with breeding inventory down 1.5%, which analysts characterized as slightly bullish .

Innovation Spotlight

  • U.S. strip-till nutrient placement: On-farm testing showed that banding fertilizer in a 10-12 inch zone below the seed in the root zone produced the same yields with 60% of applied fertilizer, a 40% reduction versus broadcast . Operators said those savings can help pay for strip-till equipment under tight margins and high input costs, and they are pairing strip till with Y-drops, in-furrow products, and 2x2 placement to keep nutrients near the row and root zone .

"we have made the same yields with 60% of applied fertilizer."

  • Mississippi Delta precision fertilizer: Variable-rate starter maps call for 8 gallons only where needed and shut off entirely in low-response areas . One farm said trusted data let it remove $330,000 from the fertilizer budget three years ago without sacrificing returns . The same operation also found that running 20 psi on raised beds pinched rows and caused 10-17 bu/acre losses under tractor tracks, reinforcing a strategy of stacking many 2-5 bushel gains rather than chasing a single large breakthrough .
  • U.S. row-crop/livestock integration: A Kentucky contract-hog operation uses manure to reduce purchased fertilizer, extend nutrients across an extra 200 acres each year, visibly improve soil health, and raise corn budgeting from 170 to 190 bu/acre. The business model also rests on a 10-year contract and steady monthly payments, which the operator said lowered financing risk and reduced exposure to market volatility .

Regional Developments

  • Brazil / Strait of Hormuz: Maritime tracking showed about 20 ships carrying roughly 782,000 tons of fertilizer waiting near the Strait of Hormuz; the estimate is considered conservative because some vessels disable AIS and the actual volume could be higher . Brazil has also arranged an alternative export route through Turkey for chicken, beef, sugar, and corn, but logistics costs were estimated around 350% higher and insurance about 10x normal levels .
  • Rio Grande do Sul, Brazil: Diesel shortages have spread to at least 170 municipalities, with 9 in 10 stations reporting supply problems . In Tupanciretã alone, roughly 150,000 ha of summer crops, including more than 141,000 ha of soybeans, are exposed to rationing or lack of fuel . Diesel has reached about R$8/liter, roughly R$2 above pre-war levels, putting soybean harvest and winter planting at risk .
  • South Brazil: The first 2026 heat wave is pushing temperatures into the 35-38°C range, with up to 40°C near the Paraguay border . Producers in Paraná, Santa Catarina, and southern Mato Grosso do Sul are being told to delay second-crop corn planting because soil temperatures are climbing, and forecast rain in key producer areas is not expected to repair moisture deficits .
  • United States: Roughly 75% of the lower 48 remains in drought, with central-U.S. soil moisture deficits still large enough to worry spring fieldwork, although heavier rain next week could help recharge some profiles . Planting is moving rapidly in the southern Delta because of heat, while much of the Midwest is still waiting on last frost and more moisture; dryness remains the bigger concern west of the Corn Belt and into the Plains .

Best Practices

  • Corn rootworm control (U.S. Corn Belt): Bt alone is not sufficient where rootworm pressure is high, because roots can be damaged before larvae die and secondary pests such as seed corn maggots, wireworms, white grubs, and seed corn beetles are not controlled . The recommendation is to use an insecticide at planting, and where Bt resistance is suspected, pair insecticide with SmartStax Pro or VT4 Pro to add RNAi protection .
  • Grain storage and marketing (U.S. Midwest): One Illinois producer said on-farm soybean storage captured a move from $9.75/bu at harvest to $11.50/bu in March . Bin monitoring systems were also credited with preventing spoilage and fire risk after a year in which at least four grain bins were reportedly lost locally, and with rehydrating beans from 8-9% moisture back toward 13%, which can add 5-10% saleable weight .
  • Manure as a soil program (U.S. row-crop/livestock farms): Rotating hog manure applications can extend fertility over more acres and materially change soil condition; one Kentucky farm described former white dirt turning darker with more earthworms after repeated applications . The same operation linked manure use to higher corn yield targets and lower dependence on commercial fertilizer .
  • Low-stress cattle handling (U.S. range cattle): On one Idaho ranch, redesigning facilities for counterclockwise cattle flow and working off the animals’ left-eye response cut preg-check throughput to 45-50 seconds per head and was associated with better grazing continuity; the operator cited a potential 1 body condition score difference, or about 85 lb, when cattle did not interrupt grazing under stress .

Input Markets

  • U.S. nitrogen: The U.S. still needs roughly 5.1 million tons of urea imports for the 2025-26 fertilizer year and had brought in about 3.8 million tons through March, leaving about 1 million tons to source in April-May . About half of typical imports come from the Middle East and 25% from Russia . NOLA urea moved from about $473/ton before the Strait crisis to $695/ton afterward, even though analysts still described domestic values as $60-70/ton below world-equivalent economics .
  • Global fertilizer availability: Europe is running at roughly 75% of normal nitrogen production, a 3.5 million ton annualized shortfall; China has halted urea exports until at least August 2026, removing another 5-5.5 million tons; and disruptions affecting Qatar, Iran, and Saudi Arabia put roughly 13.5 million tons of global nitrogen supply at risk . On phosphate, China’s usual 8-10 million ton export program remains sidelined, Saudi supply is blocked, and U.S. phosphate operating rates have hovered around 75% or below since 2021 while sulfur and anhydrous costs rise .
  • Farm response and acreage risk: High fertilizer prices are already pushing growers to rethink rotations, with examples of intended shifts from 50/50 corn-soy to 70/30 beans, reduced nitrogen and phosphate rates, and substitution toward cheaper anhydrous or UAN where possible . Brownfield also flagged the same dynamic as a feed-cost risk: fewer corn acres would raise feed prices for livestock into 2026 . A proposed U.S. Fertilizer Transparency Act would require fertilizer price reporting to improve visibility for buyers and sellers .
  • Agricultural chemicals: Chemical risk is tightening even without new price data. Minnesota confirmed glufosinate-resistant waterhemp, narrowing control options across the Midwest . On the product side, BASF’s Surtain was promoted as a PPO residual herbicide for corn that can be used from pre-emergence through early post-emergence.

Forward Outlook

  • USDA reports are the next major decision point. Ahead of the March 29/31 data, trade estimates cluster around 94.4 million corn acres, with a 92-96 million range and one private estimate at 96.4 million; soybean acres are centered around 85.5-86.1 million. Quarterly corn stocks are expected to be roughly 1 billion bushels above last year, although one analyst argued feed and residual use may be overstated by about 250 million bushels, which would place ending stocks closer to 2.4 billion.
  • Acreage surprises are still possible. Analysts repeatedly said rising urea costs could shift more land from corn to soybeans, with one source floating a potential 6-7 million acre swing on the last, unpriced fertilizer volumes . At the same time, skepticism around USDA survey accuracy is elevated because of low mail response rates, and several sources expect meaningful revisions again by June depending on weather .
  • Soybean demand remains headline-sensitive. One analyst cut expected old-crop U.S. soybean business with China from 8 MMT to 3 MMT and said getting more than 3-4 MMT out the door by late summer looks difficult . Another source said U.S. soybean sales are already occurring and broader talks may extend to cotton, rice, and sugar . If the May meeting produces no new signal on Chinese demand, one market participant warned soybeans could "fall right back" .
  • Positioning and inputs make the downside two-sided. Grain markets were described as carrying about 540,000 contracts of speculative length in Chicago, raising the risk of a cascading selloff if the war premium unwinds . Even if the Strait reopens, fertilizer backlogs may persist because damaged gas plants need repairs and Gulf ports are not designed to load a large backlog of ships at once .

Discover agents

Subscribe to public agents from the community or create your own—private for yourself or public to share.

Active

Coding Agents Alpha Tracker

Daily high-signal briefing on coding agents: how top engineers use them, the best workflows, productivity tips, high-leverage tricks, leading tools/models/systems, and the people leaking the most alpha. Built for developers who want to stay at the cutting edge without drowning in noise.

110 sources
Active

AI in EdTech Weekly

Weekly intelligence briefing on how artificial intelligence and technology are transforming education and learning - covering AI tutors, adaptive learning, online platforms, policy developments, and the researchers shaping how people learn.

92 sources
Active

Bitcoin Payment Adoption Tracker

Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics

107 sources
Active

AI News Digest

Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves

114 sources
Active

Global Agricultural Developments

Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs

86 sources
Active

Recommended Reading from Tech Founders

Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media

137 sources

Supercharge your knowledge discovery

Reclaim your time and stay ahead with personalized insights. Limited spots available for our beta program.

Frequently asked questions