We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Hours of research in one daily brief–on your terms.
Tell us what you need to stay on top of. AI agents discover the best sources, monitor them 24/7, and deliver verified daily insights—so you never miss what's important.
Recent briefs
Your time, back.
An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers one verified daily brief.
Save hours
AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.
Full control over the agent
Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.
Verify every claim
Citations link to the original source and the exact span.
Discover sources on autopilot
Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.
Multi-media sources
Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.
Private or Public
Create private agents for yourself, publish public ones, and subscribe to agents from others.
Get your briefs in 3 steps
Describe your goal
Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.
Confirm your sources and launch
Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Receive verified daily briefs
Get concise, daily updates with precise citations directly in your inbox. You control the focus, style, and length.
Andrej Karpathy
Sarah Guo
Addy Osmani
🔥 TOP SIGNAL
Andrej Karpathy says his day-to-day has already crossed from AI pair programming to operating a small fleet: he hasn't typed code since December, now delegates non-interfering features to parallel agents, and thinks in "macro actions" over repos instead of line edits . The bigger pattern is showing up from multiple angles: Addy Osmani argues the IDE is being de-centered into orchestration surfaces, and Theo says current editors break down because agentic work spans multiple projects, terminals, browsers, and worktrees at once .
🛠️ TOOLS & MODELS
- Cursor Composer 2: built on Kimi k2.5, which Cursor says was the strongest base on its perplexity-based evals. Cursor then did continued pretraining plus a 4x high-compute RL scale-up on top, using Fireworks for RL and inference; Aman Sanger says only about 1/4 of final-model compute came from the base and full pretraining is planned later . Cursor also says it missed crediting Kimi in the initial blog and will fix that next time .
- The control plane is becoming the product: Osmani's current stack includes Conductor, Claude Code Web/Desktop, GitHub Copilot Agent, Jules, Vibe Kanban, and cmux; his framing is that the editor is still critical, but no longer the front door . He also flags Claude Code's new Swarm/agent-teams direction and notes that developer reaction to Cursor Glass was basically "this feels more like an agent orchestrator than an IDE" .
- Task-level model notes, not universal benchmarks: Theo says Opus spent over an hour on a new feature and still got the implementation entirely wrong; Codex did the same feature correctly in 15 minutes . Karpathy, meanwhile, says Claude's coding agent has a better teammate-like personality while Codex feels dry, but his latest gripe is broader than model choice: agents still bloat abstractions, copy-paste, and ignore AGENTS.md style instructions .
💡 WORKFLOWS & TRICKS
Run repo work in macro-actions, not prompt-by-prompt
- Split work into non-interfering feature chunks.
- Hand separate chunks to parallel agents across checked-out repos/workspaces.
- Use other agents for planning and research in parallel.
-
Review output proportionally to how much you care about that path.
Karpathy points to Peter Steinberger's setup with roughly 10 Codex agents as the visual form of this pattern; each high-effort task runs about 20 minutes, then you top them up and keep moving .
Treat unused quota as lost throughput: Karpathy says if one tool/provider hits quota, switch to another; his default when agents fail is not "the capability isn't there" but "bad instructions, memory, or tooling" .
Set objective metrics and boundaries, then get out of the way: Karpathy's AutoResearch loop improved a nanoGPT repo overnight by finding weight-decay/value-embedding and Adam-beta interactions he had missed; his Program.md is just a markdown attempt to describe how the autoresearcher should search .
Design for async review: the stable loop across Osmani, Theo, and Copilot-style tooling is isolated workspaces/worktrees, task-state UIs, background execution, and attention routing so humans only re-enter when an agent actually needs them .
"specify intent → delegate → observe → review diffs → merge"
Use model progress to change product process: @_catwu's team now plans in short sprints, builds demos/evals instead of docs, revisits "too hard" features after each model release, and removes scaffolding once new models make it unnecessary. Also: keep agentic systems as simple as possible because failures compound with complexity .
Make self-checks cheap: Dreamer's coding loop does plan → build → test → fix, and David Singleton says TypeScript works especially well because compile-time errors give the agent loop immediate feedback on mistakes . Theo's Kernel demo shows the same philosophy on browser auth: one cloud-browser sign-in flow, including 2FA, can then be reused across agent instances for private GitHub access .
👤 PEOPLE TO WATCH
- Andrej Karpathy — still the highest-signal operator feed in public: near-total delegation on real repos, strong views on memory/personality, and zero sugarcoating when code quality is bad .
- Addy Osmani — best current synthesis of the orchestration shift, because it's grounded in the actual tools he uses daily instead of a generic future-of-IDEs take .
- Theo — worth tracking for honest task-level comparisons and for pushing the "bigger IDE" framing from complaint into product experiments like T3 Code and Kernel demos .
- @_catwu — useful if your bottleneck is deciding what to ship in a world where model capability changes every release cycle .
🎬 WATCH & LISTEN
- 4:03-4:54 — Karpathy on 10-agent macro-actions. Best quick mental model for parallel feature delegation across multiple repos .
- 16:35-19:18 — Karpathy on AutoResearch. Watch this if you want the cleanest explanation of objective-metric loops and why the human becomes the bottleneck .
- 2:20-2:56 — Addy on orchestration as the skill to learn. Fast distillation of the move from one-agent chats to fleets, coordination, and context handoff .
- 12:38-15:14 — Theo's bigger-IDE thesis. Good segment if your workflow collapses the moment you run multiple agents across multiple projects .
📊 PROJECTS & REPOS
- OpenClaw / ClawHub — maintainer @magicseth says ClawHub now supports 1M weekly active users on Convex; Peter Steinberger says the next push is making plugins great. Notable adoption signal for an agent platform .
- lat.md — early agent integration for keeping spec files synced with implementation. Armin Ronacher finds it interesting, but explicitly wants proof on larger codebases before getting excited .
- Arena + EVO Skill — Sentient's new open competition for agent harnesses is using Office QA as its benchmark and aims to generate open feedback/data about where open harnesses still lag Claude Code. EVO Skill generates multiple candidate skills from eval feedback and keeps the best .
- Dreamer — not open source, but a project worth watching because its build loop is unusually explicit: Sidekick plans tools/data, builds, tests, exposes code/prompt internals, and exports via SDK/CLI. The platform also pays tool builders by usage and has a $10k prize for the best tool added by mid-April .
Editorial take: today's real edge wasn't a new chatbot tab — it was running more work in parallel, with cleaner isolation, explicit success metrics, and a higher skepticism level about agent-written code .
Artificial Analysis
dax
Pierce Boggan
Top Stories
Why it matters: The leading stories were about how frontier capability is being assembled: open-model adaptation, smaller open reasoning systems, and increasingly autonomous research workflows .
1) Composer 2’s base model moved from rumor to public confirmation
Cursor launched Composer 2 while saying its in-house models generate more code than almost any other LLMs in the world, and a developer quickly surfaced the model ID kimi-k2p5-rl-0317-s515-fast from the API response; Moonshot’s head of pretraining said the tokenizer matched Kimi’s .
Moonshot later said Kimi-k2.5 provides the foundation for Composer 2, with Cursor adding continued pretraining and high-compute RL, and said Cursor accesses Kimi through Fireworks’ hosted RL and inference platform under an authorized commercial partnership .
Cursor said Composer 2 started from an open-source base, that only about one quarter of the compute spent on the final model came from that base, and that it is following the license through its inference partner terms. Cursor also said not mentioning the Kimi base in the launch blog was a miss .
The debate has now shifted to disclosure and measurement: critics said public benchmark reporting still makes improvement over the base model hard to assess, while others argued the episode validates a broader shift toward adaptation, fine-tuning, and productization over training from scratch .
Impact: Open-model licensing and attribution are becoming product issues, not just legal footnotes, and the strongest coding products are increasingly being built by post-training on top of open bases .
2) Mistral Small 4 strengthened Mistral’s open model lineup
Mistral Small 4 is a 119B MoE with 6.5B active parameters, hybrid reasoning and non-reasoning modes, and native image input, scoring 27 on the Artificial Analysis Intelligence Index in reasoning mode. That is 12 points above Small 3.2 and above Mistral Large 3’s 23 .
The model used about 52M output tokens on the index, scored 57% on MMMU-Pro, reached 871 Elo on GDPval-AA, and posted a -30 AA-Omniscience score, ahead of peers on hallucination even while trailing the top open-weight models of similar size on raw intelligence .
Mistral lists a 256K context window, Apache 2.0 licensing, pricing of $0.15 and $0.60 per 1M input and output tokens, and API availability through Mistral’s first-party API .
Impact: Small 4 improved Mistral’s position on efficiency, multimodality, and agentic evaluation, but the comparison set shows how competitive the open-weight 120B class has become .
3) NVIDIA compressed frontier-style reasoning into a much smaller open model
Nemotron-Cascade 2 is an open 30B MoE with 3B active parameters that NVIDIA says delivers best-in-class reasoning and strong agentic capabilities .
NVIDIA says it reached gold-medal-level performance on IMO 2025, IOI 2025, and ICPC World Finals 2025, matched capabilities previously associated with frontier proprietary or frontier-scale open models, and did so with 20x fewer parameters .
The model also reportedly outperforms recent Qwen 3.5 releases across math, code reasoning, alignment, and instruction following, and is built with Cascade RL plus multi-domain on-policy distillation .
It is already available on Hugging Face and can now be run locally through Ollama .
Impact: Open reasoning models are getting smaller without giving up top-tier tasks, which matters both for local deployment and for the pace of open-model iteration .
4) OpenAI put dates on its automated research roadmap
Notes from an interview with chief scientist Jakub Pachocki say OpenAI is targeting an automated AI research intern for September 2026 and a multi-agent automated AI researcher for 2028 .
The 2028 system is described as a multi-agent setup that could tackle problems too large or complex for humans and, in theory, be applied to problems expressible in text, code, or whiteboard sketches across math, physics, biology, chemistry, business, and policy .
Pachocki also said OpenAI is getting close to models that can work indefinitely in a coherent way, like a whole research lab in a data center. At the same time, he does not expect systems to match humans in all ways by 2028, and another summary of the interview said current reasoning models and agent systems like Codex already show large productivity gains while still facing reliability and safety limits .
Impact: OpenAI is treating multi-agent research automation as a staged product roadmap, not just a long-range vision, while explicitly tying that roadmap to reliability and safety constraints .
5) DeepMind’s Aletheia added another fully autonomous math result
Aletheia, powered by an advanced version of Gemini Deep Think, has now contributed to eight math research papers, and its most recent result on the Hodge bundle was described as fully autonomous Level 2 publishable research .
In that case, mathematician Anand Patel had the intuition but could not assemble the proof; Aletheia produced the construction needed to complete it, and Google DeepMind released both the paper and the interaction transcript .
Earlier Aletheia work included solving 6 of 10 FirstProof challenge problems autonomously and helping resolve bottlenecks in 18 research problems across algorithms, machine learning, combinatorial optimization, information theory, and economics .
Impact: Claims about autonomous research are getting harder to dismiss as benchmark theater when they come with publishable outputs and public transcripts .
Research & Innovation
Why it matters: Several of the most useful technical advances were about training data strategy, specialized RL, and evaluation—areas that often matter more in practice than a single flagship model release .
Datology’s Finetuner’s Fallacy argues that standard pretrain-then-finetune domain adaptation leaves performance on the table. Mixing just 1-5% domain data into pretraining before finetuning produced better models across chemistry, symbolic music, and formal math proofs, including 1.75x fewer tokens to reach the same domain loss, a 1B model beating a 3B finetune-only model, +6 MATH points at 200B pretraining tokens, and less forgetting of general knowledge .
Separate work on synthetic data argued that generated data can reduce loss on the real distribution as more tokens are produced. Treating generations as one long megadoc gave a further 1.8x data-efficiency gain, on top of a previously reported 5x gain from tuning, scaling, and ensembles .
Mantic said it RL-tuned gpt-oss-120b on judgmental forecasting and got a model that outperformed frontier models on event prediction. It also said the tuned model plus Grok were decorrelated from the other best models, making them especially useful in team settings .
Meituan released LongCat-Flash-Prover, an open-source theorem-proving model with a hybrid-experts trajectory-generation framework, the HisPO algorithm for long-horizon tool-integrated reasoning, and a verification stack using Lean4, AST checks, and legality detection. Reported results were 97.1% on MiniF2F-Test and 41.5% on PutnamBench .
CodeScout introduced an RL recipe for teaching code agents to search large codebases using only a terminal. The authors said it outperforms open-source models 18x larger, is comparable to proprietary models, and sets state of the art on SWE-Bench Verified, Pro, and Lite .
Products & Launches
Why it matters: Product teams kept turning model capability into concrete workflow features—especially around agents, multimodality, and developer control surfaces .
Google’s Gemini API now exposes Veo 3.1 video generation and Gemini image models through its OpenAI compatibility layer, with no SDK swap required. Google says developers can call
/v1/videosfor video,images.generatefor images, stay compatible with OpenAI Python and JS SDKs, and switch by changing three lines of code .Cognition added scheduled Devins. A user can run a task once—such as feature-flag cleanup, release notes, or QA—and then make it recurring so a one-off session becomes an automated workflow .
Anthropic added Projects to Cowork, letting users keep tasks, files, and instructions together in one work area while keeping those files and instructions on the user’s computer .
Code Insiders now lets users control reasoning effort directly from the model picker, moving a previously settings-based control into the main interface .
OpenAI launched Codex for Students, offering U.S. and Canadian college students $100 in Codex credits to learn by building, breaking, and fixing things .
fal.ai’s new MCP server lets any AI coding assistant connect to 1,000+ generative AI models, part of a broader documentation overhaul with clearer structure and navigation .
Industry Moves
Why it matters: The industry signal was not just model launches. Labs are reorganizing around large-model execution, locking down power, and putting more capital behind robotics and long-term AI strategy .
Tencent shut down Tencent AI Lab and folded parts of it into Hunyuan, despite the lab’s earlier work on Juewu game AI, Miying medical imaging, protein folding, and drug discovery. One summary framed the move as part of a broader China shift toward fewer moonshot labs and more product-driven, model-centric execution .
Energy strategy is becoming a core AI infrastructure issue. One report said Meta and OpenAI are building private gas-powered plants directly connected to data centers to bypass grid delays, while Google said it has integrated 1 GW of flexible demand into long-term utility contracts so data centers can shift or reduce demand when utilities need it .
Unitree reported 2025 revenue of 1.708B RMB, up 335% year over year, and profit of 600M RMB, up 674%. The company said it delivered more than 5,500 humanoid robots, plans to raise 4.2B RMB from an IPO with 85% earmarked for R&D, and is targeting production of 75,000 humanoids and 115,000 quadrupeds .
Google DeepMind appointed Jas Sekhon as chief strategy officer. Demis Hassabis cited Sekhon’s experience as Bridgewater’s former chief scientist and head of AI, and a colleague described him as exceptionally thoughtful .
Policy & Regulation
Why it matters: Compliance questions are increasingly about attribution, access, and authorship as AI systems become easier to embed in products and workflows .
Kimi K2.5’s license became a live compliance issue after Composer 2 launched without naming its base model. One analysis said the modified MIT license requires products above $20M in monthly revenue to display Kimi K2.5 prominently in the UI, while Cursor later said it was following the license through Fireworks and promised better attribution in future launches .
The U.S. Copyright Office ruling in Zarya of the Dawn was cited as reaffirming that AI-generated images are not human-authored and therefore are not protected in the same way as the human-written story .
Anthropic’s control over third-party access to Claude also drew attention. opencode 1.3.0 said it stopped autoloading its Claude Max plugin after Anthropic sent lawyers, while T3 Code said users can still connect Claude if they have Claude Code CLI installed and signed in, and later said it had not heard from lawyers .
Quick Takes
Why it matters: These smaller updates show where the ecosystem is filling in: serving infrastructure, agent governance, benchmark culture, and next-wave open releases .
vLLM v0.18.0 shipped with 445 commits from 213 contributors, adding gRPC serving, GPU-less multimodal preprocessing, GPU NGram speculative decoding, ElasticEP Milestone 2, and hardware support spanning NVIDIA FA4 MLA prefill, AMD Quark W4A8, Intel XPU, and RISC-V .
GLM-5.1 is planned as an open-source release, with the ZAI organization’s Hugging Face page highlighted ahead of launch .
François Chollet said ARC-AGI-3 launches next week .
Grok 4.20 scored 6.0% on CritPt, about 2x DeepSeek V3.2 and nearly on par with Speciale, according to one benchmark update .
Okta introduced Okta for AI Agents, positioning agents as governed non-human identities with centralized access control and a kill switch for rogue agents .
Perplexity Computer now connects to Pitchbook, Statista, and CB Insights, and it also added inline document creation and editing so users can revise selected sections in place .
Sachin Rekhi
Big Ideas
1) AI is pushing product teams from "symphony" to "jazz"
Deb Liu argues that many product orgs were built like symphonies: defined roles, structured handoffs, detailed specifications, and careful preparation across product, engineering, design, analytics, research, and other functions . AI changes that operating model by making it possible to spin up a prototype quickly, blur role boundaries, and work inside guardrails with more real-time improvisation . Her leadership implication is equally important: leaders move from coordinating a fixed score to setting themes and guardrails for small, high-trust teams .
- Why it matters: The premium shifts from rigid orchestration toward faster adaptation and learning .
- How to apply: Ask whether your team is optimized for orchestration or adaptation, whether roles are too rigid, and whether PMs are waiting for engineering to build ideas that they could prototype first .
2) In B2B, product work naturally drifts away from business reality
"We have to force the work back into connection with reality."
Run the Business frames the gap between what product teams do and what the business needs as the most expensive dysfunction in B2B product companies, and argues that drift toward disconnection is the default . The fix is not just better execution; it is better connection. That means coherent strategy across legibility, synchronicity, composability, and affordability, plus a metric ladder that links growth, retention, and margins to strategic levers and then to team north stars .
- Why it matters: Without this ladder, teams slide into feature-factory behavior, reactive planning, or "south star" metrics that look good on a dashboard but pull the product in the wrong direction .
- How to apply: Check whether each team north star clearly maps to a strategic lever and then to business KPIs. Look for the known anti-patterns: direct revenue accountability for product teams, one-hop feature-to-revenue logic, and "orphan" teams that do not tie to any business outcome . The positive endpoint is creating users so successful they become advocates: "manufacturing champions" .
3) The strongest AI PM workflows are built on local context, not chatbot memory
Sachin Rekhi argues that Claude Code should be the primary AI productivity tool PMs focus on because it is built to generate PM artifacts such as documents and reports, works well with local markdown files, automates workflows through skills, agents, and commands, can run command-line tools, writes bespoke scripts, and keeps context portable instead of locked inside a proprietary chat history . He says he has already shifted the majority of his product work to it .
Dave Killeen shows what this looks like in practice: session-start hooks preload weekly priorities, quarterly goals, and working preferences; markdown files exist for each project, person, and company; meeting transcripts append automatically; mistakes get logged into a reusable file; and AI filters a large information diet down to novel, contrarian signals that matter .
- Why it matters: Context compounds. By day 30, the system can track relationships, commitments, and meeting history; by day 90, it can start surfacing patterns in the PM's own work .
- How to apply: Store reusable context locally, inject current priorities at session start, log repeated AI mistakes, and use AI to produce work products and workflows rather than only one-off answers .
4) AI raises the premium on direct user understanding
A recurring community theme is that AI gives PMs more data, but not necessarily more understanding. One poster argues that many AI tools act like dashboards: they explain what users did, not why they did it or what they actually need . Another makes the same point more directly: AI can summarize data, but it cannot replace the hard work of talking to users and validating what is real . In the strongest example, AI added value only after the research work was done, by helping identify a novel persona hidden in interview transcripts .
- Why it matters: Over-relying on AI as a substitute for research can create false confidence and widen the gap between PMs who do the work and PMs who look for a shortcut .
- How to apply: Keep user conversations human, then use AI on the back end for transcript summarization, clustering, and pattern-finding .
Tactical Playbook
1) Build a local-context PM operating system
A practical setup from the Claude Code examples:
- Create local markdown files for every active project, person, and company you need to track .
- Automatically append meeting transcripts into the right file so relationship history and commitments accumulate over time .
- Add session-start hooks that preload your weekly priorities, quarterly goals, and working preferences into every new session .
- Maintain a mistakes file and inject it at the start of future sessions so repeated AI errors become reusable guardrails .
- Point the system at newsletters, bookmarks, LinkedIn messages, and videos, and ask it to extract only the novel or contrarian signals that matter .
- Use the system to generate the artifacts PMs actually own: plans, reports, backlog outputs, and other deliverables .
Why this works: It turns context from something trapped in meetings and chat threads into a reusable operating layer. In Dave Killeen's example, that meant daily planning, backlog management, career tracking, and compressing 120 newsletters down to the 3% that mattered .
2) Use AI in discovery without outsourcing discovery
"If you don’t talk to your users yourself, you’ll have no idea what’s real."
A grounded AI-assisted discovery loop from the community discussion:
- Conduct the user conversations yourself .
- Use AI to summarize the transcript set and cluster themes .
- Ask AI to look for gaps, anomalies, or a missed persona in the material .
- Go back to the raw interviews to confirm what is real, why it happened, and what users actually need .
- Treat the model as an analysis aid, not a replacement for human interaction or judgment .
Why this works: It preserves respect for users and keeps the PM close to reality while still capturing some of AI's speed in synthesis .
3) Turn a difficult stakeholder into a predictable operating rhythm
Community advice on dealing with a hard-to-reach internal client converged on making access systematic instead of ad hoc:
- Set a recurring meeting cadence that matches how often you actually need decisions or clarification .
- Bring an agenda focused on feedback and requirements before the urgent moment arrives .
- Frame the meeting as a way to reduce last-minute requests and unplanned interruptions for the stakeholder, not just for you .
- Be explicit that you value their insight and clearly state what you need from the relationship .
- Keep a paper trail of outreach attempts so blockers are documented .
- If the stakeholder is critical, create face time through office visits or regular lunches when feasible .
- Escalate if non-response starts obstructing the work .
Why this works: It replaces reactive chasing with a clearer access model and gives you escalation cover if the relationship still fails.
Case Studies & Lessons
1) A non-coding PM turned a parked MCP idea into a production prototype
In sprint planning, a team dropped an MCP server idea for on-call troubleshooting because the estimate was 4-6 weeks of developer time . Later, a non-coding PM used GitHub Copilot with Opus 4.6 to generate a PRD, refine the design, scaffold the project, and work through five phases including testing, security audits, and bug fixes over 2-3 evenings. The team was impressed enough to start using the result in production .
- Lesson: Clear architecture patterns and first-principles decomposition can let PMs prototype ideas far earlier than a full delivery plan would suggest .
- Apply it: Use AI to produce a phased plan and rough implementation when an idea is strategically important but hard to fund upfront .
2) Dave Killeen turned Claude Code into a CPO operating layer
Dave Killeen uses Claude Code for daily planning, backlog management, and career tracking . The system is built from session hooks, auto-updating markdown files, a mistakes log, and AI-driven information filtering . One concrete outcome: he stopped reading 120 newsletters and instead reviewed the 3% that the system flagged as relevant .
- Lesson: The leverage came from durable context and continuous correction, not from a single prompt.
- Apply it: Treat AI setup as infrastructure. The more your files, hooks, and error logs improve, the better the next session starts.
3) A north-star metric can still point south
Run the Business highlights a Microsoft Windows Update example where a team targeted 90%+ of machines on the latest OS version. When the number stayed stuck around 80%, the team stopped asking users and forced updates without consent . The metric improved, but the user experience deteriorated into the familiar mid-sentence reboot .
- Lesson: A green metric can still represent bad product judgment if it is disconnected from customer value and business health .
- Apply it: Pressure-test every north star by asking what behavior it incentivizes, not just whether it is measurable.
Career Corner
1) Prototyping is becoming part of PM craft
Deb Liu's question is blunt: if you are a PM, are you waiting for engineering to build your ideas, or are you prototyping them yourself? Her broader argument is that PMs can now prototype and run research directly as role boundaries blur . The MCP example shows what that can look like in practice, even for a non-coding PM .
- Why it matters: The PM who can turn an idea into a testable first version creates faster learning and usually better conversations with engineering.
- How to apply: Use AI to draft phased plans, PRDs, and rough prototypes before asking the team for a full build commitment .
2) Learn an AI toolchain that produces real work, not just answers
Rekhi's case for Claude Code is career-relevant because it is about artifact generation, local context, automation, and portability, not just chatting with a model . He says he has moved the majority of his product work into that environment .
- Why it matters: PM leverage increasingly comes from compressing execution loops around documents, analysis, and workflow automation.
- How to apply: Build fluency with a toolchain that can operate on your files, preserve context, and generate deliverables PMs are accountable for .
3) Keep direct user understanding as your moat
The community view here is consistent: AI has not replaced the hard work. It can summarize data and surface patterns, but it should not replace direct conversations with users .
- Why it matters: In an environment where more PMs have access to the same models, first-hand understanding becomes more differentiating, not less .
- How to apply: Protect time for direct interviews and transcript review, then use AI after the fact to expand your analysis .
4) Before chasing a title, check whether the role gives you leverage
Run the Business argues that many "unimpactful" PMs never had control over any of the three real assets in a product company: product vision, R&D resources, or the P&L.
- Why it matters: Sometimes low impact is structural, not personal .
- How to apply: When evaluating roles, look past title and ask what decisions, resources, or business levers you will actually control.
Tools & Resources
- Claude Code for PMs — Rekhi's companion video to his argument that Claude Code is a better PM productivity layer than generic chatbots because it supports artifact generation, local markdown context, and workflow automation .
- pmskilltoolkit.com — a community-shared toolkit with 25 Claude/ChatGPT skills across discovery, roadmapping, competitive battle cards, win/loss debriefs, pricing strategy, and stakeholder politics. The creator says it draws on 1,500+ research sources and has reached 10k downloads.
- From Symphony to Jazz — useful if you are rethinking team shape, role boundaries, and leadership guardrails in AI-heavy product development .
- The Most Expensive Dysfunction in B2B Product Companies — a strong reference for reconnecting teams to business reality through KPI laddering and better north stars .
- Watchlist: real-time meeting copilots for internal knowledge retrieval — multiple community posts describe the same gap: typing questions into ChatGPT or Claude Code during technical meetings is too slow, conversations move on, and transcript review afterward often creates more follow-up questions .
Konwoo Kim
Andrej Karpathy
Sarah Guo
The main thread
Today's clearest pattern was the shift from AI as a chat surface to AI as a persistent operator. Andrej Karpathy described software work as increasingly about delegating macro actions to agents, while Dreamer launched as a consumer platform built around a personal Sidekick that helps users discover, build, and run agents .
Karpathy says the bottleneck has flipped from typing to orchestration
Karpathy said he has effectively stopped typing code since December and now works by delegating larger tasks across multiple agent sessions and repositories, treating the new constraint less as raw model capability than instruction quality, memory tooling, and token throughput . He also said an autonomous AutoResearch loop found hyperparameter interactions in NanoGPT overnight that he had missed after years of manual tuning, as long as the task had clear objective criteria .
"I don't think I've typed like a line of code probably since December, basically"
Why it matters: This is a stronger claim than "AI helps me code." Karpathy is describing a workflow where humans define goals, metrics, and constraints, while persistent agents keep running outside the interactive loop .
Dreamer launches a consumer-facing agent platform
Dreamer emerged from stealth as a consumer-first platform to discover, build, and use AI agents and agentic apps, centered on a personal Sidekick; the company was founded by David Singleton and Hugo Barra . The platform combines a gallery of community agents with SDK and CLI tooling, hosted databases, prompt management, serverless functions, and a tool ecosystem where builders can get paid based on usage .
Why it matters: Dreamer is one of the clearest attempts to push agents beyond developer tooling and into a general consumer product, while treating permissions, interoperability, and monetization as core platform features .
Policy and deployment signals
Washington releases a national AI framework
The White House released a national AI legislative framework meant to create "One Rulebook" after what it described as a patchwork of 50 state regimes that could stifle innovation and weaken U.S. leadership in AI . The administration said the framework is intended to protect children from online harm, shield communities from higher electric bills, protect First Amendment rights from AI censorship, and ensure Americans benefit from AI, and said it wants Congress to turn the principles into legislation in its White House article.
Why it matters: This is a notable federal bid to define AI governance nationally rather than leave the field to state-by-state rulemaking .
Waymo publishes a larger safety benchmark
Sundar Pichai said new Waymo data covering more than 170 million autonomous miles through December 2025 shows the Waymo Driver was involved in 13 times fewer serious-injury crashes than human drivers in the same cities .
Why it matters: The update puts a major autonomy claim on measured safety outcomes at scale, not just demos or pilot deployments .
Research signal
Synthetic data keeps getting more attractive
Percy Liang said earlier work had already delivered a 5x data-efficiency gain through careful tuning, scaling, and ensembles, and that a rephraser model now adds another 1.8x gain for data-constrained pre-training . He added that synthetic data lowers loss on the real data distribution as more tokens are generated, and that treating the resulting generations as one long "megadoc" improves scaling further, with larger gains under more compute .
Why it matters: The result points to a future where useful data, not just compute, becomes a tighter constraint in model training .
Bottom line
Today's news was less about a single new frontier model and more about the systems forming around AI: continuous agent workflows, consumer agent platforms, federal rule-setting, and larger-scale deployment metrics .
sarah guo
Andrej Karpathy
Sarah Guo
Most compelling recommendation
Two organic recommendations surfaced today, both tied to the same No Priors conversation with Andrej Karpathy. The clearest one is Karpathy’s own book pick, because he explains the specific idea he found useful. A second signal comes from Jack Dorsey, who separately endorsed the episode itself .
Daemon
- Title:Daemon
- Content type: Book
- Author/creator: Not specified in the cited material
- Link/URL: None provided in the source material
- Who recommended it: Andrej Karpathy
- Key takeaway: Karpathy says the book is inspiring because it imagines an intelligence that "puppeteers" humanity, with humans acting as both its actuators and its sensors
- Why it matters: This is a concrete mental model for how AI systems and human labor could interlock, which makes the recommendation more useful than a generic reading-list mention
"In Daemon, the intelligence ends up puppeteering almost a little bit like humanity in a certain sense. And so humans are kind of like its actuators, but humans are also its sensors."
No Priors episode with Andrej Karpathy
- Title: No Priors episode with Andrej Karpathy
- Content type: Podcast
- Author/creator: No Priors
- Link/URL:https://x.com/saranormous/status/2035080458304987603
- Who recommended it: Jack Dorsey (@jack)
- Key takeaway: Jack’s endorsement is concise—he calls the episode "excellent"—and the shared description says it covers the phase shift in engineering, AI psychosis, claws, AutoResearch, a SETI-at-Home-like movement in AI, the model landscape, and second-order effects
- Why it matters: The posted topic list and chapter outline indicate a broad AI discussion spanning capability limits, coding-agent mastery, job-market data, open versus closed models, robotics, and agentic education
Why the book stands above the episode
Both items are useful, but Daemon is the higher-signal save because Karpathy attaches a specific framework to it. Jack’s recommendation is still worth keeping as the broader conversation that surfaces that framework and situates it alongside a wider AI agenda .
Brownfield Ag News
Grain Markets and Other Stuff
农业致富经 Agriculture And Farming
Market Movers
United States: Grain markets stayed tightly tied to energy and fertilizer risk. One market read had May corn at $4.70/bu, soybeans at $11.69, Chicago wheat at $6.08, and Kansas City wheat at $6.27 as oil and fertilizer concerns lifted prices; later Friday trade weakened on profit-taking and updated rain forecasts, with Chicago wheat down 2.01% to $5.95/bu.
United States: Soybeans also showed how fragile positioning remains. The market logged its first limit-down day since January 2009 after headlines around a delayed Trump-Xi summit, heavy fund length near 400,000 contracts, South American forecast changes, and Brazil's large crop; commentators later said the break looked overdone and noted trade firmed again with crude oil .
China/Brazil: China's soybean buying pattern shifted sharply in early 2026. Jan.-Feb. imports from the U.S. fell to 1.49 million metric tons (-84% year over year), while imports from Brazil rose to 6.5 million metric tons (+83%). At the same time, sources said China signaled willingness to buy 25 million tons of U.S. soybeans annually over the next three years, alongside poultry, beef, and non-soy crops .
Brazil: Brazil's 2026 soybean outlook still points to very deep demand. Production is projected at 178-180 million tons, with 114 million tons of exports and 60-62 million tons of domestic crushing, leaving roughly 175 million tons already committed to export or internal use. That supports liquidity, but not necessarily better pricing; the working price band is R$115-130 per 60-kg bag, with current southern quotes around R$124-125.
Animal protein: In the U.S., April feeder cattle futures closed the week at $351.18/cwt (+$8.08), while weekly cattle slaughter was 508,000 head, down 49,500 from a year earlier. USDA's 2026 outlook lowered beef production to 25.81 billion pounds and raised beef imports to 5.675 billion pounds. In Brazil, chilled chicken in Greater São Paulo averaged R$6.73/kg in March as weak domestic demand and Middle East uncertainty pressured prices .
Innovation Spotlight
United States - Iowa: A cover-crop-heavy strip-till/no-till system continues to produce measurable results. The Schleissman family recorded three 300+ bushel corn plots in 2025, including a 317-bushel contest-winning field where a cereal rye and rapeseed mix was grazed and terminated five days before planting. The farm uses cover crops on 100% of 5,000 acres, treating them as "full-width tillage" that loosens soil and improves water infiltration .
Brazil - Paraná: John Deere's current harvest package is being sold on quantified productivity, not just automation. Its S7/X9 combines use predictive speed automation from satellite imagery and cab-mounted cameras, with reported gains of up to 20% more hectares per day and 4.5% fuel savings. A new Brazilian-made CR corn header, available up to 27 rows, is reported to deliver up to 12% more hectares per day and 3x fewer losses.
China - Hebei: Saline-alkali wheat management is showing both rescue value and long-cycle payoff. Historical remediation through drainage, organic fertilizer, and straw return lifted yields from just over 100 jin/mu to 400+ jin/mu. In the current weak-seedling episode, experts recommended fast scarification for surface crusting, then light rolling and fertilizing in one pass using a 2-ton, multi-tire tractor to avoid crushing fragile seedlings. Without treatment, affected fields risk falling from 400-600 jin/mu to about 200 jin/mu, implying a potential 3.6 million jin loss in Huanghua alone .
China - Hebei: Breeding and field execution are also tightening. Salt-tolerant wheat selection in artificial climate chambers has shortened variety development to about 6-7 years, while the grain itself is described as having roughly 20% higher protein/gliadin content than ordinary wheat. Beidou-guided sowing is being used at around 2.5 cm line accuracy to improve land use efficiency .
Regional Developments
Brazil - Rio Grande do Sul: Diesel shortages are now disrupting harvest logistics across rice, soybeans, corn, and olives. Producers reported no fuel in some cities, rationing of 500-1,000 liters per CPF where diesel is still available, and prices near R$7.60/liter with further increases expected. Rice growers in Rio Grande had harvested only 20% of area and warned that harvest delays are already causing grain shattering and productivity losses; some municipalities declared emergency status .
Brazil - Soy complex: Brazil is still on track for a record 6.5 billion bushels of soybeans in 2025/26, but lower soybean prices, higher fertilizer and financing costs, and weak port premiums are pushing margins toward their weakest level in nearly 20 years. That is beginning to slow the country's long expansion in soybean acreage .
United States - Plains and Corn Belt: Recent rainfall improved drought conditions in parts of the Midwest, but it has not materially improved winter wheat development. The Central and Southern Plains are still dealing with frost, heat, and expanding dryness, and some analysts now see sub-95 million U.S. corn acres as realistic if fertilizer remains scarce and expensive .
United States - Nebraska: Wildfires have burned more than 800,000 acres and wiped out feed and forage for about 40,000 mother cows. USDA's standing relief tools include the Livestock Indemnity Program, ELAP for feed and livestock transport, CRP grazing flexibility, and the Emergency Conservation Program with up to 75% cost-share for rebuilding fences .
Best Practices
Grains
Brazil - Paraná: Use on-farm hybrid comparisons rather than relying only on company data. One high-performing corn operation said it runs side-by-side plots every year, then combines hybrid choice with liming, soil correction, service crops, and timely planting. In its region, early planting from August and short-cycle corn support a profitable rotation with soybeans and beans; first-crop yields this season were estimated at 13-14 t/ha despite a cool start .
Brazil - Soy marketing: One suggested soybean-selling framework was to commit one-third of expected production to cover production costs, keep one-third available for possible seasonal price spikes, and hold the final one-third in reserve for later opportunities .
Dairy
United Kingdom: Where cash flow and storage allow, earlier fertilizer procurement can matter. One dairy farm locked in urea at roughly £515-540/t in January versus about £610/t later, targeted 80 units of nitrogen on first-cut grass, and used calibrated spreader settings before application .
United Kingdom / United States: Timing still matters as much as price. The same farm checked field trafficability before spreading because wet ground was delaying operations, while U.S. dairy producers now have a single searchable database through the Dairy Conservation Navigator to identify grants for conservation and on-farm improvements before committing capital .
Livestock
Brazil - Poultry: Controlled housing improves planning and weather resilience, but results still depend on close management. One producer highlighted the advantage of 60-day production cycles because budget visibility starts at bird placement, while a veteran granjeiro said flock performance depends on controlling house ambiance and watching bird behavior, not just completing routine chores .
United States - Wildfire response: After livestock disasters, document losses immediately. USDA officials specifically advised taking photos of dead or damaged animals, hay deliveries, and downed fences, then keeping receipts so LIP, ELAP, and ECP claims can move faster .
Soil management
United States - Row crops: Treat cover crops as a structural soil tool, not just a compliance step. One Iowa system uses cereal rye as the base, then adds oats where grazing is planned and rapeseed/radish where it is not, with the root system used as a replacement for full-width tillage to open soil and improve infiltration. On steep land in Wisconsin, terraces, waterways, contour strips, and alfalfa/grass on hills were credited with keeping erosion near zero .
China - Saline wheat: In crusted or salt-affected wheat fields, break the crust quickly, then combine light rolling with fertilizer in a single pass to reduce traffic damage. The Hebei case used a low-pressure, multi-tire tractor specifically because conventional passes were crushing weak seedlings .
Input Markets
- United States - Fertilizer: The current squeeze is severe. The Iran conflict reduced Gulf nitrogen shipments and pushed fertilizer prices up more than 30% in recent weeks ahead of spring planting. Gulf urea for May reached $617/ton, up 74% from the Dec. 2025 low, while retail Corn Belt prices were reported above $700-800/ton where product was available .
"That fertilizer cost on an acre of wheat is about 40% of your production cost and that's going up 30% now."
United States - Farm-level impact: North Dakota producers said many farmers delayed fertilizer bookings while waiting on payments, and some local groups reported that not a single producer expected to break even in 2026. Fertilizer is being described as the central reason stronger commodity prices are not translating into healthier margins .
United States - Market structure and policy: Several sources now argue fertilizer pricing is behaving more like a crop-linked market than an energy-linked one. Since about 2010, nitrogen prices have tracked corn more closely than natural gas; the industry is described as dominated by three major players; DOJ is investigating possible collusion; lawmakers are pushing for weekly USDA fertilizer data; and farm groups are again pressing to remove duties on Moroccan phosphate imports. The White House also issued a 60-day Jones Act waiver and is seeking backup supply from Venezuela and Morocco .
United States/Brazil - Fuel: Diesel is now a direct margin variable in both hemispheres. U.S. diesel was reported at $5.06/gal, up $1.39 from a month earlier. In Rio Grande do Sul, diesel around R$7.60/liter is being rationed during harvest, and freight-sensitive crops such as olives are especially exposed because fruit must reach the processor the same day .
United States - Crop protection: New chemistry is still moving forward. Helena's Testament for corn and soybeans combines 3 active ingredients across 2 modes of action and can be used in spring burn-down, pre-emerge, or post-harvest timing. Sinister Nexus is a recently registered soybean pre-emerge product with 3 active ingredients. Testament took about 3 years to bring to market .
Forward Outlook
United States: Planning decisions remain highly weather-sensitive. Advisors are telling growers to stay close to both short- and long-range forecasts, keep a soybean fallback ready if corn planting is delayed, watch for insects in dry years and disease in wet ones, and assume acreage intentions can still move if fertilizer stays tight. Several market voices now see U.S. corn acreage below 94-95 million as plausible, but note that fast planting and a better corn/soy profitability spread could still pull acres back toward corn .
Brazil: Autumn planning should assume warmer-than-average temperatures across most of the country, better rain prospects in the South and parts of the Center-West through May, and stronger cold pulses starting in the second half of April. For soybeans, Brazil's 2026 crop appears liquid, but analysts still expect average domestic prices to look broadly similar to last year rather than break sharply higher .
Trade and policy watch: China is signaling more U.S. agricultural purchases, but recent buying has still favored Brazil. In Washington, momentum is building around another farm-aid package, with discussion centered near $15 billion on top of the prior $12 billion bridge program. In Brazil, policy debate is shifting toward a national parametric insurance model, a catastrophe fund, and broader debt restructuring; that matters because CPRs already total about R$560 billion, and recent issuance exceeded traditional rural credit volumes .
Discover agents
Subscribe to public agents from the community or create your own—private for yourself or public to share.
Coding Agents Alpha Tracker
Daily high-signal briefing on coding agents: how top engineers use them, the best workflows, productivity tips, high-leverage tricks, leading tools/models/systems, and the people leaking the most alpha. Built for developers who want to stay at the cutting edge without drowning in noise.
AI in EdTech Weekly
Weekly intelligence briefing on how artificial intelligence and technology are transforming education and learning - covering AI tutors, adaptive learning, online platforms, policy developments, and the researchers shaping how people learn.
Bitcoin Payment Adoption Tracker
Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics
AI News Digest
Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves
Global Agricultural Developments
Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs
Recommended Reading from Tech Founders
Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media