We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Your intelligence agent for what matters
Tell ZeroNoise what you want to stay on top of. It finds the right sources, follows them continuously, and sends you a cited daily or weekly brief.
Your time, back
An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers verified daily or weekly briefs.
Save hours
AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.
Full control over the agent
Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.
Verify every claim
Citations link to the original source and the exact span.
Discover sources on autopilot
Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.
Multi-media sources
Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.
Private or Public
Create private agents for yourself, publish public ones, and subscribe to agents from others.
3 steps to your first brief
Describe your goal
Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.
Review and launch
Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Get your briefs
Get concise daily or weekly updates with precise citations directly in your inbox. You control the focus, style, and length.
Addy Osmani
Shawn "swyx" Wang
🔥 TOP SIGNAL
Serious coding-agent users are converging on structured workflows, not blind autonomy. Addy Osmani's open-source skills system maps directly onto define → plan → build → verify → review → ship, with each skill carrying usage guidance, red flags, and verification steps; Simon Willison and swyx describe the same pattern in practice by tightening review, approvals, and testing as stakes go up . MongoDB's rollout makes the stakes concrete: across ~2,000 engineers, coding assistants wrote 70% of last week's checked-in code and engineers produced ~35% more code, but token spend rose enough that the company is explicitly weighing tool cost against future hiring plans .
⚡ TRY THIS
Addy Osmani — run a six-step loop before you let the agent code. In VS Code/Copilot, start with
/refineon a vague idea; answer the clarifying questions; run/specto turn that into success criteria and stack assumptions; run/planto generate phased tasks with acceptance criteria and suggested files so the agent stays inside a smaller blast radius; then do/build,/verify, and/reviewbefore shipping . Addy's build phase encodes TDD explicitly, and verify/review cover browser testing, security, simplification, CI, git, and docs .Addy Osmani + Command Code — keep context thin and route models by phase. Put reusable behavior in small skill or taste markdown files. Let the agent see only the skill name/description first, then load full content on demand; Addy prefers this progressive loading for context-window economy and reaches for MCP more in auth-heavy cases . Then split models by job: Addy plans with Gemini 3.5 Flash and implements with stronger models like Gemini Pro, Opus, or Codex, while Ahmad Awais says users seed a project's in-repo taste file with Opus or GPT-5.5 and continue iterating with cheaper models .
Simon Willison — for high-stakes code, split research, build, and red-team passes. Simon first had GPT-5.5 Pro research MicroPython WASI support and produce
research.md, then handed that doc to Codex Desktop + GPT-5.5 high to build the package . For the sandbox itself, he watched the trace closely and then used GPT-5.5 to try to break out of it, fixing edge cases it found .
read the research.md document and build this. You will probably need to write a script that compiles a custom WASM version of MicroPython as part of this project - fetch the MicroPython code to a /tmp directory for this as part of that script.
- swyx — ask for pushback first, then keep autonomy bounded. A simple prompt trick: frame the task as a question—literally append
?—so the model critiques the idea and suggests alternatives instead of blindly executing . In his own conference-scheduling agent, he paired that with a bounded while-loop, minimal turns, human approval for proposals, and logs/confirmations across Slack, email, and web . That guardrail matters because, as Kent C. Dodds notes, agents can make you more confident you're on the right path when you're not .
📡 WHAT SHIPPED
- Simon Willison —
micropython-wasmalpha +datasette-agent-micropython0.1a0. Safe Python execution in a MicroPython + WASM sandbox via wasmtime; Simon says he is using it himself after breakout testing with GPT-5.5 xhigh; repos: micropython-wasm, datasette-agent-micropython. - Cursor — Design Mode. Point, draw, or talk to update UI; blog: cursor.com/blog/design-mode.
- LangChain — Managed Deep Agents. Managed, model-agnostic infra for deploying deep agents with one line of code; details: introducing-managed-deep-agents.
- Deep Agents v0.6 — Streaming. Adds a subscription model for tool and subagent progress in highly parallel systems; examples: streaming-cookbook.
- Skills are productizing fast. LangSmith Fleet lets domain experts create reusable skills that stay in sync across teams, while Codex now reads
.agents/skillsand plans to deprecate.codex/skills. - Google ADK → LangSmith deployments. A wrapper turns an ADK runner into a deployment with persistence, streaming, tracing, and built-in endpoints; the demo flow is
wrap(...)→uv run langgraph dev→uv run langgraph deploy. - Command Code (emerging project). Ahmad Awais says its per-repo
tastefiles learn micro-preferences from accepts, edits, and rejects on merge, while its deterministic tool-repair layer has expanded to 16k+ variations across ~600B tokens; one user reportedly ran 12+ hour DeepSeek sessions and 70B tokens through it, and open source is planned very soon .
🎬 GO DEEPER
- 11:21-15:29 — Addy Osmani: vague idea → clarifying questions → spec. Watch
/refineturn a rough GitHub-inspired habit tracker into concrete constraints and candidate directions before any real build work starts .
- 18:16-21:24 — MongoDB: real rollout numbers from ~2,000 engineers. CJ walks through 70% AI-written check-ins, ~35% more code, and the cost calculus that follows when token spend starts climbing .
- 7:40-8:16 — Simon Willison: when to trust the agent vs watch it like a hawk. Short, useful distinction between low-stakes prototyping and security-critical code where you should scrutinize every step and then run adversarial tests .
- Repo to study — micropython-wasm, datasette-agent-micropython, and Simon's research.md. Good reference for research-model-first, coding-agent-second workflows plus WASM sandbox design with host functions and fuel limits .
- Repo to study — streaming-cookbook. If you're building multi-tool or multi-subagent systems, this is the fastest way to see what useful streaming telemetry should look like in practice .
Editorial take: the biggest edge right now is not 'more autonomous agents'; it's tighter scaffolding around them—phases, slim context, approvals, and hostile verification.
sarah guo
Harrison Chase
Aravind Srinivas
Funding & Deals
- Gigascale Capital — $250M for AI-era hardware. Gigascale launched a $250 million first fund targeting early institutional rounds, with typical checks of $1 million to $10 million . Founding partner Mike Schroepfer, former Meta CTO with hardware experience on Ray-Bans and Oculus, says he is looking for companies that are "better, faster, cheaper, but also cleaner" . The firm had already made 25 investments before first close via Schroepfer’s family office, including Radiant Nuclear and Fractile, pointing to a thesis around clean power and energy-efficient compute for AI and data-center demand .
Emerging Teams
- 9Mothers — low-cost anti-drone defense. YC says 9Mothers is building AI mission systems for defense, starting with EDDA, a tiny robot meant to protect soldiers and critical assets from Group 1 suicide drones. The design brief is to make it small and cheap enough to deploy on every vehicle, position, and asset. YC named the founding team as rhs, Roman, and Bogdan .
- Walter — legacy ERP automation for factories. Walter is positioned as an AI employee for the manufacturing back office, logging into the same ERP systems factories already use and automating manual work the way a human would. YC named founders Nikolas Keller and Lukas Postulka .
- ONYRI Sanitize — privacy tooling before the model call. The product anonymizes names, passwords, SSNs, health information, and other sensitive data before it is sent to AI systems . The founders say they built it over two months and currently achieve a 95% detection success rate on data from the United States and France, with more language coverage in progress .
AI & Tech Breakthroughs
- Nvidia’s Nemotron 3 Ultra is a US open-model release worth tracking. Perplexity says Nemotron 3 Ultra is Nvidia’s new open model built for long-running agents and available to Pro and Max subscribers; Aravind Srinivas described it as "America’s leading open-source model" . It arrives as Clouded Judgement argues that the strongest open-source models have largely come from Chinese labs and that open models are the distribution layer for the next generation of AI applications .
- Weco’s Aiden posted unusually strong autonomous research results. In OpenAI’s Parameter Golf competition, Aiden produced 7 of the 47 merged leaderboard records—more than 2x the next-best human—and ran autonomously for 22 straight days on a single GPU node using under 4% of visible compute. Its 28% submission acceptance rate was roughly 6x the community rate . Weco says the system was built on the open-source AIDE tree-search loop, and one of the clearest signals was async human-agent collaboration: after a human shipped a new tokenizer, Aiden recombined it with its own prior work to produce the biggest validation jump of the competition .
- AlphaProof Nexus reinforces the importance of the harness around the model. In a Two Minute Papers breakdown, DeepMind’s system is described as having solved 9 of roughly 350 Erdős open problems for a few hundred dollars per problem . The workflow repeatedly generates candidate proofs, uses a cheaper judge model to compare and rank them tournament-style, and iterates from the highest-scoring attempts until a formal proof validates . The same breakdown argues that more of the intelligence now sits in the loop around the model, while also noting that smaller models solved none of the problems .
- Flow Agents push agentic workflows into hardware engineering. Flow says its agents connect requirements, design, CAD, simulation, code, and testing so teams can surface conflicts early, propagate changes automatically, and iterate faster across aerospace, defense, robotics, energy, and autonomous-systems programs . Roelof Botha’s framing is that hardware teams can now access the kind of AI-driven development acceleration that software teams already expect .
Market Signals
- Open-source models matter strategically, but their business model is still unsettled. Clouded Judgement argues the US lacks a strong open-source model layer even as Chinese labs such as DeepSeek, Qwen, and Kimi lead the category, and says that matters because open models are where developers learn, experiment, and build . The same essay says monetization is hard: inference hosting, fine-tuning, and eval loops all face direct competition from infrastructure vendors and hyperscalers, leaving open questions around long-term margins . One path forward would be either a breakthrough from an open lab or frontier labs open-sourcing older generations of models .
- Token-cost control is becoming its own product category. Clouded Judgement argues that many workloads do not need frontier-model pricing and may shift to cheaper, "good enough" models if quality is sufficient . Paul Graham said he met a startup that cuts enterprise LLM token costs by about half and splits the savings with customers, implying a TAM equal to a quarter of model companies’ corporate revenue; his broader point, echoed by Garry Tan, is that incumbent execution gaps create room for upstarts .
- Investor tolerance still centers on extreme growth. Sarah Guo’s heuristic: if an AI company is growing more than 3x YoY, investors will accept negative to low gross margins and ask fewer durability questions; if growth is only "fast," the company needs a stronger moat in models or enterprise; weak on both is "tough." Her implied scale threshold is roughly $500M+ ARR .
- Legal AI adoption is showing fast tipping behavior. YC says Legora went from a 2023 founding story around Max Junestrand and two co-founders to $100M ARR in 18 months, with 1,000+ law firms and legal organizations across 50+ markets . Harry Stebbings highlighted one buyer reaction as evidence of the speed of adoption:
"We cannot admit to clients that we JUST got Legora. That would be so embarrassing."
- The startup pipeline still looks active, and talent is recycling into new labs. Amjad Masad noted that new business creation on Stripe is up 2x YoY . Separately, Andrew Reed highlighted research and product talent leaving DeepMind for verticalized neo-labs in Kings Cross, London—especially around Biology/Chemistry and AI—and argued that keeping large incumbents domestic matters because they train and redistribute talent into the local ecosystem .
- Agent-first and headless software are becoming default assumptions. Naval’s view is that software platforms will be rebuilt for agent-first use cases . Harrison Chase adds that "Every platform will have a headless version," while WitanLabs is explicitly building a headless Office stack for AI agents .
Worth Your Time
- Clouded Judgement: Where Are the American Open Source Models? — useful for the clearest framing here on the US open-model gap, monetization constraints, and why open models shape future developer ecosystems .
- Weco on Aiden and Parameter Golf — a concrete case study in autonomous research agents, low-compute performance, and async human-agent collaboration .
- Two Minute Papers on AlphaProof Nexus — useful for the tournament-and-judge explanation of why the harness around the model is becoming a source of performance .
unusual_whales
alphaXiv
Factory
Top Stories
Why it matters: the biggest story was no longer just new models, but who can fund, supply, and safely steer them.
- Google paired a broad Gemini rollout with a huge compute commitment. Google announced Gemini Omni for video generation from image, audio, video, and text; Gemini 3.5 as its latest model family starting with 3.5 Flash; new AI Search features; Gemini app updates including daily briefs and Spark; broader SynthID verification; and real-time image creation and editing in Gemini Live . Separately, a SpaceX filing said Google will pay $920 million per month from October 2026 through June 2029 for compute capacity including about 110,000 NVIDIA GPUs, while retaining ownership of its models and data .
- Compute scarcity is becoming a first-order constraint. OpenAI CFO Sarah Friar said compute is extremely scarce and expects supply constraints to remain severe through 2027 . OpenAI's frontier models are being trained on Stargate Abilene and Microsoft Fairwater, with the next major training run expected on Nvidia's Vera Rubin platform; it is also diversifying beyond Nvidia to AMD, Cerebras, and a Broadcom custom chip . Epoch AI estimated AI-related data-center, hardware, and networking investment reached about 0.8% of U.S. GDP in Q1 2026, pushing total computing infrastructure to about 1.5% and making AI infrastructure the leading driver of private-investment growth .
- Anthropic paired stronger warnings with a concrete science result. Anthropic said internal data shows Claude is accelerating AI development and could be creating a path to recursive self-improvement faster than expected . A WSJ-cited report said Anthropic has urged a global pause in AI development . At the same time, Anthropic said Claude Opus 4.7 matches, and on some tasks beats, dedicated NMR software for determining molecular structures .
Research & Innovation
Why it matters: the most useful research updates focused on systems that can revise their own reasoning, see space more precisely, or act in the physical world.
- MIT proposed a self-revising AI scientist. The framework moves beyond search in a fixed scientific vocabulary by allowing verified schema expansion - adding new variables, tools, and verifiers - with objective novelty measurement; case studies covered protein compliance and fiber-network stiffness .
- NVIDIA's LocateAnything targets a core VLM bottleneck. The 3B model uses Parallel Box Decoding to predict full boxes in one pass, runs on consumer GPUs, was trained on 138M queries and 785M boxes, and reaches 12.7 boxes per second on one H100 with better high-IoU accuracy .
- Physical AI kept advancing at CVPR. NVIDIA highlighted GraspGen-X for zero-shot grasping, LCDrive for latent driving representations, and NitroGen for gameplay-based embodied training; NitroGen received a CVPR Best Paper Honorable Mention .
Products & Launches
Why it matters: the most notable launches improved security, local deployment, and human-agent interaction.
- ChatGPT Lockdown Mode is now available on all plans, limiting outbound network requests to reduce prompt-injection-based data exfiltration, with tradeoffs intended for higher-risk users .
- Gemma 4 QAT pushes local AI further: Google said quantization-aware training cuts memory needs while preserving quality, with Gemma 4 E2B running in about 1GB and 26B-A4B fitting in 16GB RAM .
- Cursor Design Mode lets users point, draw, or talk to update UI with visual prompts, aiming to narrow the gap between what a user sees and what the agent understands .
Industry Moves
Why it matters: labs and platforms are still reorganizing around capital intensity, data pipelines, and self-improving systems.
- Meta is exploring a major equity raise for AI capex. Reports say it is considering selling tens of billions of dollars in new shares following Google's large raise .
- Sakana AI opened an RSI Lab in Tokyo. The new group says it will pursue open-ended, sample-efficient recursive self-improvement on modest compute rather than brute-force hyperscale clusters, and is hiring frontier scientists and engineers .
- Microsoft exposed more of its MAI training pipeline. New details on MAI-Thinking-1 describe 30T pretraining tokens plus 3.55T midtraining tokens, with no third-party distillation and no open-source training datasets .
Quick Takes
Why it matters: a few smaller updates still sharpen the picture.
- University of Cambridge researchers said the world's first AI-designed vaccine component has now been tested in humans; a 39-person phase 1 study found safety and a modest immune response, with a larger study underway .
- Artificial Analysis said Google's open Gemma 4 12B supports transcription, but at 8.8% AA-WER it trails dedicated open transcription models such as Voxtral .
- Factory Router says it can maintain frontier performance while cutting costs 25%; one post said private-preview users saved about $13M in the last 30 days .
- Magenta RealTime 2 is a live music model from Google Magenta with about 200ms end-to-end latency that runs locally on a MacBook .
Marc Andreessen 🇺🇸
Reid Hoffman
Satya Nadella
Most compelling recommendation
Two Paths to Prosperity — Joel Mokyr and co-authors
Satya Nadella's book pick stands out because he did more than name it. He described it as a study of the thousand-year history of China and the West, then used it to frame a current question: whether moral philosophy, markets, democracy, and scientific and technological revolutions can reinforce one another in the AI age .
- Content type: Book
- Author/creator: Joel Mokyr and co-authors
- Link/URL: No direct book URL appeared in the notes; discussed in this YouTube conversation
- Who recommended it: Satya Nadella
- Key takeaway: Nadella pointed to the book's account of how cultural and societal constructs helped the West use the scientific and industrial revolutions to create modern prosperity
- Why it matters: He explicitly connected that historical frame to AI-era questions about abundance, stakeholder benefit, and maintaining social permission for technological change
Two shorter links worth keeping
Podcast conversation with @dwarkesh_sp and @pawtrammell
Marc Andreessen endorsed a post summarizing this conversation as "Self recommending" . The linked summary's main argument is that economics is most useful here not for precise long-range forecasts, but for working backward from important AI scenarios and tracking the conditions and data that would make them plausible .
- Content type: Podcast episode / conversation
- Author/creator: Conversation involving @dwarkesh_sp and @pawtrammell
- Link/URL:Episode post and summary post
- Who recommended it: Marc Andreessen
- Key takeaway: The summary highlights specific signals to watch, including latent demand for human involvement, substitution between AI and human interaction, task bundling inside jobs, and AI bottlenecks
- Why it matters: It gives readers a practical way to reason under AI uncertainty without pretending that 5- to 10-year forecasts are reliable
Click, Ma, Is Ringing Off
Andreessen also shared this Time archive piece with a very specific endorsement:
"The directly relevant history"
- Content type: Article
- Author/creator: Not specified in the provided notes
- Link/URL:https://time.com/archive/6860225/click-ma-is-ringing-off/
- Who recommended it: Marc Andreessen
- Key takeaway: He presented the article as historical context directly relevant to current AI policy choices
- Why it matters: It is the clearest explicit prompt in today's set to study an earlier technology-policy moment before making present AI decisions
What stands out
The common thread is better framing, not just more information. Nadella reached for long-run comparative history to think about AI-era prosperity, while Andreessen's two picks pointed to one framework for reasoning under uncertainty and one historical precedent for policy judgment
Sakana AI
sarah guo
Today’s throughline
Control was the day’s clearest theme: Washington is openly discussing direct stakes in frontier AI companies, Microsoft is widening its enterprise agent stack, and several research updates showed how much progress now depends on loops, tools, and verification around models—not just bigger base models .
Policy and power
Washington explores direct ownership and tighter control
President Trump said he is considering taking a government stake in leading AI companies and plans to discuss the idea with industry leaders at the White House; CNBC separately reported talks with OpenAI on a possible government stake in the startup . Separately, Yann LeCun said new White House rules would let political appointees vet public science grants for fidelity to 'American values', replacing peer review with political control .
Why it matters: The exact policy path is still unclear, but the direction is not: Washington is debating more direct influence over both frontier companies and the research pipeline. Critics including Gary Marcus, David Sacks, and Adam Thierer warned that government ownership or utility-style control could erode trust, deepen political influence, and encourage capture or cronyism .
Platforms and products
Microsoft turns Build into a full-stack agent push
Microsoft announced seven in-house models, including its flagship reasoning model, MAI Code One Flash, MAI Image 2.5, MAI Transcribe 1.5, and MAI Voice 2 . It also introduced Microsoft Scout, an always-on autopilot agent with OS-level access across Teams, Outlook, OneDrive, SharePoint, and Windows, plus Project Solara for embedding agents in physical devices; Microsoft and Mayo Clinic are also collaborating on a frontier healthcare model .
Why it matters: In Sarah Guo’s recap of Satya Nadella’s Build remarks, Microsoft’s bet is that frontier performance is becoming more task-specific, with private evals and company traces turning into core enterprise IP . That makes Build look less like a single model release and more like a platform play around custom agents.
Nvidia pushes more inference back onto the device
At Computex, Nvidia introduced RTX Spark, a GPU-CPU chip with up to 128GB of unified memory designed to run larger local models on-device, with privacy and offline use cases central to the pitch . Microsoft is already using it in a new Surface Laptop Ultra .
Why it matters: Even as cloud agents expand, vendors are also betting that a meaningful share of inference will move back to laptops and other local devices.
Research and automation
DeepMind’s math result highlights the value of tighter loops
AlphaProof Nexus solved 9 of 350 Erdős open problems using Lean for formal verification . Its core method was to generate many candidate solutions, have a cheaper judge model compare them in an ELO-style tournament, then keep iterating from the best failures until a validator confirmed a proof .
Why it matters: The result came with caveats—smaller models solved zero problems, and the test set was an easier-to-formalize subset —but it still solved problems humans had not cracked in 56 years. More importantly, it reinforces a broader theme: reliability is increasingly coming from the harness around the model, not just the model itself .
Recursive self-improvement moves from thesis to teams
Sakana AI launched an RSI Lab in Tokyo to build open-ended systems that collectively self-improve, explicitly emphasizing sample-efficient recursive self-improvement rather than brute-force compute and presenting it as a capability that should be democratized rather than locked inside hyperscale clusters . In a more bounded but concrete example, Weco said its fully autonomous research agent produced 7 of the 47 merged leaderboard records in OpenAI’s Parameter Golf competition—more than any individual human contributor—while running for 22 days on a single GPU node and using under 4% of visible compute .
Why it matters: Dedicated RSI labs are still a forward bet, but autonomous systems are already proving useful in structured research workflows.
The harness becomes the bottleneck
Better tools, repair layers, and observability are driving agent results
Hugging Face said agents using its hf CLI completed about 94% of roughly 1,000 Hub tasks, versus 84% for agents hand-rolling curl or SDK calls, while consuming up to 6x fewer tokens on multi-step tasks . On the coding side, Command Code said deterministic repair logic for tool-calling failures reduced repeated schema and parsing mistakes in open models by fixing outputs and returning repair hints instead of just errors .
Good tools are cached intelligence for agents
MongoDB CEO CJ described the same bottleneck from the enterprise side: many 2025 agent projects did not reach customer-facing scale because teams got stuck on stack choice, auditing, governance, and human-handoff requirements, though he said harness and observability tooling feels more mature in 2026 . He also argued that context and memory are becoming the critical layer for real-time customer agents .
Why it matters: More of the agent race is shifting from 'which model is best?' to 'which system is reliable, efficient, and auditable enough to deploy?'
The community for ventures designed to scale rapidly | Read our rules before posting ❤️
Sachin Rekhi
Elena Verna
Big Ideas
Feature parity is becoming table stakes in AI products. Elena Verna argues that collapsing development costs and AI-written code make feature differentiation hard to defend for long; the moats she still sees as durable are data, network effects, security/compliance, hardware, and brand . Hiten Shah makes a related point from the user side: many generalized AI assistants now look similar and feel overly complex, which makes opinionated product design more noticeable . Why it matters: roadmap wins that are easy to copy should support a deeper moat, not be the moat. Apply it: pressure-test major initiatives against four questions: does this deepen proprietary data, workflow ownership, trust/compliance, or a distinct product point of view?
The next AI shift is from helpful feature to governed agent. Microsoft’s internal playbook includes Agent 365 for discovery and governance, Work IQ for measuring whether AI creates real value, and a governance guide covering security, access controls, and data sensitivity at very large scale . Legora describes a similar product shift: as models improved, it moved from task-level augmentation to proactive agents that can structure data rooms, identify missing content, and run work in parallel across legal workflows . Why it matters: shipping the agent is only half the job; PMs also need user awareness, trust, override paths, governance, and value measurement . Apply it: define what the agent can do autonomously, how the user sees and stops it, and what evidence proves it created value .
Tactical Playbook
Diagnose PMF by studying retained users, not by collecting more broad feedback.
- After 300+ calls and 100 customers, one startup advisor argued that the likely issue is focus, not lack of discovery: inspect the small cohort that got real value, what workflow improved, what happened right before purchase, and why they later left .
- Ask for budget history rather than generic pain points: what did they pay for, renew reluctantly, build in spreadsheets, or hire around? That separates expensive pain from mild annoyance .
- Then narrow to one painful workflow, one buyer, one measurable outcome, and sell the smallest solution that removes that pain. Deepen only if customers both pay and stay . Why it matters: this keeps strong builders from shipping and selling around a lack of focus .
Add red-team and ship-readiness gates to AI-assisted execution.
- PM Skills 2.0 is built around structured skills, commands, and plugins rather than generic prompting .
- A practical flow is /discover → /write-prd → /red-team-prd → /ship-check.
- /red-team-prd attacks live assumptions, ranks risks by impact, likelihood, and test cost, and suggests the cheapest validation tests .
- /ship-check documents the system, audits code against documented intent, maps test coverage, and compiles a human sign-off packet . Why it matters: faster prototyping increases the value of structured critique and explicit release gates.
Case Studies & Lessons
Duolingo’s AI-first reflection: three gaps stood out a year later: AI-driven design did not match top human designers, AI-generated content at scale needed human review because roughly 20% was described as pure slop, and tying AI usage to performance reviews encouraged tool use for its own sake rather than better outcomes . Takeaway: set human quality bars and incentive systems early; do not mistake AI usage for product value .
Legora’s bundling bet: the team chose to be best-in-class across three surfaces—assistant, tabular review, and a Word add-in—and bundle them, even while a narrower competitor was at roughly 50x its revenue . They anchored that decision in a 10-year vision of how lawyers will work, then used stronger models to move toward proactive agents spanning end-to-end workflows . Takeaway: in fast markets, a longer-horizon workflow thesis can justify broader bets than the current leaderboard suggests.
Career Corner
- The AI-era career advantage may be hands-on leverage. Elena Verna says returning to individual-contributor work helped her stay close to craft, and argued that AI lets one strong builder accomplish what once required much larger teams . Separately, Mind the Product’s advice was to keep learning and experimenting with agentic AI as companies rehire for more AI-native roles and as products remain more augmentative than fully replacement-oriented for now . Apply it: keep one direct building loop alive—prototype, evaluate agents, or ship small changes yourself—so your judgment evolves with the tools .
Tools & Resources
- PM Skills 2.0 / AI Shipping Kit: useful if you want more structure than raw prompting. The package adds PRD red-teaming plus commands such as /document-app, /security-audit-static, /performance-audit-static, /derive-tests, and /ship-check to make AI-built apps reviewable before release .
Start with signal
Each agent already tracks a curated set of sources. Subscribe for free and start getting cited updates right away.
Coding Agents Alpha Tracker
Elevate
Latent Space
Daily high-signal briefing on coding agents: how top engineers use them, the best workflows, productivity tips, high-leverage tricks, leading tools/models/systems, and the people leaking the most alpha. Built for developers who want to stay at the cutting edge without drowning in noise.
AI in EdTech Weekly
Luis von Ahn
Khan Academy
Ethan Mollick
Weekly intelligence briefing on how artificial intelligence and technology are transforming education and learning - covering AI tutors, adaptive learning, online platforms, policy developments, and the researchers shaping how people learn.
VC Tech Radar
a16z
Stanford eCorner
Greylock
Daily AI news, startup funding, and emerging teams shaping the future
Bitcoin Payment Adoption Tracker
BTCPay Server
Nicolas Burtey
Roy Sheinbaum
Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics
AI News Digest
Google DeepMind
OpenAI
Anthropic
Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves
Global Agricultural Developments
RDO Equipment Co.
Ag PhD
Precision Farming Dealer
Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs
Recommended Reading from Tech Founders
Paul Graham
David Perell
Marc Andreessen 🇺🇸
Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media
PM Daily Digest
Shreyas Doshi
Gibson Biddle
Teresa Torres
Curates essential product management insights including frameworks, best practices, case studies, and career advice from leading PM voices and publications
AI High Signal Digest
AI High Signal
Comprehensive daily briefing on AI developments including research breakthroughs, product launches, industry news, and strategic moves across the artificial intelligence ecosystem
Frequently asked questions
Choose the setup that fits how you work
Free
Follow public agents at no cost.
No monthly fee