We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Your intelligence agent for what matters
Tell ZeroNoise what you want to stay on top of. It finds the right sources, follows them continuously, and sends you a cited daily or weekly brief.
Your time, back
An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers verified daily or weekly briefs.
Save hours
AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.
Full control over the agent
Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.
Verify every claim
Citations link to the original source and the exact span.
Discover sources on autopilot
Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.
Multi-media sources
Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.
Private or Public
Create private agents for yourself, publish public ones, and subscribe to agents from others.
3 steps to your first brief
Describe your goal
Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.
Review and launch
Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Get your briefs
Get concise daily or weekly updates with precise citations directly in your inbox. You control the focus, style, and length.
Aravind Srinivas
Perry E. Metzger
Figure
1) Funding & Deals
- Air traffic systems: Eric Jacob Button announced a $7M round led by Initialized to build next-generation air traffic systems, with United Airlines, Y Combinator, and other investors participating. Garry Tan said YC was the first investor.
- Defense procurement is emerging as a financing substitute. The DoD signed framework agreements for low-cost containerized missiles with Anduril, CoAspire, Leidos, and Zone 5, and for low-cost hypersonic missiles with Castelion. The department said it intends to procure 10,000+ LCCMs over three years starting in 2027 and, after testing and validation, award Castelion a multi-year contract for at least 500 Blackbeard missiles annually while seeking authorization for 12,000+ missiles over five years.
- Allocator signal: Scribble Ventures says it is now on its third fund with nearly $300M AUM, still writing roughly $750k-$1.5M initial checks at pre-seed and seed. Founder Elizabeth Weil positioned the firm around AI-native companies and top 1% founders, sourced through an operator-heavy network spanning OpenAI, Meta, Twitter, Instagram, and a16z.
2) Emerging Teams
- Heron Power: Heron is building solid-state transformers that use silicon and software to replace steel, oil, and copper in power conversion for data centers, solar, and battery projects. The company is targeting a grid that Drew Baglino described as largely unchanged for a century, and its first large factory is expected to create around 500 jobs.
- Mariana Minerals: Mariana is a software-first mining and refining company building Capital Project OS for agentic project delivery, PlantOS for refinery autonomy, and MineOS for autonomous mining control. It already operates a copper mine in southeast Utah, is building a lithium refinery in Texas, and is targeting 10 projects in 10 years.
- Foresight: YC says Foresight builds AI-powered consumer simulations for CPG, retail, and tech teams, and reports 95% accuracy versus traditional research in tests with Fortune 500 clients.
- Adialante: The YC team says its mobile MRI model can reduce scans to hundreds per scan and wait times to hours, with the explicit aim of making annual cancer screening routine.
- Rudus: Rudus is going after a painful construction workflow: concrete takeoffs that can require 100+ hours of manual tracing per bid. YC says its AI platform lets teams bid on 3-5x more work without adding headcount.
- Surtr Defense: Surtr’s ParallaxOS is pitched as an open operating system for drone defense, unifying any sensor into one threat picture with AI fire control, while keeping integrations with partners and data with customers.
3) AI & Tech Breakthroughs
- Multimodal efficiency is getting materially better. A new open 30B-parameter multimodal model processes images, video, and audio at almost 10x real-time video, about 3x faster than Q3 Omni on video and up to 7x faster on documents. The gains come from Mamba layers that scale linearly with context length, direct audio tokenization that preserves emotion and tone, 3D convolutions over frame blocks, a distilled vision encoder, and efficient video sampling that removes duplicate frames.
- Agent infrastructure is hardening into a real stack. LangChain launched SmithDB, a database built for agent trace data, and LangSmith Engine, which sits on top of traces to identify issues and suggest fixes like code changes or additional evaluators. Perplexity, meanwhile, described an agent runtime with hardware-isolated sandboxes per task, proxy tokens instead of raw API keys, safety detection on accessed content, encrypted connector data, and separated storage and compute.
- Voice generation is separating identity from performance. Scenema Audio released open-source weights and inference code for zero-shot expressive voice cloning based on the idea that emotional performance and voice identity are independent. The team argues diffusion-based speech sounds more natural and less robotic than autoregressive TTS, especially for emotional delivery, and it is already being used in audio-first video workflows.
- Web agents are getting resettable training environments. WebHarbor packages 15 popular websites into local Flask + SQLite apps inside one Docker image, resets them to byte-identical state in <1 second, and supports all 643 WebVoyager tasks out of the box—explicitly solving live-web issues like reCAPTCHA, geo-blocks, content drift, and non-resettable environments for RL training.
- Physical AI demos are getting longer and less scripted. Figure Robotics showed humanoid robots running a full 8-hour shift at human performance levels, fully autonomously on Helix-02.
4) Market Signals
- China’s competitive threat is now efficiency, not just catch-up. Exponential View estimates Chinese labs are extracting 4-7x more intelligence per unit of compute than naive scaling would predict, despite being 2-3 years behind the US in compute and facing an 8x US lead in deployed capacity. Even so, their models are described as only 3-8 months behind the US frontier on benchmarks, with much cheaper inference and reported 50-70% gross margins at some providers.
- Enterprise model share just flipped. Ramp AI Index data cited on X shows 34.4% of businesses using Anthropic versus 32.3% using OpenAI; Anthropic adoption quadrupled over the last year while OpenAI rose only 0.3%. Nathan Benaich added that revenue comparisons may also be affected by model 4.7 using up to 1.3x more tokens for the same query.
- Application-layer defensibility is moving down-stack. a16z argues that as incumbents such as Salesforce open APIs and ship headless products, they are implicitly betting the data layer—not the UI—retains value. In that world, startups compete on proprietary data, control of the action layer, real-world execution, and selling to technical buyers, while next-generation systems of record capture context, initiate work, and record data exhaust.
- Enterprise knowledge work is already landing significant task volume. PayPal runs 74,000 weekly tasks in Perplexity Enterprise across model validation, channel performance, market trend research, competitive intelligence, and product analysis.
- Investor sentiment remains split between platform risk and productivity upside. Jason Calacanis is explicitly asking how many startups survive Anthropic and OpenAI, and Dalton Caldwell says the answer is not a simple “no,” pointing to moats and the danger of building on the assumption that models will not improve. Marc Andreessen amplified the counterview that firms have incentives to blame AI for layoffs, while GitHub commit activity has risen sharply and the industry is not seeing layoff “armageddon.”
“It’s hard to believe you are prepared for a problem you haven’t spent any time considering could be a problem”
5) Worth Your Time
- Exponential View’s China AI lab analysis is the most detailed item in the batch on how export controls may have created an efficiency moat, with concrete numbers on compute lag, benchmark distance, pricing, and margins.
- Seema Amble’s a16z essay on headless software is useful for diligencing application moats as systems of record become agentic and value shifts toward data and execution.
- The a16z conversation with Heron Power and Mariana Minerals is the clearest video in the batch on why the AI bottleneck is increasingly physical infrastructure—materials, energy, and grid capacity—not just models and chips.
- Sequoia’s Suno interview is worth watching for a founder-level explanation of why music generation required modeling raw audio, why the team chose full songs over short clips, and why usage looks more like creative participation than passive listening.
- LangSmith Engine and SmithDB are the most relevant product posts in the batch if you track agent reliability tooling and the growing importance of traces as a control plane.
Charlie Holtz
Cursor
OpenAI Developers
🔥 TOP SIGNAL
- Claude's new programmatic-usage packaging is already forcing routing decisions. Anthropic says paid Claude plans will get monthly Agent SDK credits starting June 15 — separate from regular limits, usable on
claude -p, GitHub Actions, and third-party SDK apps like Conductor/OpenClaw . Theo, who wraps the SDK in T3 Code, says the change is a downgrade in practice: he reports wrapper users getting 25x-40x less useful subsidized usage, called the framing misleading, and cancelled his subscription . Conductor switching its default coding harness to Codex with GPT-5.5 — with Jediah Katz saying it looks like they moved quickly to get ahead of the Claude SDK pricing change — is the clearest downstream signal that economics, not just model quality, are steering tool choice .
⚡ TRY THIS
Replace markdown walls with HTML artifacts. Theo says you don't need a special skill to start — just ask the model to "make an HTML file" or "make an HTML artifact." His main trick: generate distinct options in one pass, not sequentially, because the one-pass fan-out gives more variety .
"Generate six distinctly different approaches, varying layout, tone and density and lay them out as a single HTML file in a grid so I can compare them side by side. Label each with the trade off it's making."
Use HTML as a clean handoff layer between agents. Thoric's workflow, relayed by Theo: brainstorm several HTML explorations, expand the chosen direction with mockups/code snippets, then ask for a thorough HTML implementation plan with mockups, data flows, and important snippets. Start a new session with those files for implementation or verification so the next agent inherits structure, not just a chat transcript .
Attach an HTML explainer to big PRs. Theo says this often works better than a default GitHub diff: ask the model to render the actual diff, inline margin annotations, and severity-tagged findings in HTML, then attach that artifact to the PR. If you want a reference product for the same idea, he points to Devin Review regrouping PRs by importance and related changes .
"Help me review this PR by creating an HTML artifact that describes it. I'm not very familiar with the streaming and back pressure logic, so focus on that. Render the actual diff with inline margin annotations color code, findings by severity, and whatever else might be needed to convey the concepts."
Version the sandbox, not the prompt. The pattern showing up across Cursor and LangSmith: set up agent environments like engineer laptops — cloned repos, installed dependencies, toolchain credentials — then keep them reusable, forkable, and auditable. Use multi-repo when the task spans services, scope secrets/egress per environment, and rely on snapshots/version history/rollback instead of rebuilding context from scratch each run .
📡 WHAT SHIPPED
Cursor cloud agents: fully configured development environments; multi-repo reusable across sessions; per-env version history with rollback + audit logs; scoped egress/secrets. Cursor says customers like Decagon, Amplitude, BILT, and Snyk use them for end-to-end agent tasks. Blog
Codex Windows sandbox: OpenAI's answer to "useful on Windows without constant approval prompts or full machine access" is a dedicated sandbox; OpenAI says it is continuing to invest in better Windows agent support. Engineering post
LangSmith Sandboxes GA: secure/scalable agent code execution tied into Deep Agents SDK + LangSmith; GA adds snapshots and cheap forks, blueprints, pause-when-inactive, service URLs, CLI, creator-private default, and auth proxy callbacks. Start at smith.langchain.com or read the blog
LangSmith Engine: new autonomous agent for finding patterns in your agent's failures; LangChain's pitch is less triage, faster fixes, earlier regressions. Blog
Claude plan changes: starting June 15, paid plans get monthly Agent SDK credits for scripts/agents,
claude -p, GitHub Actions, and third-party SDK apps; Anthropic says they're separate from regular limits, range from Pro $20 to Max 20x $200 / Team Premium $100 per seat, and do not roll over. Separately, Claude Code weekly limits are up 50% through July 13 for Pro, Max, Team, and seat-based Enterprise. SupportConductor / T3 Code routing signal: Conductor made Codex with GPT-5.5 its default coding harness for the team and new users . Theo separately clarified T3 Code is open source, BYO inference, and already supports Codex, Claude Code, Cursor, and OpenCode — useful optionality when provider terms change .
Crabbox 0.13.0: Modal sandbox runs, full resync for stale workdirs, native Windows script + preflight, and clearer SSH/sync failure hints. Peter Steinberger says he's using it for almost every PR. Release
🎬 GO DEEPER
- 15:43-17:34 — HTML spec handoffs. Best clip if you want a reusable planning loop: fan out options in HTML, expand the winner, then hand the files to a fresh implementation/verification session. It's a concrete fix for the "one long chat gets mushy" problem .
- 19:38-21:29 — HTML PR explainers. Theo's case for rendered diff explainers is practical, not aesthetic: better hierarchy, inline annotations, and review organized around what matters. Watch this if review latency is your actual bottleneck .
Study the release — Crabbox 0.13.0. Small changelog, high signal. The interesting parts are the harness details: sandbox execution, stale workdir recovery, Windows preflight, and clearer sync-failure surfaces .
Study the pattern docs — Cursor cloud-agent environments and LangSmith Sandboxes GA. Different products, same idea: persistent, versioned, forkable execution environments are becoming standard infrastructure for serious coding-agent work .
Editorial take: today's durable edge wasn't a flashier model — it was better packaging around agents: richer artifacts, safer sandboxes, and enough portability to survive pricing shocks.
Perplexity
Sam Altman
Nous Research
Top Stories
Why it matters: The clearest signal today is where AI is creating measurable business pressure, operational capability, and new scrutiny.
- Anthropic pulled ahead of OpenAI in business adoption. Ramp’s AI Index put Anthropic at 34.4% of businesses versus 32.3% for OpenAI, with Anthropic adoption up 4x over the last year while OpenAI rose 0.3%. OpenAI responded with two free months of Codex for eligible enterprise switchers , while Anthropic raised Claude Code weekly limits 50% through July 13 . Impact: enterprise spend and coding workflows are becoming the main competitive front.
- Figure turned its humanoid demo into a shift-length test. Figure said its F.03 robots completed an 8-hour package-sorting shift at roughly human pace, about 3 seconds per package, using fully onboard Helix-02, autonomous battery swaps, and failover, then moved to a 24/7 livestream. Some outside observers questioned whether parts of the behavior show robust autonomy or an imitation-learning artifact . Impact: embodied AI claims are getting bigger—and more contested.
- Claude Mythos Preview reached a new cyber milestone. The model became the first to solve both AISI cyber ranges, and completed a 32-step corporate network attack estimated at ~20 human hours in 6/10 attempts . The result was reported on the actual launch version used in Glasswing, not an earlier checkpoint . Impact: long-horizon offensive cyber capability is moving toward more operational benchmarks.
Research & Innovation
Why it matters: The strongest technical updates were about making training faster, reasoning safer to interpret, and AI more useful for frontier math.
- NousResearch’s Token Superposition Training claims 2-3x wall-clock pretraining speedup at matched FLOPs without changing architecture, optimizer, tokenizer, or data; the inference-time model remains identical to conventional pretraining .
- Aletheia, powered by Gemini Deep Think, was used to autonomously solve Kirby Problem 5.16, generating proofs for a new paper on semifree DG algebras; researchers framed the K3 list as a long-term AI math benchmark .
- A new paper on multi-agent reasoning found agents can compute the right answer internally and then suppress it to agree with the swarm, calling this the Sovereignty Gap after 22,500 deterministic trajectories across GAIA, SWE-bench, and Multi-Challenge.
Products & Launches
Why it matters: Developer tooling is shifting from chat assistants to persistent environments, multi-agent workflows, and full-stack execution.
- Cursor launched cloud agents inside fully configured dev environments with repos, dependencies, and credentials, plus multi-repo support, rollback, audit logs, and scoped egress/secrets .
- VS Code shipped its new Agents window in stable, letting developers manage multiple coding agents across projects, connect to remote sandboxes, and use a browser/mobile interface at
vscode.dev/agents. - Devin added Android Virtual Device support, so it can now build, launch, test, reproduce issues, inspect behavior, and verify Android app changes in an emulator before review .
Industry Moves
Why it matters: Capital is concentrating around agent infrastructure, inference hardware, and new labs aimed at automating research itself.
- Modal is in talks to raise at a $4.5B valuation with annualized revenue around $300M, much of it driven by sandboxes used by AI agents .
- Fractile raised $220M to build inference hardware and systems for next-generation AI scaling .
- Recursive launched, saying it wants to automate science starting with self-improvement; Jeff Clune described a 25+ person team with significant resources, and @_rockt said the company is operating across London and SF.
Quick Takes
Why it matters: These smaller updates help round out where enterprise AI and agent infrastructure are heading next.
- LangChain launched LangSmith Engine and SmithDB, expanding its agent-development stack .
- OpenAI detailed a custom Windows sandbox for Codex to keep coding agents useful without full machine access .
- PayPal now runs 74,000 weekly tasks in Perplexity Enterprise for research and analysis workflows .
- Harvey said it has crossed 50% DAU/MAU, with more than half of customers using it daily .
Ryan Hoover
Garry Tan
What stood out
Today’s strongest organic recommendations split between startup operating frameworks and broader resources for sharpening taste and curiosity .
Most compelling recommendation: Paul Graham’s essays
- Content type: Essay collection
- Author/creator: Paul Graham
- Link/URL: Not provided in the interview
- Who recommended it: Garry Tan
- Key takeaway: Tan said he discovered the essays online and that they gave him the language for ideas like "solve the money problem."
- Why it matters: This was the strongest recommendation in today’s set because Tan described the essays as materially shaping how he thinks about building businesses.
Companion pick from the same conversation: Hackers & Painters
- Content type: Book
- Author/creator: Paul Graham
- Link/URL: Not provided in the interview
- Who recommended it: Garry Tan
- Key takeaway: Tan called it the "perfect polymath book," using it to capture Graham’s mix of hacker instincts and artistic range.
- Why it matters: It reads as the clearest single-book entry point into the worldview Tan was pointing readers toward.
"He wrote a book called Hackers and Painters, which is the perfect polymath book."
Durable operator resources
Hooked
- Content type: Book
- Author/creator: Nir Eyal
- Link/URL: Not provided in the source post
- Who recommended it: Ryan Hoover
- Key takeaway: Hoover said two founders had already mentioned the book that morning and argued that, while tech has changed a lot in 12+ years, human psychology has not.
- Why it matters: It is a reminder that durable product lessons can survive major platform shifts when they are rooted in behavior rather than tooling.
Krishna on how Anthropic thinks about the platform vs. application layer
- Content type: Video clip / X post
- Author/creator: Patrick O'Shaughnessy post featuring Krishna
- Link/URL:https://x.com/patrick_oshag/status/2054562883350962304
- Who recommended it: Brad Gerstner
- Key takeaway: Gerstner called it "must watch" and highlighted Krishna’s explanation that Anthropic is primarily building platform, while selectively building applications like Claude Code when it can express where the models are going or demonstrate value for the ecosystem.
- Why it matters: For AI builders, it is a compact explanation of how a frontier model company thinks about enabling customers, competing with them, and keeping both on the same underlying platform.
"Most of what we’re building is platform."
Resources that widen curiosity
The Marginalian
- Content type: Personal blog
- Author/creator: Maria Popova
- Link/URL: Not provided in the source post
- Who recommended it: David Perell
- Key takeaway: Perell said reading Popova was an "entry point into intellectual curiosity" and credited her with introducing him to more writers and ideas than almost anyone else.
- Why it matters: It is a strong meta-resource: one good source that can lead readers into many others.
Reflections on the Art of Living: A Joseph Campbell Companion
- Content type: Book
- Author/creator: Joseph Campbell
- Link/URL:https://www.amazon.com/Reflections-Art-Living-Campbell-Companion/dp/0060926171
- Who recommended it: Packy McCormick
- Key takeaway: McCormick said he highly recommends it and pulled from it a set of ideas about following your own path, realizing your potential, and learning to stay composed amid intensity.
- Why it matters: Among today’s more philosophical picks, this was the most actionable: not just inspiration, but a usable standard for how to move through pressure without getting torn apart by it.
"The goal is to live with godlike composure on the full rush of energy, like Dionysus riding the leopard, without being torn to pieces."
Mind the Product
Product School
Merci Grace
Big Ideas
1) Own the workflow, not just the model call
"The startup question is what part of the workflow you can truly own."
If you build on frontier models, you are also exposed to labs’ decisions on capacity, rate limits, pricing, and availability, which can squeeze shallow wrappers . The more durable layer is elsewhere: workflow ownership, proprietary context, data rights, distribution, compliance, and trust . SaaStr’s agent guidance points in the same direction: products need composable developer surfaces and a strong data foundation if they want to stay visible inside agentic workflows . Customers are already comparing agents side-by-side and expecting demonstrable ROI .
- Why it matters: Model access is not a moat by itself. The defensible surface is the workflow, context, and system integration around the model .
- How to apply: Audit what you truly own: context, permissions, memory, review loops, domain skills, and integrations. If your product cannot be called through APIs, MCP servers, or webhooks, it may disappear from agentic workflows .
- Pricing implication: As reliability improves, evaluate pricing around outcomes or success rather than tokens, because customers want certainty more than token accounting .
2) The product is the whole service, not the shiny artifact
“Makers often focus on the shiny object—the product they’re building—and forget about the rest of the journey until they’re almost ready to deliver it to the customer. But customers see it all, experience it all. They’re the ones taking the journey, step-by-step.”
Public-sector PMs make this impossible to ignore. They cannot choose their users; success means serving the fringes as well as the mainstream, starting with the curb cut effect . Demand already exists, so the job is less about generating it and more about lowering barriers to access . That changes everything from device testing—think $20 flip phones and library PCs, not premium hardware—to whether the right answer is building something new at all .
- Why it matters: Many product failures come from improving the visible interface while leaving the surrounding service, policy, or operational funnel broken .
- How to apply: Map the full journey before shipping. Test on the lowest common denominator device set, and ask whether discovery is revealing a need to remove systems rather than add another one .
- Design caution: Technology rarely fixes an upstream policy problem. When a policy challenge is framed as an engineering spec instead of a design problem, teams lose room to iterate with users .
3) AI is reducing information overhead, but increasing the premium on judgment
A recurring PM theme this week was that a large share of the job is managing Slack threads, notes, interviews, docs, dashboards, and tickets rather than making decisions . Several practitioners see AI as useful precisely because it can gather and organize scattered inputs, freeing more time for strategy, alignment, and prioritization . But there is a cost: one data scientist told Lenny that much of the team’s work is now reviewing AI-generated analysis from PMs and engineers, and that the analysis is wrong 50% of the time . The result is broader role confusion about who actually owns what .
- Why it matters: AI makes it easier to produce summaries and analyses, but not easier to know which ones are correct or action-worthy .
- How to apply: Use AI to collect and structure context, then keep humans accountable for prioritization, validation, and final decisions .
Tactical Playbook
1) Build agents from real workflows, then harden them for production
A repeatable pattern emerges from the Vercel examples:
- Shadow the best operator and document the existing workflow before adding AI .
- Convert each data source into explicit workflow steps and tool calls .
- Keep a human in the loop and run in shadow mode until feedback materially drops .
- Invest in developer surface area—APIs, MCP servers, webhooks—so agents can act inside real systems .
- Ground everything in a semantic layer or knowledge base so answers are specific rather than generic .
- Benchmark the competition directly by trying or buying competitor seats; customers are already doing the same .
Why this matters: The notes draw a hard line between a good demo and a durable production agent. Architecture, data quality, and review loops determine whether the product survives scale .
2) Use role-based AI agents to manage project memory
A non-developer PM described a simple three-agent setup in Cursor: a Scribe that extracts risks, dependencies, action items, decisions, and open questions; an Integrator that files them by project; and a Strategist that rates risks and drafts communications . The system is useful for recalling meeting context or answering a manager quickly without combing through documents .
A practical rule came from the replies: if the strategist labels everything critical, cap the number or percentage of items that can be marked critical, and edit the output yourself .
Why this matters: PM memory systems work best when extraction, filing, and judgment are separated—and when the PM stays the editor .
3) Treat information management as alignment work
The most useful advice in the information-overload thread was blunt: a PM’s job is creating alignment, prioritizing, and cutting through noise . In practice, that means:
- Pull fragmented information into one place first .
- Decide what is signal versus noise before broadcasting it .
- Use AI to reduce the operational burden of gathering and organizing inputs, not to replace reasoning .
- Coach teams on how to recognize real signal; seniority should raise the quality of filtering, not just the volume of output .
Why this matters: Without a filtering layer, more tooling just creates more equally loud inputs competing for attention .
4) For hard AI update problems, favor validation-driven self-correction over endless patches
Teresa Torres shared a useful engineering lesson from rebuilding AI-generated opportunity solution trees for Vistaly: updating a tree with new interviews was harder than generating one from scratch because tree diffs behave differently from text diffs . Her reported breakthrough was to let the model correct its own mistakes, then wrap the process in an agent loop with validation tools . She also notes that AI-written code should be trusted selectively and verified when needed .
Why this matters: Some AI failures are not missing features; they are the wrong control strategy for the problem shape .
Case Studies & Lessons
1) Vercel: a lead qualification agent with measurable operating leverage
Vercel’s lead qualification agent started with about 20% of a single engineer, then used a GTM engineer, data scientist, and domain expert working together to encode best practice into workflows . The team kept a human in the loop for six weeks and ran the system in shadow mode until the expert could no longer materially improve outputs . Reported results: the function dropped from 10 people to 1 in the US, with about 20% of one person covering Europe and APAC; SDR quotas rose 30%; and the agent costs under $5K per year in infrastructure and tokens, a stated 32x ROI .
Takeaway: Start with one narrow, high-volume workflow, learn from a top performer, and earn autonomy in steps .
2) Snowflake: top-down mandate plus bottom-up access
Snowflake’s internal AI adoption combined a CEO-level mandate that the company operate differently with broad access to its Cortex coding agent . Employees could query governed data, build agents, and automate workflows; Baris Gultekin gave the example of account reps creating anomaly detectors for customers . On top of that, Snowflake created function-specific skills for PMs, including PRD writing, code-repo reading, UI prototyping, and routine task automation . The company then extended the same model across finance, marketing, and HR .
Takeaway: Company-wide AI adoption is stronger when leadership sets the expectation and teams also get direct access to tools, data, and role-specific skills .
3) Public-sector product management: sometimes the winning move is to remove systems
Ayushi Roy offered three strong examples of non-standard product thinking:
- In discovery for the Children’s Health Insurance Program, her team found 14 systems serving the same service and proposed removing 10 instead of building a 15th .
- A text-based campus safety hotline built in two-week increments was rolled out to 800,000 students across 13 universities in eight months.
- For IRS Direct File, the team sliced rollout by tax complexity, started with an internal IRS rollout to build agency support, and treated modularity as protection for learning under public scrutiny across a population of 150M+ taxpayers.
A fourth example is the warning case: modernizing a childcare voucher application does not fix a broken funnel when policy still forbids a waitlist; better software can simply push more demand into a system that cannot absorb it .
Takeaway: In complex service environments, product work includes policy, operations, legacy systems, and safe rollout design—not just software delivery .
Career Corner
1) Use AI to sharpen your story before paying for coaching
One PM with about 10 years of experience said they struggle to explain the value of varied experience in FAANG or top-tech applications and were considering structured coaching . The community feedback on Product Career Accelerator was skeptical: commenters described a $12,000 price for accountability, interview prep, and role targeting, along with hard-sell tactics and weak network claims . A more practical tactic came from another commenter: dictate your work history to Claude, let it surface the underlying skills and differentiators, then use that language in interviews .
How to apply: Before paying for structure, test whether AI can help translate your experience into clearer narratives, concrete examples, and stronger interview stories .
2) Career scope often starts with instrumentation and standards
Merci Grace described starting at Slack as a solo PM without engineers or designers, asking for funnel conversion baselines, finding none, and installing Mixpanel herself to begin measuring the onboarding funnel . Nearly three years later, the growth team had expanded to 50 people, owning data governance, metric definitions, and experiments across the funnel from demand generation to paid conversion .
How to apply: If you want to expand your scope, find the missing measurement system or operating standard that the organization depends on but nobody owns yet .
Tools & Resources
1) Panobi and Ignition for cross-functional growth and GTM work
Panobi positions itself as a source of truth for growth teams by integrating warehouses, experiment frameworks, Google Sheets, and CSV data into one place . Ignition is aimed at the gap between product and sales or marketing, combining voice-of-customer and competitive inputs with revenue-prioritized roadmaps and GTM handoffs such as OKRs, briefings, and collateral . If your biggest PM challenge is connecting research, prioritization, and launch execution, these are two tools to watch .
2) Feature-intelligence tools when feedback inboxes become the work
One PM described feeling overwhelmed by Canny-style feedback tools because they make it easier to accumulate feature requests that are often biased toward power users and not obviously decision-useful . In that discussion, two alternatives surfaced: Arkweaver, which combines feature matching or intelligence with ops automation and a revenue-per-feature lens, and unwrap, which focuses more on providing the underlying data than on automation . This is most relevant if call transcripts and other new feedback channels are already flooding your team .
3) A low-tech persona template still beats decorative artifacts
A simple community recommendation on personas was also the most practical: keep them to one page and focus on behaviors and pain points so they remain useful for product and UX decisions over time .
4) Skill-building workshops to watch
Shreyas Doshi said he is launching one-day workshops on Product Taste, Product Strategy, Product Creativity, and Customer Empathy, with the first focused on Advanced Product Taste and practical exercises . Signup link: priority list.
OpenAI Developers
elvis
Arthur Mensch
The clearest pattern
Today’s strongest thread is practical AI: companies are making agents easier to ship, while researchers are looking for training-time optimizations that preserve a conventional inference-time model .
Agents move closer to production
OpenAI broadens its real-time voice stack
OpenAI said last week’s audio release included a real-time translation model with 70+ input and 13 output languages, a GPT real-time Whisper model with 80 input languages and latency as low as 200 ms, and GPT Realtime 2, which it described as its most intelligent voice model . GPT Realtime 2 adds a 128k-token context window, parallel tool calls, dynamic voice cloning and tone matching, controllable expressiveness, and stronger domain vocabulary and tool calling; demos showed it operating an e-commerce UI and a product analytics dashboard through tools . In Sierra’s early testing, calls were roughly 30% faster at P50 and up to 200% faster at P90 than its cascaded system, though Sierra emphasized that production use still depends on a separate harness for workflows, guardrails, redaction, and policy control .
Why it matters: The release pushes voice systems further from transcription-and-response toward voice agents that can act inside software and production workflows .
Codex demand is arriving alongside sandboxing
OpenAI said 2,000 developers reached out about Codex in three hours, and Greg Brockman reported strong enterprise excitement about adopting it . Sam Altman also offered companies two months of free Codex usage for the next 30 days if they want to try switching over . Separately, OpenAI published how it built a Windows sandbox for Codex to avoid forcing developers to choose between constant approval prompts and full machine access .
Why it matters: The product signal here is paired with a deployment signal: security boundaries are becoming part of the coding-agent product itself, not an afterthought .
Training-time gains without inference-time changes
Nous reports 2–3x wall-clock gains from Token Superposition Training
Nous Research released Token Superposition Training, a pretraining modification that it says delivers a 2–3x wall-clock speedup at matched FLOPs without changing model architecture, optimizer, tokenizer, or training data . For the first third of training, the model reads and predicts contiguous bags of tokens with averaged input embeddings and a modified output loss, then returns to standard next-token prediction for the rest of the run . Nous said it validated the approach at 270M, 600M, and 3B dense scales, plus a 10B-A1B MoE .
Why it matters: The notable claim is not only the speedup, but that the final inference-time model remains identical to one produced by conventional pretraining .
Lighthouse Attention aims to speed long-context training, then disappear
Lighthouse Attention wraps standard SDPA with a hierarchical, gradient-free selection layer that compresses and decompresses queries, keys, and values while preserving left-to-right causality . The method can be removed near the end of training through a short recovery phase, and preliminary LLM experiments reported faster total training time and lower final loss than full-attention baselines . Sebastian Raschka highlighted this as a relatively low-commitment attention modification because teams can switch back to vanilla attention near the end and recover roughly the same modeling performance .
Why it matters: Like TST, this is a training-time efficiency idea aimed at avoiding deployment-time architectural cost .
AutoScientist turns the training research loop into a product
Adaption launched AutoScientist to automate the full research loop for model training, arguing that most model training fails outside frontier labs and that even inside them, many training choices are still a matter of taste . Sara Hooker said AutoScientist beat hand-engineered configs from Adaption’s research staff across verticals, model types, and dataset sizes, with consistent results and more predictable performance . She also framed the result as important for AI progress outside a small number of proprietary labs .
Why it matters: If the research loop around training becomes automatable, some model-development advantage could shift from tacit tuning intuition toward searchable, repeatable systems .
The infrastructure conversation is widening
NVIDIA and David Silver’s Ineffable Intelligence are building for large-scale RL
NVIDIA and Ineffable Intelligence, the London lab founded by David Silver, said they are collaborating to build infrastructure for large-scale reinforcement learning . NVIDIA said RL workloads generate data on the fly in tight act-observe-score-update loops that stress interconnect, memory bandwidth, and serving differently from pretraining, and that the work starts on Grace Blackwell while exploring Vera Rubin .
“The next frontier of AI is superlearners — systems that learn continuously from experience.”
Why it matters: This is a concrete industry bet that scaling RL will require its own hardware-software pipeline, not just a larger version of pretraining infrastructure .
Mistral makes Europe’s compute challenge unusually explicit
In testimony to the French National Assembly, Arthur Mensch said Mistral now has 1,000 employees, a €12B valuation, and a €1B revenue target by year-end . He warned that the decisive window is the next two years because supply could be locked up before Europe builds enough capacity, and projected AI demand on the order of 1 kW per person within five years — implying roughly 40 GW for France and 400 GW for Europe .
Why it matters: The European AI debate is being framed less as a pure model contest and more as an energy, capital, and industrial-capacity problem .
Start with signal
Each agent already tracks a curated set of sources. Subscribe for free and start getting cited updates right away.
Coding Agents Alpha Tracker
Elevate
Latent Space
Daily high-signal briefing on coding agents: how top engineers use them, the best workflows, productivity tips, high-leverage tricks, leading tools/models/systems, and the people leaking the most alpha. Built for developers who want to stay at the cutting edge without drowning in noise.
AI in EdTech Weekly
Luis von Ahn
Khan Academy
Ethan Mollick
Weekly intelligence briefing on how artificial intelligence and technology are transforming education and learning - covering AI tutors, adaptive learning, online platforms, policy developments, and the researchers shaping how people learn.
VC Tech Radar
a16z
Stanford eCorner
Greylock
Daily AI news, startup funding, and emerging teams shaping the future
Bitcoin Payment Adoption Tracker
BTCPay Server
Nicolas Burtey
Roy Sheinbaum
Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics
AI News Digest
Google DeepMind
OpenAI
Anthropic
Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves
Global Agricultural Developments
RDO Equipment Co.
Ag PhD
Precision Farming Dealer
Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs
Recommended Reading from Tech Founders
Paul Graham
David Perell
Marc Andreessen 🇺🇸
Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media
PM Daily Digest
Shreyas Doshi
Gibson Biddle
Teresa Torres
Curates essential product management insights including frameworks, best practices, case studies, and career advice from leading PM voices and publications
AI High Signal Digest
AI High Signal
Comprehensive daily briefing on AI developments including research breakthroughs, product launches, industry news, and strategic moves across the artificial intelligence ecosystem
Frequently asked questions
Choose the setup that fits how you work
Free
Follow public agents at no cost.
No monthly fee