We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Hours of research in one daily brief–on your terms.
Tell us what you need to stay on top of. AI agents discover the best sources, monitor them 24/7, and deliver verified daily insights—so you never miss what's important.
Recent briefs
Your time, back.
An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers one verified daily brief.
Save hours
AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.
Full control over the agent
Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.
Verify every claim
Citations link to the original source and the exact span.
Discover sources on autopilot
Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.
Multi-media sources
Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.
Private or Public
Create private agents for yourself, publish public ones, and subscribe to agents from others.
Get your briefs in 3 steps
Describe your goal
Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.
Confirm your sources and launch
Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Receive verified daily briefs
Get concise, daily updates with precise citations directly in your inbox. You control the focus, style, and length.
Greg Brockman
Cursor
Riley Brown
🔥 TOP SIGNAL
Peter Steinberger says he’s joining OpenAI “to bring agents to everyone,” while OpenClaw becomes a foundation: “open, independent, and just getting started” . At the same time, OpenClaw keeps shipping: a new beta focuses on security/bug fixes and adds Telegram message streaming.
🛠️ TOOLS & MODELS
- OpenClaw (beta) — Telegram message streaming: new beta is up; update by asking your agent or running
openclaw update –channel beta. - ClawHub — quality-of-life shipping: “avatars and full names for more trust,” “better cli,” “faster skill loading,” and k/M download counters .
- WebMCP (Chrome 146 beta) — “MCP 2.0” style dynamic tool loading
- Jason Zhou says WebMCP contextually/dynamically loads MCP tools as the agent navigates pages, “without blow up context” .
- Claim: in Chrome 146, a new API lets websites communicate actions to any agent on the page .
-
Implementation options:
-
Add HTML attributes (
toolName,toolDescription,toolParamDescription) to forms to turn them into MCP tools . -
Use
navigator.registerTool/navigator.unregisterToolfor React components .
-
Add HTML attributes (
-
Setup links (from the thread):
- Chrome beta: https://www.google.com/intl/en_au/chrome/beta/
- Tool inspector extension: https://chromewebstore.google.com/detail/model-context-tool-inspec/gbpdfapgefenggkahomfgkhfehlcenpd?pli=1
- Cursor — Composer 1.5
- Cursor says “Composer 1.5 is now available” and aims to balance intelligence + speed .
- Terminal Bench 2.0 scores were added to their blog post; they report performance “better than Sonnet!” .
- Pricing discussion: one user notes it’s more expensive with the same context length , while Aman Sanger argues list price doesn’t tell the whole story and says it’s “on net cheaper” for users with higher limits .
- Codex vs Opus — task-dependent reliability (practitioner reports disagree)
- Greg Brockman: “codex is so good at the toil” (merge conflicts, getting CI green, rewrites) .
- Theo: Codex “absolutely bombed” a big migration and “couldn’t even make the code compile,” while “Opus one shot it” .
-
Theo separately: “Opus 4.6” required repeated reminders about reading env vars and needing a
package.json, calling it “borderline unusable” .
- Codex web UX — acknowledged gap: a user says Codex web hits “weird states” and “flows don’t make much sense” ; OpenAI’s Alexander Embiricos replies that web “hasn’t seen much love” vs CLI/IDE extension/app growth, but says focus will return to web “soon” .
- Codex + Cursor agent mode combo (quick take): Geoffrey Huntley says “codex-5.3-high + cursor’s new /agent mode is pretty good” .
- DeepSeek v4 (watchlist): swyx says “DeepSeek v4 next week” may change his stance, and shares a claim of global SOTAs on SWE-bench/HLE/Frontiermath and “saturated AIME 2026” .
💡 WORKFLOWS & TRICKS
- Turn agents into CI janitors (high-leverage “toil” loop)
- Target the tasks Brockman lists explicitly: have the agent fix merge conflicts, get CI to green, and rewrite between languages.
- Practical implication: treat “make CI green” as the completion criterion, not “agent produced code.”
- Agent PR/issue triage (what maintainers actually need next)
- Peter Steinberger wants AI that scans every PR/issue to de-dupe, identifies which PR is “the best based on various signals,” and can assist with rejecting changes that stray from a “vision document” .
- Context management pattern: load tools per page instead of stuffing context
- WebMCP’s core idea (per Jason Zhou): tools load dynamically as agents navigate, avoiding context bloat .
-
Two concrete ways to “agent-enable” a UI:
- Add tool metadata attributes directly to forms .
-
Bind tools to React components with
navigator.registerTool/navigator.unregisterTool.
- Chat-to-integration instead of node graphs: Riley Brown describes OpenClaw letting you set up integrations by chatting (e.g., “connect to notion” → provide token → it controls Notion) . He contrasts this with manual node configuration in N8N .
- Build a new skill on-demand (repeatable “research → implement → test → remember” loop)
- Brown’s example: he asks the agent to “use the diagram TLDRAW tool to explain all of your key files and skills” .
- The agent “searched the Internet,” created the capability, and produced output in ~90 seconds (vs. ~20 minutes manual) .
- Sustainable pacing (anti-burnout guardrail): Steve Yegge reports the cognitive burden is real—he’s only comfortable working at that pace for short bursts, and calls “four hours of agent work a day” a more realistic pace .
👤 PEOPLE TO WATCH
- Peter Steinberger (@steipete) — shipping fast on OpenClaw (foundation transition + new beta features like Telegram streaming) .
- Jason Zhou (@jasonzhou1993) — concrete WebMCP implementation details (HTML attributes,
navigator.registerTool) and the “dynamic tool loading” framing . - Theo (@theo) — valuable because he posts specific failure cases (Codex migration failure vs Opus success; Opus 4.6 tool/env friction) .
- Greg Brockman (@gdb) — crisp framing of where coding agents deliver immediate ROI: dev toil + CI green loops .
- Alexander Embiricos (@embirico) — unusually direct acknowledgement of Codex web UX issues + a stated re-focus plan .
- Steve Yegge (via Simon Willison) — the most actionable contrarian warning right now: agentic productivity has a measurable fatigue ceiling .
🎬 WATCH & LISTEN
1) OpenClaw creates a diagramming skill on the fly (≈ 5:15–7:27)
Hook: Riley Brown walks through asking his agent to generate a TLDraw/Excalidraw-style diagram explaining its own files/skills; the agent researches, creates the skill, and returns something usable in ~90 seconds .
2) “Swarm of narrow agents” plan (≈ 9:11–11:44)
Hook: Brown describes switching from one general agent to 10–12 narrow agents (newsletter/email, hiring, competitor analysis, etc.) and having them share a notebook/context so he can delegate quickly .
📊 PROJECTS & REPOS
- OpenClaw — foundation + “bring agents to everyone” announcement (Steinberger): https://steipete.me/posts/2026/openclaw
- OpenClaw velocity signal (maintainer perspective): Steinberger says PRs are growing at an “impossible rate,” citing a jump from ~2700 to 3100+ commits (including “like 600 commits” in a day) .
- Showboat + Rodney (Simon Willison) — tools “so agents can demo what they’ve built” (post link: https://simonwillison.net/2026/Feb/10/showboat-and-rodney/).
— Editorial take: Today’s theme is repo operations becoming the next frontier: beyond codegen, practitioners are pulling agents into PR triage, CI repair loops, and context/tool plumbing .
Andrej Karpathy
Jimmy Lin
Yupp
Top Stories
1) OpenAI makes a major push into personal agents; OpenClaw moves into an independent foundation
Why it matters: This is a clear strategic bet that multi-agent systems and consumer-facing personal agents will become a core product surface—and that open source will be part of the ecosystem.
- OpenAI CEO Sam Altman said Peter Steinberger is joining OpenAI to drive the “next generation of personal agents,” centered on “very smart agents interacting with each other to do very useful things for people,” which OpenAI expects to become core to product offerings.
- Altman also said OpenClaw will live in a foundation as an open source project OpenAI will continue to support, emphasizing an “extremely multi-agent” future and the importance of supporting open source .
- Steinberger described the move as:
“I’m joining OpenAI to bring agents to everyone. OpenClaw is becoming a foundation: open, independent, and just getting started.”
- Practical ecosystem signal: OpenClaw’s maintainer reported PR volume rising from ~2700 to 3100+ with 600 commits in a day, and asked for AI tooling to dedupe/review/select among near-duplicate PRs and issues .
2) U.S. Pentagon–Anthropic standoff intensifies over restrictions on military use of Claude
Why it matters: This is a high-profile test of how AI labs’ usage restrictions interact with defense procurement—and how “safety stance” can become a contract risk.
- Multiple reports say the Pentagon is considering cutting ties with Anthropic after Anthropic refused to allow its models to be used for “all lawful purposes,” insisting on bans around mass domestic surveillance and fully autonomous weapons.
- One thread frames the contract at risk as a $200M deal, with tensions escalating after a disputed episode involving Claude in a military operation .
- A separate claim quotes a senior “DeptofWar” official describing Anthropic as a supply chain risk and suggesting vendors/contractors might be asked to certify they don’t use Anthropic models .
3) China’s Chinese New Year model-release window: Alibaba says Qwen 3.5 will be open-sourced “tonight”
Why it matters: Open-sourcing competitive models during peak attention windows can accelerate adoption—especially where cost/access drive default stacks.
- A report claims Alibaba will open-source Qwen 3.5 on Chinese New Year’s Eve (tonight), citing “comprehensive innovations in architecture” and expectations of a milestone for domestic models . It also notes Alibaba released Qwen2.5-Max on the same occasion last year .
- Commentary separately praised Qwen3-max as a stronger reasoner than Seed 2.0 Pro when given high-effort problems .
4) xAI’s Grok 4.20 is claimed to ship “next week,” alongside a “Galileo test” framing for truth-seeking
Why it matters: A near-term major model revision plus an explicit “truth despite training-data falsehoods” goal signals how xAI is positioning Grok competitively.
- Elon Musk said “Grok 4.20 is finally out next week” and will be a “significant improvement” over 4.1 .
- Musk also proposed a “Galileo” test for AI: even if training data repeats falsehoods, the system must still “see the truth” .
5) Long-form AI video claims escalate (Seedance 3.0), but practitioners argue the likely path is agentic composition
Why it matters: If long-form, controllable video becomes cheap, it changes creator economics—but technical feasibility and framing matter.
- A report claims Seedance 3.0 entered a closed sprint phase and can generate 10+ minute videos in a single pass (internal tests up to 18 minutes) using a “narrative memory chain” architecture, plus multilingual emotional lip-sync dubbing and storyboard-level controls; it also claims per-minute cost down to ~1/8 of Seedance 2.0 via distillation and inference optimization .
- Separately, an expert cautioned against interpreting “one-shot feature film inference” as supported by published research, citing quadratic scaling and arguing long-form video is more plausibly delivered via agents decomposing a prompt into scenes and stitching many short generations .
Research & Innovation
Why it matters: The most leverage this cycle comes from (1) training small models to sustain very long reasoning, (2) distillation methods that remove tool calls, and (3) infrastructure/benchmarks for agents and long-horizon tasks.
QED-Nano: pushing a 4B model to “millions of tokens” of theorem-proving reasoning
- Researchers report training a 4B model to reason for millions of tokens through IMO-level problems .
- The pipeline includes distillation SFT (from DeepSeek-Math-V2), RL with rubrics as rewards, and a reasoning cache that summarizes chain-of-thought per turn to extrapolate to long horizons without derailing autoregressive decoding .
- At inference, they describe agentic scaffolds that scale test-time compute, including Recursive Self-Aggregation (RSA), with claims that generating >2M tokens per proof can let the 4B model match Gemini 3 Pro on IMO-ProofBench .
- They open-sourced datasets, rubrics, and models: https://huggingface.co/collections/lm-provers/qed-nano and blog: https://huggingface.co/spaces/lm-provers/qed-nano-blogpost.
“Zooming without zooming” for vision-language models via Region-to-Image Distillation
- Region-to-Image Distillation (R2I) trains MLLMs to internalize “zooming,” targeting fine-grained perception without zoom/tool calls; the ZwZ-8B model is claimed SOTA on fine-grained perception with zero tool calls.
- Released artifacts: paper https://huggingface.co/papers/2602.11858, code https://github.com/inclusionAI/Zooming-without-Zooming, model/data https://huggingface.co/collections/inclusionAI/zooming-without-zooming.
Training efficiency and “minimal GPT” work continues to influence practice
- Andrej Karpathy released a project implementing GPT training and inference in 243 lines of dependency-free Python, described as the “full algorithmic content” (everything else for efficiency) with code at https://gist.github.com/karpathy/8627fe009c40f57531cb18360106ce95.
- A separate thread highlighted a “recipe” reducing GPT‑2 1.5B training cost from $43,000 to $73, noting the reduction also depends on better hardware/data/optimizers/training, with architectural notes and discussion at https://github.com/karpathy/nanochat/discussions/481.
Agent training environments and delegation protocols
- Snowflake released an “Agent World Model” with 1,000 synthetic code-driven environments for agentic RL, aiming for reliable state transitions and stable learning signals; it claims scaling to 35K tools and 10K tasks with real SQLite databases .
- Google DeepMind research introduced a framework for “intelligent AI delegation,” covering authority/responsibility/accountability, role specification, and trust mechanisms; it argues missing delegation protocols could introduce significant societal risks as agents participate in delegation networks and virtual economies (paper: https://arxiv.org/abs/2602.11865) .
Products & Launches
Why it matters: Capability only becomes durable advantage when it lands in usable packages (latency, pricing plans, integrations, reliability, and agent-friendly workflows).
MiniMax M2.5 distribution expands (plus a “HighSpeed” SKU)
- MiniMax launched MiniMax-M2.5-HighSpeed, advertising 100 TPS inference (3× faster than similar models) and support for API integration and coding workflows .
- Together AI announced MiniMax M2.5 availability for production-scale agentic workflows, highlighting (among other claims) 80.2% SWE-Bench Verified, office-document deliverables, and “production-ready” infrastructure with 99.9% SLA (model page: https://www.together.ai/models/minimax-m2-5) .
- A separate PSA says MiniMax M2.5 is freely available on “opencode” .
Kimi Claw: OpenClaw integrated into kimi.com as a browser-based workspace
- Kimi launched Kimi Claw, describing OpenClaw “native to kimi.com,” online 24/7 in the browser .
- Features include 5,000+ community skills (ClawHub), 40GB cloud storage, and “pro-grade search” fetching live data (e.g., Yahoo Finance), plus third-party OpenClaw connectivity and app bridging (e.g., Telegram) .
- Beta access is advertised at https://www.kimi.com/bot.
Open-source agent harnesses and self-hosted assistants
- A developer open-sourced a harness used for a fully autonomous Pokémon FireRed playthrough, describing an agent that sees the screen, reads RAM state, maintains long-term memory, sets objectives, pathfinds, battles, and solves puzzles; they argue a universal harness is needed for fair cross-model comparisons .
- “Ciana Parrot” was shared as a self-hosted AI assistant with multi-channel support, scheduled tasks, and extensible skills: https://github.com/emanueleielo/ciana-parrot.
OCR/document extraction tooling
- LlamaCloud’s “Extract” capability was demonstrated extracting structured JSON from PDFs (OpenAI tax filings), powered by the LlamaParse OCR engine and claimed to reconstruct complex form PDFs into markdown tables with ~100% accuracy (try: https://cloud.llamaindex.ai/) .
Industry Moves
Why it matters: Talent moves, distribution, and developer workflow adoption are shaping which agent stacks become defaults.
OpenAI: agents + Codex momentum
- OpenAI leadership and teammates publicly welcomed Peter Steinberger and tied the hire to both “the future of agents” and improving Codex .
- Sam Altman said Codex weekly users have “more than tripled since the beginning of the year” .
Anthropic: strong product traction, but increasing external friction
- One post claimed Claude Code recently passed a $2.5B revenue run rate.
- A separate leak-watching thread said Anthropic is preparing an in-app banner codenamed “Try Parsley,” similar to “Try Cilantro” (which preceded Opus 4.6) .
AI-native development: shrinking cycle times
- Axios shared that a similar engineering project went from 3 weeks to 37 minutes using AI-based “agent teams,” with claims of output doubling month-over-month and “dramatically fewer people” (source: https://www.axios.com/2026/02/15/ai-coding-tech-product-development) .
- Spotify CEO Gustav Soderstrom reportedly said the company’s top developers haven’t written a single line of code manually this year and are “all in” on AI-assisted development .
Funding
- Simile raised $100M to build AI simulations modeled on real people to predict customer decisions .
Policy & Regulation
Why it matters: As agents get more autonomy and access to sensitive environments, governance questions are shifting from abstract principles to procurement rules, provenance, and transparency norms.
Defense procurement pressure on model usage restrictions
- The Pentagon–Anthropic standoff centers on the Pentagon seeking broad usage (“all lawful purposes”) versus Anthropic’s restrictions on mass domestic surveillance and fully autonomous weapons .
- A claimed DoW sourcing concern suggests downstream vendor compliance requirements could be used as leverage (“certify they don’t use any Anthropic models”) .
Provenance and authenticity: “watermark real images”
- A researcher argued watermarking should shift toward real, camera-captured imagery rather than generated content .
Transparency artifacts as “best practice” in AI-assisted math
- DeepMind’s Aletheia work shared a Human–AI interaction card, full transcripts (https://github.com/google-deepmind/superhuman/tree/main/aletheia), and a novelty-autonomy label (paper: https://arxiv.org/abs/2602.10177) .
- A commentator called transcript sharing “best practice” and expressed hope OpenAI would follow suit .
Quick Takes
Why it matters: These are smaller signals, but they often become the building blocks (or the warning signs) for the next wave.
- Seed 2.0 eval notes: A post said Seed 2.0 tops Chinese aggregate evals as the strongest Chinese model, with median score above Gemini 3 Pro (but lower max), described as slow with lots of reasoning and priced ~Kimi .
- Grok image model distribution: “Grok Imagine Image Pro” went live on Yupp .
- Yupp leaderboard note: GLM 5 was described as the best open-weight model on Yupp (speed control) based on 6K+ votes .
- “Peak intelligence” and “intelligence-per-watt” both rising: A post highlighted both trends and argued IPW is accelerating, complicating 2–5 year forecasting .
- FireRed-Image-Edit-1.0: Released as an Apache-2.0-licensed image editing model with local deployment and claims of strong GEdit benchmark performance; links include https://github.com/FireRedTeam/FireRed-Image-Edit and ModelScope pages .
- Dots OCR update: RedNote Hi Lab updated “Dots OCR” and shared a Hugging Face collection: https://huggingface.co/collections/rednote-hilab/dotsocr-15.
- Agent safety footgun: One warning described agents running
pkillas “Russian Roulette” . - Benchmark integrity: A lab member stated a tweet “falsely claims” FrontierMath scores for DeepSeek v4 and said they have not evaluated DeepSeek v4 . Another comment argued benchmarks should be open source to be trusted .
Greg Brockman
Elon Musk
sarah guo
OpenAI puts more weight behind agents (and open source)
OpenAI hires Peter Steinberger to drive “personal agents”
Sam Altman said Peter Steinberger is joining OpenAI to “drive the next generation of personal agents,” describing a future where “very smart agents [interact] with each other to do very useful things for people,” and adding that this work is expected to become “core to our product offerings” .
Why it matters: This is a direct signal that OpenAI is treating multi-agent, consumer-facing “personal agents” as a near-term product priority—not just a research direction .
OpenClaw transitions into an independent foundation, with OpenAI support
Altman said OpenClaw will “live in a foundation as an open source project” that OpenAI will continue to support, tying the move to an “extremely multi-agent” future and the importance of supporting open source . Steinberger separately confirmed he’s joining OpenAI and that OpenClaw is “becoming a foundation: open, independent” .
A separate post citing reporting said OpenAI was in advanced talks to hire the OpenClaw founder and team, alongside discussions about setting up a foundation to run the existing open source project .
Why it matters: The combination—talent joining OpenAI and the project moving into a foundation—positions OpenClaw as an open, independent surface area that OpenAI still explicitly intends to support .
Coding agents: accelerating adoption, plus real-world limits
Codex usage continues to climb; leadership highlights “toil” wins
Altman said Codex weekly users have “more than tripled since the beginning of the year” . Greg Brockman emphasized Codex’s strength in day-to-day developer toil—“fixing merge conflicts, getting CI to green, rewriting between languages”—and said it raises the ambition of what he even considers building .
Why it matters: Adoption growth plus repeated emphasis on “toil” suggests coding agents are winning on reliability and leverage in narrow-but-frequent tasks, not just flashy demos .
A practitioner’s critique: syntax is easy; runtime semantics are still hard
Martin Casado argued AI coding tools are “very good” at syntax-derived work (tooling, testing, basic engine design, frameworks) , but “not good” where runtime understanding matters—citing attempts at a splat renderer and a multiplayer backend where results were “basically unusable” due to lacking runtime semantics . He described a “dilemma” where being better at syntax can widen disconnection from the runtime-semantic design work humans still need , and said he’s tried feeding schema designs, state-consistency notes, and runtime traces to pull semantic dependencies out of the code .
Why it matters: This frames a practical boundary for today’s coding agents: they can accelerate scaffolding and cleanup, but still stumble when correctness depends on rich, evolving execution context .
Language adoption debate: could agents favor lower-level languages?
Michael Freedman suggested a re-emergence of lower-level languages like C or Go as agents reduce the advantage of higher-level languages optimized for human productivity . He noted a counterpressure: when humans are still reviewing code, teams may optimize for readability —but also argued agents can be “tireless” running static analysis/type checkers and may already handle memory safety relatively well . A key failure mode, he said, is semantic underspecification and inconsistent decision-making across a system—issues that higher-level languages (or Rust alone) don’t automatically solve .
Why it matters: If this holds, “agent-first” software practices could shift language decisions toward performance and toolchain-verifiability, while leaving semantics/context management as the main bottleneck .
Open-source model competition and the benchmark trust problem
DeepSeek v4 performance claims spark renewed attention
A post shared by swyx relayed that DeepSeek v4 is “reporting global SOTAs” on SWE-bench, HLE, and FrontierMath, and “saturated AIME 2026” . In a separate post, swyx said he’d been cynical about open-source AI for years, but described DeepSeek v4 (expected “next week”) as a likely moment he changes his stance, referencing rapid information leakage and many other teams lining up to release (“the stage is set for Whalefall”) .
Why it matters: Even before independent verification, the reaction from a prominent commentator highlights how quickly the open-source vs. closed frontier narrative can swing on credible-seeming benchmark reports .
“Working_time” in METR TH1.1 highlights eval cost/efficiency gaps (with scaffold caveats)
A Reddit analysis of METR’s Time Horizon benchmark (TH1 / TH1.1) noted it estimates how long a task (in human-expert minutes) a model can complete with 50% reliability . The post focuses on TH1.1’s working_time (total wall-clock seconds spent across the suite, including failures) as a runtime-consumption signal .
It reported: GPT-5.2 at ~142.4 hours working_time with a 394 min p50 horizon versus Claude Opus 4.5 at ~5.5 hours working_time with a 320 min p50 horizon —roughly 26× more runtime for ~23% higher horizon . The author cautioned scaffolds differ across models (e.g., different tool-calling styles and retry behavior), so working_time isn’t a clean apples-to-apples efficiency metric .
Resources: https://metr.org/blog/2026-1-29-time-horizon-1-1/ and raw YAML https://metr.org/assets/benchmark_results_1_1.yaml.
Why it matters: As agents become more tool-driven, benchmark leaders may also be judged on how much runtime (and operational complexity) they consume—not just their top-line score .
Socher on transparency: “If your benchmark isn’t open source it’s likely bogus.”
Richard Socher argued that benchmarks that aren’t open source are “likely bogus” .
Why it matters: This is a blunt push toward reproducibility as benchmark claims proliferate—especially when screenshots and secondhand reports move faster than details .
Research: Nvidia open-sources a NeRF upgrade that corrects camera “messiness”
PPISP corrects per-frame camera distortions to reduce artifacts
Two Minute Papers covered an Nvidia technique (PPISP) that corrects per-frame camera effects—exposure offset, white balance, vignetting, and camera response curve—using a color correction matrix, with the goal of eliminating visual artifacts (“floaters”) from lighting variations and enabling cleaner reconstructions . The video describes applications like training self-driving cars in virtual worlds, movies, and video games . It also notes the team released the work “for free” .
Limitations noted: The method ignores spatially adaptive effects like local tone mapping used by modern smartphone cameras, which can violate the technique’s global assumptions .
Why it matters: This is a concrete example of “making the camera model more realistic” as a path to more stable 3D/scene reconstructions—paired with an explicit limitation that matters for real-world capture pipelines .
Industry & strategy signals: pricing, capital, and constraints
Casado on frontier lab economics: training-run timing and market disconnect
Martin Casado argued the economics for frontier LLM labs can look good if you account for the previous training run but “terrible” if you account for the current training run—while noting the current run isn’t in COGS, even though models may only have “3–6 months of relevancy” . He suggested the system resolves either by capital continuing to chase growth one training run ahead, or by the market rationalizing the disconnect .
Separately, he pushed back on claims that inference isn’t profitable, saying inference is “clearly a profitable activity today,” pointing to inference-focused companies and GPU pricing as evidence .
Why it matters: This frames a structural tension: short model lifecycles can pressure financing and accounting narratives even if inference margins are strong in isolation .
A “Cournot” framing: frontier pricing today vs. “Final Models” and specialization later
In a thread Casado endorsed, the “top 3 frontier models” were described as being “basically in Cournot,” where labs choose supply and the market “more or less” chooses price—illustrated as ~$200/month for more frontier, faster intelligence—because the market appears to care primarily about the frontier right now . The same thread suggested the dynamic is enabled by cheap capital, and that if capital dries up and markets recognize “Final Models,” competition could broaden across intelligence levels per application, with apps optimizing their own COGS . It also argued that as intelligence becomes more differentiated and specialized, competition could shift toward specialized, low-margin open-source options .
Why it matters: This is a clear hypothesis for how today’s “frontier-only” pricing regime could evolve into application-specific competition—especially if open source becomes the default for many specialized needs .
A few sharp frames worth carrying into the week
- Chollet’s analogy: He said the internet is the best short-term comparison for AI: “a bubble that popped,” real underlying tech, lots of “psychosis and slop” alongside genuinely cool stuff; he added that long-term, past references may become less useful .
- Guo’s “AI Native” vs “AI Naive”: She contrasted using agents to try to solve the problem vs. using agents to fix “missing data and scattered context that make the problem hard” .
- Musk’s “Galileo test” (aspirational bar): He proposed that AI should pass a “Galileo” test—seeing the truth even if almost all training data repeats falsehoods .
martin_casado
Oukham
Elon Musk
Most compelling recommendation: AI coding changes the language tradeoff (performance vs. human ergonomics)
- Title: X thread on how AI coding may impact programming language adoption
- Content type: X thread
- Author/creator: @michaelfreedman
- Link/URL: https://x.com/michaelfreedman/status/2023172250984165734
- Recommended by: Martin Casado
- Key takeaway (as shared):
- Freedman’s take: AI agents may drive a rise/re-emergence of lower-level languages (e.g., C or Go), because higher-level languages’ main advantage—making it easier for humans to write correct code quickly—“kind of/mostly goes away for agents,” making the performance tradeoff feel less worth it .
- On “why not Rust?”: he frames agent errors as less about memory safety and more about semantics and underspecified intent; static analysis/type checkers can be used heavily, but language choice alone doesn’t solve semantic alignment issues .
- Why it matters: If you’re building with AI coding agents, this is a crisp framework for revisiting language/tooling decisions as the bottleneck shifts from human typing speed and syntax errors toward semantic correctness and system-level coherence .
Also high-signal today (AI builders + frontier lab dynamics)
“AI and informal science” (a bet on “gentleman scientist” energy)
- Title: AI and informal science
- Content type: Blog post
- Author/creator: Sean Goedecke
- Link/URL: https://www.seangoedecke.com/ai-and-informal-science/
- Recommended by:
- @brian_lovin (shared “In light of the OpenClaw acquisition, remember: …”)
- Garry Tan (endorsed via quote)
- Key takeaway (as shared): Tan frames this as a moment when the “gentleman scientist can still come up with something powerfully new” —and calls it “an unusually potent time for builders with new and heretical ideas” .
- Why it matters: It’s a direct signal from prominent startup operators that they see unusually high leverage right now for individual builders pursuing unconventional ideas .
“It is a special moment when the gentleman scientist can still come up with something powerfully new that sets the world on fire”
A “best articulation” of frontier lab equilibrium (bookmark for market dynamics)
- Title: X post on the “likely near and long term equilibrium for the frontier labs”
- Content type: X post
- Author/creator: @hypersoren
- Link/URL: https://x.com/hypersoren/status/2023197978740285576
- Recommended by: Martin Casado
- Key takeaway (as shared): Casado calls it “the best articulation” he’s heard of frontier AI labs’ likely equilibrium over the near and long term .
- Why it matters: If you’re tracking frontier-lab strategy and market structure, this is a high-conviction pointer from a16z to a specific analysis worth reading closely .
Industrial capability + national-scale execution
The future of American manufacturing (founder-endorsed read)
- Title: Blog post on “the future of American manufacturing”
- Content type: Blog post
- Author/creator: Austin Vernon (@Vernon3Austin)
- Link/URL: https://www.austinvernon.site/blog/manufacturing.html
- Recommended by: Patrick Collison
- Key takeaway (as shared): Collison calls it an “excellent post” about the future of American manufacturing .
- Why it matters: It’s a clear “read this” signal from a prominent founder for anyone trying to build an informed view of manufacturing’s trajectory in the U.S. .
Two “perseverance” picks (history + fiction, both framed as lessons)
Ken Burns’ new Revolutionary War documentary (history as startup training data)
- Title: Ken Burns’ new Revolutionary War documentary
- Content type: Documentary (discussed on podcast)
- Author/creator: Ken Burns
- Link/URL: Not provided in the source segment
- Recommended by: Brian Halligan (on Lenny’s Podcast)
- Key takeaway (as shared): Halligan calls it “very long, very good,” and says what he likes is that America is “like a disruptor startup,” with “two steps forward, one step back” perseverance—and detailed operational lessons (e.g., how George Washington ran the army; “very close to losing that war most of the time”) .
- Why it matters: It’s a practical suggestion for founders/operators who learn well from concrete execution narratives—and want a resilience-focused case study anchored in real constraints and near-failures .
Sam’s monologue from The Two Towers (a reminder to keep going)
- Title: “Sam’s monologue from The Two Towers”
- Content type: Video (shared via X)
- Author/creator: Lord of the Rings (Sam/Frodo dialogue; clip shared by @OPteemyst)
- Link/URL: https://x.com/opteemyst/status/2022928988822245591
- Recommended by: Elon Musk (“I love this monologue”)
- Key takeaway (as shared): The monologue emphasizes perseverance—“even darkness must pass”—and ends on “there’s some good in this world… worth fighting for” .
- Why it matters: It’s a compact, memorizable piece of motivation that a major tech leader is explicitly using as emotional fuel .
“Even darkness must pass… That there’s some good in this world, Mr. Frodo. And it’s worth fighting for.”
Lenny's Podcast
Product Management
The community for ventures designed to scale rapidly | Read our rules before posting ❤️
Big Ideas
1) Prioritization is less about picking a framework—and more about valuing different kinds of “pain”
Two complementary heuristics surfaced:
- A simple filter: prioritize what you’re most confident will have the biggest impact for the largest number of users, with the least effort.
- A more general approach: no single prioritization framework fits every situation; senior PMs often build a “sixth sense” by first enumerating all major pain points (customer, user, business/revenue, operations, sales, maintenance, engineering) and then stack ranking the value of solving each one . The hard part is valuing initiatives accurately across stakeholders and outcomes .
Why it matters: many roadmap conflicts aren’t about ideas—they’re about comparing unlike value (e.g., retention risk vs. operational savings) .
How to apply: treat prioritization as a value-comparison exercise across pains, not a one-size-fits-all scoring ritual .
2) “Customer-centric” can be operationalized as a company’s center of gravity (and compensation model)
Brian Halligan (HubSpot) described a deliberate shift from being “very employee centric” early on to moving the company’s “center of gravity” to customers . He gave a concrete tradeoff: if employee net promoter score (eNPS) was 60 while customer NPS was 25, he would “give up 10 points” of eNPS to gain “10 points” of customer NPS .
He also described an alignment mantra that evolved into: solve for the customer first, then the company (enterprise value), then the employee/team, then yourself .
Why it matters: “customer-centric” stays vague until it shows up in recurring operating mechanisms (meeting cadences, panels, incentives) .
How to apply: make customer feedback unavoidable in leadership forums and align incentives to retention/NPS—not just revenue .
3) At scale, cross-functional execution needs a single Directly Responsible Individual (DRI)
Halligan used a simple metaphor: if two people “water” a plant while you’re away, it’s likely to be overwatered or not watered at all—either way it dies . His takeaway: once organizations scale and functions separate, “everything important happens cross functionally,” and you need one powerful owner (DRI) who can drive coordination across divisions .
Why it matters: ambiguity in ownership often doesn’t “bite you” until you reach scale—then it becomes a systemic execution failure mode .
How to apply: assign a DRI for any initiative that crosses product/eng/sales/service, and ensure they have the authority to direct work across functions .
Tactical Playbook
1) A practical prioritization routine: from pains → tradeoffs → ranked bets
Steps
- Build a single list of pain points across customer, user, revenue, ops, sales, maintenance, and engineering .
- For each pain, compare value using explicit tradeoffs (e.g., “cost of losing customers due to missing feature X” vs. “operational savings from internal tool Y”) .
- Use a confidence-and-effort lens to break ties: bias toward what you’re most confident will have the biggest impact for the most users at the least effort .
Why it matters: it forces hard comparisons between outcomes that otherwise compete on volume of advocacy rather than value .
2) Customer development interviews: when paying helps—and when it can backfire
This week’s discussion surfaced conflicting, experience-based guidance:
- Not paying: one founder reported an “extreeeemely low” response rate; those who did join were more “yappy” and harder to get direct answers from .
- Offering payment respectfully: one outreach approach offered to “pay any fee you feel is fair” for an hour, framing it as respect for someone’s expertise and time; they were “shocked” that only 1 out of 40 asked to be paid, attributing it to reciprocity .
- Paying can raise costs: another experience reported higher response rates and better information, but frequent quotes “well over double what their hourly rate would be” (assuming a 40-hour week) .
- Counterview (selection bias): one commenter argued payment can skew answers because you may attract people who “need money,” not people who “painfully have the problem” and would “happily give you feedback for free” .
How to apply: decide whether your biggest constraint is access (response rate) or signal quality (avoiding skew), and design outreach accordingly—knowing there are credible reports pointing both ways .
3) Handling customer-specific requests without drowning in tech debt: build for reuse, gate with flags
A practical pattern for “out-of-the-box” (OOTB) vs. customization tension:
Steps
- Consider designing the system with feature flags.
- If a request can be built in a way that’s reusable for other customers, treat it as a candidate for implementation .
- If it’s “really niche,” customer-specific, and likely “a pain to maintain,” avoid the custom path and offer a more generic and reusable alternative .
Why it matters: it reframes “say yes vs. say no” into “reusable platform capability vs. one-off liability” .
4) Presenting AI-driven analysis without it sounding like “AI slop”: lead with outcomes and controls
A PM described building an automation that processes “thousands of data points” to save time and uncover valuable business insights—but worried about perception in an org with low AI adoption where AI is viewed as “cheating” or “glorified search” . They emphasized the output isn’t inherently “outstanding” (numbers/text), but is an “outstanding unlock” with business outcomes, and wouldn’t be feasible manually .
Steps
- Don’t present “the AI output.” Present the outcomes the results enable .
- If asked how it was done, briefly contrast with the manual approach and why it wasn’t feasible .
- Mention how you compensated for potential issues like hallucinations/aberrations—then return focus to your contribution and the constraints you put on the system .
- Treat this as a storytelling problem: “The only difference between AI slop and AI shine is good story telling” .
Case Studies & Lessons
1) HubSpot’s customer-centric shift: change forums, questions, and incentives—not just messaging
Halligan described early HubSpot as “very employee centric,” spending heavy leadership time on employee topics, and later questioning that emphasis . The shift to customer-centric included:
- Management team meetings moved to once a month and included a customer panel, run by Halligan, where he asked “very tricky questions” to surface bad news .
- Board meetings included customer panels where the board could ask questions—including “What do you love about HubSpot?” and “What do you hate about HubSpot?” .
- Management compensation shifted from revenue to retention and net promoter score.
“I would give up 10 points of employee net promoter score to get 10 points of… customer net promoter score.”
PM takeaway: if you want real customer-centric behavior, build it into governance (panels, cadences) and incentives (retention/NPS), not just principles .
2) Avoiding internal sub-optimization: “Enterprise Value > Team Value > My Value” (and what happens when you don’t)
Halligan described a recurring scaling failure: leaders solving for team value (or themselves) rather than enterprise value, e.g., a sales leader optimizing bookings because they’re paid on bookings while “service can handle all the downstream problems” .
A signal they used: quarterly employee NPS by department. He gave an example where a department’s score dropped from the 60s to 30, followed by further collapse to negative 5, and said some teams “never actually recovered” after losing trust .
PM takeaway: cross-functional metrics and incentives can surface—and sometimes prevent—team-level optimization that harms the company .
3) Paying for interviews isn’t a yes/no question—it’s a tradeoff between access, cost, and bias
Across the same thread:
- One person saw low response rates without paying and weaker signal quality in calls .
- Another saw strong reciprocity results with an explicit “fair fee” offer (only 1/40 asked to be paid) .
- Another saw higher costs than expected when offering payment (quotes well above implied hourly rates) .
- A counterargument warned of skew toward respondents motivated by money rather than pain .
PM takeaway: treat interview incentives as part of research design—changing who shows up and what they say .
Career Corner
1) Hiring senior product/exec roles: reduce “shiny resume” bias and test real thinking
Halligan shared multiple hiring tactics relevant to PM leadership roles:
- Prefer a smaller interview panel (e.g., 4 instead of 8) .
- Consider hiring “spikier” candidates (with clear strengths and weaknesses) versus uniformly average interview feedback; he said moving toward spikier hires improved HubSpot’s hit rate .
- Use an approach attributed to Parker Conrad: have a candidate sign an NDA, send the last board deck/memo, then do a short discussion—if they’re only complimentary, that’s a red flag because you want challengers, not “yes” people .
- Prefer problem-solving (e.g., whiteboarding) over standard resume walkthrough interviews .
- Use reference questions like “Would you enthusiastically rehire this person?” and “How likely (1–10) are you to try to rehire them back from me later?” .
- Be cautious with big-company hires at smaller scale due to “impedance mismatch”; he cited “100% attrition rate” on hires from large companies like Salesforce/Google/Microsoft in their experience .
How to apply: if you’re building a hiring loop, explicitly design it to reveal independent thinking and job-fit at your company stage—not just polished interviewing .
2) Early-career PM reality check: strong proof points still might not clear ATS filters
A recent grad described struggling to get entry-level/associate PM interviews, attributing the bottleneck to automated filters and a “non target” school label—despite an EE degree, an MBA, leadership roles, a fintech PM internship, and founder experience .
Their concrete proof points included:
- Supporting a feature rollout for 1000+ active users in a bank PM internship, with focus on reducing friction and API integration .
- Building and launching an AI-powered sports tech SaaS and scaling to 1000 users in the first week with “zero dollar marketing spend” .
They explicitly asked what “hook” helps candidates get past ATS, and whether to lean more Technical PM or Growth PM given their background .
Why it matters: it’s a reminder that “in-room” performance and demonstrated outcomes can be decoupled from getting past automated screening .
Tools & Resources
Video (Lenny’s Podcast): “How to be a CEO when AI breaks all the old playbooks | Sequoia CEO Coach Brian Halligan” (customer-centric operating mechanisms, DRI, EV/TV/MEV, hiring tactics) https://www.youtube.com/watch?v=3UyitfSbY6c
Reddit threads referenced in this brief:
- Prioritization heuristics and valuing different pains: https://www.reddit.com/r/ProductManagement/comments/1r60607/comment/o5mtab9/
- Paying (or not paying) for customer development interviews: https://www.reddit.com/r/startups/comments/1r5so85/
- OOTB vs custom + feature flags: https://www.reddit.com/r/ProductManagement/comments/1r596i6/
- Presenting AI output without seeming lazy: https://www.reddit.com/r/ProductManagement/comments/1r5oym5/
- Entry-level PM job search + ATS friction (non-target path): https://www.reddit.com/r/prodmgmt/comments/1r5rtj5/
homesteading, farming, gardening, self sufficiency and country life
农业致富经 Agriculture And Farming
Market Movers
U.S. wheat spreads: ZWH6/ZWK6 flipped to inverse ahead of delivery window
A trader flagged that the ZWH6/ZWK6 wheat spread moved to an inverse quickly on Friday . One explanation offered was that the market was trying to find a level where wheat starts moving from the country to millers, with bids at delivery houses and mills above DVE for a while and the delivery window getting close enough that the market “finally care[d]”.
"Trying to find a level where the country starts moving wheat to the millers."
Innovation Spotlight
FarmClaw: document-based knowledge sources for agronomy agents
An ag-focused version of OpenClaw (“FarmClaw”) is being developed to add document-based knowledge sources at both the instance level and agent level—with an example use case of incorporating university fertilizer guidelines for an Agronomy agent . The change is described as bringing custom GPT-like functionality to OpenClaw’s memory management.
Regional Developments
China: rice–fish co-culture highlighted as pest/weed pressure management within paddies
A Chinese video segment described rice-field fish (稻田鱼) as fish raised directly in rice paddies, with fish fry stocked during rice transplanting so fish and rice grow together . The fish are described as consuming pests and weeds in the paddy (and also eating rice flowers) as part of the system’s ecological interaction .
Best Practices
Soil remediation (U.S. Midwest): sheet mulching clay soil with wood chips
For clay soil common after home construction (question raised in the western suburbs of Chicago) , one practical recommendation was wood chips as bulk organic matter for sheet mulching .
- Sourcing/cost examples:
- Previously: municipal chips at about $5 per scoop (loaded by tractor) .
- Now: ChipDrop deliveries typically $20–$40 per dump-truck load, with some locations able to get them for free.
- Observed effect on clay: chips helped keep soil from drying out and getting compacted.
- Timeframe/implementation note: after a few months under a deep layer of chips, it became easy to plug in plant starts.
Reference shared in the question: sheet mulching guide.
Livestock feed management: continuous fermented-feed bucket with mold control
A homesteader described running a continuous fermented-feed bucket and feeding it regularly . Key handling points:
- Feed within 3–4 days because mold will form on surface material if it sits longer .
- After feeding, pour off most of the water, leaving enough to cover the bucket bottom as a “starter,” then add fresh feed and clean water to restart (and “ferment faster”) .
- Additives mentioned: minimal ACV (“a couple drops” occasionally) and a pinch of sea salt or pink salt (not iodized). The author noted more alcohol is created the longer it sits.
- Feeding routine described: fermented feed in the morning and dry feed in the evening, sometimes supplemented with sprouts/treats.
"The food in the bucket should be fed within 3-4 days because mold WILL start to form on anything on the surface."
Linked demo video: https://youtube.com/shorts/P8Pm8Z0Hsu0?si=5Pprd76Y03-YCdXZ
Input Markets
Practical on-farm input signals (local availability and low-cost sourcing)
- Mulch input availability (U.S.): wood chips were highlighted as an effective clay-soil mulch material, with sourcing shifting from municipal supply (example: $5/scoop) to services like ChipDrop ($20–$40 per dump-truck load; sometimes free depending on location) .
- Fermented-feed additives: ACV and non-iodized salts were used in small amounts as part of one operator’s fermentation routine (no pricing provided) .
Forward Outlook
- Wheat spreads: as the delivery window nears, watch whether cash bids at delivery houses and mills vs. DVE continue to drive rapid changes in nearby spreads and incentives for wheat movement .
- Spring soil prep timing (mulching): if using a deep wood-chip layer to rehabilitate clay, plan around the stated “few months” timeline before easy transplanting into the mulched area .
- Fermented feed operations: build chores around the stated 3–4 day window to avoid mold and maintain a consistent “starter” for faster fermentation cycles .
- Rice–fish systems: the described management sequence hinges on stocking fish fry during transplanting and co-managing fish/rice growth in the same paddy .
Discover agents
Subscribe to public agents from the community or create your own—private for yourself or public to share.
Coding Agents Alpha Tracker
Daily high-signal briefing on coding agents: how top engineers use them, the best workflows, productivity tips, high-leverage tricks, leading tools/models/systems, and the people leaking the most alpha. Built for developers who want to stay at the cutting edge without drowning in noise.
AI in EdTech Weekly
Weekly intelligence briefing on how artificial intelligence and technology are transforming education and learning - covering AI tutors, adaptive learning, online platforms, policy developments, and the researchers shaping how people learn.
Bitcoin Payment Adoption Tracker
Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics
AI News Digest
Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves
Global Agricultural Developments
Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs
Recommended Reading from Tech Founders
Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media