We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Hours of research in one daily brief–on your terms.
Tell us what you need to stay on top of. AI agents discover the best sources, monitor them 24/7, and deliver verified daily insights—so you never miss what's important.
Recent briefs
Your time, back.
An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers one verified daily brief.
Save hours
AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.
Full control over the agent
Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.
Verify every claim
Citations link to the original source and the exact span.
Discover sources on autopilot
Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.
Multi-media sources
Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.
Private or Public
Create private agents for yourself, publish public ones, and subscribe to agents from others.
Get your briefs in 3 steps
Describe your goal
Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.
Confirm your sources and launch
Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Receive verified daily briefs
Get concise, daily updates with precise citations directly in your inbox. You control the focus, style, and length.
Tibo
swyx
David Heinemeier Hansson (DHH)
🔥 TOP SIGNAL
The strongest signal today: coding agents are becoming the default starting point for real work, not just a side tool. DHH says all new customer work now starts with agents that he steers and calibrates, while OpenAI engineer Tibo Sottiaux says the Codex team is using Codex to help refactor an end-to-end systems rethink that would otherwise take months .
The practical edge is shifting away from raw model IQ and toward loop design: faster harnesses, fresher context, and safer deployment controls .
🛠️ TOOLS & MODELS
- Cursor Composer 2 — Cursor’s new code-only model is built from open-weight Kimi 2.5 and then heavily post-trained/RL-tuned on Cursor’s own data and harnesses . Theo says it beats Opus on multiple coding evals, including Terminal Bench 2 in the 4.5-4.6 range, while running at 80-100 tokens/sec and pricing at $0.50/M input and $2.50/M output. Cursor is going through Fireworks for inference, and Theo says that path handles the attribution/license requirements for large-scale commercial use of the Kimi base .
- GPT-5.4 — Theo’s field report is blunt: it’s an incredible model for coding, but still a generation behind on frontend design. His read on OpenAI’s frontend best-practices post: useful advice, but not proof the design gap is solved .
- Claude Skills / Claude Code for web — Simon Willison used Claude Skills to teach Claude the minor breaking changes in Starlette 1.0, because the model wasn’t familiar with the release yet . His caveat: Claude chat has an “add to skills” flow, but Claude Code for web apparently does not .
- Devin — swyx says Devin usage has grown >50% MoM every month this year. More important than the growth chart: his deployment note that enterprise rollouts need permission models that won’t terrify compliance and IT teams across 10,000s of engineers.
💡 WORKFLOWS & TRICKS
- Let the agent write the first draft. Keep yourself on steering. DHH says he writes no fresh customer code himself now; new work starts with agents, and he handles direction and calibration .
- Patch fresh-release blind spots with explicit context. If the framework version is newer than the model’s knowledge, write the breaking changes into a Skill or context file before asking for edits. Simon did this for Starlette 1.0. Read: Simon’s writeup.
- Use agents as ops translators, not just coders. DHH says he injects agents into Linux systems and uses them constantly to decode obscure error messages. If you know “some Linux” but not enough to debug quickly, this is a high-ROI use case .
- Speed up the harness before you scale the loop. Peter Steinberger focused on tests and cut OpenClaw’s harness runtime from roughly 10 minutes to 2 minutes. Faster evals mean more agent iterations per day and less dead time between runs.
- Split logic and tests into separate files/domains before you unleash automation. Geoffrey Huntley hit 50 open PRs from automation and says merge conflicts become a major source of waste if logic and tests are entangled .
- Give non-engineers agent access — but only with safe permissions. swyx argues designers should get direct access to coding agents, and extends that to PMs and analytics via Slack-style workflows . Pair that with enterprise-safe permission controls, not shortcut flags, if you expect the setup to survive real IT review .
"Give your designer access to your coding agent. It is imperative..."
👤 PEOPLE TO WATCH
- David Heinemeier Hansson — high-signal because this is long-time operator commentary, not bench-racing: agent-first for new customer code, heavy use in Linux ops, and experiments with hold-to-talk voice input in his Linux setup .
- Tibo Sottiaux — short post, big signal: the Codex team is using Codex to help refactor its own system during an end-to-end scalability rethink .
- Simon Willison — still one of the best examples of practical context management. Today’s lesson: when model knowledge lags a fresh OSS release, teach the model explicitly before trusting edits .
- Theo — worth tracking because he separates “strong coding model” from “strong frontend model,” and his Composer 2 breakdown added concrete cost/speed details instead of generic hype .
- swyx — useful pulse on where coding agents are spreading inside orgs: designers, PMs, analytics teams, and enterprise deployment staff — not just core engineers .
🎬 WATCH & LISTEN
- 15:25-15:57 — DHH on using agents as Linux translators. Short clip, practical point: this is the cleanest real-world case today for using an agent to decode obscure infra errors when you’re not a deep Linux expert .
- 49:54-50:20 — DHH on hold-to-talk voice prompting. He describes a voice-to-model flow inside his Linux setup where a button press turns dictation into clean text. Worth watching if your next bottleneck is input speed, not model quality .
📊 PROJECTS & REPOS
- OpenClaw — the strongest repo signal today was operational, not social: Peter Steinberger got the harness from ~10 min to ~2 min by focusing on tests . Separate signal: another user used Claude Code plus Google’s live browser control to interact with the OpenClaw web dashboard for debugging .
- Starlette 1.0 — not an agent project, but a useful OSS release case for agent users: Simon had to explicitly teach Claude the 1.0 breaking changes because the model lagged the release . Expect this pattern on newly shipped framework versions.
Editorial take: the edge is moving from “which model is smartest?” to “who has the tightest loop” — fast tests, explicit fresh context, safe permissions, and agents in more hands .
Lenny's Podcast
April Underwood
andrew chen
Big Ideas
1) Influence is becoming a higher-leverage PM skill
Jessica Fain’s argument is that as execution gets easier, PM leverage shifts toward deciding what work survives and bringing others along. She frames influence, stakeholder management, and learning as the “10x skill” . Her core mental model is to treat executives like users, with the same curiosity and empathy you would apply in discovery .
- Why it matters: Executives are usually context-switching across budget, people, legal, and other issues before they enter your review. They have not been centered on your problem the way you have .
- How to apply: Build reviews around executive context: reset the room quickly, map your pitch to their incentives, and use feedback to learn rather than defend.
“People don’t realize that an executive’s calendar is like a strobe light going off.”
2) Founder-led coding is emerging as a zero-to-one pattern
Andrew Chen expects more “founder-led coding” as non-technical founders use AI code generation to build v1 products themselves. His analogy is founder-led sales: do the work yourself early, even if you are not yet great at it, because the point is learning and validating the product .
- Why it matters: At the earliest stage, the value is learning and validation, not polished process .
- How to apply: If you are working on a true v1, use AI-assisted building to test the idea, uncover constraints, and tighten the product before you scale delivery.
3) Repo access is becoming a PM reality-check layer
Community examples show PMs using GitHub repo access, often with AI, to understand architecture and business logic, baseline feature areas against current code, inspect event tags, write better tickets, and generate release notes .
- Why it matters: Multiple practitioners describe repo access as a way to start from reality rather than assumptions and to improve prioritization and roadmap conversations with engineering .
- How to apply: Treat the repo as a source of product context, not just engineering context, especially before writing a PRD or debating scope.
Tactical Playbook
1) Run a better executive review in 6 steps
- Start with a 30-60 second reset: why you’re here, where the last discussion ended, today’s goals, and how the meeting will run .
- Ask if there is anything else they hoped to cover so you can adapt before the conversation drifts .
- Match the format to how they process information—doc, design, customer story, dashboard, or experiment .
- Tie the proposal to how they are measured: goals, key metrics, OKRs, or current board pressure .
- Go in to learn, not to convince. Use questions like “That’s so interesting. What led you to believe that?” to surface the belief behind the feedback .
- Bring options, not just one answer. Fain’s “Stuart plus two more” method was to return with what the executive asked for plus two additional versions that created a real decision discussion; in one example, her team turned around a new document in two days after a weak review .
- Why it matters: Fain’s warning is that once you go much past 60 seconds at the top, you’ve lost them .
- How to apply: Use the opener to re-create context fast, then spend the rest of the meeting learning what the executive actually believes and needs.
2) Build trust the senior way
- Kill or deprioritize ideas that are not working .
- Make the decision criteria explicit and say when you’ll come back with a call .
- Shrink large changes into smaller experiments or proof-of-concepts to lower perceived risk .
- Follow feedback quickly; if you wait a week, the executive has often moved on .
- Why it matters: These moves signal aligned incentives with the company outcome rather than attachment to your own roadmap .
- How to apply: Define kill criteria, propose the smallest credible test, and turn around follow-ups while the conversation is still warm.
3) Use the repo before you write the spec
- Ask AI or Copilot to explain the current implementation in plain English .
- Check existing code paths for edge cases and scope assumptions before you write the PRD .
- Search comments and tags when you need to understand instrumentation, events, or legacy behavior .
- Use what you learn to write clearer tickets and summarize changes or PRs into release notes .
- If the team is already sharing AI workflows, version prompts, skills, or agents in GitHub so others can reuse them .
- Why it matters: Several commenters describe this as a direct way to save wasted spec writing and improve engineering conversations .
- How to apply: Do a repo pass before major scoping work, then bring the resulting constraints and edge cases into discovery and prioritization.
Case Studies & Lessons
1) Jessica Fain used PM skills to get inside the exec decision-making loop
In 2017, Fain approached Slack CPO April Underwood with problems she saw in how work was getting done, paired them with proposed solutions, and tried to understand Underwood and other senior leaders. Underwood says that approach “used PM skills on me” and helped land Fain the Chief of Staff role; she also credits Fain with helping Slack through IPO prep, competitive pressure, and a leadership transition . Fain later said she took the role because her product ideas kept dying and she wanted to see how executive decisions were actually made from the inside .
“She used her PM skills on me, the CPO - it was classic needs-finding and the best kind of selling”
- Lesson: Influence with executives starts before the meeting: identify the organizational problem, bring possible solutions, and show that you understand leadership constraints .
- How to apply: When a cross-functional problem keeps repeating, package it the way you would package a user problem: pain, evidence, candidate solutions, and the executive’s likely incentives.
2) Slack’s Customer Love Sprint turned quality work into a strategic signal
At Slack, the team paused normal engineering work for two weeks and let engineers pick fixes that would be good for users. PM, design, and support supplied ideas, but the only rule was that the work had to ship something good for users . The sprint produced 65 improvements and included a judging process that involved executives . Fain says it helped bring back a part of the culture the company had lost and aligned with what leaders believed differentiated Slack in the market .
- Lesson: A short execution sprint can be a product strategy tool, not just a morale exercise, if it ties visible user improvements to company-level differentiation .
- How to apply: If quality or craft is drifting, run a time-boxed improvement sprint with clear shipping criteria, broad idea input, and executive visibility.
3) Repo read access prevented wasted PRD work
One startup PM said repo read access let them inspect existing code paths before writing a PRD and discover that some edge cases were already handled, while some changes they expected to be a large lift were actually easy under the current architecture . Another PM said repository analysis with Claude Code gave their product team better information for prioritization and for discussions with IT on deadlines and roadmap changes .
- Lesson: Technical context can materially change the product plan before requirements are written .
- How to apply: Insert a codebase review step between discovery and spec drafting when the product area is complex or legacy-heavy.
Career Corner
1) In the AI era, influence looks like career leverage
Fain argues that as execution complexity drops, PM leverage shifts away from being the best note taker or experiment runner and toward deciding what work survives and getting people to buy into it .
- Why it matters: This is the part of PM work she frames as the enduring leverage point .
- How to apply: Deliberately practice stakeholder mapping, review prep, and post-meeting follow-up alongside core execution skills.
2) Show seniority by acting like a CPO
Fain’s advice for becoming more senior is to be the deepest domain expert in the room and “act like a CPO” . She pairs that with expanding your viewpoint from local optimization to global company thinking .
- Why it matters: Seniority here is framed as scope of judgment, not just scope of ownership.
- How to apply: In reviews, speak to company outcomes, not just team outputs, and bring solution-oriented thinking rather than only surfacing problems .
3) Use AI as a rehearsal partner for tough reviews
Fain describes a colleague who trained a model on past product review transcripts and expects PMs to run PRDs or pitches through it to identify likely pushback. She also suggests using AI against your own known weaknesses, such as thin data or thin UX thinking .
- Why it matters: This turns past organizational feedback into reusable coaching .
- How to apply: Before a major review, test whether your doc would survive the objections your executives typically raise .
Tools & Resources
- The art of influence: The single most important skill — the full conversation with Jessica Fain on executive influence and stakeholder management
- The art of influence: The single most important skill left that AI can't replace | Jessica Fain — video version with the meeting opener, trust-building tactics, and the Slack sprint example
- How do PMs use access to Github repos for their work — a practical thread on repo access for PRD baselining, release notes, metrics debugging, and prompt sharing
- GitHub Copilot — worth exploring if you need help turning legacy repos into plain-English docs, understanding recent changes, or drafting technical spikes
- Claude Code — PM teams describe using it to analyze repositories, understand architecture and business logic, and improve prioritization conversations with engineering and IT
- GitHub as a prompt/versioning layer — some PMs are sharing prompts, skills, agents, and queries there with engineers and other PMs
Chase Brower
Paul Calcraft
Vuk Rosić 武克
Top Stories
Why it matters: The clearest signals this cycle were about where AI competition is moving next: open weights, agent deployment infrastructure, model interpretability, and tougher standards for evaluating real-world usefulness.
1) MiniMax put a near-term open-weight release on the map
Posts tracking MiniMax said M2.7 open weights are coming in about two weeks, that the team is still iterating, and that a version updated yesterday was noticeably better on OpenClaw; MiniMax later confirmed the release was coming . A separate post also said multimodal MiniMax m3 is confirmed .
Impact: Another imminent open-weight release from a fast-moving lab would add pressure to the broader open-model field, especially as MiniMax models are already showing up in ambitious coding demos elsewhere in this cycle.
2) Anthropic-style interpretability work looks more operational
"LLMs are not the 'black box' you were promised"
A summary of Anthropic’s recent circuit tracing work described training a sparse replacement model to recreate MLP outputs, turning dense activations into human-interpretable features such as "Texas" or "the Olympics," then tracing them into causal circuits . The same summary pointed to multi-step chains like Dallas → Texas → Austin, poem planning via future rhyme candidates, and possible uses in steering, misbehavior detection, and better learning algorithms .
Impact: This is a meaningful shift from generic "black box" language toward tools that could make model behavior easier to inspect and control.
3) Sakana AI showed a live AI-assisted intelligence workflow
Sakana AI and Yomiuri Shimbun said they analyzed 1.1 million social posts about anti-Japan criticism on SNS, extracting narratives from context and nuance rather than keywords, clustering them with an ensemble of three LLMs, and generating evidence-backed hypotheses for human review . Sakana said one hypothesis about a coordinated criticism campaign was later verified by journalists through interviews with government sources, and the company now explicitly frames defense and intelligence as a focus area alongside finance .
Impact: This is a concrete example of LLM systems being used for structured OSINT and intelligence analysis, not just summarization.
4) Benchmark confidence took another hit
METR researchers found that roughly half of SWE-bench Verified PRs that pass the automated grader would not actually be merged by maintainers, with automated scores averaging 24 points higher than maintainer merge rates . In a separate benchmark debate, EsoLang-Bench authors said their conclusions only applied to a 32k-token, no-tools setting, while follow-up testing showed Claude solving 20/20 hard problems when given a looser interface and more room to work .
Impact: Benchmark numbers are becoming less reliable as stand-alone proxies for production quality.
Research & Innovation
Why it matters: Several of the most useful technical ideas this cycle were about memory, model surgery, inference efficiency, and low-cost monitoring rather than a single headline model.
- Memory systems are moving beyond vector databases. Supermemory said it reached ~99% on LongMemEval_s with ASMR (Agentic Search and Memory Retrieval), replacing vector search and embeddings with parallel observer agents that extract structured knowledge across six vectors from raw multi-session histories; it also said the system uses specialized agents for direct facts, related context, and temporal reconstruction, with no vector database required, and will be open sourced in 11 days. In parallel, another proposal suggested spawning subagents to build a searchable "memory wiki" and querying it at inference time, though the author called the current implementation expensive .
- Low-compute model surgery produced a striking leaderboard result. A researcher said he topped the Hugging Face Open LLM Leaderboard without changing a single weight by duplicating seven middle layers of Qwen2-72B and stitching them back together; follow-up commentary said those layers were identified using evaluation on just two simple items, supporting a "denoising circuits" intuition .
- AttnRes is pushing on transformer efficiency. One technical note said AttnRes has a two-stage inference algorithm that can reduce per-layer memory reads from O(layers) to O(sqrt(layers)) by batching queries, unlike fully sequential mixing in mHC; separate commentary argued it could become a new canonical transformer design motif .
- Cheap API drift detection is becoming practical. Two new papers—Log Probability Tracking of LLM APIs and Token-Efficient Change Detection in LLM APIs—request only a single token from APIs, enabling unusually cheap monitoring of silent model changes . A commenter noted the current methods apply to API endpoints rather than chat interfaces .
- OpenAI’s compression challenge is surfacing fast architectural feedback. In 71 short experiments, Vuk Rosić found 4-expert MoE + leaky ReLU to be the clearest winner, saw gains from untied factored embeddings, and reported that depthwise convolution consistently hurt performance .
Products & Launches
Why it matters: The strongest product activity centered on making agents easier to deploy, teach, and integrate into existing workflows.
- Hermes Agent: NousResearch’s open-source agent hit 10,000 GitHub stars, and the broader ecosystem moved quickly: v0.3.0 shipped with 248 PRs, there is now a one-command migration from OpenClaw, and a recent hackathon drew 187 submissions. New additions highlighted this week included HermesHub with safety-checked skills, Pinokio 1-click launch, parallel web search and page extraction tools, x402 payments, a new Workspace UI, and Gemini AI Pro subscription support .
- LlamaParse Agent Skill: LlamaIndex released an official skill usable across 40+ agents for parsing complex documents, tables, charts, images, dense PDFs, and messy handwriting into agent-readable markdown .
- Hugging Face Protected Spaces with Public URLs: Hugging Face now lets teams keep a Space protected on-platform while exposing a public URL, a setup framed as useful for production demos or internal tools without exposing model weights, prompts, or proprietary logic .
- Claude “codebase to course”: A new Claude skill turns any codebase into an interactive course with visualizations, plain-English code translations, metaphors, and quizzes; Claude Code also suggested using HTML artifacts for deeper concept explanations .
- LangChain Academy: LangChain launched a free course, Building Reliable Agents, focused on taking agents from first run to production-ready systems through iterative improvement with LangSmith.
Industry Moves
Why it matters: Company behavior is revealing where demand looks real: background agents, intelligence workflows, large-scale data operations, and changing talent strategies.
- Cognition / Devin: swyx said Devin usage has grown more than 50% month over month every month this year . A separate post argued the market has finally caught up to Cognition’s earlier vision around tool-calling, harnesses, sandboxes, dev workspaces, and fully async background agents.
- Sakana AI strategy: Beyond the Yomiuri project itself, Sakana explicitly positioned defense and intelligence as a major focus alongside finance .
- Curator spend signal: Bespoke Labs said anonymized Curator users sometimes spend as much as $80,000 on tokens, a sign that some users are already operating large-scale data curation or generation workflows .
- Figure AI hiring thesis: Brett Adcock said he has been "batting .000" hiring senior people from big established companies, arguing instead for people who "really care" and warning that assembling elite stars "like 15 Tom Bradys" will not work .
Policy & Regulation
Why it matters: As AI moves into sensitive domains, the hard questions are increasingly about restricted use cases, user protection, and compliance controls.
- OpenAI’s proposed adult mode hit internal resistance. A WSJ-linked report said advisers warned about risks including emotional dependency, compulsive use, and even a "sexy suicide coach" scenario; separate commentary said technical flaws, including a roughly 12% age-verification error rate, helped delay launch despite growth and revenue incentives .
- Military use remains contested. Commentary on reporting around U.S. operations said Claude was used via Palantir in Iranian and Venezuelan operations even as Anthropic restricted more extreme military and surveillance uses and the administration had banned Anthropic products; the same thread said investigations were examining whether inaccurate targets were hit because of outdated or hallucinated model outputs . The post contrasted that with xAI’s direct military contracts .
- Enterprise compliance is becoming a gating factor for agents. swyx argued that serious deployment across organizations with tens of thousands of engineers requires controls that go far beyond casual
dangerously-skip-permissionsworkflows .
Quick Takes
Why it matters: These smaller updates did not lead the cycle, but they help map where models and tools are getting stronger—or where they still break.
- Xiaomi’s MiMo-V2-Pro is a 1 trillion parameter flagship for an agent-oriented multi-model stack; commentary said it is strong in creative writing, document analysis, literature/history, and instruction following, but still weaker in coding and still prone to hallucinations .
-
In an AMD-AGI kernel-optimization test, Claude beat Codex on
gemm_bf16at 1.19x vs 0.94x. Codex was faster, but the author said it produced no reinjectable optimizations; the work is expected to be open sourced soon . - mbusigin reported that open-weight models one-shotted a bootable x86-64 OS in about three hours and later described a mostly working two-shot C compiler built with Pi operating MiniMax m2.7.
- Deedy Das said Karpathy’s Autoresearch pushed a vibecoded Rust chess engine to ELO 2718 after running 70+ autonomous experiments .
- GLM-5 was described as the only model currently beating the human baseline on predictionarena.ai, but replies cautioned that the sample window is short and strategy variance is wide .
- One practitioner said generic AI code review prompts succeed only about 13% of the time, while prompts grounded in specific deployment and scaling scenarios work much better .
- LTX 2.3 was described as a major improvement over LTX 2.0, and AI Toolkit now supports fine-tuning it .
swyx
LocalLLM
Jeremy Howard
What stood out
Today's strongest thread was not a single new frontier model. It was the push to make AI operational in real workflows—paired with a clearer picture of where autonomy still breaks.
Operational deployments
Sakana AI and Yomiuri test a human-AI workflow for information-campaign analysis
Sakana AI said its Narrative Intelligence system worked with The Yomiuri Shimbun to analyze more than 1.1 million social media posts, using an ensemble of three LLMs plus Novelty Search to extract narratives from context, cluster them hierarchically, and generate evidence-backed hypotheses. In one case, journalists then interviewed government sources and said they verified a hypothesis that China coordinated anti-Japan criticism after a politician's statement.
"Human journalists took the AI-generated hypotheses, interviewed real-world government sources, and verified the timeline of the coordinated campaign our system uncovered."
Why it matters: This is a more concrete deployment story than a generic "AI for analysis" claim. Sakana explicitly positioned defense and intelligence as a priority area alongside finance, suggesting this style of narrative mapping is moving toward real operational use.
More: sakana.ai/narrative-intelligence#en
Enterprise agent stacks are getting built around permissions, tools, and portability
In an interview, Composio CTO Karan Vaidya described a platform that gives agents access to more than 50,000 tools across 1,000+ apps through a single interface, with managed authentication, just-in-time tool discovery, execution sandboxes, logging, and a feedback loop that rewrites tools or turns agent traces into reusable skills. He said AWS, Zoom, Glean, and Airtable are building core agent products on top of Composio, and highlighted least-privilege controls, hooks for guardrails, SOC2, and self-hosting for enterprise use.
Vaidya also said Composio's three-person team running its internal agent-building pipeline spent about $100,000 last month on tokens—more than human payroll. Separately, swyx wrote that Devin usage has grown more than 50% month over month this year, while arguing that serious enterprise deployment needs safer permissioning than consumer-style shortcuts such as dangerously-skip-permissions.
Why it matters: The live question is shifting from whether agents can call tools to whether companies can trust, audit, and swap models under production conditions. Composio's emphasis on reusable skills and cross-model portability is aimed directly at reducing model lock-in as enterprise rollouts expand.
Reality checks on autonomy
More agents, more tokens, and more initiative are not automatically improvements
The paper Can AI Agents Agree? argues that current agent groups cannot reliably coordinate or reach consensus even in cooperative settings, and that larger groups fail more often by getting stuck or stopping altogether. Gary Marcus summarized the result bluntly: groups of agents do not magically solve the unreliability of individual agents.
"Groups of agents don’t magically sort out the unreliability of individual agents. Instead, they often get stuck."
Nathan Lambert made a similar macro argument in Lossy Self-Improvement, saying AI-assisted development is real but that narrow automatable research, diminishing returns from parallel agents, and organizational bottlenecks make fast takeoff unlikely. On the practitioner side, Jeremy Howard said Opus and Sonnet 4.6 have been too eager to take over instead of letting humans lead, while Martin Casado argued that beyond a baseline, higher token use is inversely correlated with competence in using AI.
Why it matters: These are different critiques, but they converge on the same deployment lesson: adding autonomy does not remove the need for structure, supervision, and guardrails.
Brief notes
- PyTorch announced TorchSpec, framed as speculative decoding training at scale, in a blog post
- Arc Institute introduced BioReason-Pro, targeting the vast majority of proteins that still lack experimental annotations, in its announcement
Bottom line
The day pointed to a maturing AI stack: stronger workflow integration, clearer enterprise controls, and more concrete human verification loops—but also more evidence that autonomy remains brittle when coordination, oversight, or user control really matter.
Lenny's Podcast
Elon Musk
Most compelling recommendation
Jessica Fain's Switch mention is the strongest save today because it pairs a strong endorsement with a specific framework she still uses: "shrink the change." Her summary is to turn an overwhelming initiative into a small experiment or one-week proof of concept, which helps build momentum and trust with leaders .
Switch
- Title:Switch
- Content type: Book
- Author/creator: Not specified in the source material
- Link/URL: None provided in the source material
- Who recommended it: Jessica Fain
- Key takeaway: Break intimidating change into smaller experiments instead of asking for a long, uncertain commitment
- Why it matters: Fain says it is one of the only business books she has liked, and she ties it to a concrete mechanism for earning buy-in: smaller bets create momentum and trust
"If something seems scary and overwhelming, how do you make it so much smaller so that it's an experiment, it's a one week proof of concept."
Perspective-building fiction from Jessica Fain
Fain's other recommendations form a coherent cluster: historical fiction as a way to access unfamiliar worlds and experiences .
"Books are windows or mirrors for us... historical fiction... is really a window into someone else's experience."
Pachinko
- Title:Pachinko
- Content type: Book
- Author/creator: Not specified in the source material
- Link/URL: None provided in the source material
- Who recommended it: Jessica Fain
- Key takeaway: A multi-generational family story in Korea that moves to Japan
- Why it matters: Fain presents it as historical fiction that opens a social world she does not already know well
Homegoing
- Title:Homegoing
- Content type: Book
- Author/creator: Not specified in the source material
- Link/URL: None provided in the source material
- Who recommended it: Jessica Fain
- Key takeaway: The book follows the impact of the slave trade from West Africa through two split branches of a family
- Why it matters: It fits Fain's broader preference for fiction that reveals unfamiliar histories through family-level consequences
History of Burning
- Title:History of Burning
- Content type: Book
- Author/creator: Not specified in the source material
- Link/URL: None provided in the source material
- Who recommended it: Jessica Fain
- Key takeaway: It tells the story of Indian indentured servants brought to Uganda to build the British railroad
- Why it matters: It extends the same perspective-building pattern into a different historical setting that readers may know less well
Additional low-context signal
Elon Musk shared a brief but direct video recommendation, calling it a good explanation of nihilist philosophy .
Video described as a good explanation of nihilist philosophy
- Title: Not specified in the source material
- Content type: Video
- Author/creator: Not specified in the source material
- Link/URL:https://video.twimg.com/amplify_video/2035834974327173120/vid/avc1/1280x720/4HnIR-oivyjo64B0.mp4?tag=21
- Who recommended it: Elon Musk
- Key takeaway: Musk's entire comment is that it is a "good explanation" of the topic
- Why it matters: The context is thin, but it is still a clear pointer to a resource he found useful as an explainer
homesteading, farming, gardening, self sufficiency and country life
Successful Farming
1) Market Movers
- United States - corn planting: Early activity is advancing in the South. Texas, Louisiana, and Mississippi were marked higher, while Arkansas was unchanged. As temperatures warm, planters are expected to move north, and the trade is watching acreage switches and planting delays closely.
- China / global fertilizer trade:GulfSentinel said it now tracks export restrictions and bans, including China’s fertilizer quota, keeping fertilizer availability on the watchlist.
- Farm-economics signal: Prairie Routes argues industrial grain production systems have run out of financial runway and that regenerative management requires shifting away from high-yield chasing toward biodiversity, marginal-land stewardship, riparian areas, and livestock care.
2) Innovation Spotlight
Fujian, China - kelp value-added model
In Lianjiang, harvesting immature kelp at under 1 meter lifted annual sales per mu from more than 10,000 yuan to more than 30,000 yuan. The model then extended value through salted cold storage for up to a year, hotpot-ready frozen product sold at 15 yuan per 50 g bag, and compressed second-harvest kelp that cuts shipping volume to one-tenth and rehydrates in under a minute. The operation also spread processing across seven primary sorting plants and created flexible factory jobs for about 50 local women earning roughly 2,000-3,000 yuan per month. Peak harvest is still highly manual: filling a 3-ton boat can require more than 100 bends and workers reported sleeping only four hours a day.
Beijing/Pinggu, China - oyster mushrooms with year-one payback
A bag-cultivation system using crushed rice hulls and corn cobs, high-temperature sterilization, and two-end fruiting was cited at about 45 days to full mycelial fill. Each shed held 5,000 bags, with output around 3 jin per bag and gross revenue of 45,000 yuan per season, including about 30,000 yuan net. Harvests come every 15-20 days for 4-5 flushes, and the source said two seasons per year can repay the investment in year one. The critical control point was contamination: fully cover the bag mouth with spawn, use smaller crushed spawn pieces, inoculate in a disinfected enclosure, and remove infected bags quickly.
Measuring soil function instead of debating labels
Water infiltration testing offers a low-cost, repeatable way to compare land-management systems: drive a small metal ring into the ground at a standard depth, pour in a fixed amount of water, time how long the puddle takes to disappear, and repeat across locations. The case for the method is simple: infiltration directly measures how well a field handles both heavy rain and drought.
"Cheap, replicable, visual, measurable and comparable - water infiltration testing says it all."
3) Regional Developments
- United States - South: Corn planting is advancing in Texas, Louisiana, and Mississippi, with Arkansas flat; the next regional shift is northward as temperatures rise.
- China - Anhui: In Shouxian, spring plowing started after the spring equinox "spring ox" ceremony. Wheat is at the jointing stage and receiving topdress fertilizer under humid, windy conditions. The same area illustrates grain-livestock integration: local corn and rice/wheat stalks are fed as roughage, manure is returned to wheat, corn, and soybean land, and annual meat-cattle output was cited above 20,000 head.
- China - Shanxi: A Datong farm is testing 20 daylily varieties from Shanghai, Hainan, other Chinese regions, and abroad for overwintering in cold, dry conditions. Management responses include extra straw mulching, heavy watering where salt-alkali patches appear, and QR-code tracking for daily field management.
- China - downstream R&D: Early work on fermenting opened daylily flowers reported higher active-component content and improved antioxidant performance, showing how specialty crops are being linked to bioprocessing rather than sold only as fresh raw material.
4) Best Practices
Grains and soil
- Use diverse cover crops and keep living roots in the field where possible. The reported effect is better downward water movement and better moisture holding under both heavy rain and drought.
- Standardize infiltration tests across fields and management systems so drainage and water-holding performance can be compared directly.
- Where soil horizons have recently loosened, one soil specialist said topsoil alone will not stabilize the site; deep-rooted species such as river birch, switchgrass, goldenrod, ninebark, coneflower, black-eyed Susan, and blanket flower were suggested.
Dairy and livestock
- A cited dairy research example reported that milk from cows on industrial TMR contained five times less fat-soluble vitamins and much lower CLA than milk from cows on pasture and green feed.
- In cattle systems, local corn plus rice and wheat stalks can lower roughage costs, while manure can be cycled back into wheat, corn, and soybean fields.
- Adaptive rotational grazing remains a soil-building theme. Ranchers described systems such as Strategic chaos and Grazing planning as ways to use cattle to build resilient soils and vibrant grasslands.
- For integrated household poultry systems, one example used food scraps to close the loop between household waste and egg production.
Specialty crop operations
- In oyster mushrooms, contamination prevention starts at inoculation: crush spawn into smaller pieces, seal the full bag opening, work in a disinfected enclosure, and remove contaminated bags before spores spread through the shed.
- Harvest oyster mushrooms around 70-80% maturity, before heavy spore release reduces quality and raises allergy risk; an added water application was said to delay spore release for up to 24 hours.
5) Input Markets
- Fertilizer:GulfSentinel is now tracking export restrictions and bans, including China’s fertilizer quota, making quota policy a live availability watch.
- Feed: In Anhui, local corn and rice/wheat stalks are being used as cattle roughage to lower feed costs. At smaller scale, poultry keepers reported supplementing pellets with chickweed, microgreens, home-grown corn, and garden scraps to reduce purchased-feed dependence and add dietary variety.
- Water-management capital: Tile drainage was described as expensive and incomplete in monocrop systems and can move nutrients and chemicals into waterways. The lower-cost alternative highlighted here is biological water management through living roots plus simple infiltration testing.
6) Forward Outlook
- United States: The near-term watch is still northward planting progress and whether weather-driven delays force acreage switches.
- Regenerative systems are moving from philosophy to measurement: water infiltration, adaptive grazing, local feed substitution, and manure cycling are being presented as operational resilience tools rather than abstract sustainability claims.
- China spring management: Jointing-stage wheat nutrition, seedling care under cold-dry conditions, and targeted mulching/watering on saline ground remain immediate seasonal tasks.
- Margin expansion is increasingly downstream: The clearest return cases in today’s sources came from processing and handling changes - early kelp harvest plus product diversification, and disciplined mushroom contamination control - not just more raw output.
Discover agents
Subscribe to public agents from the community or create your own—private for yourself or public to share.
Coding Agents Alpha Tracker
Daily high-signal briefing on coding agents: how top engineers use them, the best workflows, productivity tips, high-leverage tricks, leading tools/models/systems, and the people leaking the most alpha. Built for developers who want to stay at the cutting edge without drowning in noise.
AI in EdTech Weekly
Weekly intelligence briefing on how artificial intelligence and technology are transforming education and learning - covering AI tutors, adaptive learning, online platforms, policy developments, and the researchers shaping how people learn.
Bitcoin Payment Adoption Tracker
Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics
AI News Digest
Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves
Global Agricultural Developments
Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs
Recommended Reading from Tech Founders
Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media