We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Hours of research in one daily brief–on your terms.
Tell us what you need to stay on top of. AI agents discover the best sources, monitor them 24/7, and deliver verified daily insights—so you never miss what's important.
Recent briefs
Your time, back.
An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers one verified daily brief.
Save hours
AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.
Full control over the agent
Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.
Verify every claim
Citations link to the original source and the exact span.
Discover sources on autopilot
Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.
Multi-media sources
Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.
Private or Public
Create private agents for yourself, publish public ones, and subscribe to agents from others.
Get your briefs in 3 steps
Describe your goal
Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.
Confirm your sources and launch
Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Receive verified daily briefs
Get concise, daily updates with precise citations directly in your inbox. You control the focus, style, and length.
Boris Cherny
Armin Ronacher
Salvatore Sanfilippo
🔥 TOP SIGNAL
A strong pattern crystallized today: the best agent wins are benchmarked, not vibes-based. Tobias Lütke used Pi plus pi-autoresearch to run around 120 semi-autonomous experiments against Liquid, landing 93 commits that made parse+render 53% faster and cut allocations 61% . Simon Willison’s reusable lesson is the setup: a benchmark script made make it faster actionable, and Liquid’s 974-test suite made aggressive agent experimentation safe .
🛠️ TOOLS & MODELS
- OpenAI Automations — GA. You can now choose model and reasoning level, run in a worktree or existing branch, and reuse workflows via templates. OpenAI’s own examples are recurring repo jobs: daily briefings, issue triage, and PR comment follow-up .
- CursorBench — new eval surface for coding agents. Cursor is publishing intelligence + efficiency scores for agentic coding, and says it combines offline benchmarks with online evals because public benchmarks are increasingly saturated . Jediah Katz frames this as a transparency push around real scores . cursor.com/blog/cursorbench
- Cursor’s search stack is now a lot more legible in public. Via Turbopuffer, Cursor embeds the full codebase with a custom embedding model, uses semantic search plus GREP, and increasingly fans out parallel queries inside an agent turn . Turbopuffer says the migration cut Cursor’s costs 95% and fixed per-user economics .
- OpenClaw 2026.3.11 — behavior change worth checking today. Cron now enforces a stricter cron-owned delivery contract in isolated runs; jobs using
delivery.mode='none'while sending ad hoc messages may now go silent. Fix: runopenclaw doctor --fix, then move to explicitannounceorwebhookdelivery . - Gemini API spend caps. Simon Willison calls this immediately useful for CI and agent experiments where the main fear is an accidental bill spike .
- Actual model routing from a daily driver. Theo says he still prefers Claude for a lot of UI work, uses Codex alongside it inside T3 code/terminal workflows, and will spin up Gemini CLI quickly for UI tasks he cannot do in Codex .
💡 WORKFLOWS & TRICKS
- Run benchmarked autoresearch, not random refactors.
- Create a prompt file plus a script that runs tests and benchmarks .
- Let the agent propose many micro-optimizations and test them one by one; Tobi’s run hit around 120 experiments and 93 commits .
-
Persist state in
autoresearch.jsonlso the search can keep context across runs . - Only do this on a repo with strong tests; Liquid had 974 unit tests .
Having a robust test suite is a massive unlock for working with coding agents.
Add a no-code-first planning mode.
- Give users a way to ask for step-by-step architecture/planning without immediate code generation .
-
Anthropic’s implementation was basically one instruction:
please don’t code. - This matters because users were already trying to force that behavior through the chat UI by hand .
Use fresh-context subagents, but over-spec the handoff.
- Keep a main agent in the loop and spawn subagents with clean context windows for bugs or research tasks .
- Let several of them work in parallel when the problem is ambiguous .
-
Force the final message to return actual findings, not just
done; Harrison says bad communication is the failure mode here .
Rebuild your workspace around projects, not panes.
- Put each project in its own sidebar/workspace entry with quick hotkeys .
- Inside that project, keep an agent terminal, a dev server, and a git terminal together .
- Offload long-running agents to SSH/TMUX on another machine so they keep working when the laptop is closed .
- Theo and Armin/Ben are converging on the same idea: less context switching, more parallel threads under light supervision .
👤 PEOPLE TO WATCH
- Tobias Lütke + Simon Willison — best public example today of agentic optimization with a measurable end state: benchmark script, test suite, lots of experiments, concrete win .
- Boris Cherny — high signal because he is sharing actual Claude Code product patterns:
plan mode, multi-agentMama Claude, and a Bitter Lesson-style refusal to overbuild around current model limits . - Harrison Chase — still one of the clearest explainers of agent harness primitives: prompts, planning, subagents, filesystems, sandboxes, observability, and evals .
- Salvatore Sanfilippo — useful reality check: benchmark passes do not guarantee code you would ship, and operator skill still determines whether AI is a weak assistant or a 10-100x multiplier .
- ThePrimeagen — strongest contrarian take today: fast autocomplete plus skill may improve proficiency without the cognitive debt and codebase drift that full agents can cause .
🎬 WATCH & LISTEN
- 8:59-9:53 — Boris Cherny on
plan mode. Great short clip because it shows how a valuable agent feature can come from a tiny harness change: users wanted thinking-first, not a code dump, and Anthropic shipped that behavior fast .
- 13:13-14:33 — Harrison Chase on subagents. Probably the cleanest explanation you’ll hear this week of why subagents help and why communication back to the parent agent is the real hard part .
- 13:34-18:23 — Theo on Niri-style work hierarchies for agentic coding. Worth watching if terminal/IDE/browser context switching is frying your brain; this is a concrete sketch of a better project/task layout for Claude Code, Codex, dev servers, and git .
📊 PROJECTS & REPOS
- pi-autoresearch — Pi plugin used in Tobi’s Liquid optimization. The signal is that it carried state via
autoresearch.jsonlthrough around 120 experiments and 93 commits in a live performance PR . - Shopify/liquid PR #2056 — public playbook for benchmarked agent optimization. Read it for concrete wins like
String#byteindex, byte-level tag parsing, and cached small-integer strings . - Seamux — open-source terminal built on LibGhostty that Theo says has already replaced Ghostty as his daily driver because the project/task hierarchy fits parallel agentic work better than TMUX alone .
- OpenClaw — if you automate recurring agent jobs, the 2026.3.11 release is the kind of operational change you want to catch early: stricter cron delivery rules plus a maintainer-provided migration path via
doctor --fix.
Editorial take: the edge is moving away from raw model worship and toward measurable objectives, clean context boundaries, and workspaces that let one human supervise many agent threads.
Pushmeet Kohli
Demis Hassabis
Sam Altman
AI moved deeper into high-stakes domains
Microsoft launches Copilot Health; Limbic highlights specialist clinical performance
Microsoft introduced Copilot Health, a private health workspace for U.S. adults that can combine EHR records, lab results, and data from 50+ wearables to generate personalized insights and help users prepare for appointments; Microsoft said connected data stays user-controlled and is not used to train its models.
In a separate healthcare signal, Vinod Khosla pointed to a Nature Medicine study on Limbic Layer, saying it turns frontier LLMs into behavioral-health specialists and that 75% of its AI sessions ranked in the top 10% of human therapist sessions, with its CBT system rated above both human clinicians and the base LLMs.
Why it matters: Health AI is moving along two tracks at once: consumer-facing data integration and more tightly scaffolded, domain-specific systems.
Google puts urban flash-flood forecasting into production and opens the data
Google said it trained a new model to predict flash floods in urban areas up to 24 hours in advance. It also introduced Groundsource, a Gemini-based method that identified more than 2.6 million historical events across 150+ countries, and said the resulting dataset is being open-sourced while forecasts go live in Flood Hub.
Why it matters: This is a concrete example of frontier models being applied to public-safety forecasting rather than only consumer productivity.
Sakana AI moves further into defense
Sakana AI said Japan's Ministry of Defense selected it for a multi-year research contract focused on speeding observation, reporting, information integration, and resource allocation. The company said it will use small vision-language models and autonomous agents on edge devices such as drones, and that defense and intelligence are now a primary focus area alongside finance.
Why it matters: The line between commercial AI research and national-security deployment keeps narrowing, and governments are starting to fund domestic capability directly.
Frontier competition kept tightening
xAI pairs product momentum with an internal reset
According to DesignArena by Arcada Labs, Grok Imagine reached #1 on its Video Arena leaderboard at Elo 1336, with a 69.7% win rate across 15,590 battles; separately, an xAI beta post said Grok 4.20 improved hallucination, instruction following, and output speed over Grok 4.
"xAI was not built right first time around, so is being rebuilt from the foundations up."
Musk also said he and Baris Akis are revisiting earlier hiring decisions and reconnecting with promising candidates.
Why it matters: xAI is signaling two things at once: competitive progress on model performance and a willingness to reorganize its core engineering setup to keep pace.
Altman points to faster adoption in India and argues for "democratic AI"
Sam Altman said Codex usage in India grew 10x over a short period and described Indian startups and large companies as especially aggressive about AI adoption, with customers there seeming "a little further along" than in the U.S.
He also argued that if AI is becoming infrastructure that reshapes the economy and geopolitical power, its rules and limits should be set through democratic processes rather than by companies or governments alone.
"I think that this belongs to the will of the people working through the democratic process."
Why it matters: The competitive map is no longer just about model labs; it is also about where adoption is moving fastest and who gets to set the rules.
Research signal
DeepMind says AlphaEvolve improved five classical Ramsey bounds
Google DeepMind said AlphaEvolve established new lower bounds for five classical Ramsey numbers, a long-standing problem in extremal combinatorics where some previous best results were more than a decade old. Demis Hassabis said the system achieved this by discovering search procedures itself, rather than relying on bespoke human-designed algorithms.
Why it matters: The result extends the AI-for-maths story from solving known tasks toward automating parts of the search procedure itself.
Aakash Gupta
Product Management
Lenny Rachitsky
Big Ideas
1) The scarce skill shifted from building to deciding
Creation is easier; judgment is now the differentiator. Hiten Shah frames the new bottleneck as deciding what should exist, who it is for, what problem it solves, and whether it should be built at all . Rekhi makes the operational PM version: AI has sped up delivery enough that discovery and design can become the constraint, increasing the risk of feature factories if teams stop validating with customers .
The bottleneck moved.
- Why it matters: faster output does not automatically create better products.
- How to apply: raise the bar on problem selection before prompting or prototyping starts, and require a clear statement of what is being built and why .
2) AI should speed discovery, not replace product intuition
Rekhi's core caution is to stay in the loop. PMs still need to read customer feedback, sample interview recordings, and digest research themselves to build intuition and empathy . In his experience, end-to-end workflows that ask AI to synthesize feedback, invent the feature, and write the spec produce poor results; the better pattern is to let AI surface pain points and summaries, then apply human judgment to the solution .
- Why it matters: many PM decisions still happen without full information, so intuition remains a real operating asset .
- How to apply: ask for verbatims alongside summaries, and keep solutioning as a human step .
3) Discovery is becoming a continuous operating system
The notable change in Rekhi's workflow is not one tool but a connected system: continuous surveys, feedback rivers, AI-generated interview guides, interview synthesis, AI-moderated interviews, functional prototypes with analytics, and natural-language data analysis . He describes moving from quarterly NPS work with a marketing team to continuous collection and weekly automated reporting .
- Why it matters: PMs can keep pace with faster engineering without waiting on long research or analytics queues.
- How to apply: replace one-off studies with a recurring loop: collect, synthesize, validate, instrument, and revisit.
4) Some products may need to design for agent-to-agent use
Aakash Gupta argues that many teams still optimize for a single interaction model - human opens app, human types, AI responds - while the next surface may be agent-to-agent, where a user's assistant contacts the product directly . His suggested PM questions shift to what the product exposes to other agents, what permissions it grants, and where it escalates when an AI cannot answer .
- Why it matters: onboarding flows, nudges, and empty states assume a human is watching .
- How to apply: inventory which actions, permissions, and escalation paths would still work if another agent were the caller .
Tactical Playbook
1) Build a continuous signal stack
- Run surveys continuously. In Rekhi's example, Claude calculated an overall NPS of 41 from 2,600 responses, showed 58% promoters, visualized monthly trends, and supported deeper segmentation work .
- Automate the reporting layer. His agent checks the CSV, runs numerical analysis and verbatim themes, then generates both an HTML report and a Gamma presentation .
- Add a feedback river. Tools in this category pull from sources such as App Store, Google Play, G2, Zendesk, cancellation surveys, and NPS, then group feedback into themes with trend lines and counts . In his example, one theme showed 138 complaints about Evernote import issues .
- Keep the customer voice attached. Review exact verbatims and use the linked identities or contact records when you need follow-up conversations .
Why it matters: this turns customer signal from a quarterly project into a weekly operating rhythm.
Start here: automate one recurring survey report, then add one support or review source next.
2) Upgrade interview work end to end
- Seed AI with strong interviewing principles. Rekhi uses a summary of actionable best practices from The Mom Test plus a research brief to generate better interview guides .
- Transcribe and summarize recordings against a template. His workflow extracts takeaways, pain points, workflow/tools, feature requests, and direct quotes from each interview .
- Ask for cross-interview patterns. The same workflow can cluster themes across a batch and count how often pain points appear across 10 interviews .
- Use AI-moderated interviews when speed matters. Rekhi says these tools can run async concept interviews, probe with follow-up questions, and return summarized results by the next morning .
- Still watch some interviews yourself. He explicitly keeps sampling recordings to build customer and product intuition .
Why it matters: scripting, transcription, and synthesis compress from hours or days into an overnight workflow.
Start here: standardize one interview template before automating anything else.
3) Use prototypes as research instruments
- Build a functional prototype, not just a mockup. Rekhi used Bolt to create a working Ask AI concept .
- Instrument it like a real product. He adds in-product surveys, retention tracking, session replay, and heatmaps through analytics tooling such as PostHog .
- Use synthetic users for fast usability feedback. In his example, synthetic feedback surfaced a hidden entry point, 5-20 second waits, missing example prompts, and the lack of multi-note querying .
- Then let real behavior settle design choices. Heatmaps showed users clicking the Ask AI button in the top right, not the bottom-right floating button .
- Do not confuse synthetic feedback with product-market fit. Rekhi says it is useful for usability-style feedback, not for determining product-market fit .
Why it matters: this gives teams pre-launch signal on usability and return behavior that mockups cannot provide.
Start here: add one survey question and one heatmap before you send a prototype out.
4) Make analytics self-serve - carefully
- Connect AI to your data safely. Rekhi uses MCP or a database dump with read-only access so the model can inspect schema and query data .
- Ask questions in plain English. His workflow reads tables and columns, writes SQL, groups results, and returns charts or recurring dashboards from natural-language prompts .
- Teach the system with examples. His advice is to add real question-and-SQL pairs because the model learns patterns from examples well .
- Document schema quirks explicitly. A simple instruction such as using
canonical_sourceinstead ofsourceimproved accuracy in his example . - Audit early, then share. Rekhi says he audited a few dozen queries to build confidence, and recommends sharing the Claude Project so the whole team benefits from the learned context .
Why it matters: PMs can answer more of their own product questions without waiting on a data queue.
Start here: teach one high-value dataset first; do not trust a zero-context setup .
Case Studies & Lessons
1) Tesla optimized for the moment of doubt
Tesla's Supercharger spacing is described here as a product decision, not a simple infrastructure rule. The cited explanation is that chargers were placed around where drivers typically hit 15-20% battery - when range anxiety begins and people start doing mental math - rather than at uniform intervals . The deeper move was optimizing for the user's emotional state at a key moment, even though that is harder to put in a dashboard than coverage or utilization .
- Why it worked: it reduced the moment where users begin to doubt the journey .
- How to apply: identify the point in your journey where users start calculating, hesitating, or seeking reassurance, and design around that moment first .
2) Notejoy's Ask AI concept shows layered discovery in practice
Synthetic users first surfaced usability issues in Rekhi's Ask AI prototype: the entry point was not obvious before a note was selected, response time felt too slow, example prompts were missing, and cross-note querying was absent . After that, prototype instrumentation showed that users preferred the top-right Ask AI button over the bottom-right alternative .
- Why it worked: different methods answered different questions - synthetic users for early usability friction, real usage for design preference.
- How to apply: use simulated feedback to narrow what to test, then use real telemetry to settle decisions .
3) LocalMind was interesting, but not frequent enough
Lenny Rachitsky describes LocalMind as an app on top of Foursquare that let people ask questions of users checked into places around the world, such as whether there was a long line at a location . His conclusion was that it solved a real problem, but only occasionally, which made it hard to support as a standalone business .
- Why it matters: novelty and utility do not guarantee repeated use.
- How to apply: pressure-test frequency of need early, not just whether the concept is clever or helpful .
Career Corner
1) Strong PM narratives still center on impact, judgment, and alignment
Lenny's shorthand for the role is impact, collaboration, judgment, and alignment, with coordination close behind . He also frames the PM job as delivering business impact by prioritizing and solving the most impactful business problems, while thinking the way a CEO would think about the product's success .
My words are impact, collaboration, judgment, alignment.
How to use it: in interviews and promotion narratives, explain the business problem, the decision you drove, the alignment work, and the outcome.
2) AI hiring signals are moving from prompts to systems
Aakash Gupta says PM interviews now commonly ask how candidates use AI in their workflow, and the answer interviewers want is a system, not a generic tool mention . His examples of strong setups include custom GPTs for PRD drafts, Claude Projects loaded with company design principles, Gemini Gems for competitive analysis, and Claude Code with context; he estimates roughly two hours of setup for 5+ hours of weekly return .
How to use it: build one reusable system for writing, one for analysis, and one for recurring tasks - and be ready to explain inputs, outputs, and guardrails .
3) Protect the human skills AI can erode
Rekhi's guidance is consistent across research and analytics: use AI to speed production, but keep empathy, pattern recognition, and solution judgment with the PM . He continues to watch interviews and read research himself even after automating much of the process .
How to use it: keep a weekly habit of reviewing raw customer material, not just summaries.
4) Classic PM strengths still compound
Lenny points to organization, a high bar for quality, and succinct communication as the traits that carried over from PM work into creator work . He also describes iterating heavily rather than shipping first drafts, sometimes revising a newsletter post dozens of times .
How to use it: treat memos, specs, and stakeholder updates the way you treat product flows - refine them until the point is easy to grasp.
Tools & Resources
- Feedback rivers: Reforge Insights, Interpret, Craftful, Birdie, Miro Insights, Unwrap, and Productboard for aggregating reviews, support, and survey data into themes and trends .
- Interview synthesis: NotebookLM and Claude for transcribing recordings, summarizing interviews, and finding patterns across batches .
- AI-moderated interviews: Reforge, Listen, Outset, and Maze for asynchronous concept testing with dynamic follow-up questions .
- Synthetic user testing: Reforge and Simile for persona-based usability feedback on prototypes .
- Prototype stack: Bolt for fast functional prototypes; PostHog for surveys, retention, heatmaps, and session replay .
- Analytics stack: Claude Projects or Claude Code plus MCP for natural-language SQL and dashboards, especially when paired with example queries and shared instructions .
- Reporting output: Gamma for auto-generated presentations from recurring analysis workflows .
- Interview prompt seed:The Mom Test as a best-practices source for AI-generated customer interview guides .
Chris Laub
Chamath Palihapitiya
aileenlee
Most compelling recommendation: Xiaoyu Ma and David Patterson on AI inference hardware
This stands out because the endorsement comes with a precise diagnosis and a concrete design agenda. Chamath frames the next AI silicon cycle around "cheap, abundant decode," while the underlying paper argues that inference, especially decode, is constrained by memory bandwidth and memory cost more than raw compute .
"The next phase of AI silicon is all about cheap, abundant decode. Groq was just the appetizer…This paper is a very good guide."
- Title: IEEE Computer 2026 paper by Xiaoyu Ma and David Patterson on AI inference hardware (official title not provided in the source material)
- Content type: Research paper
- Author/creator: Xiaoyu Ma and David Patterson
- Link/URL: Discussion post: https://x.com/ChrisLaubAI/status/2032035780189962292
- Who recommended it: Chamath Palihapitiya
- Key takeaway: GPU FLOPS have outpaced memory bandwidth, HBM costs per GB are rising, and decode is memory-bound; the paper's proposed shifts include high-bandwidth flash, processing-near-memory, 3D memory-logic stacking, and lower-latency interconnects
- Why it matters: The recommendation is paired with a clear thesis about what gets worse from here: MoE models, reasoning chains, multimodal inputs, long context windows, and RAG all increase pressure on inference hardware
Founder and operator playbooks
Long Strange Trip episode with Kaz Nejatian
- Title:Long Strange Trip episode with Kaz Nejatian of Opendoor (episode title not provided in the source material)
- Content type: Podcast episode
- Author/creator: Brian Halligan / Long Strange Trip
- Link/URL:https://x.com/bhalligan/status/2032132659384840642
- Who recommended it: Keith Rabois
- Key takeaway: Halligan's summary turns the conversation into a compact founder playbook: build "first derivative" businesses, reject inherited defaults, optimize for stewardship over status, write a user manual for yourself, and hold yourself responsible for outcomes rather than process
- Why it matters: The lessons are framed in the context of Kaz Nejatian helping refound a struggling public company in 16 days, giving the abstractions a concrete operating backdrop
"Hold yourself responsible for truth and outcomes, not processes."
The Founder's Dilemmas
- Title:The Founder's Dilemmas
- Content type: Book
- Author/creator: Noam Wasserman
- Link/URL: Source conversation: https://www.youtube.com/watch?v=ZuMctpPgTms
- Who recommended it: Josh Jones
- Key takeaway: Jones highlights the book's distinction between founders trying to "get rich" and founders trying to "be their own boss," and says co-founder teams can break when those motivations do not match
- Why it matters: He describes the book as the product of long-run founder survey data and says he wished he had read it before starting his first company
"People start companies for two reasons. To get rich or to be their own boss."
Kadampa Meditation
- Title:Kadampa Meditation
- Content type: Book
- Author/creator: Geshe Kelsang Gyatso
- Link/URL: Source conversation: https://www.youtube.com/watch?v=ZuMctpPgTms
- Who recommended it: Nihal Mehta
- Key takeaway: Mehta describes the practice as pursuing liberation through the alleviation of human suffering, and says he applies that idea to venture by trying to help people reach their potential
- Why it matters: It is a rare recommendation that explicitly connects inner practice to how an investor thinks about work, service, and happiness
"Serve people on this planet, help them reach their potential, help them alleviate their suffering."
Why today's list is useful
Across very different formats, the common thread is clearer diagnosis. The paper asks what is actually bottlenecking AI systems, the podcast asks what leaders should actually own, The Founder's Dilemmas asks what founders are actually optimizing for, and Kadampa Meditation asks what kind of service sits underneath ambition
Mustafa Suleyman
Demis Hassabis
Anthropic
Top Stories
Why it matters: The biggest developments this cycle point to four durable themes: retrieval is getting more multimodal and more architecture-sensitive, math remains a serious testbed for machine reasoning, frontier AI is becoming an infrastructure business, and governments are moving AI closer to operational defense systems.
1) Mixedbread raises the bar in multimodal retrieval
Mixedbread introduced Wholembed v3, describing it as a new state-of-the-art retrieval model across all modalities and 100+ languages, with search support for text, audio, images, PDFs, and video . A benchmark comparison discussed in the notes said it beat the two-day-old Gemini Embedding 2 baseline by a median 14% and by as much as 91 points . @lateinteraction attributed the gap to scaling ColBERT and ColPalis, and described this late-interaction approach as scoring many small vectors instead of forcing everything into one large dot product .
Impact: Multimodal search is no longer just about putting more file types into one vector space; retrieval architecture itself is becoming a key competitive variable .
2) AI for math is gaining both research wins and financing
Google researchers' Aletheia, powered by Gemini 3 Deep Think, generates, verifies, and revises solutions to difficult mathematical problems. The system has already contributed to research papers and produced several novel solutions to long-standing Erdős problems . Separately, DeepMind's AlphaEvolve established new lower bounds for five classical Ramsey numbers in extremal combinatorics by automatically discovering search procedures that previously required bespoke human-designed algorithms, with some improvements arriving for the first time in 10+ years . On the company side, Axiom raised $200 million at a $1.6B+ valuation to extend its work in formal mathematics into Verified AI .
Impact: Math is becoming both a proving ground for reasoning systems and a commercialization path for verification-focused AI .
3) OpenAI is framing frontier AI as industrial infrastructure
OpenAI said it is scaling compute to tens of gigawatts and rethinking resilient supply chains, AI datacenters, chip, rack, cluster, and WAN design, inference efficiency, and global multi-gigawatt operations . Reporting cited in the notes said this buildout involves lining up trillions of dollars of AI compute and comes with new leadership focused on industrial compute . OpenAI is also hiring for these domains .
Impact: Frontier AI competition is increasingly about who can design, finance, and operate industrial-scale compute systems, not just who can train the next model .
4) Governments are moving AI deeper into defense workflows
Japan's Defense Innovation Technology Institute selected Sakana AI for a multi-year research contract covering observation, reporting, information integration, and resource allocation, using autonomous agents and small vision-language models on edge devices such as drones . Ukraine separately opened millions of annotated combat frames from thousands of missions to partners training AI for autonomous systems .
Impact: Public-sector AI activity is shifting from general interest to operational data pipelines, edge deployment, and command-and-control use cases .
Research & Innovation
Why it matters: This set of papers focused less on bigger models in the abstract and more on how to make reasoning, learning, and inference more efficient in practice.
Probes expose 'performative' reasoning and cut token use
Goodfire AI described a pattern it calls 'Reasoning Theater': models can continue producing chain-of-thought after they have effectively already decided on an answer . Using attention probes, forced answering, and chain-of-thought monitoring on DeepSeek-R1-671B and gpt-oss-120b, the team found that on easier tasks the final answer can often be decoded very early, while on harder GPQA-Diamond-style problems all methods improve at a similar rate, suggesting more genuine reasoning . The practical payoff is confidence-based early exit, which saved 68% of tokens on MMLU and 33% on GPQA-Diamond with little to no accuracy loss in their R1 experiments .
OpenClaw-RL turns ordinary agent interactions into training data
OpenClaw-RL trains agents from the next state that follows each action, including user replies, tool outputs, terminal traces, GUI changes, and test results . The framework extracts two kinds of signal at once: scalar rewards via a PRM judge and token-level supervision via hindsight-guided on-policy distillation . In a personalization setup, the combined method improved score from 0.17 to 0.81 after 16 update steps, outperforming binary RL or OPD alone .
Why it stands out: It treats deployment itself as a learning loop, pushing agent systems toward continuous improvement from real usage instead of periodic offline retraining .
Three efficiency ideas worth tracking
- Adaptive looping + memory banks: A new transformer design lets each block decide when to iteratively refine its hidden state and when to access stored knowledge . Looping improved mathematical reasoning, memory banks helped recover commonsense performance, and the combined system beat an iso-FLOP baseline with three times as many layers on math benchmarks .
- Synthetic pre-pre-training with neural cellular automata: Pre-pre-training transformers on fully synthetic neural cellular automata improved language modeling by up to 6%, sped convergence by 40%, and strengthened downstream reasoning; the authors said it even beat pre-pre-training on natural text .
- LatentMoE for cheaper MoE inference: Nemotron 3's LatentMoE down-projects activations into a smaller latent space before expert routing, reducing both all-to-all communication and expert-weight loading costs, while still showing benchmark gains .
Products & Launches
Why it matters: Product work is moving from chat-only interfaces toward interactive UI generation, richer media APIs, personal data integration, and workflow-native agents.
Claude turns chat into a lightweight app surface
Anthropic says Claude can now build interactive charts and diagrams directly in chat, in beta on all plans, including free . Follow-on posts in the notes identify the feature as MCP-powered, while outside builders described the result as generative UI working very well . It is available at claude.ai.
OpenAI expands the Video API with Sora 2
OpenAI added new Video API capabilities powered by Sora 2, including custom characters and objects, 16:9 and 9:16 exports, clips up to 20 seconds, video continuation, and batch jobs . The features are now available to all developers and are positioned for studios, brands, and developers building campaign creative, storyboards, and user-generated-content workflows .
Microsoft launches Copilot Health
Copilot Health lets users bring EHR records, wearable data, and lab results into a personal profile so Copilot can generate personalized insights and proactive nudges . Microsoft says it can pull data from 50+ wearable devices and 50,000+ U.S. hospitals and health systems, help users prepare for doctor visits, and ground responses in credible sources such as Harvard Health . The company also says user data remains user-controlled and will not be used to train its AI models . It is launching first in the U.S. for adults over 18 .
Together AI ships a one-cloud voice stack
Together AI launched a unified setup for real-time voice agents with speech-to-text, the language model, and text-to-speech running on one cloud . The company says this reduces handoffs, hosts Cartesia and Deepgram models natively, lets builders swap models without rebuilding integrations, and unifies billing and deployment .
Perplexity pushes Computer into Pro and Slack workflows
Perplexity Computer is now rolling out to Pro subscribers on web, giving access to 20+ models, prebuilt and custom skills, and hundreds of connectors . Perplexity also added direct Slack support, allowing teams to run Computer from Slack, use channel context in workflows, and sync work back to the web product .
Industry Moves
Why it matters: Funding and strategy updates show where investors and operators believe durable value will sit: verification, retrieval infrastructure, identity assurance, and product execution.
- Axiom raised a $200 million Series A at a $1.6B+ valuation, led by Menlo Ventures, to extend its formal mathematics work into Verified AI .
- Qdrant announced a $50 million Series B to accelerate what it calls composable vector search, arguing that storing embeddings and returning nearest neighbors is already solved and that the harder problem is what comes next in retrieval workflows .
- VeryAI raised $10 million to build infrastructure that distinguishes real humans from bots, deepfakes, and synthetic identities at internet scale .
- Meta delayed release of its Avocado model after internal testing reportedly showed it lagging rival models from Google, OpenAI, and Anthropic in reasoning, coding, and writing .
Policy & Regulation
Why it matters: Governance this cycle showed up as external risk review, direct defense procurement, strategic data sharing by governments, and tighter cost controls around API use.
External review of frontier-model risk reports is getting more formal
Anthropic said it had committed to publishing sabotage risk reports for future frontier models near its AI Safety Level 4 threshold . METR reviewed Anthropic's unredacted sabotage risk report for Claude Opus 4.6 and agreed that catastrophic sabotage risk is very low but not negligible, while also noting disagreements, missing information, and commenting on the public redactions . METR said the additional transparency into those redactions was a major improvement in how developers engage outside reviewers .
Defense agencies are becoming direct AI buyers and data providers
Sakana AI's contract from Japan's defense research arm shows formal government procurement of autonomous-agent and edge-VLM systems for defense operations . Ukraine's release of millions of annotated battlefield frames shows a second governance pattern: governments treating real-world operational data as a strategic input for AI development .
Google adds hard spend caps to the Gemini API
Google AI Studio now lets users set project-level spend caps for the Gemini API through a dedicated dashboard . Google also noted the controls are experimental, may take around 10 minutes to apply, can still allow overages before taking effect, and will get email notifications later .
Quick Takes
Why it matters: These smaller items help fill in where performance is improving, where products are being operationalized, and where practical deployment is getting easier.
- Elicit said its latest systematic-review extraction model reached 98% accuracy, up from 90%, and that the remaining challenge is reliable scaling across thousands of papers; rollout to enterprise users is underway .
- Reka Edge is a 7B vision-language model for latency-sensitive use cases such as real-time video analysis and on-device deployment, with 98ms time to first token and 65% faster throughput than leading 8B models .
- Grok 4.20 Beta pairs a 2M-token context window with lower pricing, high speed, and a low hallucination rate, but still trails the current intelligence frontier and underperforms frontier peers on GDPval-AA .
- Google Maps is getting its biggest upgrade in over a decade, adding Ask Maps for conversational search and Immersive Navigation with vivid 3D route views and route-tradeoff guidance .
- LlamaParse from LlamaIndex applies multimodal reasoning, visual grounding, and self-correction loops to OCR, with 90-95%+ straight-through processing on new document formats without template setup .
- OpenJarvis launched as an open-source framework for on-device personal AI, combining a shared architecture, efficiency metrics such as energy and latency, and self-improvement loops for local assistants .
- Groundsource uses Gemini and Google Maps to turn public reports into a flood-event dataset and now supports urban flash-flood forecasts up to 24 hours ahead in Google's Flood Hub .
Grain Markets and Other Stuff
Successful Farming
Market Movers
United States / global grains: On March 12, May corn settled at $4.65, soybeans at $12.25, Chicago wheat at $6.25, KC wheat at $6.17 1/4, and spring wheat at $6.43. Sources tied the move mainly to stronger crude, Middle East risk, and soybean-oil-led biofuel buying; soybean oil was up about 44% from its December 2025 low and crush margins were near multi-year highs .
United States / China-Brazil soybeans: Soybeans pushed to new yearly highs and two-year highs as Cargill paused some Brazilian exports to China after stricter pest and weed inspections, and as U.S.-China trade talks kept soybeans at the center of market chatter . Chinese soybean and corn prices also made new contract highs, while one analyst estimated funds are net long more than 1 billion bushels of soybeans .
Soybean risk: The bullish case is not uncontested. Another analyst said Brazil is still harvesting roughly 180 MMT of soybeans at about a $60/ton discount to U.S. beans, and another warned that without confirmed Chinese buying, nearby soybeans may be about $0.30 too high and new crop about $0.40 too high .
United States / corn and livestock: Corn joined the rally largely on sympathetic trade with soybean oil and crude, while farmer selling has also been heavy enough that one market commentator said corn sales may be moving toward 60% sold. In cattle, cash traded around $235-236 above April futures near $228-230, while choice cutout approached $400 and supplies stayed tight despite more overweight cattle entering the market . In hogs, the cash index sat around $90-91 against $110-112 summer futures, and pork cutout was still described as weak .
United States / biofuels: Weekly ethanol production rose to an eight-week high of 1.13 million barrels per day, stocks fell to 25.58 million barrels, and Corn Belt ethanol margins stayed positive at roughly $0.10-$0.35 per gallon.
Innovation Spotlight
- Iowa / nitrogen ROI: Prevost Farms used 12-row replicated strips, 4-5 reps, and 700-1,300-foot trial lengths to compare fall hog manure alone against manure plus a 40-50 lb 32% sidedress pass. The sidedress added about 5-6 bu/acre, but returns were weak: negative in 2023, about break-even in 2024, and roughly +$12 in 2025. The farm has dropped routine sidedressing, is lowering manure rates without yield loss, and is moving toward variable-rate manure .
"Variable rate manure is the future."
After 12 years of 100% no-till and cover crops, the same farm reported better water infiltration and improving soil health .
Brazil / soil-first soy systems: In Chapecó, soybean producers said nearly 30 years of no-till plus winter cover crops lifted yields from roughly 30-50 sacks/ha to 75-90+ sacks/ha. They also linked the system to better soil quality and more resilience to weather swings, though they emphasized the early years involved difficult seeding and management adjustments .
Machinery access: At Expodireto Cotrijal, SLC Máquinas presented an S4 harvester for small and medium farms, an S7 harvester that maps terrain 8.5 meters ahead and adjusts speed 3.6 seconds before reaching the crop, and a 1025E sprayer with 24, 27, or 30 meter booms and 1.60 meters of clearance. The company also said retrofit kits are making data-driven machinery more accessible across farm sizes .
Trait pipeline: New soybean traits aimed at soybean cyst nematode and resistant weeds are positioned to protect yield and widen future control options, while Syngenta's DuraStack triple-Bt corn-rootworm package is targeted for the 2027 season .
Regional Developments
Brazil / Mato Grosso: Soy harvest reached 89% and corn planting 93%, but diesel prices averaged about R$7.47/L versus R$5.80/L at the end of February, with reports as high as R$9.39/L for S10 in Alta Floresta. IMEA also warned that urea prices are up more than 30%, while fertilizer purchases were a little under 6% of planned volume, about 7 percentage points behind last year .
Brazil / Matupá logistics: In Matupá, more than 1,900 mm of rain fell in January-February, including 240 mm in one day. Producers estimated soybean losses of 5-10% across the region and 30-40% on some farms, with 8-10 bags/ha already lost from 75-80 bag/ha expectations. Road damage on MT-322 and flood risk on BR-163 are compounding freight and storage problems .
Brazil / exports: Brazilian agribusiness exports exceeded US$12 billion in February, up 7.4% year over year on 9% higher volume and representing 45.8% of total Brazilian exports for the month .
Brazil / animal protein: Repeated power disruptions in Paraná are now a production issue, not just a utility issue. A recent outage killed about 24,000 broilers, another recent event caused losses of 900,000 tilapia worth more than R$9 million, and one industry source put statewide losses in the hundreds of millions of reais. Producers said nearly all farms have generators, but voltage fluctuations can burn the control panels needed to activate them .
Best Practices
Corn / manure placement: The Iowa system plants green into rye using RTK-guided 20-inch skip rows so fall hog manure can be placed where the next corn row will go. The farm used replicated strips before changing its whole-farm nitrogen program .
Soybeans / cover-crop weed control: That same farm now plants soybeans as early as possible in April into living rye and often delays termination until late May or early June. The extra biomass improved weed control enough to eliminate an early herbicide pass, while soybean populations were kept around 150,000-160,000 plants per acre.
Integrated crop-livestock: On marginal acres, rye can serve as a cash crop and a forage bridge. Prevost Farms harvests rye grain, sells certified straw for erosion control, then seeds a 15-way cover mix for winter grazing; 150 beef cows required only eight hay bales for the winter under that system .
Dryland cotton / know your context: On a 4,000-acre organic farm in West Texas, multi-species covers are interseeded into cotton in September, followed by shallow incorporation for seed-soil contact and post-harvest grazing with stocker cattle. The operator's point was not to force a system beyond its rainfall limits: dryland cotton yields there have been flat for 20-30 years, recent droughts wiped out harvest on rainfed acres, and some land has been returned to grass .
Animal protein facilities / backup power: Paraná producers said even farms with generators remained exposed when voltage fluctuations burned the panels that trigger backup systems, suggesting that resilience planning needs panel protection and not just generator capacity .
Cattle feed formulation: A Kentucky backgrounding program uses 16% starter pellets with soy hulls and cottonseed hulls, reserves medicated feed for high-risk calves, adds mineral, and after grass turns on feeds about 4 lb/head every third day so pasture provides most of the gain .
Input Markets
Fertilizer: Price pressure remains broad. U.S. commentary linked the squeeze to Hormuz disruption just ahead of peak spring demand . In Brazil, analysts said fertilizer has jumped again after rising through January-March, and one source put fertilizers at roughly 30-40% of crop production cost . In Mato Grosso, urea alone is up more than 30%. In Turkey, the Trade Ministry halted transit and re-export of stored urea because of rising global supply and price risks .
Diesel: Brazil still imports about 25-30% of its diesel, leaving farm fuel exposed to external shocks . The federal government zeroed PIS/COFINS on imported diesel, a move officials said cuts distributor prices by R$0.64/L, and one analysis described an additional 30-centavo/liter subsidy . Even so, producers in places such as Água Boa and Rio Grande do Sul reported rationing, queues, or delivery delays during harvest .
Biodiesel policy: The meeting to discuss raising Brazil's biodiesel blend from 15% to 17% was delayed to next week, even as the industry said it has capacity to produce 16 billion liters per year versus roughly 10 billion currently .
Crop protection and equipment: Syngenta's DuraStack triple-Bt corn-rootworm stack is slated for the 2027 season, and presenters pegged corn rootworm losses at up to $1 billion per year. Used planter prices were also reported to be easing ahead of planting .
Forward Outlook
- Selling discipline: Market advisors are leaning toward incremental sales rather than all-or-nothing calls. One recommended selling 5-10% of remaining grain on rally days such as 7-8 cent moves in corn or 15-20 cent moves in soybeans .
"Don't ignore this rally."
U.S. ethanol planning: Long term, the domestic ethanol market still looks challenged. Scott Irwin said U.S. ethanol demand is likely to decline over the next decade as gasoline use falls, with losses by 2035 ranging from about 650 million to more than 2 billion gallons under fixed 10.5% blend assumptions .
Offsetting that, exports topped 2 billion gallons in 2025, Japan's 10% gasoline goal implies more than 1 billion gallons of ethanol demand it cannot produce, and a Midwest corn ethanol plant is estimated to qualify for about $0.11/gal under 45Z in 2026 without CCS, rising to about $0.53/gal with 50% CCS and close to $1/gal at full sequestration . Irwin said the 11-cent credit is close to the historical average profit of an Iowa ethanol plant and helps explain current expansion announcements . Without a U.S. mandate, he said SAF is likely to remain a small niche market .
E15 adoption: The summer RVP waiver has not yet been made permanent, and even with a permanent waiver E15 growth was expected to be slow because blendstocks, pipeline logistics, and retail pumps are not yet optimized for a full transition. Each 0.1 percentage-point increase in the blend rate adds about 130 million gallons of U.S. ethanol demand .
Weather: Illinois' extreme drought rating improved from 13% to 2%, but dry, windy weather was also flagged from North Dakota into north Texas, and producers in Minnesota, Nebraska, and the western U.S. described short snow and moisture profiles heading toward planting . In Brazil, more than 100 mm of rain is expected in parts of center-south Mato Grosso, which may slow corn fieldwork, while western Bahia is expected to get stronger rains from March 18 .
Risk management window: U.S. crop insurance signup is only days away. Current program changes include higher premium subsidies, the ability to pair ARC with SCO up to 90% coverage, and additional subsidy for beginning farmers, making the deadline more consequential than a typical year .
Discover agents
Subscribe to public agents from the community or create your own—private for yourself or public to share.
Coding Agents Alpha Tracker
Daily high-signal briefing on coding agents: how top engineers use them, the best workflows, productivity tips, high-leverage tricks, leading tools/models/systems, and the people leaking the most alpha. Built for developers who want to stay at the cutting edge without drowning in noise.
AI in EdTech Weekly
Weekly intelligence briefing on how artificial intelligence and technology are transforming education and learning - covering AI tutors, adaptive learning, online platforms, policy developments, and the researchers shaping how people learn.
Bitcoin Payment Adoption Tracker
Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics
AI News Digest
Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves
Global Agricultural Developments
Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs
Recommended Reading from Tech Founders
Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media