ZeroNoise Logo zeronoise

VC Tech Radar

Live Daily at 7:00 AM Agent time: 8:00 AM GMT+01:00 – Europe / London

by avergin 120 sources

Daily AI news, startup funding, and emerging teams shaping the future

Trajectory’s $15M Round, Robotics Simulation Progress, and New Vertical-AI Wedges
May 28
5 min read
921 docs
Machine Learning
sarah guo
Ronak Malde
+6
Trajectory’s continual-learning raise leads this investor brief, followed by emerging YC teams in compute, finance, labor, and nuclear, plus technical signals from robotics simulation, portable MoE inference, and AI-search infrastructure. The broader pattern is value moving toward vertical scaffolding, open-source positioning, and new distribution surfaces such as AI search.

Funding & Deals

  • Trajectory raised $15M from Conviction, Bessemer, Radical, Jeff Dean, Fei-Fei Li, and others. The company is building a continual-learning platform that uses product-usage signals to continuously post-train agentic models; its research team comes from DeepMind, OpenAI, Apple, Meta Superintelligence, Amazon AGI, and Scale AI, with product talent from Stripe and Figma, and it says partners including Clay, Harvey, Decagon, Mercor, and Rogo are already using the system, with some deployments in production

  • TechCrunch’s Equity podcast surfaced Scrunch’s capital path: a $4M seed in March and a $15M Series A in July. The company pairs AI-search visibility analytics with an "agent experience" product that strips non-semantic page elements and serves bot-optimized versions at the edge

Emerging Teams

  • InfinityOS / ProjectX_Cloud is a web-based OS that lets any device run Windows or Linux desktop apps, each with its own GPU but inside the same workspace and filesystem. YC’s demo claims included a Chromebook running Isaac Sim, an iPad rendering Blender in 4K, and a phone running 10 parallel agents; founders are Rounacc, Bishal, and runallapps

  • KelAI is an autonomous research engine for hedge funds and institutional investors, built by an experienced quant PM, that runs idea generation, validation, monitoring, and feedback in one agentic system

  • Apollo Atomics is building compact nuclear reactors with less than 24-month deployment timelines. Its wedge is a modified pressurized-water design that flips the steam generator to make the plant an order of magnitude smaller without reducing power; founders are Assil Halimi and Drew

  • Rentahuman and Eden both point toward AI-native labor models in the physical world. Rentahuman lets AI agents communicate with and pay humans for real-world tasks and frames the mission around creating jobs and coordinating workers at global scale; Eden launched Eden I, an industrial semi-humanoid robot available for hourly hire

AI & Tech Breakthroughs

  • Genesis World 1.0 is an open-sourced robotics simulation stack aimed at turning physical-world iteration into a compute problem. The release says one hour of real testing can become 100 days of simulation and describes a rebuilt stack including a GPU-accelerated cross-platform compiler, penetration-free multi-physics contact solvers, unified rigid and deformable physics, a photo-realistic Nyx renderer, and the Quadrants engine, which the team says delivers 10x faster launch and up to 4.6x runtime versus the prior Genesis release; it also reports near real-time dexterous manipulation across multiple embodiments with a lower sim-to-real gap

  • TritonMoE is a portable MoE inference kernel worth watching. A new preprint describes it as written entirely in OpenAI Triton for NVIDIA and AMD portability without vendor-specific code; the authors report a fused gate+up GEMM that eliminates 35% of global memory traffic, 89-131% of Megablocks throughput at inference batch sizes up to 512 tokens on A100, identical execution on MI300X, and limitations at 2048+ tokens or with 64+ experts under extreme routing skew

  • Scrunch’s "agent experience" layer reframes technical SEO for LLMs. The company says key pages can be reduced by 98-99% in token count—for example from roughly 100k tokens to 1-2k—by stripping code, JavaScript, image tags, and other non-semantic content, then serving the optimized version only to bots while humans keep the normal page

Market Signals

  • Startup formation and monetization look faster than the last SaaS cycle. A circulated Stripe datapoint set claimed new business creation was up 2x YoY in March, 20% of startups charged their first customer within 30 days versus 8% in 2020, time to $1M/$10M/$100M ARR is down about 2x, and average revenue per business is still up despite 2x more company creation

  • The app-layer thesis is shifting toward vertical scaffolding, not just better models. a16z argues that in complex verticals, value comes less from raw model capability than from the surrounding infrastructure that makes output trustworthy, compliant, and operational; horizontal categories like code generation improve directly with pre-training spend, while vertical problems require industry-specific systems

  • Founder edge may be moving toward domain expertise and distribution. Another circulated take argued for vertical SaaS in under-softwared industries, said domain experts can outperform traditional Silicon Valley pedigrees, and pointed to a teacher-built MagicSchool AI as an example of how quickly AI-native companies can scale

"the moat in software was never the code
it was always coordination, distribution, and knowing what to build"

  • AI search is becoming a distinct distribution surface. Scrunch says many customer sites now see more AI bot traffic than human traffic and that the gap is widening month over month; it also says AI referrals convert 400% higher than traditional organic search, while the broader backdrop is rapid growth in Google AI Overviews, AI Mode, Gemini, and longer open-ended queries

  • Open source positioning remains a real wedge in AI dev tools. Pragmatic Engineer reports OpenCode rose from roughly 650k MAU to nearly 8M and almost 1M DAU in a few months, argues open source positioning was a major reason it captured the category, and also flags GPU demand as a system-wide bottleneck

Worth Your Time

TechCrunch Equity on AI searchwatch here. Useful background on how Google AI Overviews, AI Mode, and Gemini are changing discovery flows, plus a founder-level explanation of bot-optimized pages

Trajectory announcement threadX post for the clearest primary source on the round, team pedigree, and early partner list

OpenCode episodePragmatic Engineer if you want a tighter view on open-source positioning, GPU bottlenecks, and the shift toward AI coding agents

TritonMoEpaper and code for a concise read on cross-platform MoE inference portability

InfinityOS and Apollo AtomicsInfinityOS launch and Apollo Atomics launch for quick primary-source product pages

Perceptic’s $12M Biopharma Round, Lucis’ Series A, and AI’s Shift Toward Trust Layers
May 27
5 min read
799 docs
r/SideProject - A community for sharing side projects
Serena Ge (Datacurve)
Eric Wu
+9
Two health-focused AI financings lead this brief, alongside early traction in agent infrastructure, enterprise workflow automation, and regulated operations. The broader pattern is AI moving toward trusted deployment in compliance and business systems while compute, memory, and inference speed remain hard constraints.

Funding & Deals

  • Perceptic announced a $12M round led by Air Street and Accel, with participation from Elder Gull and angels from AI labs, to operationalize frontier AI systems for biopharma . The founding team includes CEO Tilman after a seven-year stint at Palantir, plus Palantir AIP veterans Martin Copes and Zaki Trache . The company says it is already live in top-20 pharma accounts speeding up critical workflows, and Nathan Benaich framed the timing around biopharma’s increased appetite for frontier AI and AI labs turning more attention toward science .
  • Lucis raised a $20M Series A to expand its AI-driven preventive health platform in Europe . YC says the company has served 10,000+ customers across France, the UK, Ireland, and Portugal, delivered 1M+ biomarker tests, and that 75% of users who completed a six-month follow-up improved three or more markers without medication .

Emerging Teams

  • Superset is an open-source IDE for developers to run hundreds of agents in parallel. YC says it has grown 30% week over week for the last four months and is helping engineers ship 10x more PRs.
  • Appnigma is attacking Salesforce implementation pain by generating native managed packages from natural language. The founders say it compresses 3-6 months of development into days and outputs 100% native Salesforce metadata. They are ex-Salesforce, including 3.5 years on the AppExchange review team, and say customers include Pylon, Warmly, UserEvidence, and roughly 85% YC companies.
  • Alchemize is building an AI-native customs brokerage that promises importers real-time regulatory clarity and shipment clearance in minutes instead of days. YC highlighted the launch from founders Samuel and Robert.
  • NavigateAI gives Eric Wu a new wedge in labor-constrained physical work. The company says it wants to give every field worker an AI copilot to address skilled-worker shortages so crews can build faster and better. Keith Rabois amplified the launch, calling it “Super cool” and emphasizing faster, lower-cost building .

AI & Tech Breakthroughs

  • DeepSWE is positioned as a new benchmark for agentic coding. Its creators say it shows where top models actually diverge in realistic developer workflows, and Garry Tan called it the new standard for engineering evals.
  • GBrain + ActiveGraphAI outline a more auditable agent architecture. The system pairs a durable markdown/git knowledge substrate with an event-sourced runtime, then wraps operations as typed events, projects retrieved evidence into graph state with stable citation tokens, adds an explicit propose/approve/apply flow, and forks before mutation so agents can trace recommendations to specific evidence and runs .
  • StableBrowse is a concrete attempt to improve agent browsing economics. YC says its browser layer lets agents navigate the web with 70% fewer tokens and 3-4x faster execution.
  • YourMemory is a notable open-source experiment in persistent AI memory. Its author says the system adds time-aware retrieval and memory decay based on a modified Ebbinghaus forgetting curve, tested it on LongMemEval, and runs it locally with a visualization dashboard .

Market Signals

  • Compute still looks supply-constrained rather than speculative. In a 20VC interview, Cerebras’ CEO said data centers cannot be built fast enough, the company has a $25B backlog, and Nvidia, AMD, and others also have backlogs . He argued the industry is building behind demand, not ahead of it . In the same interview, he said HBM shortages could last several years because fab capacity is lumpy and slow to add, and that electricity may become the longer-term limiting factor .
  • Compliance is starting to look like a real AI software category. a16z says AI may finally be moving from good enough to pilot to good enough to trust in compliance, noting that many LLMs now score 80-100% on LegalBench’s 162 legal reasoning tasks. The firm argues compliance is essentially applied legal reasoning and affects every dollar moving through a business .

“As AI clears the ‘good enough to trust’ bar and sales cycles speed up, there may finally be an opening for startups.”

  • AI is beginning to replace legacy SaaS, not just sit on top of it. Harry Stebbings highlighted a CEO saying the company replaced a $600K Salesforce contract with a vibe-coded CRM built in three weeks, expects to eliminate 80% of its internal SaaS stack, and would not change Anthropic usage even if pricing doubled .
  • On-device inference is moving from theory to production design choice in B2B SaaS. A founding team at an on-device AI infrastructure company says its system handles 60-80% of requests locally with cloud fallback, especially for bounded tasks like transcription, summarization, and classification on modern hardware .
  • Agent governance is hardening into an infrastructure layer. GBrain + ActiveGraphAI describe replay, stable citations, controlled write-back, and fork-before-mutation workflows . Synapsor is built around governed memory, staged writes, replay, permissions, and audit trails for agents touching business systems . Nexus Synapse makes a similar argument that runtime governance, not just the model, is the missing layer around AI systems .

Worth Your Time

  • 20VC x Cerebras CEO: useful primary source on AI infrastructure bottlenecks, HBM scarcity, and the argument that slow inference has no real market. YouTube interview
  • a16z on compliance: useful if you are mapping where AI may move from pilots into trusted production software. Essay
  • DeepSWE release thread: worth reviewing if you diligence coding agents or developer infra and want a benchmark tied to realistic workflows. X thread
  • Superset launch page: useful background on an open-source IDE that YC says has grown 30% week over week for four months. YC launch
  • GBrain x ActiveGraphAI thread: useful diligence material if you care about evidence-linked, replayable agent systems. X thread
Campfire’s Series B, BioStack’s Revenue Jump, and Verification as AI’s Next Layer
May 26
4 min read
724 docs
Machine Learning
Artificial Intelligence (AI)
Sathya
+9
Campfire provides the clearest financing signal, while BioStack, Callab_AI, and Mount show where early-stage AI companies are finding product wedges. Across the set, the stronger pattern is a move toward verifiable AI, local inference, and control layers around autonomous systems.

Funding & Deals

  • Campfire recently raised a Series B led by Accel and Ribbit Capital. The thesis is an AI-native ERP for high-growth tech companies that automates accounting, taxes, and investor reporting; YC also said Campfire has more than doubled ARR each quarter since Q4 2024 and now has 100+ employees after closing a $35 million Series A in June 2025 with 12 people.
  • Conifer says it has funding to build an open-source local inference runtime for Apple Silicon. The five-person Princeton team is building it in Rust with handwritten kernels, says it is ahead of llama/mlx on small models, and is using a 100-user free beta to surface bugs and tool needs .

Emerging Teams

  • BioStack is the strongest early traction signal in the set. It builds simulation environments where healthcare AI models practice on real clinical data, converting messy records, lab tests, notes, and long-horizon outcomes into data, evals, rewards, and benchmarks; YC said revenue moved from six figures to seven figures in just the last few weeks. YC identified the founders as @sanatmishra7 and @patwa_parth.
  • Callab_AI is attacking a large legacy-integration wedge. The company connects AI voice agents directly to on-prem PBX systems such as Cisco UCM and Mitel, avoiding migration in a market where 58% of the $400B call center industry still runs on-prem . YC identified the founders as @haithemkchaou and @chehir_dh.
  • Mount is notable because it turns AI-agent risk into an insurable product. Its pitch is to secure autonomous-agent workflows, measure residual risk, and transfer that risk through insurance built specifically for AI agents so companies can use agents without carrying the full downside alone . YC identified the founders as @johnbachm and @fabeamherd.

AI & Tech Breakthroughs

  • Delta Attention Residuals is the clearest research signal in this batch. Instead of routing over cumulative hidden states, it routes over deltas, which the authors say avoids routing collapse in deep layers and produces 1.8x sharper cross-layer routing. Reported results include 1.7-8.2% lower validation PPL from 220M to 7.6B, drop-in fine-tuning of pretrained models that beats baseline on 8 benchmarks, and 0.008% parameter overhead at 8B.
  • Small local models are getting more practical. Garry Tan said Qwen2.5-7B Instruct is at GPT-3.5-turbo level and argued that even if local models are not the default, every device will need one as a fallback when connectivity fails . Conifer is building toward that future with a runtime for fully local agents that can access files and apps under OS kernel enforcement.
  • AI-guided gene editing remains a frontier category. Nathan Benaich highlighted ProfluentBio's work on designing large gene insertions and fine-scale editing with AI.

Market Signals

  • Verification, governance, and risk transfer are emerging as a distinct AI layer. Vinod Khosla called autoformalization the next critical frontier and said founders should work on areas where AI is weak . In the same direction, Orygent is building a governed enterprise work layer around trust, approvals, audit trails, role-based authority, and verifiable AI, while Mount focuses on security plus insurance for autonomous agents .
  • Local inference is shifting from niche feature to resilience layer. Garry Tan said local models will not be the default, but every device will need one as an emergency generator when connectivity drops . Conifer's funding and beta around Apple Silicon local inference is one startup expression of that view .
  • The software labor debate is becoming more explicit. Bindu Reddy argued that engineers are producing 10-100x more code, that layoffs will continue at companies with large engineering teams, and that the current status quo is unsustainable because of resulting instability in large codebases and teams .

Worth Your Time

  • Delta Attention Residuals:paper and code. The work reports 1.7-8.2% lower validation PPL from 220M to 7.6B with 0.008% parameter overhead at 8B.
  • World-model explainer:drops.mts.now/world-model. It covers what world models are, how they work, and what DreamZero and Agora-1 are building; Marc Andreessen amplified it on X .
  • Campfire founder thread:X post. YC says the discussion covers launching the first paying version as a Google Sheet, pulling customers off NetSuite with four employees, and founder-led sales through Series A.
  • BioStack launch page:YC launch. BioStack says it turns messy clinical data into post-training loops for healthcare AI, and YC says revenue moved from six figures to seven figures in weeks .
  • Conifer beta and feedback:site and waitlist. The team says it is building an open-source Apple Silicon runtime and is taking 100 users into a free beta .
Amilabs’ World-Model Bet, Memory-Led AI Infrastructure, and Earned-Insight Startups
May 25
6 min read
587 docs
Paul Graham
Yann LeCun
Yann LeCun
+7
Amilabs’ outsized world-model round is the clearest capital signal in this batch, while the strongest early teams are solving narrow, painful workflows in support, job search, agent memory, and retrieval. The broader pattern is a market shifting toward inference economics, grounded tooling, and founder-led products built from earned insight.

Funding & Deals

  • Amilabs: €890M world-model round. Yann LeCun says Amilabs launched after his Meta departure became official on 31 Dec 2025, and the company raised an oversubscribed ~€890M round at roughly €3-3.5B pre-money . CEO Alexandre Le Brun previously sold a startup to Facebook, later led engineering at the Paris research lab, then founded Nabla; Laurence Olly also joined from Meta’s Europe operations . The thesis is JEPA/world models for real-world understanding, planning, robotics, industrial control, and predictive maintenance . Nabla is already described as a privileged partner for healthcare applications .

  • Regulated AI is still being financed like infrastructure, not SaaS. A renewable-grid startup is pre-MVP and pursuing HORIZON/EIC grants before building inside an EU-approved sandbox to meet AI Act requirements, after framing congestion and curtailment as a multi-billion-dollar European problem . The core pushback in the thread was about liability and trust: start with prediction and suggested actions, get TSO feedback, and only later ask for live control .

Emerging Teams

  • Arbyn: founder-market-fit in Shopify support. The founder cites seven years across ecommerce support environments, detailed ticket economics of roughly $2.70-$5.60 per ticket, and direct experience with why earlier AI support tools failed . Arbyn handles email, chat, Instagram DMs, and Facebook Messenger from one inbox, can take Shopify actions inside the conversation, trains on a merchant’s actual sent emails for voice, uses a 50-conversation calibration phase, and prices at $99/month flat with unlimited conversations .

  • Ninelayer: retrieval infra that only became useful once latency dropped. Early users said the product gave agents better context, more grounded responses, and citations that made outputs easier to trust . The first version sometimes took ~40 seconds, so the team rebuilt retrieval and brought the same flow down to about 1.5 seconds for agent reasoning and planning workflows .

  • Hiro: immigration-aware job search wedge. Built by an ML engineer after his own OPT job-search experience, Hiro aggregates 550K active jobs from 52 sources, scores each role semantically against a user profile, and layers 8.4M USCIS H-1B sponsor records on top . The stack uses Next.js, GCP Cloud Run, Cloud SQL with pgvector, Vertex AI embeddings, and Gemini for the agent layer, and the product is already live .

  • XTrace: managed memory API with a differentiated view of agent state. Its xmem SDK extracts facts, episodes, and artifacts from multi-turn conversations and uses AGM-style belief revision so changed preferences or corrected facts supersede old memories instead of accumulating as noise . The system runs on PostgreSQL + pgvector with HNSW indexing, Redis caching, and multi-tenant isolation, and ships with an open-source TypeScript SDK plus docs .

AI & Tech Breakthroughs

  • Inference economics are now a memory problem. vLLM’s PagedAttention improved KV-cache utilization, batching, and throughput by borrowing OS paging concepts rather than assuming contiguous memory . The broader point is that modern LLM inference is memory-bandwidth bound: KV cache scales dynamically with users, batch size, and context length, and a 70B model can require hundreds of GB to multiple TB of KV cache at scale . That is why the stack is shifting toward HBM, NVLink, unified memory, compression, quantization, and smarter cache management .

  • World models / JEPA are re-emerging as a post-chatbot thesis. LeCun describes JEPA as a non-generative architecture that predicts in an abstract representation space instead of reconstructing every detail, and describes world models as systems that predict the effects of actions so they can plan toward goals . He explicitly says he believes 2026 will be “the year of the World Model” . Amilabs is commercializing that direction into robotics and complex industrial systems .

  • Local inference keeps getting more practical. Clement Delangue highlighted llama.cpp with MTP support moving Qwen3.6-27B dense generation on an A10G from 25 tok/s to 45 tok/s, a 78% speedup that makes local models more plausible as daily-driver tools .

  • AI math is being framed as novel idea generation, not just faster search. One highlighted case describes an OpenAI model solving the Erdős unit-distance conjecture by connecting algebraic number theory to geometry, with a Princeton mathematician refining the result and Tim Gowers indicating the proof could meet Annals of Mathematics standards . The significance, as framed in the source, is AI doing mathematics differently rather than merely faster .

Market Signals

  • Earned insight remains the cleanest founder filter. One founder-quality test circulating on X argues that the best companies come from a specific, earned insight rather than generic AI for X pitches, while weak teams often build something nobody asked for and avoid direct user truth . Paul Graham separately argued that founders who start too early often have not had time to develop that earned insight . Several teams in this batch are grounded in explicit lived pain: Arbyn in ecommerce support, Hiro in OPT job search, and DriftWatch in day-to-day data engineering problems inside finance settings .

  • The near-term agent opportunity is the ‘cerebellum,’ not the ‘prefrontal cortex.’ Garry Tan’s framing is that routine tasks should become reflexive automation, and that most agent frameworks will fail by treating all cognition as high cognition . The commercial examples in this batch skew that way: Shopify support actions, managed memory layers, and faster retrieval for grounded responses .

  • Latency, grounding, and memory are hardening into distinct infra layers. Ninelayer only became usable to agents after retrieval fell from ~40 seconds to ~1.5 seconds . XTrace is packaging memory so developers do not have to build vector stores, dedup logic, and session state themselves . The vLLM discussion points to the same conclusion one layer lower: memory, not raw FLOPs, is becoming the economic bottleneck in inference .

  • Regulated verticals will likely enter through decision support, not full autonomy. In the renewable-grid thread, feedback centered on liability, TSO trust, and the need for human-approved suggested actions before live control . The founder’s immediate next step is stakeholder discovery with Romania’s TSO and EU research centers, not deployment .

  • Technical sophistication alone is not a go-to-market strategy. Xipen’s team combines a bioinformatician and two math PhDs, has a live product, working Stripe integration, daily updates, and institutional-style modeling for 12,000+ stocks, yet reports only four paid users at €10/month .

Worth Your Time

  • LeCun on why world models now. Best primary-source explanation in this batch of JEPA, world models, and why they matter for planning, robotics, and industrial systems . YouTube interview

“À mon avis, 2026 va être l'année du World Model.”

  • The vLLM/PagedAttention essay. Useful if you want a compact argument for why long-context serving is becoming a memory-architecture problem, not just a model-size problem . Reddit post

  • Garry Tan’s cerebellum post. A crisp framework for sorting durable agent products from planning-heavy demos: the winning systems may be the ones that make boring tasks reflexive first . X post

  • Arbyn’s operator essay on ecommerce support. Worth reading for concrete support economics, why earlier AI support tools failed, and what product choices matter in this vertical . Reddit post

  • XTrace’s memory SDK and docs. Useful diligence material if you are evaluating managed-memory infrastructure for agents, especially around contradiction handling and state history . GitHub · Docs

ZeroEntropy’s GBrain Win, Agent Guardrails, and an AI GTM Compensation Bubble
May 24
4 min read
718 docs
Future(s) Studies
r/SideProject - A community for sharing side projects
Thinking Machines
+6
Commercial validation led this cycle, with ZeroEntropy, SPRYT, and Avarieux surfacing as the clearest near-term deal signals. The deeper themes were agent control infrastructure, privacy-first/on-device AI, and a 20VC discussion that framed AI sales hiring and pricing as a growing market distortion.

Funding & Deals

  • ZeroEntropy: GBrain now ships ZeroEntropy as its recommended default embedding and re-ranking option over OpenAI and Voyage AI. That commercial placement is paired with Garry Tan's public endorsement of the company as a six-person team building task-specific AI models described as 4-8x faster than offerings from OpenAI or Anthropic, with 500K Hugging Face downloads.
  • SPRYT: The UK company says its Asa patient-engagement agent is backed by the NVIDIA Inception programme and partnered with Optum and NHS trusts. The reported pilot results are better than traditional reminder baselines, particularly for patient cohorts that are historically harder to reach.
  • Avarieux: The company came out of stealth this week and opened a waitlist around a finance-research product that verifies numeric AI claims against public sources before delivery. The product is explicitly framed to operate as a publisher rather than an adviser.

Emerging Teams

  • ZeroEntropy: The signal is compact but notable: six people, 500K Hugging Face downloads, and third-party distribution through GBrain's default embedding/reranking slot.
  • Avarieux: The founder is a recent MS Data Science graduate who built several public MCP servers and has two pull requests under review in Anthropic's official modelcontextprotocol/servers repo. The product adds a real-time verifier between model and user, and every analysis becomes a timestamped, citable URL.
  • Cirano: The product runs 100% on-device on iOS with no cloud and uses chat history for what the company calls Personal Digital Intelligence. After 14 months of development and a late pivot from cloud to on-device architecture, the four-person team across Minnesota, Uzbekistan, and Vietnam reported its first four paying subscribers and $52 MRR.
  • MCP policy SaaS: A solo founder is building cloud-managed allow/deny/audit policies across roughly 3,800 MCP servers and can push signed policy bundles to enrolled machines in under 60 seconds. A local Claude Code plugin enforces those policies, keeps a SHA-256 chained audit log, and blocks prompt injection attempts before Claude sees the call.

AI & Tech Breakthroughs

  • Arc Sentry: The product detects multi-turn jailbreaks by monitoring internal model state instead of prompt text. On a USENIX Security 2025 example, the score moved from 0.031 at Turn 2 to 0.232 at Turn 3; the post says LLM Guard scored 0/8 because it evaluates prompts independently, while Arc Sentry blocked before any model response was generated.
  • Thinking Machines: The company says it is building AI for real-time, human-like collaboration and shared its approach and early results. Garry Tan separately argued that fast, usable multimodal systems could enable personal AI after quickly fine-tuning his own Qwen3.5-397B model.
  • Runtime control for agents: The MCP policy product above enforces tool policies locally on the developer machine and keeps data local, while Arc Gate uses the same geometric monitoring layer as a hosted runtime governance proxy for agents using APIs.
  • DeepSeek Flash: Bindu Reddy highlighted the model as almost 100x cheaper, effective for small tasks, and well-suited to operating on large amounts of data or records.

Market Signals

  • 20VC participants described an AI sales compensation bubble. In the episode, operators said Anthropic and OpenAI are offering $10M-$30M+ packages for top sales talent, and one speaker said Anthropic is inflating a bubble the rest of the market is trying to keep up with.

"It's a bubble... Anthropic is inflating that bubble."

  • API-first AI selling is getting more technical. The xAI example in the same discussion emphasizes technically astute reps who can run their own demos.
  • International expansion is moving earlier. The same operators said companies are opening EMEA and APAC in parallel instead of waiting to fully mature North America, increasing the premium on leaders with global experience.
  • Consumption pricing is spreading as per-seat SaaS comes under pressure. The 20VC discussion argues that vendors are being forced to add consumption components and tie some sales compensation to actual usage after the initial booking event.

Worth Your Time

Nano Claw's Seed, Anthropic's Stack Moves, and New Vertical AI Wedges
May 23
5 min read
847 docs
Jerry Liu
Greg Lukianoff
Simon Eskildsen
+7
Nano Co's seed, Anthropic's Stainless acquisition, and 20VC's UsePrelude bet anchor this brief. It also surfaces promising teams in health, legal, robotics, and agent commerce, plus the clearest market signals on vector stores, AI cost control, and small-team leverage.

Funding & Deals

  • Nano Co / Nano Claw: $12M seed over a $20M buyout. The brothers-founded company positions Nano Claw as a secure alternative to OpenClaw. The round was led by Valley Capital Partners, with the Hugging Face CEO participating as an angel after reaching out over social media. TechCrunch also notes that Andrej Karpathy's support helped draw attention and investment to the company.

  • 20VC wrote a $15M check into UsePrelude. Harry Stebbings frames the bet around founders Matias Berny and @Zibra_, a market shaped by more apps and more security threats, and fund-returning upside. He also says the company has signed one of the largest social networks and e-commerce players and is already doing many millions in ARR.

  • Anthropic bought Stainless for a reported ~$300M and hired Andrej Karpathy. TechCrunch describes Stainless as automated SDK/API tooling that every AI lab wants when scaling agents. Anthropic had already been using the product internally, making the acquisition a strong signal that key agent-stack tooling is being pulled in-house.

Emerging Teams

  • Juno is one of the clearest early distribution signals in this batch. The company is building an AI personal health assistant for chronic illness. Founders @isaactolley_ and @marshalljgould grew up with chronic conditions themselves, and YC says Juno is already supporting 80,000+ people globally six months in.

  • Synphony is attacking agricultural labor with a robotics wedge that already looks economically legible. YC says the company deploys robots to pick strawberries in a California market worth $3B, where labor is 60% of cost and the workforce is shrinking. The company says robots have now reached the crossover point with field labor, with strawberries as an entry point into a $15B berry market. Founders are Sean Wu and Saichi Fujimoto.

  • A new law-firm research assistant is a credible vertical AI wedge to watch. The founder is an ex-bigtech engineer who started building after a layoff. The product searches a firm's own document library from plain-English questions, returns answers with exact citations, weights legal authority, surfaces conflicting sources, and lets senior lawyers add durable annotations. There are no clients yet, but repeated law-firm conversations are validating the pain point, and the product is being designed for local hosting because security matters to attorneys.

  • YC's latest launches also show a widening agent-commerce stack. Allowance lets AI agents make purchases with one-time virtual cards and built-in guardrails. HessianHQ forward deploys into businesses to map work before building, operating, and scaling agents. Amboras says its end-to-end ecommerce automation is already producing 80%+ conversion-rate lifts for early merchants.

AI & Tech Breakthroughs

  • Guardrails are becoming standalone agent infrastructure. Nano Claw focuses on safer agent execution as a secure alternative to OpenClaw, while Allowance focuses on safe agent payments through one-time cards and purchase guardrails. The common pattern is notable: autonomy is creating new products around operational control, not just smarter models.

  • Replication Radar is an ambitious attempt to use AI for knowledge verification. Built by Rhea Karty at Harvard's lab, the system is designed to crawl papers, books, claims, citations, replications, retractions, old debates, and buried null results to check what actually holds up. It is supported by Cosmos Institute and FIRE, and Marc Andreessen separately flagged the project as interesting.

Does this actually hold up?

  • Abinitio Bio is applying foundation-model thinking to biomanufacturing. YC says the company turns 6-18 month process decisions into hours of compute, with pharma economics measured against $100M+ per month of delay on blockbusters.

Market Signals

  • turbopuffer is the strongest contrarian infra datapoint in this batch. The company crossed a $100M run-rate in March, 19 months after $1M, is profitable, and raised less than $1M. Customers include Cursor, Anthropic, Notion, Cognition, Harvey, Bridgewater, Ramp, Linear, Legora, Superhuman, Atlassian, and Granola. Jerry Liu's takeaway is that even in a commoditized vector-store market, a better product can still win if it makes the right technical bet, in this case optimizing cost through object storage.

  • AI cost observability is turning into an immediate budget-control category. One startup's SDK for tracking AI spend inside apps passed 1k+ npm downloads and 100+ paying users within days. The feedback clustered quickly around per-user and per-feature cost tracking, Slack alerts, incident replay, and kill switches to stop runaway spend.

  • AI is compressing team-size assumptions. One 4-person team says it is running multiple products at roughly 600K€ ARR and 35% EBITDA, arguing that AI lets tiny teams do work that used to require 100 people. Separately, a solo founder reports 1,060 registered users, 640 monthly actives, 18 paying subscribers, and 247 AI calls per day with no advertising spend.

  • Some investors are explicitly looking for markets where adoption friction is near zero. Garry Tan highlighted 9 Mothers, a counter-drone defense company in the YC Spring 2026 batch, as a case where there is no viable close-quarters alternative.

Worth Your Time

  • TechCrunch's Equity Podcast — best single watch here for context on Nano Claw, Stainless, and Anthropic's willingness to buy critical agent-stack tooling.
World Models, Agent Sandboxes, and New Vertical AI Wedges
May 22
6 min read
857 docs
Sarah Guo
Yann LeCun
Elad Gil
+18
The clearest signals this cycle are a concrete shift toward world models, strong traction in agent infrastructure, and a fresh set of vertical AI startups in fraud, pathology, and human-agent coordination. The brief also covers new financing structures, frontier-model economics, and a short list of source material worth reading or watching.

Funding & Deals

  • OpenAI's YC token-for-equity program is still the clearest financing experiment in this batch. Sam Altman offered $2M in OpenAI tokens to every YC startup in the current batch in exchange for equity; YC separately said the offer covers the spring and summer batches and extended the summer deadline to May 25. External commentary framed the tokens as compute credits that can de-risk early product work and may lift valuations at the margin.

  • Round mechanics are lagging company growth. Harry Stebbings said founders are agreeing terms around $3M ARR and reaching $15M-$20M ARR by the time legals finish, with company progression outpacing legal completion.

  • A small angel round came with a clear product lesson. An AI video editor founder said they raised $30K two weeks earlier, then learned from 10 beta users that the real pain was workflow speed, not output fidelity. They responded by cutting validation from 18 gates to 5, limiting retries, and moving to a preview-first flow.

Emerging Teams

  • Daytona: stateful compute for agents. CEO Ivan Burazin previously co-founded CodeAnywhere, used by about 3 million people, and later ran developer experience at InfoBip. After a January 2025 pivot from human dev environments to agent sandboxes, Daytona reported 74% month-over-month growth; one customer runs about 850K sandboxes a day; RL/eval workloads moved from 0% to roughly 50% of usage. The system runs on bare metal with its own scheduler, using local NVMe snapshots to start one sandbox in about 60 ms or 50,000 in about 75 seconds.

  • Incandor: behavioral intelligence for bank fraud. YC says the product links behavior across accounts, making fraud rings, mule handoffs, and banned operators visible. Founders are Matthew Yekell and Luc Rosenzweig.

  • Limrun: mobile development infrastructure for cloud agents. The product provides remote Xcode plus iOS and Android simulators so cloud agents can build mobile software; YC says customers already include Replit, Rork, and Momentic AI. Founder: @muvaff.

  • Voquill: voice AI for pathologists. Voquill listens while pathologists work and drafts sign-out-ready reports in real time, targeting a workflow where many pathologists spend more time writing reports than diagnosing. Founders are @HenryHabibAI, @josiahsrc, and Michael.

  • Human-agent coordination is becoming a software layer. Pentagon, launched by @edgarpavlovsky, argues that agents are already doing coding, research, ops, and customer work but still operate in isolation, turning humans into middleware. Lightsprint is attacking the adjacent problem with a platform for visual planning, parallel cloud agents, live previews, and more reliable shipping.

AI & Tech Breakthroughs

  • World models are moving from research rhetoric into startup formation. Yann LeCun said he founded Advanced Machine Intelligence to pursue world models and physical AI beyond LLMs, predicted 2026 will be "the year of the world model," and argued that LLM-style next-token architectures do not work for video, sensor, or biological data because there are infinitely many plausible next states. Fei-Fei Li said World Labs is building foundation models for spatial intelligence, with world models and world action models that learn from pixels to generate states, policies, and actions for robots and physical systems. Bioptimists is applying similar beyond-language ideas to biology with multimodal, multiscale models aimed at drug discovery and rational medicine design.

"I think 2026 is going to be the year of the world model"

  • OpenAI's unit-distance result is a real symbolic milestone. An OpenAI model discovered a new family of constructions for the planar unit distance problem, outperforming square-grid-based approaches and disproving a belief held since Erdős posed the problem in 1946. Multiple sources framed it as the first time AI autonomously solved a prominent open problem central to a field of mathematics; one account said the model connected geometry to deep number theory, and experts including Noga Alon, Melanie Wood, and Tim Gowers called it "a milestone in AI mathematics."

  • Runway is productizing a stronger video-editing primitive. Aleph 2.0 lets users edit a single frame, preview the change, and propagate that edit through the rest of the video inside the web-based Edit Studio. Cristóbal Valenzuela said Aleph 1.0 had already changed editing workflows and positioned 2.0 as a new standard for the category.

  • Fast inference remains one of the few infrastructure advantages users immediately feel. Cerebras said its wafer-scale AI systems are 15-20x faster than GPUs at inference and are built around a 46,000 square millimeter chip. CEO Andrew Feldman said demand accelerated in 2025 once models became useful enough for everyday work, and argued that speed opens new business models rather than just marginal efficiency gains.

Market Signals

  • AI adoption is now showing measurable GTM leverage. ICONIQ/SaaStr data says companies with AI fully embedded in GTM generate roughly 2x the net new revenue per FTE of medium and low adopters. AI-heavy pipelines also show better top-of-funnel conversion: new lead to MQL is 38% versus 27%, and MQL to SQL is 37% versus 29%. Daily AI use passed 50% in marketing, SDR/BDR, and RevOps.

  • Returns still appear concentrated at the frontier, and the supporting stack is expensive. In recent conversations cited by Patrick O'Shaughnessy, Anthropic's Krishna, Dylan Patel, and Gavin Baker all argued that frontier models capture most economic returns at the model layer; Krishna said customers spend heavily on newer models because frontier intelligence drives meaningful ROI. Sarah Guo added that this is a capex-intensive cycle, that Nvidia is 2-5 years ahead in areas like neoclouds and inference cloud, and that startups still want frontier chip performance because it enables products such as current coding agents.

  • The geopolitics of open versus closed models are shifting. Fei-Fei Li said the 2026 AI Index shows the US-China capability gap has closed for the first time; she added that China now leads in open LLMs, video models, and even world models, while the US is closing models.

  • Efficiency gains are real, but the energy map will get more complicated. Fei-Fei Li said inference costs for language models fell about 280x in the last two to three years through distillation, quantization, and newer chips. At the same time, she said AI's current power buildout is being driven by training and inference on language models, while embodied AI will eventually add a much more distributed pattern of on-machine compute and energy demand.

  • Early-stage distribution is being subsidized with tokens, and that may invite backlash. Harry Stebbings said token spend is becoming a core marketing line item, with founders willing to give away $20K-$50K per month in tokens to drive usage and temporarily out-hustle incumbents. He also warned that layoffs and capital shifts into machines are creating a political and social backlash the tech industry is underestimating.

Worth Your Time

  • OpenAI's planar unit distance thread — Primary-source summary of the math result and what changed relative to long-standing square-grid intuition.

  • Runway Aleph 2.0 demo — Quick product demo of a potentially important editing primitive: change one frame, then propagate the edit across the clip.

Blank Bio's Seed, Exa's Search Bet, and the Agent-Native Infrastructure Shift
May 21
5 min read
724 docs
Exa
Aidan Gomez
Cohere
+12
Blank Bio's seed and Exa's search financing framed the capital signals, while YC launches and new commentary from Baseten, Railway, and Cohere sharpened the investment case around agent-native software, post-training, and compute economics.

Funding & Deals

  • Blank Bio: Blank Bio raised a $7.2M seed with a strategic collaboration from PacBio. The company is training foundation models on bulk RNA-seq to help pharma design better clinical trials by learning patient heterogeneity and building prognostic and predictive biomarkers from tumor transcriptomes. Announcement

  • Exa: Exa raised $250M at a $2.2B valuation in a Series C led by a16z. Not seed-stage, but still a clear thesis-confirming financing: Exa is positioning as search infrastructure for AI agents, especially on long-tail, high-alpha queries where traditional engines fail, and a16z says developers and agents are already reaching for it first. The founders started building years before ChatGPT, betting transformers would change how information is accessed.

Emerging Teams

  • Lab0: Lab0 is building an AI forward deployment engineer for enterprise software, automating client process discovery, configuration, testing, and go-live. The key datapoint is implementation speed: YC says deployment cycles fall from six months to ten days. Founders: Onkar Borade, tokenaware, and Sujay Sriv.

  • InLoopRobotics: InLoopRobotics is selling warehouse automation as a monthly service rather than capex: packing, kitting, and fulfillment with no integrators and no 6-month PoC. Paid pilots are already live at 300+ picks per hour. Founders: FeduniakS, Zakariea_sh, and Pasha Rizali.

  • Armature: Armature is an early signal that "agent experience" may become its own software category. It runs real agent workflows to monitor and optimize how AI agents experience products, with a focus on improving MCP or CLI surfaces. Founders: Totzenberger and Louis Scremin.

  • AI code-review tooling is starting to cluster: YC-backed Stage is a guided code-review platform for understanding AI-generated code and claims faster review than GitHub, while Prix AI independently pitches AI as the first reviewer on GitHub PRs, flagging repetitive issues such as edge cases, logic mistakes, performance, security, and style problems before humans step in. The overlap suggests a real wedge is forming around QA for AI-written software.

AI & Tech Breakthroughs

  • Baseten's "owned intelligence" thesis is getting production proof points: Baseten describes its stack as production-grade inference for companies moving from rented to owned intelligence by post-training models on their own application data. It cited Abridge, Decagon, OpenEvidence, Cursor, and Intercom as companies already adopting this pattern, and its technical work is pushing toward continual learning for long-horizon agentic tasks where models evolve with real-time data, tools, and specialized evals.

  • Cohere Command A+: Cohere said Command A+ is its most powerful LLM yet, optimized to run on minimal hardware and released as the company's first fully open-source Apache 2 model. For investors, it is a clean signal that efficient open models are still improving at the high end.

  • Context control is turning into real infrastructure: Compresh reports roughly 60% fewer input tokens on long agent sessions by keeping the last four rounds raw and compressing older context into a partitioned memory view; in separate architecture writing, an "Adaptive Agent Architecture" proposes state-driven micro-agents, hard retry limits, and reflection anchors, with a claimed reduction from 15,000-50,000 tokens per task to 3,000-7,000. The broader takeaway is that memory and retry control are becoming first-class product surfaces.

  • Efficient-model research keeps moving: A BitNet 1.58 writeup highlighted a ternary-weight approach using {-1, 0, +1} instead of FP16/FP32 weights, trading precision for higher dimensionality to preserve output quality while reducing memory and compute demands.

Market Signals

  • The competitive layer is moving above the base model: Railway argues agent workloads need tighter control over network, compute, storage, orchestration, versioning, observability, and branching at 1,000x human scale; Armature is explicitly measuring how agents experience products; and a Reddit discussion around Google's enterprise agent platform framed the shift as moving from model hosting toward orchestration, governance, and multi-agent tooling.

"Pull request is definitely dying."

  • Compute scarcity is creating infra moats: Railway says its own bare-metal data centers deliver roughly three-month payback and ~70% margins, while cloud bursting across five providers helps avoid compute bottlenecks. Baseten says capacity constraints are worse than most outsiders think and has responded by distributing inference across 15-20 clouds and 80-100 regions.

  • Search and distribution are being rebuilt for AI agents: Exa's financing rests on the idea that agent-first search wins hard, long-tail queries, while Georion is building a growth dashboard around AI visibility scanning, prompt tracking, AI crawler logs, and revenue attribution across engines such as ChatGPT, Claude, and Perplexity.

  • Capital structure may matter more than many app founders expect: Gavin Baker argued that disaggregating prefill and inference could extend GPU useful lives from 3-4 years to 10-15 years, lowering financing rates and helping fund the AI buildout; in the same discussion, he said TSMC's capacity decisions are the key indicator for whether AI infrastructure turns into an overbuild.

  • Policy risk is rising around frontier releases: Bindu Reddy flagged a planned White House executive order requiring frontier models to be reviewed 90 days before release, and argued it would boost China and open-source AI.

Worth Your Time

  • GBrain thread and follow-up: Quick read on open-source agent memory infrastructure, benchmarked long-memory performance, and context-engineering-driven idea generation.

  • Blank Bio seed announcement: Short, useful read if you want the cleanest primary-source framing for the RNA-seq foundation-model thesis in clinical trials.

OpenAI's YC Token Offer, Enterprise AI Traction, and the Agent-Web Buildout
May 20
6 min read
859 docs
Sam Altman
Patrick Collison
Sam Altman
+14
OpenAI's batch-wide YC financing offer was the clearest capital signal, but the deeper read is broader: enterprise AI teams are showing real traction, agent-era web infrastructure is emerging, and local or verification-first AI architectures are getting more investable.

Funding & Deals

  • OpenAI made the clearest financing move in this batch: Sam Altman said OpenAI offered $2M in tokens to every startup in the current YC batch in exchange for equity. Outside observers compared it to Yuri Milner's old practice of offering to invest across YC, while Altman and Garry Tan framed the upside as seeing what "tokenmaxxing" founders build.
  • The structure also sharpens platform-risk questions: Jason Calacanis warned YC founders that taking the tokens carries a non-zero risk OpenAI studies their product and ships adjacent functionality into its own free offering.
  • Seed pipeline worth a look: a vertical AI company selling finance workflow automation to mid-market CPG brands said it has 7 paying customers, just over $10K MRR, zero churn after seven months, and is raising a seed round. Its founders said win rates improved once they sold outcomes rather than AI.

Emerging Teams

  • Serval is one of the strongest traction signals here: the AI-native enterprise service management company is two years old and already serves 100+ customers, from AI-native startups to enterprises with hundreds of thousands of employees. Its architecture keeps workflows-on-databases as the core abstraction, but uses AI codegen to create and maintain workflows from natural language, split across an admin agent and a help-desk agent with approvals and permissions. Serval says it uses OpenAI for end-user interactions, Anthropic for automation/codegen, and benefits from strong economics because it is not reselling tokens.
  • p0 / Index looks like an early infrastructure play for agent traffic. Parag Agrawal said p0 launched Index so content owners can understand how AI agents use their work and earn revenue from it; he said the thesis is that agents will use the web 1000x more than humans, and that agents are already scaling on p0's infrastructure. Early partners include The Atlantic, Fortune, PR Newswire, PitchBook, ZoomInfo, Tracxn, RocketReach, and several creators.
  • Compute and physical-AI infrastructure keep producing new YC companies: General Instinct helps robotics teams run frontier models offline and with low latency on constrained devices including Jetsons, mobile NPUs, and ARM CPUs, while Zibra Labs says its HPC clusters let quantitative trading firms run 100x more backtests across massively parallel spot workloads on hyperscalers and neoclouds.
  • Regulated and industrial wedges continue to surface: Panacea_Bio pairs FDA regulatory consultants with an AI platform to speed and lower the cost of biotech and medtech approvals, while Andustry says its AI-native brokerage saves manufacturers 30% and cuts sourcing time in half.

AI & Tech Breakthroughs

  • Verification-first AI is becoming a real design pattern: Aurora exposes deterministic quantitative tools such as aurora_run, aurora_findings, aurora_verify, and aurora_what_if, keeps the LLM as a language layer around structured outputs, and uses a verifier so quantitative claims must be grounded or flagged as uncertain. The system runs locally, is Apache 2.0, and now includes 24+ methods, causal inference, streaming connectors, and signed bundles.
  • Local inference looks increasingly plausible as product architecture: Andrew Chen argued that a very large share of LLM queries are simple enough for smaller local models, noted that consumer hardware can already run good models, highlighted privacy-sensitive categories, and pointed to browser/webGPU delivery as a zero-install way to cut compute costs. He also noted that growing global compute supply should keep pushing cloud token costs down.
  • Distributed training is now a governance problem, not just an infrastructure problem: a cited paper claims GPT-4-scale training could be done over consumer internet, on hardware below proposed compute-governance thresholds, for under $100M, and focuses on how to detect and stop that path.
  • Agent architectures are getting more operationally opinionated: the GBrain framework argues for parameterized skills, a thin harness, explicit resolvers, markdown-based memory, and a hard split between latent judgment and deterministic code. Garry Tan's follow-up frames the resulting moat as "process power," and he called just-in-time, markdown-defined dynamic skills one of the most powerful ideas in personal AI.

Market Signals

  • Outcome-first selling is hardening into the new B2B AI playbook. One founder selling into CPG finance said demo conversion rose once the pitch changed from "we use AI agents" to "we recover deductions," and argued that "depth of workflow coverage" is now the wedge as generic AI claims commoditize. The same founder said early vertical AI companies should target mid-market rather than enterprise because buyers are also users and cycles close in 3-6 weeks instead of stalling for months.

"The AI part is implementation detail and not the value prop we thought it was going into it."

  • The web's agent layer is starting to look like a new distribution and monetization surface. Parag Agrawal said agents will use the web 1000x more than humans and that p0 is already seeing agent traffic scale on its infrastructure. Separately, a SaaS founder said AI traffic jumped 12x the day after shipping agent-friendly site changes including llms.txt, server-side rendering, structured data, and allowlisting major AI bots.
  • Retention is getting tougher in SaaS even when acquisition is still available. Founders described rising churn pressure from subscription fatigue, AI saturation, cheaper clones, and poor onboarding, and said the focus is shifting toward churn, reactivation, and actual LTV rather than top-line MRR screenshots.
  • AI is widening the founder aperture but not removing team-quality filters. Sam Altman said he now wants to fund some non-technical founders who deeply understand users, but also reiterated that shared history and deep mutual respect between co-founders remain one of the strongest predictors of success.
  • Fundraising velocity still looks materially better in the US than Europe for some AI startups. One European company said it is adding about 6 customers per day yet still faces slow, repetitive diligence across around 10 EU VC conversations, while multiple US founders told the team it would likely fund faster in SF/NY.

Worth Your Time

Sam Altman in conversation with Patrick Collison

Best in this batch for how AI changes founder selection and the ceiling for science and small teams. Altman says models are already helping excellent scientists find better ideas and make small discoveries, calls material science especially underappreciated, and describes seeing a small company run much of its work from a single Slack channel with agents.

Sequoia's interview with Serval CEO Jake Stauch

Useful diligence material on AI-native enterprise software. The key segment is the argument that "the product is the boundaries": enterprise adoption depends on permissions, approvals, audits, logs, and scoped integrations, not just raw model capability.

YC's self-improving AI-native company talk

A compact framework for recursive improvement loops, "burn tokens, not headcount," making everything legible to AI, and where humans still matter.

GBrain architecture thread and Garry Tan's follow-up

The clearest material here on skills, thin harnesses, resolvers, deterministic layers, memory, and "process power" as a moat for AI-native startups.

Defense Autonomy, World Models, and the New AI ROI Bar
May 19
6 min read
803 docs
SaaStr
clem 🤗
Cursor
+10
The strongest signals in this batch are Josh Browder’s pre-seed investing playbook, Yaroslav Azhnyuk’s defense-autonomy stack, Odyssey’s move from world models to shared simulations, and a wave of agent-native infrastructure startups. The market backdrop is clearer too: buyers want provable ROI, higher agent utilization, and tighter evidence that founders are authentic operators.

Funding & Deals

  • Joshua Browder is the clearest emerging-manager signal in this batch. Harry Stebbings said Browder would be his pick for a sub-$50M emerging manager and said 12 founders rated him 9.2/10 on average . Browder said his latest fund has made 33 investments at a $5M median entry valuation, with deals ranging from $1.5M to $21M, and that he is concentrating on “real” enterprise AI businesses rather than crypto or consumer hardware . His stated allocation view is to deploy hard at pre-seed rather than save reserves for later rounds, and his operating thesis is that pre-seed companies usually fail by running out of money, hope, or team cohesion .

  • Valar Atomics is a notable hard-tech financing signal. The Information described the company as backed by Trump allies and Palantir-linked investors, with a deregulation tailwind in Washington, while pursuing a faster, “brute-force” path to bringing a reactor online . Suhail said the company in question was @isaiah_p_taylor’s, giving investors a likely founder reference point .

Emerging Teams

  • The Fourth Law / Odd Systems: Yaroslav Azhnyuk, an applied-math-trained serial founder who previously built Petcube, said he moved from consumer IoT into defense tech after Russia’s invasion . He now runs The Fourth Law for on-drone autonomy alongside Odd Systems for thermal cameras, with the two companies moving toward a merge . He said the group sells cameras and autonomy modules to 200+ Ukrainian drone manufacturers and sells drones directly to the Ukrainian armed forces .

  • Monrow: Built after a retry bug projected about $6k/day in Claude spend across multiple app instances, Monrow says it catches runaway AI costs before the next call fires . The team says it launched publicly two days ago, reached 1k+ npm installs, keeps a fully local free tier, and prices Pro at $99 with cross-server detection, alerting, margin intelligence, and kill-switch controls .

  • AgentMail: The YCS25 startup is making email a native surface for agents. Its agent-first signup flow lets an agent arrive via curl, receive markdown instructions, provision a restricted inbox, and ask a human to complete OTP claim . The founder said the product was modified for agents with single-column CLI formatting and shorter message IDs to reduce parsing issues and hallucinated completions .

  • YC launch watchlist: Transload measures freight dimensions in motion using existing CCTV at logistics sites . InsForge is positioning itself as backend infrastructure for coding agents, covering servers, databases, LLM gateways, and frontend deployment . Prism calls itself an AI-native recruiting agency and says its people search scores 21+ points ahead of published competitors on the leading benchmark . Deep Interactions says 95% of AI pilots fail because teams cannot build in sync, and is pitching a collaborative AI builder that ships products in an afternoon .

  • Devlens: Founder-reported traction is still early but worth noting: 50+ waitlist signups in 60 days for the cloud version, despite a free open-source tool already existing . The product uses AST parsing to build an exact map of a JavaScript/React/Next.js codebase, then layers a graph-aware AI chat on top so architectural questions stay grounded in the repo structure .

AI & Tech Breakthroughs

  • Odyssey pushed world models from passive video toward interactive simulation. Starchild-1 is described as the first real-time multimodal world model that can generate interactive simulations with audio . Odyssey also introduced Agora-1, a multi-agent world model where multiple human or AI participants can interact inside the same simulated world in real time, with a playable research preview built around a multiplayer GoldenEye deathmatch .

  • The Fourth Law is building a full autonomy stack, not a single drone feature. Azhnyuk described five autonomy levels ranging from terminal guidance to autonomous takeoff and landing . He also said the company builds autonomy modules across day/night conditions, terrains, and platforms, plus its own simulation, training school, and planned semiconductor plants for thermal-camera sensors .

  • Self-optimizing inference stacks are starting to look viable at the edge. One builder reported tracing every request, clustering similar calls with embeddings, and fine-tuning a 7B model on production traces, claiming 95% agreement with GPT-5.1 at 2% of the cost . The same post said spend fell from $420/month to $73/month in three months, with additional reductions as bad outputs were recycled into negative training examples and good ones into positive data .

  • Cursor is still pushing model capability inside the product layer. The company introduced Composer 2.5 as its “most powerful model yet,” describing it as better at sustained work on long-running tasks and more reliable on complex instructions .

Market Signals

  • The buyer bar in B2B AI is now ROI plus utilization, not generic model access. SaaStr said B2B + AI companies with provable ROI are growing 60%+ this year, while those without clear ROI are being churned out of budgets . It also argued that hallucinations are no longer the frontier buyer conversation when grounding, tool use, and model choice are handled correctly; the harder problem is getting agents to do materially more work inside the customer workflow .

“It’s not dead. It bifurcated. If you have AI ROI you can prove in a customer’s QBR deck, you are growing 60%+ this year. If you don’t, you’re getting churned out of the budget cycle.”

  • Pre-seed diligence is getting more behavioral. Browder says he looks first for founders with deep problem connection and first-customer credibility, citing Owner.com’s origin in Adam Guild building for his mother’s dog grooming business . Stebbings highlighted filters such as late-night pitch meetings, rapid-fire questioning, and live verification of revenue claims . Browder also warned about “fake founders” and AI-assisted narrative engineering, particularly around summer projects where commitment is hard to read .

  • Agents are becoming operational workers, which is creating a new infrastructure layer. In this batch alone, startups were building backend infrastructure for coding agents through InsForge , dedicated email inboxes for agents through AgentMail , telephony rails for any agent through Patter , and cost guardrails through Monrow .

  • Defense tech is being framed as software-defined systems plus manufacturing depth. Azhnyuk said drones matter because software updates can change battlefield capability in a step change, and he tied the opportunity to a wider Western gap versus China in drone manufacturing and autonomy systems .

  • Model ownership is increasingly being treated as strategy, not research vanity. Cursor introduced a new in-house model iteration, and Clement Delangue argued that serious AI companies will want to train their own models on open-source bases rather than outsource via APIs .

Worth Your Time