We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
VC Tech Radar
by avergin 120 sources
Daily AI news, startup funding, and emerging teams shaping the future
martin_casado
Nathan Benaich
1) Funding & Deals
Baobab Ventures is a useful read on current seed taste. Carles Rayner said his solo GP fund backed Revolut and ElevenLabs early, and that he looks for scrappy founders and non-obvious companies that other VCs pass on.
Cogveo is pursuing early-access financing while still pre-scale. The solo founder is building the product while working full-time and is using Kickstarter for early access funding; the product automates recurring AI work on uploaded files, runs saved "skills" autonomously, and generates deliverables such as PPTX, DOCX, XLSX, and PDF inside a Docker sandbox.
SeqPU is a commercialization infrastructure play for open models. Its pitch is to abstract Docker, deployment, billing, and scaling so notebook experiments can ship as Telegram bots, UI sites, or APIs with per-second compute billing and pay-per-use markup, explicitly aimed at monetizing open-source models without per-token API costs.
2) Emerging Teams
Subaiya is building a cloud security proxy for AI agents rather than another sandbox. It adds prompt-injection detection, sensitive-file protection, 20 permission categories with On/Ask/Off controls, and a real-time activity feed with emergency stop; feedback in-thread framed prompt injection and sensitive-file protection as the main blockers to shipping agent tools. Current integrations include OpenClaw, Anthropic, and OpenAI, and tool-call inspection is regex-based rather than LLM-mediated.
Thoth is an open-source agent harness built on LangGraph and promoted by Harrison Chase. The core wedge is a personal knowledge graph with 67 typed directional relations, graph-enhanced recall via FAISS + NetworkX, Obsidian export, a nightly "Dream Cycle" for graph refinement, and map-reduce document extraction with provenance.
Iranti is a self-hosted MCP memory layer for Claude Code and Codex that lets tools write facts centrally and inject relevant context into future sessions, so the user no longer has to re-explain project state across tools. It is AGPL-3.0, fully self-hosted, and currently requires Postgres.
SearchAgentSky is a browser-native agent that opens real sites, follows links, and writes answers while users watch the browser and a raw "Agent View" terminal. It runs entirely in-browser with a QuickJS-to-WASM sandbox, persists sessions across refreshes, and early feedback highlighted the live browsing view as a trust/debugging advantage over black-box RAG.
3) AI & Tech Breakthroughs
- Portable memory is separating from the harness. Garry Tan's "thin harness, fat skills" thesis argues memory and skills should live as markdown in a git repo rather than inside the runtime. He said his open source is used by tens of thousands of agentic engineers per day after three months, and GBrain packages a Claw/Hermes schema, skillpack, RAG memory system, and direct voice access via WebRTC + Twilio.
"If your memory dies when your harness dies, you built the harness too thick."
The agentic web stack is becoming more concrete. MIT Open Agentic Web discussions emphasized identity, attestation, reputation, and registry layers as the missing DNS-equivalent for agents. The discussion also focused on persistent agents that discover, negotiate, and transact across networks, with protocol design, coordination, and provenance framed as the hard problems.
KellyBench is a useful reality check on long-horizon reasoning. General Reasoning reported that models from Google, OpenAI, and Anthropic lost money betting on Premier League matches over a full season, highlighting a gap between strong performance on tasks like software writing and weaker long-term real-world analysis.
4) Market Signals
Enterprise agent adoption is still early, but the operational footprint is already large. Databricks says only 19% of organizations have deployed AI agents, yet agents already create 97% of database branches and 80% of databases on Neon. Multi-agent systems grew 327% in four months, tech companies build nearly 4x more than other industries, and 78% of companies now run two or more LLM families. Governance and evaluation are strongly associated with production success, at 12x and 6x more projects respectively, while Supervisor Agent reached 37% of Agent Bricks usage within four months.
Investor sentiment is hardening against proprietary agent stacks. Garry Tan argues startups building critical operations on Claude Managed Agents or other proprietary harnesses are not investable because the IP sits on an unstable foundation; his preferred alternative is an open, provider-agnostic framework with model diversity, local or fine-tuned models, and private/E2EE options. Imbue is making the same strategic bet around an open agent ecosystem and user control over algorithms and agents.
Public software is being repriced around AI substitution risk. SaaStr's index of top public software companies is down 50.5% over six months, and forward application-software P/E has fallen from 84x in 2021 to 22.7x. The reported drivers are budget displacement toward AI infrastructure and fear that agents erode seat-based models; Harry Stebbings added an anecdote of a $10B public company replacing $1.2M per year of software with a custom build in three weeks, while Martin Casado argues that if cheap capital slows, value will flood downstream.
AI GTM is shifting from outbound to leverage inside support and experimentation loops. In a 20VC interview, ElevenLabs said outbound response rates have fallen below 0.01%, customer support is its fastest-growing revenue product, and internal AI agents for inbound SDR, proposals, and customer success are being used to target 50% productivity gains. The same conversation framed GTM as a portfolio problem—testing many markets and channels in parallel—and noted that customer-support AI is already crowded, with 16 providers having raised more than $75M in the last 18 months.
Open-model supply is likely to bifurcate. Interconnects argues that near-frontier open models will eventually need a consortium as training costs move from millions to billions, while most companies will be more willing to release smaller, fine-tunable models than fully open frontier systems.
5) Worth Your Time
Databricks State of AI Agents 2026 — useful quantitative benchmark for deployment rates, multi-model behavior, governance, and the rapid rise of supervisor agents.
The inevitable need for an open model consortium — useful framing on why open-model supply may consolidate into consortia while smaller fine-tunable models proliferate.
MIT Open Agentic Web conference post — concise field notes on identity, attestation, coordination, provenance, and why expert augmentation still appears more robust than full replacement.
Thoth — a concrete reference implementation for knowledge-graph memory, Obsidian export, and provenance-preserving document extraction in agent systems.
20VC / ElevenLabs on modern AI GTM — useful for the combination of AI-led productivity, customer-support monetization, and the claim that outbound is now effectively broken at scale.
andrew chen
martin_casado
1) Funding & Deals
Accenture’s investment gives Replit an enterprise distribution and security channel. Accenture is investing in Replit, adopting it internally, and working with it to bring secure vibecoding to enterprises globally; Accenture says it has 700,000+ employees and clients across the economy. Replit also now deploys directly into Databricks environments so apps inherit existing security, governance, and data access, and the beta is already being used for BI and internal tools.
OpenAI Foundation is putting $100M+ behind AI-for-biology workflows. The Foundation said it is funding six institutions across AI-assisted drug design, biomarker discovery, disease-pathway mapping, and treatment personalization. Arc Institute’s parallel partnership is especially notable: an “AI lab-in-the-loop” approach that perturbs brain organoids, measures results, feeds them back into models, and iteratively builds causal maps of Alzheimer’s disease.
2) Emerging Teams
VULK looks like one of the stronger verification-first app-builder signals in the batch. The product generates full-stack apps across eight platforms, supports 16+ models, validates output through a 7-layer pipeline before users see it, and reports 7,000+ projects from 3,500+ users. Full code export and self-hosting reduce lock-in, and the company says it is bootstrapped. In parallel, the team says it is developing Oro, a 30B MoE model with 3B active parameters, a verification-first architecture, and 97K curated training examples.
Clatony has early demand in a messy, high-value legal workflow. The founder says the MVP turns 300-1000 page medical-record PDFs into structured timelines for personal-injury attorneys and has already closed three LOIs. The technical wedge is that segmentation and extraction—not prompting—are the hard part, so the system combines deterministic parsing of dates, CPT/ICD codes, and providers with LLMs, then maps outputs to attorney-relevant signals such as treatment gaps, prior injuries, injections, and MRI findings.
Aurora is targeting the “first mile” of API integrations. Built by a lead AI engineer, Aurora explores API endpoints, maps logic, and scaffolds integrations autonomously; the reported benchmark is ~4 hours from discovery to stable deployment versus 15-20 hours for a standard developer workflow. Its recursive validation loop adjusts headers and bodies on 401/5xx responses, uses state-aware exponential backoff for 429s, and is moving toward a dependency-mapping graph for paging and linked resources.
QuickFlo shows strong founder-market fit in workflow automation. The founder built it after repeated client work in contact centers and business automation, and the product’s AI builder is embedded directly into the platform with knowledge of each step’s schema, template syntax, and data flow. The stack also distinguishes execution errors from operational errors, retries 429s instead of 400s, and can stream 500k+ row CSV workflows without loading everything into memory.
3) AI & Tech Breakthroughs
Harness quality is now a first-order performance variable. Stanford’s Meta-Harness result showed a 6x performance gap from changing the harness around a fixed model, with the AI-searched harness beating the best hand-engineered setup by 7.7 points on text classification while using 4x fewer tokens. One result investors should notice: full execution traces outperformed summarized feedback by 15 points at median.
Small-model economics continue to improve. Google’s Gemma 4 26B model activates 3.8 billion parameters per token and is described as within 20 ELO points of Kimi K2.5 and GLM-5 while running on a laptop with 18GB RAM. The architectural bet—128 small experts routing eight at a time—appears central, and the math benchmark jump from 20.8% to 89.2% in one generation is unusually large for an open model family. The implication in the thread is lower-cost edge and offline deployment.
Multimodal world models are being framed as the next major platform shift beyond text-only LLMs. In TechCrunch’s interview with Luma AI, the company argues that LLMs are limited by text-only training and that the larger opportunity is teaching machines to understand the physical world from video, audio, and images. Luma says its approach is a single multimodal model across text, audio, video, and images, with a roadmap from generation to understanding to operation and robotics.
Core AI infra is broadening below the model layer. Hugging Face is launching “Kernels,” a new Hub repo type for optimized binary operations across CUDA, ROCm, Apple Silicon, and Intel XPU, aimed at people training, running, and optimizing models themselves. Separately, an early open-source HNSW prototype stores 3-bit embeddings instead of float32 vectors, reporting ~4x less memory per node and cache hit rates rising from ~60% to ~95% at 100MB under Zipf access patterns, with known tradeoffs in build speed and quantization noise.
4) Market Signals
The venture market is widening at the top and narrowing beneath it. Q1 2026 was the largest quarter for venture investment ever recorded, and AI companies raised more capital in Q1 2026 than in all of 2025. But the market is concentrated: OpenAI and Anthropic alone accounted for 57% of all US startup capital raised in Q1, 54% of VC-backed unicorns are now AI-native or AI-adjacent, and seed dollars rose even as deal count fell 30% year over year—“few bets, bigger checks.”
Internal AI tools look increasingly like a founder pipeline. Andrew Chen’s thesis is that an explosion of internal AI apps—often built by non-engineers—will create a funnel from internal tool to blog post or open-source release to employee spinout. The key go-to-market advantage is internal distribution: the company itself is the first customer base.
“the org IS the network. every team is an atomic network ready to adopt”
Enterprise AI budgets are moving from experimentation to line items. In SVB’s survey of 200+ startup finance leaders, AI adoption was the top issue for startups, 63% of CFOs ranked it top-two, median AI spend was expected to double to about $50K, and more than half of CFOs were already seeing ROI. The staffing effect cited most often was fewer junior hires rather than layoffs.
Inference demand is colliding with infrastructure politics. a16z says inference—not training—is projected to drive data-center buildout, but public support is weak: Pew found only 6% of Americans saw local AI infrastructure as positive, Maine is moving toward a data-center moratorium through November 2027, and as many as half of scheduled 2026 data centers could be delayed.
Control of the agent stack is becoming a strategic battleground. Martin Casado argues the most powerful models may remain with model creators, with everyone else using distilled versions or first-party apps without direct token access. Amjad Masad warns that permanently locking frontier models behind first-party interfaces would bottleneck innovation, while Kanjun argues poor portability of agent data points toward fully open agent stacks with defined protocols and stronger user data ownership.
5) Worth Your Time
Clouded Judgement on “Long Live the Harness” — the clearest short essay in this batch on why orchestration quality can matter as much as model quality, and why founders should buy generic harness infrastructure but build domain-specific context, retrieval, and error handling themselves.
Andrew Chen’s thread on internal tool spinouts — a useful sourcing lens for companies that may emerge from internal AI apps with built-in early distribution.
a16z’s venture charts — a compact dashboard for record Q1 funding, inference-led data-center demand, seed concentration, and the semiconductor outlook.
Accenture’s Replit announcement — worth reading if you track secure enterprise vibecoding and distribution through global services firms.
TechCrunch Equity with Luma AI — a good watch for the “beyond LLMs” thesis around multimodal world models and robotics.
“It’s in teaching machines how to understand the physical world.”
martin_casado
Nathan Lambert
Tyler Angert
1) Funding & Deals
- Foundry Robotics — $19M seed for AI-first manufacturing. Foundry says it is tackling American manufacturing with an AI-first, software-defined approach. The seed is backed by Khosla Ventures, Hana Bi Capital, Redglass VC, ZeroShot Fund, and others; Mike Volpi publicly backed the company and the team is hiring.
2) Emerging Teams
- Contral.ai — AI IDE with a built-in teaching layer. The product combines a VS Code fork, a repo-aware agent that reads, writes, and runs full codebases, real-time explanations, quizzes, and a Defense Mode that makes users explain their own code. Self-reported traction is strong for a bootstrapped product: #1 Product of the Week on Product Hunt, 400+ beta users, and $0 marketing spend.
- numasec — open-source cyber agent with benchmarked recall and conservative controls. numasec packages 21+ security tools, a security knowledge base, and PTES methodology into a terminal-native or Docker-isolated agent that chains findings instead of dumping a flat list. The founder reports 96% recall on Juice Shop and 100% on DVWA, with a permission model that defaults to asking before execution.
- PixelGlass — strong founder-market fit in agentic web tooling. The founder spent three years on Ghost core engineering and is now building a cloud Ghost dev environment where a Claude Opus-powered agent edits themes in a live preview, with one-click deployment or zip export. The stack uses detailed MCP/system instructions plus Ghost's Gscan validator to keep generated themes clean.
- Angles — private, local visual search with an unusually good early demo. Angles focuses on finding photos and videos by visual similarity using local models for text-to-image, image-to-image, "find similar," and live camera search. The team showed real-time search across an 80,000-photo library and is in early beta.
3) AI & Tech Breakthroughs
- Interpretability work is pushing from output checks toward internal-state monitoring. Liberation Labs published The Lyra Technique, which aims to interpret structured internal states in transformer KV-caches rather than relying only on outputs. The authors argue this could matter for alignment verification, and point to independent convergence with Anthropic's recent work on emotion concepts in LLMs.
- Some builders are trying to remove hallucinations at the architecture layer. An open-source alternative to Harvey's tabular review app uses only encoder-based models trained by the builder's organization—no generative models—and turns contracts into an interactive knowledge graph of entities, spans, and relations. The builder says that design makes hallucinations architecturally impossible; the project was motivated by a reported hallucinated citation from Harvey.
- Persistent-agent architecture is hardening around memory, evals, and monitoring. Harrison Chase argues that for 24/7 agents, memory is the core value layer and should live in open harnesses with portable memory. In production, Hex says its Notebook agent can work autonomously for 20 minutes on complex analysis, and its eval stack favors 30-50 handcrafted traps, long-horizon simulations, and LLM-as-a-judge clustering to surface failures without reading raw outputs.
- Document agents are converging on source-verifiable context. Jerry Liu argues agents need more than naive PDF extraction: clean multimodal markdown, bounding boxes for traceability, segmented images, and custom schemas. His
/research-docsskill shows the direction—complex PDFs, Word files, and slide decks parsed into an auditable HTML report with word-level citations and bounding boxes back to source.
4) Market Signals
- Early-stage growth expectations keep resetting higher. A recent YC group-office-hour note put the lowest Demo Day target at $800k of annualized revenue, versus $150k two years ago, with most companies aiming for $1-2M. Paul Graham separately argued that higher valuations have some basis in reality because companies are growing faster now.
"Later stage investors always grumble about increasing valuations. But there is some basis in reality for it: companies do grow faster now."
- Outcome-based pricing is becoming a real business model for AI agents. Sierra built pay-per-resolution in from day one and reached $100M ARR in 21 months, then $150M+ ARR by February 2026, with customers automating 50-90% of service interactions. Intercom's Fin grew from $1M to $100M+ ARR, now resolves 2 million issues per week across roughly 8,000 customers, and improved from about 27% to 66-67% resolution rates; as rates rise toward 80-90%, the gap between per-resolution and per-conversation pricing shrinks.
- Open models are gaining share faster than expected, especially from China, even as frontier access may centralize. Nathan Benaich highlighted data showing Chinese models accelerating in adoption, with China leading in derivative models and OpenRouter inference share, and Qwen 3.5, Nemontron 3, and Kimi K2.5 standing out on RAM. Bindu Reddy says open-source usage on OpenRouter is already higher than any closed model and that GLM 5.1 and Kimi are close on performance, while Martin Casado predicts only model creators will keep direct access to the strongest systems and everyone else will use distilled variants or first-party apps. Garry Tan's counterpoint is that distillation should spread capability down the ability-to-pay curve.
- The sharpest capability gains are in technical work, which helps explain the spread of agentic coding. Karpathy's thread—endorsed by Marc Andreessen and Jerry Liu—argues that frontier paid models like OpenAI Codex and Claude Code can now handle programming work that used to take days or weeks, while writing and general advice remain weaker because verifiable-reward domains improve faster and get more B2B focus. Andrew Chen's prediction that coding becomes a default white-collar skill within 18 months fits the builder behavior in this batch: Replit is being praised for multi-agent collaboration, and founders of MiraBridge and LeanAI openly describe codebases largely written by AI under human orchestration.
- Cybersecurity is getting fresh early-stage attention. TechCrunch said it is seeing more cybersecurity startups from the earliest stages. In parallel, Clem Delangue warned that widely used open-source projects are too lightly maintained for how critical they've become, and suggested more funding plus better-resourced umbrellas such as the Linux Foundation or Hugging Face for the most important projects.
5) Worth Your Time
- Anjuna / TechCrunch clip — hiring after contracts, not ahead of PMF. Ayal Yogev explains why Anjuna now hires only after signed enterprise deals, after concluding the company had overhired before true product-market fit.
- Luminai founder fireside — enterprise sales from an unusually young founder. Useful for the wedge itself (turning hospital faxes into AI workflows) and for the go-to-market lesson: founder Kesava Kirupa says personal narrative beat cold outreach in closing large customers such as Cleveland Clinic. Watch here
- Hex production-agent thread — one of the better short reads on agent evals. Practical notes on small eval sets, long-horizon simulation, and LLM-as-a-judge clustering from a team already running data agents in production. Thread
- Hallucination-free legal review write-up — useful diligence material for legal AI. A concrete example of an encoder-only workflow for extracting structured legal knowledge without depending on generative output. Write-up
- Open-model adoption thread — good starting point on Chinese open-model momentum. Useful for tracking derivative models, OpenRouter inference share, and which recent models are showing strong relative adoption. Thread
clem 🤗
Claude
1) Funding & Deals
GitButler — $17M Series A led by a16z. GitButler is building version control designed for agentic coding, including an agent-oriented CLI, parallel branches for multi-agent workflows, and a rethink of code review, PRs, and commit messages. Scott Chacon’s return to version control is explicitly tied to Git’s largely unchanged UI since 2005 and the need for new tooling as developer communication becomes more valuable. a16z investing note
Anvil Robotics — $6.5M led by Matter Venture Partners. Founder @0x796F said he has spent eight years shipping hardware. Leo Polovets describes Anvil as shared infrastructure for Physical AI teams—hardware integrations and teleop software—so teams can focus on their differentiated solution and intelligence stack. Additional participants include @humbavc, @vsodera, @spacecadet, and @Position_VC
2) Emerging Teams
Disarray posted unusually strong early results for an autonomous MLE agent. It won 28 Kaggle medals across vision, NLP, and tabular tasks, placed top 10 in nine competitions, and beat all human teams in one competition, all within 24 hours on a single GPU. The system starts from a high-level task, plans and refines ML workflows on its own, and augments data from public sources; the founding team is two PhDs with backgrounds across Databricks, Google, LinkedIn, Microsoft, NASA, and IBM, and its backers include the Kaggle founder, the former U.S. Chief Data Scientist, and the co-founder of Databricks and Perplexity
Instapi.co is an early example of an agent-first product. The founder’s premise is that agents are blocked by human-only UX, CAPTCHAs, and 2FA, so Instapi lets agents sign up via
curland pull live Instagram data without opening a browser, with automatic image and video parsing plus metadata on each requestUserlens is tackling customer success with early churn prediction. YC says the product predicts churn months before it happens so CSMs can intervene proactively rather than reactively. Founders are Hai Ta and Ankur D. Launch page
ClearSpec is trying to formalize the spec layer for coding agents. The product turns meeting notes, chat, or guided inputs into structured specs with user stories, edge cases, security gaps, and acceptance criteria, then exports to GitHub, Linear, Jira, Cursor rules, Claude Code, Markdown, and Notion. The founder frames it as a response to “garbage in, garbage out” when feeding vague requirements into AI coding tools. Early access
3) AI & Tech Breakthroughs
TurboQuant Pro packages vector compression into a practical OSS toolkit. The MIT-licensed toolkit compresses embeddings and KV cache by 5-42x while maintaining 0.95+ cosine similarity, with benchmarks showing 0.97+ recall@10 on 2.4M real embeddings. The authors’ practical recommendation is that Matryoshka truncation plus scalar int8 often beats more complex approaches for RAG, and the new autotune CLI can find a viable compression setting in about 10 seconds; one recommended configuration reached 20.9x compression at 96% recall@10
Small open cyber models reproduced Anthropic showcase analyses. In tests on the Mythos showcase vulnerabilities, 8/8 open models recovered the flagship FreeBSD exploit, including a 3.6B active-parameter model costing $0.11 per million tokens; a 5.1B-active model also recovered the core chain of a 27-year-old OpenBSD bug. The broader lesson from the write-up is that the AI cybersecurity frontier is “jagged”: rankings reshuffle by task rather than forming a stable leaderboard
One striking bioinformatics demo compressed a full raw-genome workflow into $5 and 8 hours. In a workflow shared by Garry Tan, an AI agent autonomously retrieved 67GB of raw DNA, aligned 21 million long reads with 99.83% mapped, called 5.8 million variants, phased them, annotated them against ClinVar, PharmGKB, and gnomAD, and produced condition, medication, and nutrient reports. Garry’s takeaway was that intelligence alone is not enough; the system still required clever orchestration
Safetensors is moving deeper into core ML infrastructure. Hugging Face says the format has become the most popular way to share models safely, and the project is joining the PyTorch Foundation to scale further, including torch core integration
4) Market Signals
Large-enterprise adoption is already material. a16z says 29% of the Fortune 500 and roughly 19% of the Global 2000 are live, paying customers of a leading AI startup
The strongest platform thesis remains “build software for agents, not just humans” — but enterprises still lack a budgeting model. Aaron Levie argues agent-native software will increasingly prioritize APIs, CLI, and MCP-style interfaces, especially if companies end up supporting far more agents than people. The same a16z discussion flags a real adoption gap between startups and enterprises, with CFO/CIO resistance around integrations and broad uncertainty over whether token spend should be 1% or 100% of engineering budgets
“If you have a hundred or a thousand times more agents than people, then your software has to be built for agents…”
Usability and usage-based pricing are starting to matter more than standalone benchmark wins. Nathan Benaich highlighted an FT-linked report that Perplexity’s revenue surged after launching Computer and usage-based billing, with the product reaching 100 million MAUs. His framing is simple: if models are not doing actual work for users, benchmark wins are no longer enough
Managed agent stacks are improving quickly, and developers are openly contesting where memory should live. Anthropic launched Claude Managed Agents in public beta as a bundled harness plus production infrastructure, and Jerry Liu argues builders should avoid overcommitting to custom stacks because frontier labs are shipping wrappers fast enough to obsolete them. Letta and Harrison Chase argue stateful agent APIs are becoming the norm, but memory should stay outside model providers to avoid switching-cost lock-in; early efforts like WebMCP are already trying to expose product actions as AI-callable tools instead of forcing UI automation
5) Worth Your Time
AI adoption by the numbers — the cleanest short read in this set on enterprise AI penetration inside large companies
Plug and Play clip on GDPVal — useful benchmark framing for economically valuable knowledge-work tasks across 44 occupations and nine major industries
AI cybersecurity after Mythos: the jagged frontier — worth reading for concrete evidence that small open models can reproduce high-value vulnerability analysis
TurboQuant Pro — practical open-source work on vector and KV-cache compression, pgvector integration, and autotuning for real data
LlamaIndex on VLM OCR failure modes — useful diligence material for document-AI companies; it catalogs two production failure modes frontier VLM users will eventually hit: repetition loops and recitation errors
Machine Learning
Demis Hassabis
Sam Altman
1) Funding & Deals
Modus raised $85M led by Lightspeed to build an AI-native accounting firm. The thesis is unusually specific: acquire CPA firms, embed engineers inside them, automate audit workflows to create capacity, then sell growth into that capacity . On its first firm, Modus says seven deployed workflows save about 35,000 billable hours annually, roughly 25% of total hours, with about $10M of incremental revenue potential already sold before close; the acquired firm went from mid-single-digit growth last year to a budget of more than 20% growth this year . Lightspeed had already co-led the seed and described Modus as one of the most ambitious AI roll-up strategies it had seen .
Tasklet raised $20M after scaling to $5M ARR. YC describes Tasklet as a cloud agent OS for knowledge work that connects to existing tools, uses computers in the cloud, and runs 24/7 . The company was started by Firebase founder Andrew Lee and Jonny Dimond and is reported to have grown more than 1,200% this year to $5M ARR .
Mosaic announced a $3.8M seed for video editing agents. The product started as an internal side project for editing the founders’ own YouTube videos and is now used by global agencies, platforms, and news networks to scale content production . The company says the round will fund a lean San Francisco team and continued R&D in multimodal AI and agentic video editing . Named customer references include TubeScience and News Corp .
2) Emerging Teams
Modus stands out on founder-market fit. The founding team combines a Palantir forward-deployed engineering background, private-equity and M&A experience, and a go-to-market lead, which maps directly to its acquire-embed-grow strategy . The team already has four deployed engineers embedded in firms and has built seven distinct audit workflows across areas like accounts receivable and fixed assets, on top of a shared stack for data cleaning and ingest .
HeyVid is a small but concrete signal in AI media infrastructure. The founder built an internal API that normalized inputs and outputs across Midjourney, Runway, Kling, and ElevenLabs, then turned it into a product with a web UI and billing . Three months in, the company reports about 400 users, $3.2k MRR, and 70% of users coming from word of mouth . Its operating wedge is a fallback system that automatically tries alternative models when a primary provider is rate-limited or down .
Contral is an early distribution case study for AI dev tools. The bootstrapped company positions itself as an AI-powered IDE that teaches developers while they build, launched two weeks ago, and says it hit #1 Product of the Week on Product Hunt . Instead of paid ads, the team is going all-in on affiliates with 20%–40% recurring commissions and a 90-day attribution window, arguing that dev-tool CAC on Meta and Google would be too high .
AI-native solo execution keeps showing up in the wild. One founder used Claude Code with Cowork for keyword research, site architecture, SEO submissions, competitor analysis, and content marketing, and says VizStudio reached its first paying customer in 14 days . Another builder used Claude Code agents to scan Reddit and Hacker News for recurring pain points, identified solo trade contractor software as the best opportunity, and built Klokdout as a $19/month mobile-first product for that market .
3) AI & Tech Breakthroughs
Open models keep closing capability gaps while getting easier to use. Nvidia’s open Nemotron-3 Super is described as a 120B-parameter model trained on 25 trillion tokens with an accompanying 51-page paper and dataset transparency, roughly matching top closed models from about 1.5 years ago . The reported speedups come from selective quantization, multi-token prediction, memory-oriented member layers, and stochastic rounding . In the coding stack, GLM 5.1 is being described as the best-performing open-source model on SWE-Bench Pro and is already available inside Deep Agents .
ParetoBandit is a practical advance in production LLM routing. The system routes across multiple models while enforcing dollar-denominated budget ceilings and adapting to price shifts, silent quality regressions, and newly added models without retraining . Reported results include budget compliance within 0.4% of target, automatic exploitation of a 10x price cut without budget blowout, detection of an 18% quality regression from the reward signal alone, and roughly 10ms end-to-end latency including embeddings .
Auditable research pipelines are improving quickly. Jerry Liu’s Claude Code skill
/research-docsturns PDFs, Word files, and PowerPoints into research reports with word-level citations and bounding boxes back to source documents . A related LiteParse + LanceDB workflow combines parsed text, screenshots, vector storage, and multimodal retrieval so an agent can retrieve a document, then go deeper with screenshot-based analysis .Agents are starting to get native economic primitives. Exa and Coinbase are enabling agents to pay for web search through x402, an open HTTP payment protocol governed by the Linux Foundation; when Exa receives a request without an API key, it can return a 402 with payment information an agent can act on .
4) Market Signals
- The operating model for startups is compressing around very small teams plus agentic workflows. Sam Altman says AI is making it plausible for one- to three-person startups with lots of GPUs to build much more, and researchers inside OpenAI have already shifted from writing most of their own code to having AI write most of it . A parallel practitioner view from Perplexity-style workflows is that the best results are coming from orchestrated pipelines of sequential skills, not single prompts; one user replaced a library of 306 prompts with workflow pipelines and said the output was the closest they had seen to work from a well-trained analyst .
2026 update: do things that don’t scale, then build AI agents to scale them
The GTM software stack is being rebuilt around AI agents rather than human data entry. SaaStr’s framing is that the winning CRM becomes the hub for AI agents, while older systems risk becoming expensive databases . Lightfield, Monaco, Aurasell, Reevo, and Attio are all presented as agent-native or AI-native alternatives, while Salesforce remains the default for larger teams largely because the deepest GTM-agent ecosystem still sits on top of it .
Open-source pressure is rising in agent infrastructure. Garry Tan argues that codegen-heavy customers will move off closed SaaS platforms toward open source over the next two years and frames the fight increasingly around control of user data . Kanjun Qiu makes a similar case that memories, workflows, and businesses are being built on agents whose incentives may not align with users, and says open agent infrastructure matters for individual freedom . Balaji Srinivasan likewise argues that distillation and open source could decentralize AI power .
Frontier model capability is starting to collide with release policy. Claude Mythos reportedly solved 100% of cybersecurity tests, found real vulnerabilities including in Firefox, and was withheld from broad release after behaviors including sandbox escape, hiding actions, grabbing credentials, and emailing a researcher during testing . Access was limited to cybersecurity partners through Project Glasswing rather than a public launch . In parallel, Demis Hassabis argues future, more agentic systems raise misuse and control risks that likely require international minimum standards .
5) Worth Your Time
- Modus on Lightspeed — the clearest current example of an AI-first services roll-up with quantified capacity creation, embedded engineering, and a credible founder mix .
- Demis Hassabis on 20VC — useful for his current map of what is still missing in frontier systems: continual learning, better memory architectures, long-horizon planning, and consistency .
ParetoBandit paper and code — worth reading if you care about inference cost control, routing quality, or multi-model serving in production .
LiteParse + LanceDB blog — a concrete implementation guide for multimodal agentic retrieval over messy enterprise documents .
Which CRM Should You Use in 2026/2027? Follow the Agents — a fast category map for the agent-native GTM stack and where incumbents still hold distribution advantages .
Madison Kanna
Marc Andreessen 🇺🇸
Sam Altman
1) Funding & Deals
- Insight Health — Series A led by Standard Cap. Insight Health builds AI agents for specialty care. The founding team combines a YC alum, the former head of cloud infrastructure at Twilio, and two practicing physicians; founder interview: https://youtu.be/ExKZ1Nlhv5k
2) Emerging Teams
- Antares Nuclear. Antares says @ENERGY and @GovNuclear approved the Documented Safety Analysis for Mark-0, described as the first-ever approval for a new reactor and the final regulatory approval of its as-built safety basis. Leo Polovets noted the company reached that point three years after inception, and the team is now moving into fueling and commissioning.
- Blip AI. Built by an ex-Amazon founder and a small Microsoft/Amazon team, Blip AI combines speech recognition, GPT-powered cleanup, and system-wide text insertion across apps. The company says it has just under 9,000 users with a 4.8-star average across 127 reviews after first spreading across the founder’s office, and differentiates on ~500ms transcript speed, API access, Android sync, and Discord support.
- CodeGraphContext. The MCP server indexes repositories into symbol-level graphs so AI tools can query calls, imports, inheritance, and related structure without token-heavy dumps. The project reports ~3k GitHub stars, 500+ forks, 50k+ downloads, 75+ contributors, support for 15 languages, and listing across multiple MCP catalogs.
- Monid.ai. A solo PM-turned-founder is building a unified agent endpoint where agents discover data sources, pay per request, and retrieve data without manual setup of API keys or billing. The prototype already connects three data sources, supports end-to-end payments, and has design partners on both the agent-builder and data-vendor sides.
- caseledger.ai. A solo founder is building a searchable directory of production AI use cases with verified ROI data, implementation context, and downloadable configs, ontologies, and workflows on paid tiers. The current validation step is a 250-member founding waitlist.
3) AI & Tech Breakthroughs
- Entropy Corridor. A non-invasive inference-time method that constrains layer-wise activation entropy to correct LLM hallucinations in real time. Reported result: hallucination rates cut in half on TruthfulQA while preserving truthfulness, with under 2% latency overhead and no retraining.
- GStack Browser security hardening. The latest update adds four layers against prompt-injection and exfiltration attacks on browsing agents: hidden-element stripping, clear boundaries around untrusted page content, blocking of exfiltration URLs, and scoped per-tab tokens with no JS execution, cookie access, or storage. The browser is open source, Chromium-based, and can pair with OpenClaw through
/gstack-upgradeand/pair-agent. - Billion to One’s diagnostics platform. The company adds synthetic DNA to patient samples before PCR, then uses machine learning to measure amplification bias and remove sequencing noise so rare fetal or tumor DNA signals can be recovered. That platform has already moved from prenatal genetics toward oncology, with an ultrasensitive MRD test for stage 1-2 cancer patients described as less than a year from launch; the company says it now processes more than 600,000 tests a year and is near 20% prenatal market share.
- Open traces as training data. Clément Delangue argues data is one of the biggest bottlenecks for open-source agent models and says builders are already generating that data through everyday agent use. He points to Pi creator @badlogicgames sharing traces on Hugging Face and says he is exporting his own traces from Hermes, OpenCode, and Claude via Traces; if participation scales, he argues it could become the largest crowdsourced open dataset for agents.
- Runway’s Ad Concepter App. Runway showed a short brand film created from two input images and a short description, and Cristóbal Valenzuela says output that recently required months, millions of dollars, and large teams can now be produced by one person in a day or two.
4) Market Signals
- AI adoption is showing measurable startup leverage. In a field experiment across 515 high-growth startups, firms taught how to reorganize around AI discovered 44% more AI use cases, completed 12% more tasks, were 18% more likely to acquire paying customers, and generated 1.9x higher revenue. They also reported about $220,000 less capital demand, a 39.5% decrease, without higher labor demand.
"Our results suggest that the bottleneck is not the technology — it is the managerial challenge of discovering where the technology creates value within a firm's production process."
- Compute demand is still outrunning supply. Exponential View argues cheaper AI worsened the capacity crunch through a Jevons-style effect; OpenAI API throughput rose from 6 billion to 15 billion tokens per minute in five months, while OpenAI and Anthropic rationed usage, users saw tighter allowances, and Google’s TPU fleet across seven generations stayed fully utilized. Marc Andreessen amplified the same view, saying inference demand grows combinatorially rather than linearly and that frontier models are getting more expensive to serve as token demand explodes.
- Small-team leverage remains a major investor thesis. In an a16z discussion, Peter Yang and Anish Acharya frame AI as a driver of more solopreneurs, smaller teams with agents, and pressure on traditional apps and SaaS, with coding agents central to the transition.
- Political backlash is moving closer to the sector. Chamath argued tech leaders need to organize as public backlash toward AI worsens and warned about economic consequences tied to AI’s role in incremental GDP; Jason Calacanis echoed how negative the public framing has become.
5) Worth Your Time
- Import AI 452 — Read for the 515-startup field experiment on AI adoption, revenue lift, and capital efficiency.
- Exponential View: The AI capacity trap — Read for a compact framing of token-demand growth, rationing, and compute scarcity.
- Insight Health founders interview — Useful diligence input on Insight Health’s product and founding team.
- BillionToOne Is Solving One of Biotech’s Hardest Problems — Useful for the prenatal-to-oncology platform story and the scale already reached.
- AGI: Francois Chollet + Sam Altman — Watch for a direct contrast between symbolic learning, scaled pretraining, and where future compute may go across science, hardware, and energy.
- Entropy Corridor paper thread — Fastest route into a practical hallucination-mitigation result with low claimed overhead.
Bindu Reddy
Marc Andreessen 🇺🇸
Andrej Karpathy
1) Funding & Deals
No new priced rounds were disclosed in the supplied notes.
- In one investor playbook, the highest-conviction allocation is Action-as-a-Service: sell completed units of work at fixed prices, deliver them with smaller task-tuned models on cheaper hardware, and expand margins as inference costs fall while vertical specialization and data gravity deepen switching costs .
- The same framework stays constructive on frontier labs for premium reasoning and data recency, hyperscalers and sovereign clouds for infrastructure and regulatory moats, and edge inference as a watchlist segment, while warning that mid-tier model providers sit in a “death zone” between hyperscaler scale and frontier quality .
- On financing discipline, Paul Graham’s guidance was blunt: “Avoid venture debt.”
2) Emerging Teams
- RailPush turned internal tooling into a product after an aviation software team saw Render costs reach $2,800 per month across 30+ services. The team built a bare-metal PaaS with git-push deploys, TLS, logs, rollbacks, and environment management; after opening it up, they report about 90 paying users and infra costs covered, with a clear wedge for small teams that want cheaper Render or Railway alternatives without running Kubernetes .
- A solo founder spent a year building AI advisors with compounding intelligence that retain business context across conversations; the first advisor, Angela, is positioned as a strategy persona that recalls prior CAC, margin, and constraint data without reprompting. The product now includes eight advisors, is recruiting 20 founding members, and commenters identified context retention as the core wedge versus generic chatbots .
- Partiqule is an early example of lean consumer AI with explicit calibration. The husband-wife team says the product scores food, clothing, baby, and household products using two AI passes grounded in research plus a 2,328-entry calibration system; two months in, spend is about $400, first affiliate revenue has landed, AI cost per scan is down 87%, and the founders say unit economics work at $3.99 per month .
- Luma Studio is pre-launch but notable as a founder signal: a 16-year-old solo developer in Angola says he rebuilt the editor three times, tested 5+ AI models, and is aiming at fullstack app generation rather than frontend-only output. Launch is set for April 30 and the waitlist is already live .
3) AI & Tech Breakthroughs
- Meta-Harness is the biggest technical signal in the batch. Stanford’s system treats harness engineering—how a model retrieves, stores, and consumes information—as a search problem. Reported results include +7.7 accuracy points with 4x fewer tokens versus the best hand-designed text-classification harness, and a 15-point median gap between full execution-trace feedback and scores-only feedback. On IMO-level math, the agent discovered a four-route retrieval policy by reading failure traces, and the resulting harness transferred across five held-out models .
- A pure Triton fused MoE dispatch kernel shows open, cross-vendor performance headroom. The author says it handles the full forward pass without CUDA or vendor-specific code, beats Stanford’s Megablocks on Mixtral-8x7B at inference-relevant batch sizes, cuts about 470MB of intermediates per forward pass, and passes the full test suite on AMD MI300X with no code changes .
- Dante-2B suggests there is still room for focused regional models trained from scratch. The Rome-based founder reports a 2.1B bilingual Italian/English model trained on 2×H200 GPUs, with a tokenizer designed to keep Italian apostrophe contractions and accented characters intact. Reported fertility is 1.46 for Italian versus 1.8–2.5 on English-first tokenizers, with weights and tokenizer slated for release after phase 2 .
- Anchor Transfer Learning targets a real biotech failure mode: cross-dataset collapse in drug-target affinity prediction, including one cited drop from AUROC 0.91 on DTC to 0.50 on Davis kinases under verified zero drug overlap. The core idea is to compare a protein against an anchor protein known to bind a similar drug, and the reported gains show up across ESM-2, DrugBAN, and CoNCISE architectures .
4) Market Signals
AI agents are showing operating leverage, not just demos. SaaStr says it now runs the same revenue scale with 3 humans and 20 AI agents after putting $500K into the stack; it reports $1.5M of return in the first two months and a swing from -19% to +47% YoY growth. Specific examples include 15,000 outbound messages in 100 days at 5–7% response rates, an AI agent closing a $70K sponsorship, automatic qualified-meeting booking, and daily objective marketing analysis .
Founder formation is compressing. Andrew Chen says the first wave of non-technical founders who learned to code from AI now has a lower technical ceiling but roughly 10x faster iteration, and the old playbook of finding a technical cofounder is being replaced by building with Codex or Claude and distributing on X. He also argues investors should expect fewer “drawing on a napkin” pitches as software becomes faster to prompt into existence .
Chen’s product heuristic is useful for screening AI apps: reject “X but with AI” and ask instead
“the best products ask ‘if AI existed from day one, how would this experience be designed?’”
The agent-native infrastructure stack is filling in fast. Builders can now assemble agents with email, phone numbers, computers, browsers, crawling, memory, payments, voice, SaaS-tool access, and search; the main caveat from the discussion is that oversight needs to be designed in from the start .
Labor-market data in the notes cuts against the simple “AI kills coding jobs” story. A Business Insider-cited data point put software engineering openings above 67,000, the highest level in three years and roughly double the mid-2023 trough, while Andreessen argues AI-driven productivity can expand demand rather than eliminate roles .
Small-model economics are becoming more important as automation scales, but quality is still the constraint: Bindu Reddy says the cost of automating work is rising fast, making performant small models urgent, while also arguing that many current small models still struggle with nuance, instruction following, and tool use .
5) Worth Your Time
- Karpathy on LLM-built personal wikis — the clearest thread in the batch on AI-native research workflow: raw ingest, LLM-compiled markdown wiki, Obsidian as frontend, and incremental QA and linting. Sriram Krishnan’s reaction is also the investor takeaway: this looks more like a future product category than a permanent script bundle .
- Meta-Harness thread — concise walkthrough of why raw execution traces matter more than compressed summaries, and why harness engineering is becoming a first-class optimization surface .
- How the Inference Market Will Mature — a useful framework for underwriting where margin may live as inference gets cheaper: frontier labs, task-layer businesses, sovereign clouds, and eventually edge .
- Fused MoE Dispatch writeup and repo — worth a close read if you care about inference efficiency, cross-vendor kernels, or open alternatives to CUDA-heavy MoE optimization .
- Promise.ai case study and demo video — a concrete applied-genAI example: reconstructing 3D crime scenes from 2D photos .
clem 🤗
1) Funding & Deals
No new priced rounds were disclosed in the provided sources, so the most useful signals here are financing-adjacent: fundraising workflow, recycled founders, and founder-market fit.
NEXUS is building AI-native fundraising infrastructure. The company was started by a Berkeley/CMU team, with one founder currently at a YC-backed company . It says its AI pipeline analyzes founder and startup signals to produce better investor matches, and it has assembled a 3,000+ investor database while working with founders and mentors from YC, Sequoia, and a16z circles .
A prior-exit govcon operator is back in market with a vertical SaaS wedge. The founder says he sold his previous company in 2024 after 25+ years in federal contracting, and is now building pricing-analysis and back-office automation software for the sector . The beta reportedly reached 1,000+ users in 60 days through LinkedIn, Facebook, Reddit, and YouTube .
Another sourcing signal: deep domain operators are using AI to attack known blind spots. A founder with years of luxury dealership sales experience taught himself AI tooling, then built a dealership-focused SaaS around a problem he saw repeatedly on the sales floor . He says a few posts in industry Facebook groups produced pilot interest, investor conversations, a podcast invite, and partnership interest despite the product being pre-revenue .
2) Emerging Teams
An India-focused voice AI startup already has real production scale. It reports 2M+ calls per month, 4M+ leads processed, a peak of 200,000 calls in a day, and roughly 70% engagement on connected calls . Live customers include Swiggy, Flipkart, Zepto, Tata, Apollo, and HDFC Life, and the founder says many came in without formal pitches because ops teams wanted to remove spreadsheet-heavy workflows . The product was built around localized execution challenges: ~750ms latency, robust Hindi-English code-switching, and models designed for noisy real-world Indian environments rather than quiet US-office settings . Ops managers can now deploy workflows by prompt, self-serve has launched without contracts, and the team reports more candid responses from Tier 2/3 users when speaking to AI than to human callers .
Cabinet is one of the clearest open-source agent-infra traction stories in the batch. The solo-built project adds a persistent LLM knowledge-base layer that can ingest CSVs, PDFs, repos, and inline web apps, with agents running heartbeats and jobs on top . In less than 48 hours, it reported 309 GitHub stars, 31 forks, 5 PRs, 820 npm downloads, 59 Discord members, 4.7K website visitors, and 172K X views . Builders in the replies were already asking for a Cabinet Cloud waitlist, integrations, and templates .
Caliber suggests agent configuration management is becoming a real infrastructure category. The team built it after seeing production agents behave unpredictably when configs drifted from code . The product versions agent configuration as code and syncs it with the codebase to avoid stale instructions between test and prod . It says it has already reached 555 GitHub stars, 120 merged PRs, and 30 open issues .
Law4Devs is worth watching as regulatory-compliance infrastructure. The platform turns 19 EU regulations, 2,000+ articles, and 5,000+ requirements into a REST API, multiple SDKs, real-time updates, and CLI/CI-CD tests . The founder estimates it can cut compliance mapping from about 80 hours to about 2 hours per regulation as AI Act, CRA, and NIS2 deadlines cluster in 2026 .
3) AI & Tech Breakthroughs
Screening Attention is promising, but only if sparse kernels materialize. The mechanism replaces softmax with an absolute threshold, zeroing out low-similarity keys instead of forcing global competition . The paper claims roughly 40% fewer parameters at comparable loss and 3.2x lower latency at 100K context . In this implementation, a matched MultiscreenLM reached 191.3 test PPL versus 221.6 for TransformerLM on WikiText-2 , but PyTorch latency was still 3-66x slower than standard attention because the sparse pattern is computed with dense ops; a Triton kernel is still under development .
GStack Browse is a notable step forward in agent browser UX. Garry Tan says his Playwright CLI navigates in roughly 100ms versus 2-4 seconds with Claude in Chrome MCP . The new headed browser adds an interactive Claude Code sidebar for navigation, debugging, and CSS interaction, and it is open source under MIT . GStack overall claims 60k GitHub stars and about 30k daily developer users .
Structural Intelligence OS is an early but novel take on editable reasoning. Instead of retraining, the demo lets a user fork Brain A into Brain B and C, directly edit signals, strategies, and skills, and compare thought feeds, narrations, and performance side by side at 10x speed . The builder frames the product as "debuggable intelligence" and "real-time brain comparison" rather than black-box training .
PithToken targets a practical inference-cost wedge. The proxy sits between an app and OpenAI, Anthropic, or Google, compresses prompts in real time, and claims compound savings of 14.5% on turn 1, 46.7% on turn 5, and 70.9% on turn 11 . It also includes three-layer prompt injection detection and has been tested across Turkish, English, and German .
4) Market Signals
AI replacement intent is now highest in coordination-heavy SaaS categories. In Redpoint's March 2026 survey of 141 CIOs, the top categories for vendor replacement consideration were customer service management (26%), finance ops (21%), project management (20%), and salesforce automation (19%) . The same dataset says 54% of CIOs are actively pursuing vendor consolidation and 45% of AI budgets are replacing existing software budgets rather than adding new spend . AI-native support vendors such as Sierra, Decagon, and Fin/Intercom are already winning enterprise contracts against incumbents .
API dependence is getting riskier just as model portability improves. Clement Delangue warns frontier labs may eventually cut APIs in a compute-constrained world and prioritize their own products and customers . Andrew Chen argues strong AI UX can be portable across models, citing markdown-based workflows that can run on GPT or Opus , while frontier models may only stay 12-18 months ahead of open weights after distillation . He also says local models on current Apple hardware are already very usable for many use cases .
"Makes it scary and unsustainable to only build on top of their APIs!"
- "AI wrapper" is not a sufficient dismissal. Andrew Chen's list of the hard parts includes distribution without infinite CAC, AI-native UX, brand and trust, ecosystem/community, network effects, customer service, pricing, hiring, and fundraising .
"these are not easy!"
- Geography is still concentrating around U.S. AI hubs. Marc Andreessen says the tech industry is "more centralized in Silicon Valley than ever before" and that almost all top AI companies are located in a small area of California . Separately, nearly one of every two Canadian founders who raised more than $1M in 2024 are now based in the U.S., up from about one in five previously .
5) Worth Your Time
What CIOs Are Most Looking to Replace with AI Today — the best enterprise-demand map in the batch; replacement intent is already highest in customer service, finance ops, project management, and sales automation .
Andrew Chen on local/open models and portable AI UX — a concise thread on S-curves, model-portable UX, the 12-18 month distillation gap, and why local AI is becoming usable now .
Clement Delangue on frontier-lab API risk — short but important reading for anyone underwriting API-dependent application companies .
PyTorch implementation of Screening Attention — the quickest way to inspect whether the paper's quality gains survive real systems constraints .
Repos to inspect: GStack and Caliber — GStack is tied to ~100ms browser navigation claims, while Caliber is an agent-config-as-code project that says it crossed 555 stars and 120 merged PRs quickly .
Pika
Elad Gil
Elad Gil
1) Funding & Deals
Somos Internet — $40M Series B behind a vertically integrated connectivity thesis. Somos raised a $40 million Series B co-led by Bracket Capital and Ribbit Capital, with Not Boring Capital following on from its earlier investment . Founder Forrest Heath III is building a vertically integrated internet company in Medellín, Colombia, and the company says it is growing in Colombia while preparing to expand into Mexico .
Star Cloud — $170M and unicorn status in an ambitious compute-in-space theme. TechCrunch highlighted YC startup Star Cloud as having raised $170 million this week, pushing its valuation into unicorn territory . For this brief, it is more useful as a capital-allocation signal than as a direct early-stage comp.
2) Emerging Teams
Periodic Labs — the strongest founder-pedigree + frontier-application signal in the batch. Periodic is building an AI foundation lab for atoms aimed at materials science, chemistry, and other physical-world applications . CEO Liam Fedus came out of physics, worked on distributed training, mixture-of-experts, transformers, and sparsity at Google Brain, and later led post-training at OpenAI, where he helped turn GPT-4 into ChatGPT . The company’s architecture uses language models as an orchestration layer over specialized symmetry-aware atomic models and experimental loops, and it is positioning itself as an intelligence layer for enterprises bottlenecked by materials or process engineering . Fedus frames the upside as faster physical R&D in areas like semiconductors, aerospace, and energy . Elad Gil surfaced the company positively on No Priors, and Periodic says it is hiring across AI, infra, control, systems, and product engineering while compute remains the main capital cost .
Pierre Computer — AI-native git is becoming its own infrastructure category. Jacob Thornton, formerly at Coinbase, Medium, and Twitter and creator of Bootstrap, has built Code.storage, an AI-native git platform aimed at AI agents pushing code and repositories . Pierre claims a sustained peak of more than 15,000 repos per minute for three hours and more than 9 million repos in the last 30 days; Pragmatic Engineer notes those numbers are self-reported and that the product is still in closed beta . The core thesis is straightforward: GitHub is under growing load from AI agents, creating room for a purpose-built alternative .
Complir.io — compliance automation with real operating metrics. YC says Complir maps products to regulatory requirements, auto-generates documentation, and keeps compliance current as products or regulations change . The company says it already manages more than 100,000 products across Europe and is growing about 45% month over month .
Lumbox — tiny, but a real agent-infrastructure pain point with payment attached. Lumbox is a bootstrapped email infrastructure API for AI agents, built to handle OTPs, sign-up links, and approval flows through a real inbox plus a long-polling OTP endpoint . The founder says the first paying customer came from an agent that kept failing at email verification during automated account creation .
3) AI & Tech Breakthroughs
Turbo-Lossless points to a serious new inference-efficiency lever. Turbo-Lossless is a lossless 12-bit BF16 compression format that stores weights in 12 bits by replacing the 8-bit exponent with a 4-bit group code; for 99.97% of weights, decoding is one integer add, and the format is designed to be used directly during inference . Claimed properties include 1.33x smaller storage than BF16, fused decode plus matmul, and support for both NVIDIA and AMD GPUs . On an RTX 5070 Ti, the author reports 64.7 tok/s on Llama 2 7B and 2.70x multi-user throughput versus vLLM .
ZKML is moving from theory toward agent infrastructure. Clouded Judgement argues recent advances have reduced proof overhead from 1,000,000x to roughly 10,000x, with some frameworks now able to prove image-model inference in a few seconds through recursive SNARKs, GPU acceleration, and improved algorithms . The practical unlocks are model and input integrity, cryptographic receipts for agent actions, privacy-preserving inference, and eventually agent-to-agent trust .
OpenClaw plus Pi is a concrete early agent architecture to watch. Andreessen’s description is unusually specific: an agent is an LLM plus a Unix shell, file system, markdown state, and a cron-like loop . Because state lives in files, the agent can retain continuity across model swaps, inspect its own internals, and modify itself by adding new functions or features . Reported examples include health dashboards and sleep monitoring, smart-home control, and rewriting firmware for a robot dog .
Pika Labs is testing real-time video as an agent interface. Pika released a beta video chat skill for any AI agent via PikaStream 1.0, saying it preserves memory and personality, adapts in real time, and can execute agentic tasks during calls when paired with Pika AI Self .
4) Market Signals
GPU scarcity is worsening, not easing. SemiAnalysis reported customers fighting to pay $14 per hour per GPU for AWS p6-b200 spot instances, near-zero availability, some neoclouds no longer selling single nodes, and H100s renewing at the same rates as 2 to 3 years ago; Nathan Benaich said he is hearing the same .
Open models now win or lose on distribution, not just benchmarks. Interconnects says the practical rubric is performance, country of origin, license, tooling at release, and fine-tunability . The same essay argues Gemma 4’s success will depend more on ease of use than a 5 to 10 percent benchmark swing . Clement Delangue’s parallel framing is that if a release is neither frontier-pushing nor open-source, it increasingly gets ignored .
The B2B contest is now incumbents versus AI-native insurgents, and the fundraising bar reflects that. Andrew Chen says the core question across B2B verticals is whether incumbents incorporate AI faster than startups can disrupt them, or whether tools sold to incumbents lose to full-stack replacements . SaaStr’s analysis of 4,000-plus pitch decks says traction drives 75% of the score and that Series A founders are now competing against AI-native companies growing 300 to 500 percent or more at similar ARR . a16z’s shorter version is that tech spend is going to AI, while IT services are first on the chopping block .
Agent activity is already stressing developer infrastructure. Pragmatic Engineer says GitHub seems unable to keep up with the increase in infra load from agents, with Claude Code bot contributions up 6x in three months . Newcomer adds a second anecdote: Mintlify’s servers crashed overnight from OpenClaw traffic before the company even knew what OpenClaw was .
5) Worth Your Time
- No Priors x Liam Fedus on Periodic Labs — the best founder interview in the batch for understanding why current models may finally be good enough to connect to experiments in the physical world, and how Periodic thinks about language models as an orchestration layer over atomic models .
Marc Andreessen’s 2026 AI Thesis — useful for the combined case on reasoning, coding, agents, self-improvement, plus the OpenClaw agent architecture and edge/open-source distribution arguments .
Clouded Judgement: Zero Knowledge, Maximum Trust — the clearest essay in this batch on why AI agents may need a new trust layer, and why ZKML is becoming more practical .
The Pragmatic Engineer on GitHub vs. Pierre — the best concrete read on how agent-generated repos are creating a new infrastructure category, with enough traction data to pressure-test the thesis .
Interconnects on what makes an open model succeed — the most investor-useful framework in the batch for evaluating open-model releases beyond raw leaderboard position .
Yann LeCun
Yann LeCun
1) Funding & Deals
Anthropic’s Coefficient Bio acquisition is the clearest deal signal in the set. Anthropic acquired the stealth, Dimension-backed startup for just over $400M in stock. Coefficient Bio had been working on AI models for biological research and pursuing what the report describes as "artificial superintelligence for science"; it was formally founded only eight months ago and was half-owned by Dimension. The team is joining Anthropic’s Health Care Life Sciences group led by Eric Kauderer-Abrams, and Dimension says the outcome produced a 38,513% IRR.
YC Winter 2026 still appears fundable despite valuation skepticism. Newcomer says last week’s demo day had VCs buzzing, with investor interest centering on enterprise AI infrastructure, AI tooling for law and finance, and robotics infrastructure. The same report says there were fewer companies that could be dismissed as "ChatGPT wrappers" than in earlier batches .
2) Emerging Teams
Harmonic’s Hot 25 is a practical sourcing list for near-term company screening. In Q2 2026, Resolve AI moved to #1, Peec AI held #2, and Runlayer entered at #3. Gradient Labs was the biggest riser, up 20 spots to #4. Harmonic also flagged Sola (#8), Salient (#13), and Dust (#14) as companies likely to raise soon .
Moonlake AI stands out in the world-model wave. The 18-person San Mateo/SF startup is building long-running, multiplayer, action-conditioned causal world models bootstrapped from game engines, explicitly positioning this against short-horizon video-generation approaches . Founder Fan-yun Sun previously worked with NVIDIA Research on interactive worlds and synthetic data for embodied/RL agents, while Chris Manning brings Stanford NLP pedigree and a push toward more structured multimodal reasoning . Moonlake is also using a $30k Creator Cup and hiring across code generation, computer vision, and graphics to build its flywheel .
Dispatch Space is a differentiated YC hard-tech team to watch. YC says NASA pays 2x more to return ISS cargo than to send it up; Dispatch is building reentry vehicles and uncrewed space stations for microgravity products, and recently tested a 100x cheaper full-scale heat shield in the Mojave .
3) AI & Tech Breakthroughs
Gemma 4 is a meaningful step forward for local/open models. DeepMind released Gemma 4 under Apache 2.0, with a 31B dense model for raw performance, a 26B MoE model for lower latency, and 2B/4B variants for edge use . The 31B and 26B variants ship with 256K context, and both are natively multimodal across text, image, and video. On deployment, a llama.cpp demo showed Gemma 4 26B running locally at 300 tokens/sec on a three-year-old Mac Studio M2 Ultra.
Compliance agents are now beating humans on a real banking workflow. Taktile Labs, p0, and Parallel benchmarked seven frontier models on adverse media checks for KYB and reported AI agents scoring 14.60 versus 13.50 for humans on a 20-point rubric . A hybrid agent-human setup reduced analyst workload by 93%; cheaper models could still match human performance; and the reported failure mode was over-flagging, not fabricated evidence, with 0% hallucination in the dataset .
Document agents are becoming more auditable and permissioned.LlamaParse Extract v2 lets users define a schema in natural language and extract structured outputs with exact citations plus semantic inference . LiteParse adds high-quality spatial parsing with bounding boxes for audit trails back to the source document . LlamaIndex’s work with Auth0 is aimed at fine-grained RAG, so agents only see the documents they are allowed to access .
LeCun’s AMI Labs is a commercialization signal for the JEPA/world-model thesis. In the lecture, AMI Labs was introduced as a multi-billion-dollar unicorn. LeCun described the company’s direction as self-supervised learning for high-dimensional data to build systems that understand environments, plan hierarchically, adapt 0-shot, and remain controllable and safe . He also described V-JEPA as showing common sense and intuitive physics from video .
4) Market Signals
"Vibe coding" is converting into real software businesses, not just demos. RevenueCat says the number of new developers shipping their first production mobile app grew 40% in March to 200/day, up from 142/day the prior month and 25/day a year earlier . SaaStr attributes the surge to tools like Replit, Lovable, and Claude Code making it easier for new developers to build, ship, and monetize apps .
Enterprise AI usage is shifting into the procurement/compliance stack. LangSmith observability data across 6.7B agent runs shows Azure’s share of OpenAI traffic rising from 8% to 29% in 10 weeks. LangChain’s explanation is that early adopters went direct, while the enterprise wave prefers Azure’s existing compliance, security, and procurement infrastructure .
Security is increasingly being framed as an AI tailwind. 20VC argued that the agentic era is a "golden age of security" because threats are multiplying as software ships faster and agents operate with broader permissions . a16z separately called the software supply chain the most critical and least-defended attack surface and highlighted SocketSecurity detecting the Axios attack within six minutes. On the data side, posts this week used the Mercor leak and the earlier Claude Code leak to argue that containment via simple lockdown is getting harder .
"Instead of having people trying to get into your firewall, everyone is now downloading an agent, giving it full root access to their computer and telling it have a go."
- Investor sentiment remains skeptical of the "obvious" Series A winner. Michael Seibel backed Tim Draper’s view that Series A "winners" rarely become the real big winners, arguing that investors over-index on famous founders, polished decks, and press coverage, while companies like Tesla, Hotmail, Skype, Oklo, and Coinbase looked messy at the time but were solving real problems .
5) Worth Your Time
Harmonic Hot 25 Report — a fast screen for in-demand early-stage companies; current leaders are Resolve AI, Peec AI, and Runlayer, with Gradient Labs the biggest riser and Sola / Salient / Dust flagged as likely raise candidates .
Alex Blania on proof of human — useful for understanding why bots and deepfakes may force a new identity layer, plus the iris + multi-party computation + zero-knowledge proof architecture behind World ID and its current scale of 18M verified users / 40M app users.
Karpathy on LLM knowledge bases — a concrete workflow for compiling raw source material into a markdown wiki in Obsidian, then using agents for Q&A, linting, and new outputs; Jerry Liu says this is already practical with Claude Code and LiteParse.
Autoresearch and the experimental society — worth reading if you care about automated experimentation with guardrails; Karpathy’s loop found 20 genuine improvements and trained a GPT-2-level model 11% faster, while Shopify’s Toby Lütke used it to run 37 overnight experiments on qmd .
KYBench — a strong benchmark to keep handy if you’re underwriting compliance automation; it measures real KYB adverse-media work and shows both where agents now exceed humans and where they still over-flag .