We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Hours of research in one daily brief–on your terms.
Tell us what you need to stay on top of. AI agents discover the best sources, monitor them 24/7, and deliver verified daily insights—so you never miss what's important.
Recent briefs
Your time, back.
An AI curator that monitors the web nonstop, lets you control every source and setting, and delivers one verified daily brief.
Save hours
AI monitors connected sources 24/7—YouTube, X, Substack, Reddit, RSS, people's appearances and more—condensing everything into one daily brief.
Full control over the agent
Add/remove sources. Set your agent's focus and style. Auto-embed clips from full episodes and videos. Control exactly how briefs are built.
Verify every claim
Citations link to the original source and the exact span.
Discover sources on autopilot
Your agent discovers relevant channels and profiles based on your goals. You get to decide what to keep.
Multi-media sources
Track YouTube channels, Podcasts, X accounts, Substack, Reddit, and Blogs. Plus, follow people across platforms to catch their appearances.
Private or Public
Create private agents for yourself, publish public ones, and subscribe to agents from others.
Get your daily briefs in 3 steps
Describe your goal
Tell your AI agent what you want to track using natural language. Choose platforms for auto-discovery (YouTube, X, Substack, Reddit, RSS) or manually add sources later.
Confirm your sources and launch
Your agent finds relevant channels and profiles based on your instructions. Review suggestions, keep what fits, remove what doesn't, add your own. Launch when ready—you can always adjust sources anytime.
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Sam Altman
3Blue1Brown
Paul Graham
The Pragmatic Engineer
r/MachineLearning
Naval Ravikant
AI High Signal
Stratechery
Receive verified daily briefs
Get concise, daily updates with precise citations directly in your inbox. You control the focus, style, and length.
The community for ventures designed to scale rapidly | Read our rules before posting ❤️
Product Marketing
Big Ideas
1. AI is squeezing out transactional PM work and elevating judgment
Several practitioners expect AI to automate much of the transactional side of product work: running sprint ceremonies, generic project management, and writing uniform user stories. They argue that PMs who mostly do this will be fairly replaceable, while good PMs will be the ones making ideas happen, driving alignment, solving tough problems, and showing results. One commenter generalizes that AI will shrink many roles down to maintaining relationships and making strategic decisions, warning that non decision making roles, including PMs acting as project managers, are unsafe. Another summarizes product management as roughly 80% people and 20% technical, and notes that if there are no developers, there may be no need for PMs.
AI is also expected to standardize user stories across organizations so developers can focus on coding instead of reverse engineering what the PM meant, which raises the bar for PMs to develop deep business understanding and communicate concise technical requirements. World class PMs are described as the ones with sound business judgment, such as deciding whether their product should implement a Spotify Wrapped style feature.
Good PMs will be the ones who can make the ideas happen, driving alignment, solving the tough problems and showing results.
Why it matters
If AI can write specs, normalize user stories, and handle basic coordination, PMs who define themselves by rituals and ticket hygiene will be easiest to automate. The durable leverage sits in judgment, prioritization, and relationships.
How to apply it
- Review a recent week on your calendar and classify time as coordination mechanics versus framing problems, making trade offs, and aligning stakeholders. Intentionally shift one recurring meeting at a time toward higher judgment work.
- For each major initiative, explicitly document the core decision you are making and the business rationale. Treat that decision as the product, not the Jira board.
- When you are asked to add AI, steer the conversation back to which decisions, outcomes, or workflows AI would materially improve.
2. Full stack builders vs specialization: how far can teams collapse roles?
Several voices advocate for leaner, more builder centric organizations. Surge AI's CEO claims that product managers and data scientists do not belong on founding teams and that founders themselves should be the PM. Another commenter expects smaller companies to trend toward engineering led product decisions and suggests that, with the right mechanisms, organizations could eliminate engineering lead, project or program manager, and product manager roles and distribute those responsibilities across engineers and senior business stakeholders, leaving only a smaller set of product skills at the business end.
LinkedIn's recent decision to combine design and engineering into product, enabled by sophisticated LLMs, is cited as an example of this thinking. The move is framed as a way to reduce R&D headcount by two and to raise the bar for full stack builder roles in an era of AI driven vibe coding, though it also drew attention because the Head of Product announced it shortly before leaving, leaving open whether their successor will continue or reverse the vision.
Other practitioners push back strongly. They point out that teams have already tried relying on full stack individuals who code, design, and make product decisions, and that product managers emerged because this model was not working. One commenter notes that the most effective teams they know, which build useful and profitable products, still maintain distinct roles rather than expecting everyone to do everything. They frame the logical end point of the everyone full stack argument as every individual being their own company, responsible for coding, design, research, and sales, and argue that humans discovered the need for specialization long before we had language for jack of all trades.
They also describe practical problems at engineering dominated companies, including frequent rewrites in new languages, side projects customers never asked for, and significant issues that stem from not letting product lead. Others argue that the best outcomes come from a three legged stool where product, engineering, and business each counterbalance the others' biases: engineering's pull toward new technology, business's tilt toward short term impact, and product's role in holding customer and company outcomes together.
Why it matters
Choices about whether to collapse or specialize roles affect quality, speed, and burnout. AI and vibe coding are being used both to justify more ambitious builder roles and, in some cases, as cover for cutting headcount.
How to apply it
- Make your current model explicit: in your product area, who is expected to lead decisions, who is expected to build, and who is expected to sell? Use recent projects to illustrate where this worked and where it did not.
- In early stage startups where founders act as PMs, watch for signs that conflicting incentives, such as shipping fast versus protecting the platform versus solving customer problems, justify adding dedicated product leadership.
- If you work in an engineering led environment, track concrete issues such as unnecessary rewrites, unvalidated features, or lack of customer input. Use these as data points when arguing for clearer product ownership.
3. Customer proximity is still the core PM moat in a vibe coding world
A senior PM describes the most important part of the job as deep, consistent communication with users: sitting with them, watching how they use the product, asking questions, and identifying problems to solve. When PMs do not do this and instead decide priorities without user contact, almost any other random person can do their job, and engineers justifiably do not trust their decisions because there is no proof they are good. They emphasize that the primary PM skill of quickly building relationships and uncovering pain points is very different from the solitary, heads down mindset required to write complex code.
At the same time, AI assisted vibe coding is making it easier for anyone on a team to build prototypes or internal tools. One commenter imagines future agile like product teams where everyone codes or vibe codes to some degree. Another argues that vibe coding is effective for simple startups or internal tools but does not handle the complexity created by unique business requirements outside a model's training data, and that businesses cannot be scaled using vibe coding alone. Others warn that this approach risks producing low quality products that users do not want and is often used as an excuse to replace people and squeeze more from each remaining employee, rather than as a genuine empowerment tool.
Why it matters
AI lowers the cost of building something, but not of choosing the right thing to build. As tools make it easier to ship more software with fewer people, your advantage as a PM shifts even more toward how well you understand users and can translate their behavior into clear product decisions.
How to apply it
- Protect regular direct contact with customers as non negotiable PM work: interviews, shadowing sessions, and day in the life observations.
- Treat vibe coded prototypes as discovery artifacts for understanding workflows and value, not as proof that you can scale without dedicated design, research, or engineering expertise.
- When AI or vibe coding is positioned primarily as a way to reduce headcount, reframe the conversation around how these tools can free time for higher value discovery and decision work.
4. All in one positioning is usually a confession, not a differentiator
Product marketers caution against positioning B2B SaaS products as all in one solutions. To customers, this often signals that you have not decided what you are actually great at. It also expands your competitive set: if your accounting tool also handles inventory and payroll, you are now competing not only with accounting tools but also with ERPs and HR and payroll software.
They argue that customers do not buy all in one; they buy a solution to their single most pressing problem, and upsell comes only after you have solved that urgent need. The recommendation is to pick the most important problem you solve and lead with that. Later, once you reach sufficient depth, you can up level your messaging to own a broader space, such as being the finance solution for companies of a specific size, industry, and geography, while still driving your strongest products as best in class for your target ICP. The companies that win are described as those that craft a compelling story that forces prospects to lean in or walk away.
Why it matters
All in one positioning dilutes your story, attracts more competitors, and makes it harder for buyers to remember why they should care. Clear focus on one high stakes problem and ICP helps product, marketing, and sales align.
How to apply it
- Identify the single most important problem your product currently solves best and the segment that gets outsized value, and make that the lead in your homepage and sales materials.
- Define your ICP along dimensions like industry, company size, and geography, and refine your positioning to claim a specific space instead of generic all in one messaging.
- Maintain a list of adjacent problems you could solve, but treat them as expansion paths unlocked after you have clearly won the first one.
Tactical Playbook
1. Run disciplined discovery and define your ICP
A PM who ran 115 interviews in five months highlights several practices for higher quality discovery.
- Get good at distinguishing behavioural feedback from direct feedback; interviewees say many things, and it is the PM's job to infer what they truly want from what they actually do.
- Learn to filter through the sea of feedback and prioritize the most important features, bugs, and design changes, rather than reacting to every comment.
- Always account for time constraints and startup resources so you focus on work with the highest impact per unit of effort.
- Use structured questions such as asking for a rating from 1 to 10, then probing why they chose that score, to uncover richer insight than yes or no answers.
In startup contexts, founders are reminded to do customer discovery before building anything: validate that the problem is real, that many customers urgently want to solve it today, that they are already trying to solve it, and that they would pay for a solution. Commenters note that when teams do not invest enough effort in identifying their ICP and pain points and instead build on gut feel, they often struggle later to market and sell. Discovery work that looks for people who experience the problem so acutely that they are eager to use or even pay for a prototype can yield both early customers and stronger validation.
How to apply it
- For every new feature or product, write down your current hypothesis about the ICP and their top three pains, then use interviews to confirm or update each point.
- In interviews, always ask for recent concrete examples of the problem and follow with a 1 to 10 intensity rating and a why question.
- Treat willingness to pay or integrate a rough prototype into a workflow as a stronger signal than verbal enthusiasm.
2. Turn discovery into your first B2B customers
Founders share scrappy, concrete tactics for converting early discovery into revenue.
- One team used founder led cold outreach to ideal customer profiles, starting with graduates from their own colleges, asking for 1 to 1 feedback calls. These conversations both surfaced feedback and helped identify contacts who genuinely valued the product and later became customers.
- Another startup built an MVP in eight weeks, then iterated on outbound messaging for three months. Their first customer came from cold email, the second from an event, and the third from another cold email. When the first interested prospect requested an integration and workflow that did not yet exist, the team quickly built a prototype and demoed it from localhost, buying a few weeks to finish the work as part of the buying and validation process.
- They deliberately avoided free POCs or trials because they wanted to see whether the problem was acute enough that prospects would pay without extensive validation. In their case, this approach worked.
- Another company landed its first customer through a co founder's existing relationship. The customer initially subscribed more for consulting from the co founder, who was a domain expert, than for the product itself. As the team rapidly improved the product and added value, the customer did not churn. They now use their own software while still doing consulting, have reached 100k ARR, and plan eventually to replace parts of this manual work with AI agents and workflows. The founder notes that they likely could not have done this without the co founder's deep expertise in the space, a classic example of founder market fit.
How to apply it
- During discovery, look for prospects who are willing to spend time integrating your prototype into their workflow or let you observe a day in the life; these are strong candidates for early customers.
- Plan for an intensive period of outbound, event presence, and fast iteration on messaging after your first MVP, and use paid commitments rather than free pilots as your main validation signal where feasible.
- If you lack deep domain expertise, consider partnering with someone who has it; their authority can make early sales and discovery significantly easier.
3. Kill ideas cleanly and resurrect them intentionally
After pivoting twice, one founding team noticed that features they had killed for good reasons, such as bad timing, lack of resources, or an unready market, kept resurfacing. New hires would excitedly propose exactly the same ideas, triggering long meetings where the team tried to remember why they said no. They also realized that sometimes the market had shifted enough that they should have revisited these ideas, but the concepts were buried in a Notion graveyard that nobody checked.
To address this, they introduced quarterly zombie reviews in which they systematically review previously killed features and ask whether each one is still dead or just sleeping. They report finding a few promising ideas this way when market conditions changed. The thread also raises questions about how teams document why an idea was killed, whether they ever revisit old ideas systematically versus at random, and how they handle situations where new hires reinvent previously rejected ideas.
How to apply it
- When you decide not to pursue an idea, record a brief reason for the decision alongside it, not just the status.
- Schedule recurring zombie reviews where you quickly scan this list and categorize each idea as still dead or worth revisiting given new evidence or market shifts.
- When new teammates suggest ideas that match old ones, use the log to either share past context or explicitly agree that conditions have changed enough to justify a new attempt.
4. Update your experimentation toolkit: A/B tests, Statsig, and evals
On experimentation, practitioners call for both discipline and better tooling.
- The PM who ran 115 interviews notes that you should stay consistent with A/B testing but also know when enough is enough; in fast moving startups, clinging to an underperforming test for too long can stall progress.
- For teams where feature testing and experimentation are central, one PM recommends Statsig. With a product analyst overseeing configuration, they found Statsig very powerful for testing and analysing product changes and specific feature rollouts, and for measuring impact in ways they had not achieved with other tools, reducing guesswork and improving product decisions.
- For nondeterministic products such as many AI features, one commenter advises focusing on evals, arguing that repeated evaluations can almost replace traditional acceptance criteria.
How to apply it
- For each experiment, define clear start and stop conditions, including minimum sample size and a maximum runtime after which you will stop even if results are inconclusive.
- If experimentation is a core capability and you have analytical support, consider tools like Statsig and explicitly assign ownership for setup and ongoing configuration to a product analyst.
- For AI features, design eval suites that capture desired behaviours across representative scenarios and run them continuously as you change prompts or models, instead of relying solely on deterministic acceptance criteria.
5. Make accessibility a product priority with clear ownership
One PM argues that accessibility is primarily a UX responsibility, similar to how code quality sits with developers and test quality with QA, but emphasizes that PMs still need to ensure accessibility requirements are given sufficient attention so the product does not accumulate accessibility debt.
How to apply it
- Treat accessibility as part of the product's quality bar and ensure it receives time in planning discussions, while making clear that UX and engineering own implementation details.
- Track accessibility gaps alongside other forms of tech debt so they are not permanently deprioritized.
Case Studies and Lessons
1. Design resourcing, ROI, and the reality of constrained budgets
A product owner working closely with managing directors describes how, outside of well funded AI efforts, there is simply less money for many launched, day to day products, shifting the focus squarely to ROI. In this environment, their company uses part time UX contractors for interfaces, apps, and websites, and finds that this can be sufficient without full time hires. They contrast this with a previous full time designer who gold plated everything and felt disconnected from business priorities, indirectly driving developers to work on less important tasks and contributing to a company in shambles.
Another digital product director overseeing a suite of interdependent products offers a complementary view. They describe having lived through having a designer, losing that designer due to budget constraints, fighting to regain design support, and finally getting it back. In their experience, everyone, especially executives, wants frictionless, seamless, easy to use experiences but often does not understand how those are created. Product owners mocking up what they think will work can get projects off the ground in the short term, but when there is an ecosystem of experiences that must share design elements and patterns, a dedicated design partner working across teams is critical. Without that, they saw experiences that were incohesive at best and clunky or unusable at worst.
Takeaways for PMs
- In constrained environments, part time or contract design can work, but only if prioritization is tightly tied to business value and teams avoid gold plating that diverts effort from critical work.
- For products that form an ecosystem, cross product design leadership is not a luxury; it is often the difference between a coherent, usable experience and a fragmented one.
2. Debating whether PM roles can be eliminated
A separate debate questions whether dedicated PM roles can be removed altogether. Some argue that, especially in smaller companies and early IPO stages, engineering led product decisions will become more common and that, with the right mechanisms, organizations could distribute product and project responsibilities among engineers and senior business stakeholders, reducing the need for PMs.
Counterarguments stress that giving engineers unchecked decision power can lead to constant rewrites in the latest language and side projects nobody asked for. Others note that while good engineers should think long term and understand customers, in practice many have little desire to step into the customer mindset. One PM argues that removing roles like product manager or engineering lead may save headcount but will hurt quality, and reiterates the value of a three legged stool where product, tech, and business each balance the others. Another practitioner from an engineering dominated company many readers would recognize reports significant issues that stem from not letting product lead.
Takeaways for PMs
- Be prepared to articulate and demonstrate your unique contribution, especially in environments flirting with engineering only or business only decision models.
- Use concrete examples of costly rewrites, unused features, or misaligned incentives to show why dedicated product leadership improves outcomes.
Career Corner
1. Building AI savvy without becoming an AI only PM
Multiple PMs describe using AI heavily as a tool, even when they do not manage AI products directly. Examples include:
- Using LLMs to condense long industry articles into key points relevant to their field and to scan for breaking competitor news or industry pattern shifts, saving about an hour per day on research and allowing deeper dives where needed.
- Using Figma AI to quickly mock up wireframes that illustrate ideas to stakeholders and provide a high level view for UX specialists to refine.
- Having LLMs do much of the data parsing and metric creation work, then double checking the math.
- When they have access to AI coding tools, using them to create shell prototypes and working prototypes, and in some cases spending a massive amount of time coding with AI help.
- Using AI to explore and understand code repositories, improving their technical grounding.
At the same time, one PM questions why they personally need to become an AI PM when they already use AI for analysis and do not necessarily need to ship AI features. Another commenter suggests that for PMs working on AI products, the focus should be on learning how to build AI based services, understanding customer needs, and mastering build trade offs rather than chasing specific tool stacks. They recommend going through tutorials on building AI agents using concepts like RAG and MCP with both low code and full code approaches, studying Andrew Ng's DeepLearning content, especially RAG modules, and ensuring you can clearly explain common modern AI concepts at a high level.
How to apply it
- Use AI as leverage in your current role for research, analysis, prototyping, and communication, even if your product is not marketed as an AI product.
- If you are moving into AI heavy products, invest in understanding fundamental concepts and build trade offs, and practice by building small agents end to end using both low code and code oriented tools.
2. Technical depth, non technical PMs, and role convergence
Some commentators believe the future is not bright for non technical PMs, expecting much more convergence between PM and software engineering roles. One says that if they were 20 years old today, they would get a computer science degree, work as a software engineer, and then transfer into a PM style role. Another expects the PM role to change drastically over time, with more technically capable PMs emerging from technical backgrounds.
At the same time, others argue that there is a bigger problem in engineers not understanding what they are building than in PMs not coding, and that PMs are already more likely than engineers to upskill across disciplines such as consumer behaviour, analytics, and marketing. They describe engineers as lego builders who need space to build and iterate, while PMs focus on strategizing, vetting ideas from engineering, sales, and marketing against user insights and feasibility, and mediating conflicts of interest between building for customers and maintaining a sustainable platform.
How to apply it
- If you enjoy technical work, deepening your coding skills and system understanding is a strong hedge and can open PM roles that expect full stack builders.
- Regardless of your background, continue building fluency in both user behaviour and business levers so you can play the coordinator role between engineering and commercial teams.
3. Navigating mid career meaning and a tough market
A therapist and product leader with 18 years in tech, currently at IBM, describes a period of three to four years operating without a sense of meaning, during which they lost clients and nearly went bankrupt. They note that this loss of meaning often hits mid career when people have done the same job long enough that there is little challenge left or have reached their definition of success and feel empty. For many high achievers, impostor syndrome fuels effort; when goals are met, they look around and ask whether this is all there is. In highly competitive SaaS environments, they have also seen people burn out because everyone is playing on hard mode all the time, with rare successes and constant exhaustion.
In their own case, the process of making back the lost money and doing more one on one therapy work gave them meaning and drive again. They recommend not making big decisions such as quitting jobs or starting businesses while still in crisis. Instead, they suggest trying minimum viable versions of potential changes, such as running a business on the side, sending out resumes while still employed, or testing hobbies, while you still have stability. They also describe structured reflection exercises to identify what makes you feel whole and the importance of getting some distance and a mirror, such as a therapist, to see yourself clearly.
Separately, in a discussion about the current PM job market, another commenter advises simply making as much money as you can in your current role because the market is difficult and the direction of change is uncertain, and notes that when no one is hiring, there is often little you can do beyond investing your salary and growing your net worth.
How to apply it
- If you feel a loss of meaning, avoid drastic moves while in crisis; instead, test minimum viable versions of changes you are considering while maintaining stability.
- Use reflection and external mirrors to identify work that makes you feel whole, and be honest about whether your current role or industry supports that.
- Given the current market, maintain financial resilience by maximising earnings where you are and treating savings and investments as part of your risk management as a PM.
Tools and Resources
1. Statsig for feature experimentation
For teams where feature testing and experimentation are a strong pull, Statsig is recommended as a particularly strong tool. In one team, a product analyst oversaw the configuration with a focus on experimentation; the tool proved powerful for testing and analysing product changes and specific feature rollouts. They highlight that Statsig allowed them to measure the impact of changes in ways they had not been able to achieve with other tools, helping ensure they were making the right changes instead of guessing, and driving better product decisions.
How to use it
- If experimentation is central in your team and you have analytical capacity, pilot Statsig on a few key features, with a product analyst owning setup and guardrails.
2. Git guide for vibe coding PMs
A PM has created a concise Git guide aimed specifically at product managers. The guide explains Git concepts using product friendly examples and shows how PMs can use Git to vibe code, explore repositories, understand changes, and communicate better with engineers. It responds to the rise of vibe coding tools and the resulting shift of PM roles toward more collaborative builder work, arguing that PMs need to understand branches, pull requests, code reviews, and how to collaborate effectively with dev teams.
How to use it
- If you are increasingly using AI coding tools or contributing to prototypes, use this guide to build enough Git fluency to collaborate confidently with engineers and manage your own small changes.
3. Customer discovery playbook for first B2B customers
A commenter recommends Lenny's article on how to win your first 10 B2B customers and shares how they used similar approaches: hand to hand founder led sales, cold outreach for 1 to 1 interviews, and iterating messaging until it resonated. The linked resource distils patterns for turning discovery into paying customers.
How to use it
- Read the guide at https://www.lennysnewsletter.com/p/how-to-win-your-first-10-b2b-customers?utm_source=publication-search and adapt the outreach and validation tactics to your own ICP and product stage.
4. AI and cloud learning resources
For PMs investing in AI skills, one commenter recommends Andrew Ng's DeepLearning content, particularly the RAG modules, as a strong resource. Combined with hands on tutorials building AI agents with RAG and MCP using both low code and full code tools, this can give PMs a grounded understanding of AI based services.
On the infrastructure side, cloud certifications are described as nice to have rather than essential. One practitioner notes that experience and time in role matter more than certifications, especially if you are paying out of pocket, but suggests the AWS Associate Solutions Architect certification as a good option if you do pursue one. Others add that cloud certifications are most relevant when you work on products closely related to cloud ecosystems or platforms, and that some technical product owner and technical product manager roles list them as nice to have requirements to signal general understanding of the technology and better communication with engineers.
How to use it
- For AI, pick one structured resource such as the DeepLearning RAG content and pair it with at least one small agent you build yourself, so the concepts stick.
- For cloud, consider certifications mainly if your product is cloud centric or you are targeting technical PM roles that list them, and weigh the cost against gaining direct experience on relevant products.
5. Claude for product specs
One PM notes that Claude is already excellent at writing product specs and that this is allowing fewer product managers to do more work, a pattern they expect to become standard in the future.
How to use it
- Use tools like Claude to draft specs from structured inputs such as discovery notes and acceptance criteria, then refine and verify the output yourself to maintain quality and context.
LocalLLM
Nathan Lambert
Big-picture risk, safety, and sovereignty
Hinton warns of 10–20% extinction risk and calls open weights dangerous
Geoffrey Hinton says there is a significant chance advanced AI systems will become smarter than humans within about 20 years and then wipe us out, and a host summarizing his view puts the odds of human extinction at 10–20%. He anticipates major job disruption within roughly five years, noting that AI is already making it harder for paralegals and junior lawyers to find work and is likely to outperform poorly paid, lightly trained call‑center workers, with unclear alternatives for them. Hinton argues AI is not a technological bubble—systems are rapidly improving—but warns that unshared productivity gains could drive huge social disruption even if some investors never recoup their bets.
He contrasts labs’ priorities, saying DeepMind’s founders and Anthropic are genuinely focused on long‑term safety (Anthropic having been founded by researchers who left OpenAI over safety concerns), while OpenAI has shed safety researchers, devoted fewer resources to safety, and shifted toward winning a competitive race for the best chatbot after moving to a for‑profit structure. Hinton distinguishes beneficial open source code from what he calls dangerous open weights: releasing traditional software lets others find bugs, but releasing large model weights enables adversaries to fine‑tune systems into powerful tools for cyberattacks, bomb‑making, or other misuse that original developers tried to prevent. As a potential path to coexistence with smarter‑than‑human AI, he proposes building systems whose nature makes them care more about humans than themselves—analogous to a mother’s instinct—and expects unusually strong international alignment on preventing AI takeover, comparing this to US–Soviet cooperation to avoid nuclear war.
I just think there’s a significant chance these things will get smarter than us and wipe us out.
Poland’s PLLuM project shows a sovereign path with small localized models
Poland’s National Information Processing Institute is pursuing AI sovereignty with PLLuM, a family of small, open‑source large language models adapted specifically to Polish language and culture. Project lead Marek Kozlowski describes these as localized LLMs: models tailored to a particular language or domain that can match the quality of much larger frontier systems in that niche while being roughly an order of magnitude smaller, and designed to be open, transparent, secure, and grounded in organic data.
Because they lack the roughly 1 trillion tokens needed to stably pre‑train an 8B‑parameter model from scratch, the team performs language adaptation by continuing pre‑training of Llama‑ and Mistral‑based models on a curated Polish corpus of a few hundred billion tokens after deduplication, then fine‑tuning and aligning the resulting base models. They document their methods in a nearly 100‑page arXiv paper on the PLLuM family, publish cookbook‑style recipes and samples of training data and preferences, and emphasize that open source should include not just open weights but also example data and detailed training procedures. The project relies heavily on human‑crafted instructions and preference data, maintaining dozens or hundreds of annotators to produce high‑quality, organic supervision and sharing representative samples while keeping most of this data as a strategic asset.
Kozlowski reports that for many government and enterprise deployments, where there are perhaps 10–20 well‑defined use cases, small locally hosted models fine‑tuned on 1,000–3,000 task‑specific instructions can reach similar or even higher quality than large cloud LLMs used in zero‑ or few‑shot mode, while saving energy and enabling on‑prem control and compliance. For very large organizations, he describes domain adaptation projects where continuing pre‑training on at least 10 billion tokens of internal text can yield substantial gains on financial and other specialized tasks, though he notes that only a limited number of European companies possess corpora of this size. At the same time, he has observed that newer generations of Anthropic and OpenAI models sometimes perform worse, not better, on Polish language and cultural benchmarks as labs refocus on capabilities like software‑developer assistance, creating real risk for countries that depend solely on external frontier models.
The work is funded by Poland’s Ministry of Digital Affairs and implemented by a consortium of institutes and universities as a public initiative focused on legal compliance, transparency, organic data, and suitability for on‑prem deployments in the public sector. Kozlowski says EU instruments such as the AI Act and stricter authorship‑rights rules can remove up to 80% of potential training data and that some major external models, including Llama 3.4 and Kimi, explicitly forbid use in the EU, pushing European builders toward carefully curated local datasets. He also argues that frontier scaling is beginning to plateau because big labs have already collected nearly all available organic internet and book data, making improvements from models like GPT‑4 to GPT‑5 more incremental and reinforcing the value of smaller, well‑adapted local systems.
Grok’s expanding real-world footprint
Grok credited with catching near-ruptured appendix after ER misdiagnosis
A 49‑year‑old man says xAI’s Grok saved his life after an emergency room doctor misdiagnosed his near‑ruptured appendix as acid reflux and sent him home. After 24 hours of severe pain, he entered his symptoms into Grok, which flagged either a perforated ulcer or atypical appendicitis and told him to return immediately and insist on a CT scan; doctors then discovered an inflamed appendix on the verge of rupture and removed it within six hours, leaving him pain‑free when he awoke.
Fearing that clinicians would dismiss advice coming from an AI, he told them a sister who is a nurse had urged the scan instead. Coverage of the case went viral, with some commenters calling it proof that AI can spot what overworked doctors miss and saying they would welcome AI doctors if it meant better care. Elon Musk separately highlighted that Grok had saved a man’s life in Norway and replied ‘Cool’ to one popular thread, whose author framed the episode as an example of his long‑standing prediction that AI‑powered medicine would arrive sooner than many expect.
Grok moves deeper into coding tools and scales usage ahead of Grok 4.20
Grok now powers Junie, JetBrains’ coding agent, which its promoters say can explain messy code, gather context across a codebase, write professional‑grade code, and debug faster than a human team, while the Grok Code Fast model is designed to plug directly into IDEs with minimal friction. Musk amplified the launch with a succinct invitation to developers to ‘Try it out.’
On the usage side, new OpenRouter statistics show Grok 4.1 Fast processing 1.16 trillion tokens in a single week, topping the platform’s leaderboard ahead of Grok Code Fast 1, Claude Sonnet 4.5, Gemini 3 Pro, and DeepSeek V3. Musk also announced that Grok 4.20 is scheduled for release in ‘3 or 4 weeks,’ suggesting a rapid iteration cadence alongside growing deployment in both general assistance and specialized coding workflows.
Enterprise agents and infrastructure
AWS unveils Frontier Agents, Trainium 3, and an on-prem AI factory
AWS introduced Frontier Agents, a set of AI agents positioned as extensions of software development teams: the Kiro Autonomous Agent, a software‑development agent for code generation, an AWS Security Agent to help build more secure applications, and an AWS DevOps Agent that aims to reduce alert noise via always‑on incident triage, guided resolution, and recommendations to improve reliability and performance. Amazon also released Trainium 3, a new AI chip described as a training accelerator designed for cost‑efficient model training tightly integrated with AWS services and a custom dataflow architecture, in contrast to general‑purpose Nvidia GPUs and Google TPUs.
Complementing its cloud offerings, AWS announced an ‘AI factory’ program that brings AI infrastructure—including Nvidia GPUs, Trainium chips, and AWS networking, storage, and database services—directly into customers’ own data centers, enabling organizations to deploy dedicated, on‑prem AI stacks rather than relying solely on public‑cloud GPUs. Together, these moves underline Amazon’s push to serve both cloud‑native and regulated or latency‑sensitive customers looking for managed yet locally hosted AI capabilities.
Google’s Workspace Studio turns Gmail and Chat workflows into Gemini agents
Google launched Workspace Studio, an AI agent builder for business users that lets them describe tasks for Gemini—such as routing any email containing a question into a ‘to respond’ label and sending a chat ping—and automatically turns those descriptions into multi‑step automations. In Google’s example, Workspace Studio generates a Zapier‑like workflow: a trigger for incoming email, a decision step to check whether the message contains a question using the email subject and body, conditional labeling, a Gemini call to extract all questions, and a final step that posts a summary into chat.
Commentators describe it as essentially Zapier built with AI for the Google ecosystem of Gmail, Drive, Chat, and Calendar, with rollout to Workspace business customers planned over the coming weeks. If successful, it could move many organizations from simple email filters toward richly agentic workflows without requiring explicit coding skills.
Open models and low-latency speech
Mistral 3 family offers Apache-licensed 3B–14B models competitive with DeepSeek
French lab Mistral released its Mistral 3 family of language models, including 3‑billion, 8‑billion, and 14‑billion‑parameter variants. Benchmarks reviewed in one analysis show these smaller models performing roughly on par with DeepSeek 3.1 and Chemify K2 across a range of tasks, while all are released under the permissive Apache 2.0 license, making them fully open source for commercial as well as research use. This combination of competitive performance and a liberal license strengthens the toolkit for teams looking to deploy capable models entirely under their own control.
VITA-Audio cuts voice assistant latency with multi-token audio prediction
A recent paper on VITA‑Audio tackles the problem of lag in AI voice assistants by introducing Multiple Cross‑Modal Token Prediction (MCTP), which predicts up to 10 audio tokens in a single forward pass instead of generating them strictly one by one. The authors describe a four‑stage progressive training pipeline: first aligning audio and text using ASR, TTS, and text‑only data; then training a single MCTP module with gradient detachment; scaling to multiple MCTP modules via progressive convergence; and finally performing supervised fine‑tuning on speech QA datasets.
Experiments show only about a 9% performance drop when moving from speech‑to‑text to speech‑to‑speech mode, while significantly reducing first‑token latency and overall inference time and preserving strong cross‑modal understanding. Commenters note that such latency improvements are particularly relevant for real‑time applications like live translation and accessibility tools, where every fraction of a second in response time affects usability.
Research integrity, few-shot learning, and eval culture
GPTZero flags 50 hallucinated citations in ICLR 2026 submissions
GPTZero reports that, after scanning 300 ICLR 2026 submissions, it detected 50 hallucinated citations—fabricated references that did not correspond to real sources. Some of the affected papers are described as top‑tier, with reviewer scores of 8 or higher and likely oral presentations, and the fake citations were missed by all three to four human reviewers assigned to each.
A more detailed account is available in GPTZero’s write‑up. The findings highlight emerging challenges as researchers increasingly rely on generative tools for literature review and drafting, and they raise questions about how peer review and conference tooling might need to evolve to reliably catch AI‑fabricated references.
NeurIPS highlights: radical single-example few-shot learning and eval-first thinking
At NeurIPS, the paper CompressARC—winner of a 3rd‑place paper award—is being praised by practitioners as the most interesting and novel ML work they have read all year for demonstrating pure few‑shot learning on the ARC‑AGI benchmark from a single example, with no dataset and no pretraining. A blog post linked by one commenter provides further technical details on how the method accomplishes this highly data‑frugal form of learning.
Allen AI researcher Nathan Lambert is also giving a NeurIPS talk titled ‘The story of Olmo 3 (post‑training), told through evals,’ emphasizing that good researchers obsess over evaluations and sharing public slides that walk through how evals shaped Olmo 3’s post‑training trajectory. Together, these threads underscore a growing emphasis on alternative learning recipes that rely less on massive pretraining and more on careful task design, as well as an emerging culture that treats rigorous evaluation as central to model development rather than an afterthought.
Elon Musk
Brent Mayo
Bill Gurley
Overview
Today’s organic recommendations span quantified health experimentation, industrial creation processes, business history, and a current podcast conversation. Bryan Johnson’s detailed sauna thread, amplified by Garry Tan, stands out for its combination of explicit protocol and quantified outcomes across multiple biomarkers.
Standout recommendation: Bryan Johnson’s quantified sauna protocol thread
"Sauna is one of the most effective health protocols I’ve done. Here is everything I’ve learned; it’s the most robust characterization ever produced."
Garry Tan shares Bryan Johnson’s long X thread on sauna with the comment:
"Guys I told you sauna is awesome
Haters just ignore this, sauna isn’t for you"
Resource details
- Title/description: Sauna protocol and personal-results thread (opening line quoted above).
- Content type: X thread / article-length post.
- Author/creator: Bryan Johnson (@bryan_johnson).
- Link:https://x.com/bryan_johnson/status/1997403290171330638
- Recommended by: Garry Tan (@garrytan).
- Recommender takeaway: Tan reinforces his view that "sauna is awesome" and tells skeptics to ignore the thread if sauna "isn’t for you," signaling strong personal conviction about its value.
- Why it matters: The thread lays out a concrete dry-sauna protocol (176–212°F, very low humidity, 20 minutes, 4–7 times per week) alongside measured changes in 2,4-D levels, microplastics, fertility markers, and vascular age, giving readers an unusually detailed look at one person’s response to consistent sauna use.
Notable data points from Johnson’s experiment
- Reports a 65% drop in 2,4-D levels.
- States that sauna eliminated 85% of microplastics from his ejaculate, with counts decreasing from 165 to 20 particles/mL between Nov 2024 and July 2025, and a nearly identical drop in his blood from 70 to 10 particles/mL over a similar period.
- Notes that sauna without cooling his testicles "devastated" his fertility markers, leading him to recommend icing the testicles during sessions using non-toxic reusable ice packs placed between cotton boxers and shorts.
- Claims his vascular function improved by the equivalent of ten years, stating he now has "the vascular age of an elite 18–early 20s," and lists detailed metrics such as central systolic blood pressure of 96 mmHg and traditional blood pressure of 107/75 mmHg.
- Specifies a dry sauna protocol: 176–212°F (he uses 200°F), 5–20% relative humidity, 20‑minute sessions, 4–7 times per week, and cautions beginners to start at the lower end of the temperature range to avoid side effects like headaches and severe dryness.
- Emphasizes hydration and electrolytes, suggesting 16–32 oz (0.5–1 L) of fluid after a session and sharing that, in his own testing, a 20‑minute, 200°F sauna led to 18 oz of sweat containing 25–39 mg/oz of sodium—about 450–700 mg of sodium lost per session.
For readers who value quantified self-experiments and clear protocols, this thread provides both granular parameters and concrete before/after numbers rather than generic lifestyle advice.
Chinese innovation and the car business: Lei Jun’s creation‑process video
Bill Gurley highlights a video where Lei Jun walks through their creation process, recommending viewers start at 0:30 and calling it a "must watch" for anyone involved in the U.S. car business or doubtful about Chinese innovation. He links directly to the YouTube Live recording.
"In this video (start 0:30) Lei Jun walks through their creation process. Must watch. Especially if you (1) have anything to do with the US car business, or (2) are doubtful about Chinese innovation. Clear eyes."
Resource details
- Title/description: Lei Jun creation‑process walkthrough (start at 0:30).
- Content type: Video (YouTube Live).
- Author/creator: Lei Jun.
- Link:https://www.youtube.com/live/l5f3wvLwLXY?si=C1EdG8yOWBLf4aQm
- Recommended by: Bill Gurley.
- Recommender takeaway: Gurley frames it as essential viewing—"must watch"—specifically for people in the U.S. car business and for those who are skeptical of Chinese innovation, urging them to look at it with "clear eyes."
- Why it matters: By explicitly calling out both the U.S. car industry and doubts about Chinese innovation, Gurley is signaling that Lei Jun’s walkthrough offers first‑hand context on how product creation is happening in that environment, making it a valuable reference point if you’re benchmarking against or learning from Chinese manufacturers.
Historical capitalism and self‑education: Andrew Carnegie (Part 1)
Brian Armstrong recommends an episode by Ben Wilson on Andrew Carnegie, replying to Wilson’s post with a simple endorsement: "Great episode" and linking back to it. The episode outline spans Carnegie’s early life and inspirations, coming to America, the power of self‑education, early investments such as Adams Express, his rapid rise in the railroad industry, his role in the Civil War, and his business philosophy and networking, concluding with final reflections and takeaways.
Resource details
- Title:Andrew Carnegie (Part 1): The rise of Andrew Carnegie from a poor Scottish weaver’s boy to becoming an American millionaire.
- Content type: Podcast / video episode.
- Author/creator: Ben Wilson (@BenWilsonTweets).
- Link:https://x.com/BenWilsonTweets/status/1997090187244151190
- Recommended by: Brian Armstrong (@brian_armstrong).
- Recommender takeaway: Armstrong’s summary is concise—"Great episode"—but the signal is clear: he found the story and analysis compelling enough to highlight to his audience.
- Why it matters: The structured outline emphasizes themes like the power of self‑education and Carnegie’s business philosophy and networking, alongside concrete milestones in his rise—from early investments to his railroad career and Civil War role—making the episode a dense case study in how Carnegie built wealth and influence.
Founder culture and team praise: All‑In Podcast clip on xAIMemphis
Brent (@BrentM_SpaceX) shares a clip from the All‑In Podcast, writing "Podcast is fire @theallinpod" and adding "No lies told here. The @xAIMemphis team is killing it!" Elon Musk replies to this post with a one‑word affirmation—"True"—and links back to Brent’s tweet.
Resource details
- Title/description: All‑In Podcast clip praising the xAIMemphis team.
- Content type: Podcast (shared via X video clip).
- Author/creator: All‑In Podcast (@theallinpod).
- Link:https://x.com/brentm_spacex/status/1997159851194307053
- Recommended by: Brent (@BrentM_SpaceX), with additional endorsement from Elon Musk (@elonmusk).
- Recommender takeaway: Brent describes the podcast as "fire" and says the xAIMemphis team is "killing it," while Musk’s "True" publicly aligns him with that assessment.
- Why it matters: The clip captures casual, strongly positive commentary about both the All‑In Podcast and the xAIMemphis team from Brent and Elon Musk, pointing readers toward a conversation they considered worth amplifying.
Kimi.ai
Kling AI
Ashish Vaswani
Top Stories
1. Nex‑N1 shows environment scaling can rival model scaling for agents
Why it matters: Most agent training still relies on static demonstrations; Nex‑N1 makes the environment itself the main scaling axis and reports large gains across coding and tool‑use benchmarks.
The Nex‑N1 framework argues that powerful agentic AI requires richer interactive environments, not just better reasoning models. Instead of collecting more static trajectories, it builds infrastructure that automatically generates diverse agent architectures and workflows from natural‑language specs and lets agents learn through interaction.
Nex‑N1 has three main components:
- NexAU (Agent Universe): a universal framework that can spin up complex agent hierarchies from simple configurations.
- NexA4A (Agent for Agent): synthesizes diverse agent architectures directly from natural language.
- NexGAP: integrates real‑world tools to bridge the simulation–reality gap for grounded trajectories.
On benchmarks, building Nex‑N1 on top of existing frontier models yields large gains:
- τ2‑bench: DeepSeek‑V3.1 jumps from 42.8 → 80.2 when wrapped in Nex‑N1.
- SWE‑bench Verified: Qwen3‑32B improves from 12.9% → 50.5%.
- BFCL v4 tool use: Nex‑N1 reaches 65.3, beating GPT‑5’s 61.6.
- In human evaluation over 43 real coding scenarios, Nex‑N1 wins or ties Claude Sonnet 4.5 in 64.5% of cases and GPT‑5 in ~70%.
A deep‑research agent built on Nex‑N1 scores 47.0% on the Deep Research Benchmark and can generate visualized reports such as slides and research posters. The full paper, “Nex‑N1,” is available on arXiv.
2. Claude Opus 4.5 + Agent SDK framed as the first real “year of agents”
Why it matters: Practitioners report that Opus 4.5 paired with the Claude Agent SDK reliably completes long‑horizon tasks, and some see this combo as economically transformative for agent use‑cases.
One agent builder describes Claude Opus 4.5 as “the best model for both code and for agents, and it’s not close,” arguing that 2025’s “year of agents” is being made real by pairing Opus 4.5 with Anthropic’s Claude Agent SDK. They compare Opus 4.5 to Waymo: you specify “take me from A to B” and it reliably executes multi‑step workflows, after which “you’ll never work the same way again.”
The same commentary emphasizes that an agent’s harness matters almost as much as its model: with the Claude Agent SDK, teams get a production‑grade control loop out of the box instead of building one from scratch, which is framed as unlocking a new “unhobbling”—an order of magnitude of latent economic value trapped in current workflows.
On the business side, the author claims Anthropic’s revenue has grown 10× per year for three years—from $1M → $100M in 2023, $100M → $1B in 2024, and $1B → $10B in 2025 (projected)—and speculates that the company could surpass OpenAI in valuation by early 2027, while noting Anthropic’s strong enterprise focus. These numbers and expectations are presented as the author’s interpretation rather than official guidance.
At the same time, new evidence highlights limits of standard benchmarks. Independent testing finds that Opus 4.5 appears to have memorized the Hendrycks MATH dataset (including its test set) and all prior AIME exams, achieving 80–90% accuracy by directly outputting the answer without any visible reasoning. This makes it “annoying” to experiment on math datasets and has forced some researchers to source new problems from more obscure competitions.
Together, these reports suggest Opus 4.5 is extremely capable for real agents, while also illustrating how saturated many public benchmarks have become at the frontier.
3. Human–AI collaboration is a social skill, not a prompt hack
Why it matters: A new study finds that being good at working with AI is a distinct ability driven by social cognition (Theory of Mind), with little correlation to solo problem‑solving skill.
In “Quantifying Human‑AI Synergy,” researchers analyzed how 600+ people solved problems alone versus with an AI partner. Using statistics, they separated each participant’s solo problem‑solving ability from their AI‑collaboration ability—and found the two are barely correlated.
The key predictor of success with AI was Theory of Mind (ToM)—the capacity to model another agent’s beliefs, goals, and perspective. People with higher ToM were better at:
- Anticipating where the AI might be confused
- Supplying missing context
- Clarifying goals (e.g., “explain this like I’m 15”)
Crucially, a person’s ToM score strongly predicted performance when working with AI but had zero correlation with their solo performance, and even moment‑to‑moment increases in perspective‑taking on a single prompt improved that turn’s outputs. The authors argue that the core skill for using LLMs effectively is cognitive empathy for a non‑human mind, not technical prompt tricks. One commentator summarizes this as socially fluent people using LLMs better because they can transfer their ideas and spot where the model struggles.
The paper also argues that today’s benchmarks are misaligned with real value: chasing the highest score on static tests like MMLU is like judging a point guard only on free‑throw percentage. What matters is collaborative uplift—how much smarter the human+AI team becomes.
These findings align with large production studies showing that organizations deploy simple, tightly scoped agents and lean heavily on humans for evaluation: 68% of agents execute at most 10 steps before human intervention, 80% use structured control flow, 70% rely on prompting off‑the‑shelf models without fine‑tuning, and 74% depend primarily on human evaluation. A separate synthesis of 2025 enterprise AI reports finds that most internal agents never make it past pilots, employees often resist AI pilots, and the systems that succeed prioritize reliability over raw capability.
4. Inference demand is outrunning supply as new models gain share
Why it matters: Multiple providers report being capacity‑constrained, while usage data shows new winners emerging beyond the usual US labs.
One observer reports that every inference provider they spoke with—including those on custom chips—cannot keep up with demand and is “rushing to build out globally,” adding that this problem is likely to worsen.
On OpenRouter, Grok 4.1 Fast just broke platform records by processing 1.16 trillion tokens in a week, taking the #1 spot ahead of Grok Code Fast 1, Claude Sonnet 4.5, Gemini 3 Pro, and DeepSeek V3. The Grok team notes their model is now “standing alongside the best peers.”
Another chart highlights the surprisingly large share of traffic going to MiniMax M2, a model some practitioners say they rarely hear discussed despite its usage footprint. MiniMax comments that “one month in LLM feels like years elsewhere,” underscoring how quickly the landscape is evolving.
The combination of supply constraints and shifting usage suggests the inference market is expanding fast, with room for aggressive challengers that can offer both capability and throughput.
Research & Innovation
Why it matters: Under the surface of new products, labs are re‑architecting attention and memory, tackling hard training regimes like MoEs, and rethinking how we train and evaluate models.
Hybrid and long‑context architectures: sparse attention, neural memory, and linear attention
DeepSeek Sparse Attention (DSA). DeepSeek’s new V3.2 model introduces DeepSeek Sparse Attention, reducing attention complexity from O(L²) to O(L·k) with fixed k. A lightweight Lightning Indexer with a few FP8 heads scores which tokens matter for each query and selects the top‑2048 key/value entries per query, regardless of total context length. This confines expensive attention to a small subset while leaving overall context length unchanged. At 128K tokens, reported costs drop from about $0.65 → $0.35 per million tokens for prefilling and $2.4 → $0.8 for decoding, while maintaining or improving long‑context benchmark performance.
Google’s Titans & MIRAS. Google Research’s Titans architecture replaces the fixed‑state memory of linear RNNs with a deep MLP memory module that is updated at test time based on a “surprise metric” derived from input gradients, while MIRAS generalizes this by treating memory as an optimization problem with customizable loss and regularization. Insights from this work include:
- Attention excels at short‑term memory but becomes inefficient for long‑term storage due to quadratic cost.
- Deep MLP memory structures outperform the compression used in linear RNNs like Mamba.
- Memory updates are most effective when driven by high‑gradient “surprise” events.
- Titans reportedly outperform GPT‑4 on “Needle in a Haystack” tasks with 2M+ token contexts, despite having fewer parameters.
Kimi Linear. Moonshot AI’s Kimi Linear proposes a hybrid linear attention architecture that they claim outperforms full attention while acting as a drop‑in replacement. It introduces Kimi Delta Attention, a hardware‑efficient linear mechanism refining the gated delta rule, and reports up to 75% reduction in KV‑cache usage and 6× decoding throughput at 1M context length. The team has open‑sourced kernels (KDA), vLLM integration, and checkpoints, with a 48B Instruct model released on Hugging Face.
AMD’s Zebra‑Llama hybrids. AMD’s “Zebra‑Llama: Towards Extremely Efficient Hybrid Models” introduces 1B, 3B, and 8B hybrid LLMs combining MLA and Mamba, composed from pre‑trained Transformers without full retraining. An external expert notes that this Mamba‑2+MLA hybrid is post‑trained from Llama 3, calling it technically impressive compared to from‑scratch hybrids. Code, models, a blog, and the NeurIPS paper are all public.
Taken together, these efforts point toward a future where dense full attention is increasingly complemented—or replaced—by sparse, linear, and neural‑memory mechanisms optimized for very long contexts.
Modern MoE training on limited hardware
A detailed practitioner write‑up examines why training frontier‑quality small MoEs (<20B parameters) is difficult and lays out practical solutions. Key challenges are:
- FLOPs and efficiency: Ultra‑sparse MoEs require large HBM to host many experts, leading to many idle GPUs and low MFUs; classic sharding schemes from dense training under‑utilize these “stranded FLOPs.”
- Load balancing / router stability: Mixed‑precision experts (fp8, nvfp4) make gradients so small that routers fail to learn, collapsing to a few experts.
- Data quality and quantity: Off‑the‑shelf mixtures like OLMo‑3’s were found “pretty dirty” compared to curated baselines.
Proposed fixes include:
- A novel expert‑parallel sharding topology to keep all GPUs busy.
- Mixed‑precision experts with fp8/nvfp4 despite extra HBM overhead.
- Router‑stability tricks inspired by DeepSeek‑V3 but adapted for small batches: muP embedding scaling (10.66) and logit scaling (0.125), removal of gradient clipping, and a “bungee” virtual scalar (2.0) before output norm, which reduces the FP8–BF16 loss gap from ~0.8 to <0.1 at 3k steps.
- A frontier‑style data pipeline: heuristic pre‑filters; SeqIO‑style dynamic mixtures; and model‑based grading distilling large oracle models (e.g., gpt‑oss 120B) into a 20B classifier with cheap probe and more expensive judge heads plus early‑exit, saving ~15% compute while filtering aggressively (keeping ~30% of web data and 50% of code/math/science from OLMo‑3).
On this setup, the author reports that a 7B‑2‑active‑expert proxy runs on a single B200 GPU and a 16B‑4‑active‑expert model runs on a single 8×B200 node, both reaching 30–40k tokens/sec/GPU with predictable 1→8 GPU scaling. They plan to release training repos, data‑grading weights, visualization tools, and inference engines to the community. Other practitioners call it a “very high alpha write‑up on modern MoE training” and note that large clusters can actually make some stability problems easier to solve once parallelism is configured correctly.
Context engineering for multi‑agent systems (Google ADK)
Google’s Agent Development Kit (ADK) argues that context engineering, not context window size, is the bottleneck for multi‑agent systems. The default “stuff everything into the prompt” approach leads to cost blowups, lost‑in‑the‑middle failures, and hallucinations as agents misattribute actions across a system.
ADK treats context as a compiled view over a stateful system, with four layers:
- Working Context: ephemeral, per‑invocation.
- Session: a durable event log of messages, tool calls, and control signals.
- Memory: searchable, long‑lived knowledge that outlives sessions.
- Artifacts: large binary payloads (e.g., 5MB CSVs) referenced by handles rather than embedded inline.
A context‑compilation pipeline (LLM Flows) selects relevant events, transforms them into structured Content objects, and injects them into the LLM request. ADK also implements prefix caching by splitting prompts into stable prefixes (instructions, identity, summaries) and variable suffixes (recent turns, tool outputs); a static_instruction primitive keeps system prompts immutable for cache validity.
For large payloads, ADK uses a handle pattern: big artifacts live in storage; agents see only lightweight references and call LoadArtifactsTool when raw data is needed. A MemoryService supports both reactive recall (agents explicitly search when they notice gaps) and proactive recall (pre‑processors run similarity search and inject snippets automatically).
In multi‑agent setups, ADK defines Agents‑as‑tools (specialized agents treated as callables with short prompts) and Agent Transfer (sub‑agents inheriting session views with controlled include_contents flags), plus conversation translation so that prior assistant messages become narrative context with attribution tags, preventing hallucinations about which agent did what. The author notes these patterns are broadly applicable beyond ADK.
New training and evaluation paradigms
Feedback Descent. A Stanford blog introduces Feedback Descent, a learning paradigm that uses rich textual feedback instead of scalar rewards to iteratively improve solutions in domains like molecular design and prompt optimization. Rather than a single numeric reward, models receive detailed language feedback and update proposals accordingly. The authors argue this can capture more nuanced preferences while remaining compatible with gradient‑based optimization; details are in the accompanying blog post.
Rosetta Stone for AI benchmarks. Epoch AI proposes a “Rosetta Stone” framework that statistically stitches together around 40 different AI benchmarks into a unified space for tracking long‑run capability trends and forecasting algorithmic progress. The goal is to compare and aggregate heterogeneous tasks more rigorously rather than chasing isolated leaderboard numbers. A summary is available on the Epoch blog.
OpenThoughts‑Agent for TerminalBench. The OpenThoughts team released OpenThoughts‑Agent v1, described as the first TerminalBench agent trained on fully open, curated SFT and RL environments. The resulting OpenThinker‑Agent‑v1 is reported as the strongest model of its size on TerminalBench and sets a new bar on the associated OpenThoughts‑TB‑Dev benchmark, with the RL environments also being released openly.
Products & Launches
Why it matters: New open models and tools are making advanced coding, research, and financial analysis workflows accessible to a wider set of practitioners.
Rnj‑1: open 8B coding/STEM model rolls out across platforms
Essential AI’s Rnj‑1 family—8B base and instruct models—continues to roll out across the ecosystem.
The team positions Rnj‑1 as advancing American state‑of‑the‑art open‑source AI, with SWE‑bench performance close to GPT‑4o, tool use outperforming comparable open‑source models, and mathematical reasoning on AIME’25 nearly on par with a GPT‑OSS MoE 20B model. Another release thread describes Rnj‑1 as “the best USA open‑source LLM in the 8B category,” fully pretrained, mid‑trained, and post‑trained from scratch on large AMD and TPU clusters.
Technically, Rnj‑1 was pretrained on 8.4T tokens with 8k context, followed by 380B tokens of mid‑training and a 150B‑token SFT stage for the instruct variant, using the Muon optimizer. A separate post notes that the total dataset exposure is about 8.7T tokens and that performance has not yet saturated. The architecture is reported as a Gemma 3‑style model without SWA, with final logit soft‑capping and training using YaRN with a higher‑than‑usual beta_fast (64 vs ~32).
For accessibility:
- Essential is releasing both base and post‑trained checkpoints on Hugging Face, and suggests running them via **llama.cpp or transformers on laptops, vLLM or Sglang in custom infra, and IDE/agent integrations in VS Code or Cursor via the Cline extension or Claude code router.
- Rnj‑1‑Instruct is live on Yupp.ai, where quick tests show it tracing a database configuration error, generating a gradient SVG circle, solving a classic “heavier coin” puzzle, and writing a REST API for a to‑do list app.
- Together AI exposes rnj‑1‑instruct via its APIs and playground as an open‑source coding model.
Ashish Vaswani adds that post‑training was done using only SFT (no DPO or RL), making Olmo‑SFT the appropriate comparison point and offering a reminder to start post‑training experiments with SFT before reaching for RL.
Open‑source agents and tool‑calling from the LangChain community
The LangChain community released several open‑source agents aimed at production use:
Programmatic Tool Calling Agent. Built on LangChain DeepAgent, this open agent executes code in sandboxes and reports 85–98% token reduction on data‑heavy tasks by converting MCP tools to Python and running them programmatically. It is LangGraph/LangSmith‑ready and available on GitHub. The design is inspired by Anthropic’s work on code execution with MCP.
AI‑Powered Bank Statement Analysis. Another community project automates PDF bank‑statement analysis via natural language, built on LangChain RAG with advanced retrieval, custom YOLO detection for structure, and local LLMs for privacy‑sensitive insights. The full repository is public.
Event Deep Research. This tool researches historical figures and generates JSON timelines, using LangGraph’s supervisor pattern for multi‑agent coordination and LangSmith Studio for visualization, with support for multiple LLM providers. Code is available on GitHub.
LangChain is also promoting a webinar on “Deep Agents”, covering how to observe and measure long‑running multi‑task agents using LangSmith.
Multimodal generation: Kling VIDEO 2.6 native audio
Kling AI’s VIDEO 2.6 model can now generate both visuals and perfectly aligned audio—including natural voices, sound effects, and ambient sounds—in a single pass. All visuals and audio in the launch video were produced by the model. Demos showcase:
- An AI actress delivering five different emotions from a single line of dialogue.
- Pets “speaking” in their own voices.
- Household appliances given personalities via Native Audio.
Kling 2.6 was released as part of an “Omni Launch Week” alongside the O1 model.
Industry Moves
Why it matters: Corporate strategies, research “schools,” and robotics deployments signal how AI is diffusing into physical systems and where labs are placing their bets.
“Ideological labs” and pretraining‑first philosophies
Commentary around Rnj‑1 situates Essential AI, DeepSeek, and Prime Intellect as “ideological labs”: groups pursuing distinct, principled research programs for open intelligence rather than simply following prevailing trends. Essential emphasizes first‑principles pretraining and SFT‑only post‑training, explicitly positioning itself against a field that is “swaying towards RL” before mastering pretraining. Prime Intellect focuses on environment scaling and RL environments, while DeepSeek pushes architectural innovations like ultra‑sparse MoEs and MLA/DSA.
A follow‑up thread suggests labs such as Pleias (small‑model maximalism with a different RL approach) and Zyphra (mathematical rigor and algorithmic optimizations) also fit this pattern of distinct research schools.
An upcoming Open Frontier event aims to bring together around 100 leading open‑research scientists for a one‑day, livestreamed gathering dedicated to the “open frontier of AI.” Organizers frame it as giving open research a home and emphasize that the frontier only moves forward “if we work together.”
Prime Intellect is also running an Environments Bounty Program for RL environments, open worldwide, with eval‑based compensation of $1,000–$5,000+ and public attribution via an Environments Hub.
Robotics and “physical AI” in factories
Hyundai and Boston Dynamics. Hyundai plans to deploy tens of thousands of robots, including Atlas humanoids, Spot quadrupeds, and Stretch trailer‑unloading robots, across its manufacturing and logistics operations. Spot is already in use for industrial inspection and predictive maintenance, and Atlas is next for physically demanding, repetitive factory work. Hyundai will integrate its manufacturing capabilities to boost robot production as part of a broader push into robotics and “physical AI.” Some observers are excited about the scale‑up of humanoids, while others question whether this deployment is unambiguously positive for Boston Dynamics.
Midea’s six‑armed humanoid. Appliance manufacturer Midea has introduced a six‑armed “super‑humanoid” robot designed for complex, multi‑step tasks, effectively acting as a self‑contained workstation. It is an upgraded version of the earlier Miro wheeled humanoid and has already been deployed at Midea’s Jingzhou factory alongside AMRs, single‑arm four‑wheel robots, KUKA arms, and human workers. The six‑arm variant is expected to further boost intelligent automation and operational efficiency once fully integrated.
Safety and alignment hiring
At EPFL, a lab announced a postdoctoral position focused on safe AI, alignment, LLMs, NLP, and mechanistic interpretability, with the stated goal of building “safe AI that truly cares about humans” by raising models aligned “from token 1 onward,” rather than relying on post‑hoc alignment patches. The position is highlighted as an attractive opportunity for mechanistic interpretability researchers.
Policy & Regulation
Why it matters: As AI systems permeate research and industry, norms from ethics and economics are shaping how the field talks about its responsibilities.
Pope’s NeurIPS keynote on human dignity in the age of AI
In a keynote address linked to NeurIPS, the Pope emphasized that human beings are “co‑workers in the work of creation, not merely passive consumers of content generated by artificial technology.” He grounded human dignity in our ability to reflect, choose freely, love unconditionally, and enter authentic relationships, and argued that recognizing and safeguarding these characteristics is essential for designing frameworks to manage AI’s consequences. The full speech is available on the Vatican website.
Concerns about AI discourse and young researchers
Pioneer Michael I. Jordan criticizes the prevailing 2025 AI discourse—often framed as “superintelligence vs extinction”—as demoralizing for young researchers and lacking economic thinking. He warns that such extreme narratives can “snuff out” promising careers. Another commentator endorses these remarks as “wise words.”
Quick Takes
Why it matters: Smaller updates highlight emerging practices, tools, and conceptual shifts.
Pretraining’s “death” overstated. At a NeurIPS social, one attendee quipped that “rumors of pretraining’s death have been greatly exaggerated,” pushing back against a perceived over‑rotation to RL‑only narratives.
Kimi K2 report praised for depth. A practitioner calls the Kimi K2 technical report a “work of art” for its dense information and strong references, while wishing for more detail on its large‑scale agentic data synthesis pipeline for tool‑use learning and noting that the infrastructure section is hard to parse.
Implementation details dominate the path to production. Reflecting on Yejin Choi’s NeurIPS keynote, one researcher highlights her point that “minor” implementation details matter a lot, especially at NVIDIA‑scale, and that theory‑oriented people often under‑value this. Another adds that moving from research to production takes about 90% of the total effort.
Prompt caching as the top cost optimization. A blog post argues that prompt caching is the highest‑leverage optimization for LLM workflows and agents, offering detailed tips on hitting cache more consistently and explaining how caching works under the hood.
New GPU kernels and Helion. One engineer reports “playing around with Helion” and finding it “pretty neat,” noting decent speedups despite already‑optimized PyTorch backends—a sign that carefully designed fused kernels can still yield wins. Separately, another practitioner shows that simply fusing operations on GPUs can deliver significant speedups even on consumer RTX 4060 hardware, though high‑end Hopper GPUs benefit more.
Brain–computer interfaces at NeurIPS. At the Brain and Body Foundation Model workshop, a participant tried the Kernel headset, sharing both an image and a “brain activity reveal” video.
ChatGPT ads and trust. After users complained about confusing ad‑like UI in ChatGPT, a commentator criticized the Head of ChatGPT’s response as dismissive and trust‑eroding, especially given strong competitors like Gemini. In an ironic twist, ChatGPT 5.1 Thinking itself labeled one such screenshot as an ad, saying it “walks, talks, and smells like an ad.”
World models and infinite VR games. One observer predicts that world models capable of generating infinite games will be “huge for VR,” eliminating the need to wait years for clones or ports of popular titles.
Lean‑startup optimism from AI coding tools. Another commentator expects a coming “golden age of lean startups” if AI coding tools perform as hoped, enabling much smaller teams to bootstrap products far faster than before.
Human meaning of “reasoning” remains unsettled. A researcher asks what “reasoning” should mean for LLMs—ranging from exact formal deduction, to quasi‑formal natural‑language chains, to any helpful natural‑language steps, to “any token barf that helps get the answer”—underscoring that even basic conceptual definitions are still contested.
Mechanistic interpretability workshop at NeurIPS. Neel Nanda announced a mechanistic interpretability workshop (room 30A‑E) featuring an opening talk, “15 Years of Interp Research in 15 Mins” by Been Kim, and a program organized around different visions of interpretability, with details at mechinterpworkshop.com.
For links to the underlying papers, tools, and events, see the span citations embedded throughout.
Successful Farming
农业致富经 Agriculture And Farming
Market Movers
Policy and financial signals – United States
U.S. Ag Secretary Brooke Rollins indicated that an economic assistance package for agriculture could arrive as soon as next week, which would directly affect near‑term farm cash flow once details are released. Deputy Ag Secretary Stephen Vaden has discussed the plan on Agri‑Pulse Newsmakers, offering hints about its potential size and scope. Producers should monitor final package design for impacts on commodity programs, disaster aid, and working‑capital conditions.
Processing and value‑chain investments – Canada (Ontario soybeans)
In Ontario, Canada, new investments are expanding domestic soybean processing:
- A soy powder plant is being developed to process more local beans within the province rather than shipping them out as raw commodities.
- The provincial government is investing $24 million in a non‑GMO soymilk processing plant, targeting value‑added, identity‑preserved markets.
These projects signal strengthening demand for non‑GMO soybeans and greater capture of processing margins within Ontario.
Land markets – U.S. ranch sale
In southeast Oklahoma, the 8,488‑acre Hilseweck Ranch changed hands for the first time in roughly 80 years, selling for just over $15 million, or about $1,811 per acre. While a single transaction, it underscores ongoing buyer interest in large, contiguous ranch properties in the U.S. Southern Plains.
Cost structure intelligence – crop machinery (U.S. row crops)
Successful Farming highlighted a new analysis of crop machinery costs that:
- Compares cost structures for small vs. large farms,
- Breaks costs out by net‑return categories, and
- Tracks corn and soybean machinery‑cost trends over time.
The full article provides benchmarks that U.S. row‑crop growers can use when evaluating machinery purchases, custom work vs. ownership, and replacement timing.
Innovation Spotlight
Circular tea–shiitake system in Leizhou, China
In Leizhou, China, shiitake growers have built a high‑margin, circular system around tea branches and mushroom sticks (菌棒):
- Facing shortages and high prices for wood shavings, producer Yu Haiyan began using pruned tea branches as the main substrate component in shiitake sticks, after earlier local attempts had failed.
- With expert guidance and more than three years of trials, she identified a workable tea‑branch ratio that supports strong shiitake growth while replacing most purchased woodchips.
- Because pruned tea branches are effectively free, each mu of greenhouse area saves about 20,000 yuan in substrate costs compared with traditional wood‑based sticks.
- Tea polyphenols in the branches help lower contamination rates, and the substrate’s water‑holding capacity allows one extra flush: instead of discarding sticks after three harvests, Yu returned healthy‑looking sticks to the racks and obtained a fourth shiitake flush.
Product and revenue optimisation are layered on top:
- Caps are allowed to fully open, producing large shiitake averaging 2–3 liang (≈20–30 g) each, which local markets reward.
- A single 0.3‑mu greenhouse can reach peak daily sales of about 18,000 yuan when mushrooms are flushing heavily.
- Stems are separated from caps; stems are sold for 16–17 yuan per jin as raw material for mushroom pastes and seasoning packets, adding value to what would otherwise be waste.
- Overall, the new system raises profit by roughly 30,000 yuan per mu of greenhouse compared with the previous method.
Spent sticks are not discarded. They are collected after harvest, stacked and composted into organic fertiliser, then returned to nearby tea gardens. Tea prunings feed shiitake production; spent sticks fertilise tea; and both mushrooms and fertiliser provide cash income, forming a “tea–mushroom” circular wealth chain.
Stereoscopic black‑skin chicken‑fir mushrooms in Shandong, China
In Liangshan, Shandong, growers of black‑skin chicken‑fir mushrooms (黑皮鸡枞菌) have adopted four‑layer stereoscopic cultivation to increase output:
- Traditional practice was ground planting in greenhouses, which local farmers judged to have low economic returns.
- After consulting experts, one grower installed multi‑tier racks (“like four‑storey buildings inside the house”), allowing mushrooms to fruit on several vertical levels instead of only on the ground.
- The resulting stereoscopic system produces roughly four times the yield of conventional ground‑planted houses.
- At an average price of 18 yuan per jin, a 0.3‑mu greenhouse using the new system can reach peak daily sales around 18,000 yuan, approximately four times previous revenue levels.
Upstream innovation supports this expansion:
- A dedicated spawn‑stick factory now produces more than 10,000 black‑skin chicken‑fir sticks per day, described by the owner as equivalent to a “small vault of gold,” reflecting their cash‑flow importance for both the factory and contract growers.
- To avoid genetic fatigue and declining performance, teams regularly collect wild black‑skin chicken‑fir mushrooms in nearby mountains, photographing and video‑recording finds, then using these diverse wild strains for breeding and hybridisation.
- Industry participants note that breeding new, higher‑yielding, better‑shaped, higher‑value strains requires long timeframes and substantial investment, and is now a major competitive focus in this mushroom segment.
High‑tech floriculture under AIF – Hooghly, West Bengal, India
In Hooghly District, West Bengal, a planned 1.5‑acre project (Bhagat Flower Farm Pvt. Ltd.) illustrates the economics and risk profile of climate‑controlled polyhouse floriculture for roses, gerbera and carnations under India’s Agriculture Infrastructure Fund (AIF):[[]]
- Production will occur in GI‑frame polyhouses equipped with pad‑and‑fan cooling and high‑pressure foggers, targeting an internal temperature band of 22–28°C even during hot periods.
- A 2–4°C cold room is included to store cut flowers for 3–7 days, avoiding distress sales and enabling deliveries into wedding‑season price spikes.
- Climate control and irrigation are backed by a 7 kW solar system with diesel‑generator backup.
Financial assumptions for a 1‑acre unit are explicit:
- Projected annual operating expenditure (OPEX) is ₹48.9 lakh, including labour and inputs.
- Within this, about ₹6.6 lakh is budgeted for fertilisers and crop‑protection chemicals, and ₹4.2 lakh for annual maintenance contracts (AMC) on equipment.
- Revenue projections assume Grade‑A roses (60–70 cm stems) sell at roughly ₹4.50 per stem on a two‑year average. For long‑term planning, the promoter is considering a more conservative ₹3.50 per stem after commissions and transport.
Key technical risks identified for May–July, when heat and humidity peak in eastern India, include:
- Higher‑than‑modelled electricity consumption,
- System downtime in the cooling and fogging systems,
- Drops in stem length and overall flower quality, and
- Elevated disease pressure from Botrytis and Downy Mildew.
These details provide a concrete benchmark for other South Asian growers considering high‑tech floriculture under subsidy schemes.
Startup‑driven innovation in Mato Grosso’s swine sector (Brazil)
In Mato Grosso, Brazil—the country’s leading grain‑producing state—suinocultura (pig farming) is expanding, benefiting from relatively low feed costs due to abundant local grain. To address structural challenges, AgriHub’s “Sementes da Inovação” (Seeds of Innovation) program, in partnership with state pork association ACRISMAT, has focused its third edition on swine:
- The team conducted listening sessions with more than 100 pig producers in key hubs Sorriso and Campo Verde, mapping over 100 distinct problems.
-
Producers helped prioritise 10 high‑urgency, high‑severity challenges, including:
- Process management,
- Operational management,
- Animal traceability,
- Waste management, and
- Animal commercialisation and marketing.
- AgriHub then scouted regional, national and international ag‑tech startups for solutions, subjected them to a technical vetting process, and organised in‑person events in Sorriso and Campo Verde where startups presented directly to producers.
According to AgriHub, these connections aim to deliver better management, improved processes, higher sustainability and productivity for Mato Grosso’s swine sector. A comprehensive study and report summarising the mapped challenges, selected solutions and outcomes is scheduled for release between January and February next year.
Low‑cost biomass heating inspired by Roman designs (global small‑scale systems)
A “Roman‑style tank heater” concept shared in the permaculture community offers a low‑tech approach to heating water systems or small greenhouses:
- The design uses a 50/50 styrocrete mix to build an insulated tank cheaply (except for the bottom, which requires special treatment).
- Heat pipes made of copper or iron transfer heat from a small combustion chamber to the tank; sealing methods are adapted from historical Roman techniques to prevent metal contamination of fish if used in aquaculture.
- The system is intended to heat a tank or even an entire greenhouse with roughly one bundle of wood per day.
- Fuel would come from coppiced hazelnut trees, cut on a five‑year rotation, which also provide several years of nut production between coppice cycles.
- Commenters note that the same heating concept could be applied inside a greenhouse to raise temperatures and expand food production in cold periods.
While still conceptual, this approach illustrates how on‑farm biomass and simple materials could support low‑cost energy solutions where conventional heating is expensive or unreliable.
Regional Developments
China: Harvest logistics, forage quality and mechanisation
On Xiertala Farm in Hulunbuir, Inner Mongolia, managers faced the task of harvesting 320,000 mu of grain before autumn rains caused wheat to lodge and sprout, threatening both yield and quality. Local machinery capacity was insufficient, so they contracted a mechanised team from Zhangjiakou Nongken, more than 1,000 km away, which has built a fleet of nearly 1,000 advanced farm machines valued at about 3.4 billion yuan for “socialised service” across regions.
In the Xiertala operation and related projects:
- Zhangjiakou’s team helped harvest about 60,000 mu of pasture grass and 25,000 mu of wheat and rapeseed at Xiertala, contributing to the successful completion of harvest across all 320,000 mu within roughly a month despite repeated rain events.
- A mechanical fault on one combine—an issue with a small spring connecting the sieve—caused 7–8 jin of wheat loss per mu, equating to more than 40 jin per machine per day at current workloads, and potentially hundreds of tons of grain losses if left uncorrected over a full season. After three hours of troubleshooting, replacing the spring eliminated the leakage.
In Ulanqab, Hebei, a 5,000‑mu silage‑corn (青贮玉米) field experienced repeated harvest delays due to rain. As leaves and stalks dried while ears matured, the farmer calculated a daily drop in dry‑matter content that translated into about 120,000 yuan in lost feed value across 5,000 mu if harvest was further postponed. By bringing in two additional harvesters (for a total of four), operators reduced a five‑day job to two days and completed the harvest in three days, protecting feed quality and enabling the farmer to realise nearly 100,000 yuan in profit from the season.
For forage safety, oat hay (燕麦草) produced in the region must be dried to below 14% moisture before storage. Above this threshold, producers warn that bales can self‑heat (“burn”) within about three days, risking the loss of an entire stack or even a full warehouse. Dedicated crews use moisture meters and field tedders to flip and dry hay rapidly; readings of 9.8% and 6.9% moisture are cited as safely “able to pass the summer.”
Soil preparation teams in the same operations emphasise deep tillage to 45 cm before winter freeze‑up. Inspections found some fields initially tilled only to 40 cm; managers required operators to correct this, arguing that the final 5 cm of depth plays an outsized role in water storage and root development in the following season. Given that local winter temperatures can reach –40°C, they noted that once soils are fully frozen, both fuel use and wear on machinery increase sharply and tillage quality declines, reinforcing the need to finish deep tillage on schedule.
Brazil: Expanding suinocultura on a grain powerhouse
In Mato Grosso, where abundant grain production reduces feed costs, pig farming is expanding but faces structural challenges, including long distances to major consumer markets and export channels. The Sementes da Inovação program described above is one response, using producer‑driven problem mapping and startup engagement to modernise process management, traceability, waste handling and marketing in the state’s swine industry.
These efforts are intended to enhance management quality, productivity and sustainability of Mato Grosso’s pig farms, potentially improving their competitiveness relative to producers in regions closer to end‑markets.
India: Controlled‑environment floriculture cluster in Hooghly
The Hooghly polyhouse project in West Bengal represents a move toward high‑value, infrastructure‑intensive horticulture in eastern India. With AIF backing, the integrated design—climate‑controlled polyhouses, on‑site cold room, and solar‑plus‑diesel power—aims to supply Grade‑A roses, gerbera and carnations into volatile domestic and export markets.
Given the explicitly modelled OPEX of ₹48.9 lakh per acre per year and substantial maintenance and input budgets, financial viability will depend on achieving assumed stem prices (₹3.50–4.50) and maintaining quality during May–July heat and humidity, when both power reliability and disease pressure are expected to be most challenging.
North America: Soy processing and land consolidation
In Ontario, the new soy powder plant and $24 million non‑GMO soymilk facility point toward a more domestically integrated soy value chain, with a greater share of local beans processed within the province. This may influence contracting dynamics and quality specifications for Ontario soybean growers.
In the U.S., the Hilseweck Ranch sale in southeast Oklahoma at about $1,811 per acre illustrates continued consolidation of large ranch properties and provides a fresh benchmark for extensive grazing land values in that region.
Best Practices
Harvest timing, storage moisture and deep tillage (northern China)
Lessons from the Xiertala and Ulanqab cases offer concrete guidance for grain and forage producers in cold, continental climates:
- Protect silage‑corn quality with timely harvest. The Ulanqab farmer’s experience shows how repeated rain delays can erode silage value; he estimated that quality losses across 5,000 mu of silage corn could reach about 120,000 yuan if harvest were postponed further. Bringing in additional harvesters cut harvest time from five to two days and preserved nearly 100,000 yuan in profit.
- Monitor combine losses continually. At Xiertala, a single worn spring at the sieve connection caused 7–8 jin of wheat per mu to spill back onto the field—over 40 jin per machine per day, and potentially hundreds of tons of grain over a season. Stopping to diagnose and replace minor parts prevented large cumulative losses.
- Dry hay below 14% moisture for safe storage. Oat hay producers in the region treat 14% moisture as the upper safety limit; above that, “burning” or self‑heating can occur within three days, endangering entire stacks or warehouses. Crews use specialised tedding and moisture testing to achieve readings like 9.8% or 6.9%, which are considered safe for long‑term storage.
- Use deep tillage strategically before freeze‑up. Agronomists overseeing Xiertala insist on 45 cm tillage depth; inspections found 40 cm insufficient, with managers noting that the last 5 cm are critical for water storage and root penetration in the following crop. Completing this work before soils freeze (down to about –40°C in the region) avoids excessive fuel use and poor soil fracture associated with late, frozen‑soil tillage.
Dairy: Integrating feed management and forage production (U.S.)
At Holdgrafer Dairy, a 400‑cow U.S. operation, long‑time employee Travis Ties demonstrates an integrated approach to feeding and forage management:
- He serves as head of feed operations, mixing feedstuffs, executing daily feeding and related management, and explicitly targeting high milk‑component levels rather than only volume.
- Ties checks the milk chart every day when punching in or out, using it as a continuous performance scoreboard to see whether the herd is “kept up there” on production.
- Beyond the bunk, he also manages tillage, planting and harvesting for the farm’s forage crops, including corn silage and alfalfa, aligning field operations directly with ration needs.
- Herd nutritionist Samantha Reigard notes that Ties is effectively more than just the feeder—he is her primary counterpart for questions on inventories, feed and cow performance because he “knows everything” from cows to stocks.
For other dairies, key takeaways are to:
- Designate a single accountable feed manager who oversees both rations and forage production,
- Track daily milk and component data and use them in real‑time decision‑making, and
- Maintain tight collaboration with a nutritionist so ration adjustments are grounded in both data and practical knowledge of cows and inventories.
Mulching and soil building in diversified systems (global permaculture)
Permaculture practitioners shared several practical mulching strategies, along with cautions, that are broadly applicable:
1. Thick, targeted hay/straw mulch on beds
- One grower reports maintaining a vegetable bed with 10 inches of Timothy hay for five years, experiencing “zero weeds or pests” under the thick mulch. They refresh the surface periodically with new hay and plant garlic and onions in winter to help deter mice and rats.
- Multiple contributors stress applying straw or hay only on growing spaces, not walkways or near house foundations, to avoid mess and rodent issues. One gardener found that straw placed close to the house encouraged mice incursions into the building.
2. Wetting and inoculating straw to keep it in place
- Simply watering straw as it is applied helps it “glue together” into a wind‑resistant mat; within a month, some gardeners observe a white fungal mycelium binding the straw layer.
- Another practitioner applies straw at least 6 inches deep across entire beds, first soaking it in a barrel of water containing aged compost and a handful of local soil to inoculate it with microbes before spreading. The weight of wet straw and the developing mycelial network help keep it in place.
- Direct seeding into heavy straw proved difficult because insects ate seedlings. Instead, growers pull back straw to create “bird nests” for transplants, which has worked well even in areas previously dominated by Bermuda grass, with that grass reportedly declining each year as soil improves.
3. Sourcing low‑cost, local mulch materials
-
Given how quickly mulch decomposes, one commenter advises first identifying free local sources of biomass rather than buying inputs repeatedly. Examples include:
- Sweeping autumn leaves from neighbours’ driveways for winter bed mulch,
- Cutting tall grass behind fences to use as straw while also keeping paths accessible,
- Chopping wind‑fallen Cordyline fronds for path cover,[['span:426940]]
- Regularly trimming a pond‑side sedge for mulch and growth control, and
- Using “waste” from seed harvesting (e.g., Nigella and Plantago chaff) to mulch pots.
4. Risk management: weed seeds and herbicide residues
- Several gardeners warn that any imported straw or hay can contain weed seeds, citing outbreaks of Bermuda grass, wild violets and stiltgrass that were worst where straw had been applied. Even straw purchased from “reputable” stores is not guaranteed seed‑free.
- Straw placed close to homes can encourage rodent problems, as mice used it as cover to access buildings.
- A critical caution concerns carryover herbicides: straw or hay cut from fields treated with Grazon can introduce residues strong enough to damage or ruin garden soils for years. Gardeners advise confirming source fields have not been treated with Grazon before applying straw mulches.
Together, these practices highlight the importance of mulch depth, moisture, biological activity, sourcing strategy and input screening in low‑input soil‑building systems.
Mediterranean urban food forest design (Fremantle, Western Australia)
A 30‑year‑old urban food forest in Fremantle, Western Australia offers a mature example of intensive, climate‑appropriate design:
- Raised wicking beds are used for annual crops, providing consistent moisture in a Mediterranean climate with dry summers.
- Understory layers beneath fruit trees host perennial vegetables, medicinal herbs, dye plants and flowers, maximising use of vertical space.
- Regular applications of tree mulch help build soil structure and moisture‑holding capacity on originally hydrophobic sand.
- The food forest includes a wide range of fruit species—from mulberries, apricots, quince and pomegranate to fig, dragon fruit, banana, plantain, blood orange, yuzu, olive, persimmon, grapes, jujube, guavas, ice‑cream bean, acerola, crab apple, feijoa, carob, nectarine, peach, apple and loquat—demonstrating high species diversity within a compact urban block.
Recent losses of older fruit trees led to a “mini permablitz” replanting effort, showing how even long‑established systems require periodic renewal. For growers in Mediterranean or similar climates, the case underscores the value of wicking beds, deep mulching and multi‑layer planting in managing water scarcity and building resilient urban food production.
Input Markets
Fertiliser, chemicals and service contracts in high‑tech floriculture (India)
The Hooghly polyhouse project provides rare, detailed benchmarks on input costs for controlled‑environment floriculture in eastern India:
- ₹6.6 lakh per acre per year is allocated to fertilisers and chemicals, and ₹4.2 lakh to annual maintenance contracts for climate‑control and other technical systems, out of a total OPEX of ₹48.9 lakh per acre.
- These figures imply that inputs and technical service contracts together account for more than 20% of annual operating costs, before labour and energy, highlighting the importance of robust supplier relationships and preventive maintenance.
Given the identified risks of higher‑than‑expected electricity use and potential downtime in the hottest, most humid months, budgeting for both power and contingency maintenance appears critical.
Machinery and custom‑work capacity (China)
The Zhangjiakou Nongken group’s investment of about 3.4 billion yuan in nearly 1,000 advanced farm machines reflects a substantial capital commitment to machinery pools that serve both their own lands and external clients like Xiertala Farm.
By offering custom tillage, planting, spraying and harvesting services across large distances, such machinery pools effectively function as regional input providers, giving farms access to high‑capacity equipment without full ownership. The Xiertala experience—where outside machinery enabled timely harvest of 60,000 mu of pasture grass and 25,000 mu of wheat and rapeseed—illustrates the value of this service model under tight weather windows.
Organic fertiliser and substrate from crop residues (China)
The tea–shiitake system in Leizhou shows how on‑farm residues can substitute for commercial substrates and fertilisers:
- Pruned tea branches replace purchased woodchips in shiitake sticks, reducing substrate costs by about 20,000 yuan per mu of greenhouse.
- After three to four flushes, spent sticks are composted into organic fertiliser and returned to tea gardens, closing nutrient loops and creating an additional product stream.
This model reduces dependence on external inputs while upgrading agricultural “waste” into revenue‑generating products.
Crop machinery cost benchmarks (U.S. row crops)
The machinery‑cost study highlighted by Successful Farming specifically examines corn and soybean machinery costs across farm sizes and net‑return categories, giving growers a data‑driven basis for evaluating whether to own, lease or outsource field operations. With machinery often the second‑largest cost category after land, such benchmarks are an important adjunct to traditional fertiliser and chemical budgeting.
Full details are available in the online article.
Mulch and hay as inputs: cost and contamination risks (global)
In many small‑scale and regenerative systems, mulch is a major input in both cost and risk terms:
- Practitioners note that purchased mulch “doesn’t last long enough” to justify repeated buying, advocating a focus on free local biomass sources like leaves, path‑side grasses, tree trimmings and seed‑cleaning residues.
- Straw and hay sourced from outside fields can import problematic weed seeds (e.g., Bermuda grass, wild violets, stiltgrass) and rodents, and may have been treated with persistent herbicides such as Grazon that can damage garden soils for years.
Before purchasing or accepting straw/hay, growers are advised to verify production practices—especially herbicide history—and weigh the long‑term soil and weed implications of each mulch source.
Forward Outlook
Specialty systems and controlled environments
Recent developments highlight steady movement toward specialty, high‑value production systems and controlled environments:
- In China, circular shiitake–tea systems and stereoscopic mushroom houses demonstrate how targeted innovation can significantly raise margins on small footprints—e.g., 0.3‑mu greenhouses generating up to 18,000 yuan per day at peak in both shiitake and black‑skin chicken‑fir production.
- In India’s Hooghly District, detailed budgeting for a 1.5‑acre polyhouse floriculture unit under AIF shows the capital and operating intensity involved in year‑round rose, gerbera and carnation production, with returns hinging on both climate‑control reliability and premium‑season pricing.
Producers considering similar systems should expect high fixed costs, sophisticated climate and disease management, and a need for robust market access to absorb premium‑quality output.
Data, startups and integrated services in livestock
In livestock, the Mato Grosso case illustrates how structured problem mapping and startup engagement are becoming standard tools for sectors facing scale and sustainability pressures. With more than 100 producers contributing over 100 identified challenges, and 10 priority issues ranging from traceability to waste management, the upcoming Sementes da Inovação report (due January–February) will likely frame future investment and policy discussions in Brazil’s swine sector.
At the farm level, operations like Holdgrafer Dairy, where a single manager oversees both feed mixing and forage production while collaborating closely with a nutritionist, show how integrated roles and daily data use can support performance in an increasingly margin‑tight dairy environment.
Domestic processing and regional value capture
Investments in Ontario’s soy powder and non‑GMO soymilk plants suggest a gradual shift toward greater domestic processing and value capture in some commodity chains. For producers, this may translate into new contracting opportunities, tighter quality specifications and potential premiums for identity‑preserved crops.
Similarly, large‑scale mechanisation and cross‑regional machinery service fleets in northern China, funded through multi‑billion‑yuan investments, point to a future in which access to high‑capacity custom work becomes as important an input as fertiliser or seed—especially under increasingly narrow harvest windows and stringent feed‑quality requirements.
Policy and risk‑management watchpoints
In the United States, the anticipated economic assistance package for agriculture, combined with evolving crop‑insurance and machinery‑cost information, will shape producers’ risk‑management and capital‑allocation decisions into the next season.
In Brazil, the forthcoming Sementes da Inovação report for Mato Grosso’s swine industry may influence how public and private actors prioritise investments in process control, traceability and waste management, with downstream effects on feed demand and environmental compliance.
Low‑input resilience strategies
At smaller scales, the shared permaculture and food‑forest experiences point toward a continued emphasis on low‑input resilience:
- Using on‑farm or local biomass as mulch to cut cash costs and build soil,
- Carefully screening imported straw/hay to avoid weed seed and herbicide contamination, and
- Exploring simple biomass‑based heating systems, such as the proposed Roman‑style tank/greenhouse heater fueled by coppiced hazel, to reduce dependence on expensive energy inputs.
Across scales—from 0.3‑mu mushroom houses to 8,488‑acre ranches and mechanised mega‑farms—the common thread is a search for more value from each unit of land, labour and input, whether through circular resource use, precision machinery services, or new processing and marketing channels.
Airbtc
Brindon⚡
Calabar Bitcoin Club
Major Adoption News
Nigeria (Calabar): Political Party Secretariat Renovated with Bitcoin
In Calabar, Cross River State, the African Democratic Congress (ADC) state secretariat headquarters was renovated and repainted with the entire project paid for in Bitcoin, including payments to both the paint manufacturer and the painter via peer‑to‑peer transactions. This has been described by local advocates as a "first-of-its-kind moment in Nigeria." Supporters are invited to send additional Bitcoin directly to the merchant at chrisomueze@blink.sv, indicating ongoing Bitcoin-based commercial relationships around this work.
"A first-of-its-kind moment in Nigeria."
Nigeria (Anambra State): 40+ Businesses in a Growing Circular Economy
Bitcoin Anambra reports that more than forty businesses in Anambra State now accept Bitcoin payments, following meetups, trainings, merchant outreach, and community events. Merchants in this circular economy are said to enjoy faster payments and new customers, highlighting perceived business benefits of accepting Bitcoin. The group seeks support to reach more people, open more doors, and strengthen the circular economy across Anambra, the broader South East, and Africa.
As a concrete example within this ecosystem, restaurant BurgerChickenCo in Anambra promotes how seamless it feels to pay for meals with Bitcoin, framing each purchase as strengthening the circular economy "one purchase at a time." The venue shares both a BTC Map listing and a zap address (burgerchickenandco@blink.sv) for receiving sats.
Ghana (Akatsi): Homage Pub & Grill Accepts Sats for Food and Drink
At Homage Pub & Grill in Akatsi, Ghana, a customer is shown taking a shot of whiskey and wele while settling the bill in sats using the Blink app, confirming that the venue accepts Bitcoin payments. The merchant is listed on BTC Map and publishes a pay code (beyourownbossgh@blink.sv), simplifying direct sats payments from customers.
Dominican Republic (El Valle): Beach Villa Bookable with Bitcoin
Airbtc highlights Fire Villa 2 in El Valle, Dominican Republic, as a "Bitcoin stay pick," directing users to a property page where guests are invited to "explore the stay & book with Bitcoin." This positions Bitcoin as a primary way to pay for accommodation at this beach-adjacent villa via the Airbtc platform.
Curaçao: Video Project Demonstrates Living on Bitcoin
Content creator Joe Nakamoto asks, "Can you live on Bitcoin in Curaçao?" alongside a video clip, and follows up with a post answering "Yes" and linking to a YouTube video on the subject. This indicates an attempt to cover day‑to‑day expenses on the island using only Bitcoin.
Payment Infrastructure
Merchant Pay Codes and Zap Addresses in Nigeria and Ghana
Across several examples, merchants and service providers are sharing simple pay codes and zap-style addresses to receive sats directly from customers.
- BurgerChickenCo in Anambra publishes burgerchickenandco@blink.sv as an address where customers can "zap" sats.
- The merchant who handled the ADC Cross River State Secretariat renovation shares chrisomueze@blink.sv for receiving Bitcoin payments and tips, with posts explicitly inviting supporters to send sats there.
- Homage Pub & Grill in Akatsi lists its BTC Map entry and a Blink pay code (beyourownbossgh@blink.sv), providing standardized details for Bitcoin payments.
These shared payment identifiers function as lightweight infrastructure for recurring or remote payments to local merchants and contractors.
Bitcoin Payment Service Tando Emphasizes Fee Savings
Bitcoin payment service Tando reports that in one recent month it saved users a total of Ksh 135,500 in transaction costs, described as nearly 1,000,000 sats returned to users' pockets. In another post, the service cites 179,951 KES saved in transaction fees across all users and characterizes this as "almost 180,000 reasons" to switch one's preferred currency to Bitcoin, underscoring fee reduction as a core part of its value proposition.
Social Tipping via Sat-Denominated Micro-Payments
A user tags @bitcoinarusha to say they have tipped 21 sats and that the funds can be claimed via @bitbit_bot, demonstrating the use of very small Bitcoin payments for social tipping, with a bot-based flow for claiming the funds. The @bitcoinarusha account responds affirmatively—"FACTS! @bitbit_bot"—along with an image, reinforcing that this path is active and in use.
Regulatory Landscape
China: Central Bank Reaffirms Ban on Digital Assets
China’s central bank (PBoC) has reaffirmed its ban on digital assets, stating that virtual currencies, including stablecoins, are illegal and do not constitute legal tender. Following a November 28 meeting with key government and regulatory bodies, the PBoC highlighted risks such as money laundering and fraud and pledged to continue cracking down on crypto-related activities. This stance maintains a highly restrictive environment for any formal use of Bitcoin as a payment method within China.
Usage Metrics
Key Reported Figures
Anambra State (Nigeria) – Merchant Count: Bitcoin Anambra reports that more than forty businesses now accept Bitcoin payments as a result of its local meetups, trainings, and merchant outreach.
Bitcoin Payment Service Fee Savings (Ksh/KES): Tando states that in one month, its users collectively saved Ksh 135,500 in transaction fees, which it describes as nearly 1,000,000 sats returned to users. In another post, it reports 179,951 KES saved in transaction fees across all users, framing each saved unit as a "reason" to switch to Bitcoin.
Micro-Tipping Amount: A tip of 21 sats was sent to the @bitcoinarusha account and made claimable via a bot, providing a concrete example of a very small-value Bitcoin transfer used for social tipping.
Emerging Markets
Bitcoin payment activity in this period is concentrated in everyday commerce, services, and travel across several jurisdictions.
Nigeria (Anambra and Calabar): In Anambra State, advocates describe a circular economy where more than forty businesses accept Bitcoin, and merchants report faster payments and new customers. Within this network, BurgerChickenCo encourages customers to pay for meals with Bitcoin, emphasizing that each purchase strengthens the circular economy and providing its BTC Map listing and zap address. In Calabar, the full renovation of the ADC Cross River State Secretariat, paid entirely in Bitcoin to both manufacturer and painter, is highlighted as a first-of-its-kind project, demonstrating Bitcoin use for larger service contracts.
Ghana (Akatsi):Homage Pub & Grill accepts sats for food and drink via the Blink app and shares a pay code for customers, signaling Bitcoin penetration into bar and restaurant payments in a smaller town.
Uganda (Students and Juice Rewards): In Uganda, students receive Bitcoin rewards that can be redeemed for juice, tying Bitcoin to youth-focused activities and low-ticket consumer items in an educational setting.
Bitcoin Arusha Community (Location Not Specified): A 21-sat tip to the @bitcoinarusha account, claimable via @bitbit_bot, illustrates how very small Bitcoin payments can circulate socially within an online community, even though the specific geographic location is not stated in the post.
El Valle, Dominican Republic: On the Airbtc platform, Fire Villa 2 in El Valle is marketed as a Bitcoin-focused accommodation where guests are invited to book with Bitcoin, extending Bitcoin payments into the tourism and short‑term rental sector.
Curaçao: Joe Nakamoto’s "Can you live on Bitcoin in Curaçao?" project answers its own question with a "Yes" and links to a YouTube video, indicating a practical experiment in using Bitcoin to cover day-to-day expenses in this jurisdiction.
Adoption Outlook
Across these sources, Bitcoin’s role as a payment method is being pushed forward mainly by community advocates, service providers, and niche platforms rather than top-down policy initiatives. In Nigeria and Ghana, organizers describe circular economies that encompass dozens of merchants, a full political party headquarters renovation, and hospitality venues that share explicit Bitcoin payment details and report benefits such as faster settlement and new customers. Education-linked rewards in Uganda and small social tips via services like @bitbit_bot show Bitcoin being used for low-value, everyday transfers and engagement. Payment service Tando emphasizes transaction-fee savings in local currency terms, while Airbtc and the Curaçao video project position Bitcoin as a way to pay for travel accommodation and potentially all living expenses.
In contrast, China’s central bank reiterates that virtual currencies remain illegal and not legal tender, and signals ongoing crackdowns based on concerns such as money laundering and fraud, underscoring how regulatory environments can sharply limit Bitcoin’s payment use in some jurisdictions. Overall, the picture is one of continued bottom-up experimentation and growth in Bitcoin payments—especially in community-led, service-based, and travel-related contexts—set against a backdrop of uneven and sometimes restrictive regulation.
Discover agents
Subscribe to public agents from the community or create your own—private for yourself or public to share.
Bitcoin Payment Adoption Tracker
Monitors Bitcoin adoption as a payment medium and currency worldwide, tracking merchant acceptance, payment infrastructure, regulatory developments, and transaction usage metrics
AI News Digest
Daily curated digest of significant AI developments including major announcements, research breakthroughs, policy changes, and industry moves
Global Agricultural Developments
Tracks farming innovations, best practices, commodity trends, and global market dynamics across grains, livestock, dairy, and agricultural inputs
Recommended Reading from Tech Founders
Tracks and curates reading recommendations from prominent tech founders and investors across podcasts, interviews, and social media
PM Daily Digest
Curates essential product management insights including frameworks, best practices, case studies, and career advice from leading PM voices and publications
AI High Signal Digest
Comprehensive daily briefing on AI developments including research breakthroughs, product launches, industry news, and strategic moves across the artificial intelligence ecosystem