# Gemini Goes OS-Level, AI Cyberattacks Cross a Line, and Benchmarks Reset

*By AI High Signal Digest • May 13, 2026*

Google moved Gemini deeper into Android while Google’s threat team disclosed the first known AI-developed zero-day in the wild. The brief also covers benchmark saturation, key research advances in math and RL infrastructure, and major product and funding moves across consumer AI, biotech, and enterprise software.

## Top Stories

*Why it matters: The biggest news today points to AI moving deeper into operating systems, crossing a new cyber risk threshold, and quickly outgrowing existing evals.*

- **Google pushed Gemini deeper into Android.** Gemini Intelligence adds multi-step task automation across apps, one-tap form fill, polished dictation, and custom widgets, starting on Galaxy and Pixel this summer and later expanding to watches, cars, glasses, and laptops. Google also previewed an AI-enabled pointer that understands what is under the cursor and combines pointing with speech, reinforcing its push to turn Android into an "intelligence system." [^1][^2][^3][^4][^5]
- **Google reported the first known AI-developed zero-day used in the wild.** Its Threat Intelligence Group said the attackers planned a wide-scale strike, though proactive counter-discovery may have stopped it. [^6] *Impact:* AI cyber risk is no longer just hypothetical.
- **Frontier benchmarks are already being reset.** GPT-5.5 high/xhigh solved the first ProgramBench task and xhigh outperformed Opus 4.7 xhigh across all metrics; separately, GPT-5.5 solved the last unsolved MathArena Apex problem and pushed Apex Shortlist above 90% accuracy. Benchmark builders are now creating newer tests and deprecating some final-answer competitions because models have become too strong for the old format. [^7][^8][^9][^10]

## Research & Innovation

*Why it matters: The most useful technical progress focused on expert workflows, agent training efficiency, and lower-cost compute paths.*

- **DeepMind’s AI Co-Mathematician** reached 48% on FrontierMath Tier 4, a new high among evaluated AI systems. The system is an asynchronous, stateful workbench for ideation, literature discovery, computation, theorem verification, and knowledge development; early sessions reportedly solved open problems and surfaced overlooked citations. [^11]
- **PrimeIntellect’s Renderers** fix the mismatch between token-based RL trainers and message-based environments, which had been corrupting sampled tokens and wasting compute on agentic turns. PrimeIntellect says the change unlocks more than 3x throughput on popular open models. [^12]
- **Sakana AI and NVIDIA’s TwELL** use a new sparse format plus custom CUDA kernels to exploit >95% sparsity in transformer feedforward layers, translating that into >20% faster training and inference on H100s alongside memory and energy savings. [^13][^14]

## Products & Launches

*Why it matters: New releases kept pushing down cost, widening access, and making multimodal AI more usable.*

- **DeepSeek-v4-Flash** was described in leaderboard comparisons as essentially equal to, and sometimes stronger than, v4-Pro while being faster and about 10x cheaper. [^15][^16]
- **Microsoft rolled out MAI-Image-2-Efficient** globally in Bing Image Creator, now free for everyone, with sharper detail, richer color, better text rendering, and more accurate prompt following than v1. [^17]
- **Meta launched Voice Conversations in Meta AI** powered by Muse Spark, with interruptions, topic switches, multilingual speech, real-time image generation, recommendations, and a live AI camera mode; Meta says it is available today. [^18][^19]

## Industry Moves

*Why it matters: Capital and M&A continue to cluster around domain-specific deployment and AI-enabled software businesses.*

- **Isomorphic Labs raised $2.1B** to accelerate AI drug discovery, building on AlphaFold; Demis Hassabis called improving human health AI’s top application. [^20]
- **Anthropic is reportedly in talks to acquire Stainless for $300M+**, a move that would remove a developer-tools supplier used by OpenAI and Google. [^21]
- **Notion said AI now accounts for 60% of its business.** The company closed Q1 with revenue accelerating for the seventh straight quarter and said it is cash-flow positive. [^22]

## Quick Takes

*Why it matters: These smaller updates sharpen the picture on data, voice agents, medical evals, and robotics.*

- Hugging Face crossed **1,000,000 public datasets**, with the total doubling in the last eight months. [^23]
- Artificial Analysis launched **τ-Voice**; **Grok Voice Think Fast 1.0** led at **52.1%**, and even the strongest speech-to-speech models resolved only about half of realistic customer-service scenarios. [^24]
- **Medmarks v1.0** expanded the largest open-source automated medical LLM benchmark suite to **30 benchmarks** and **61 models**. [^25][^26]
- Figure said its **F.04** humanoid reached **design lock** and has started shipping parts. [^27]

---

### Sources

[^1]: [𝕏 post by @Google](https://x.com/Google/status/2054263722353569880)
[^2]: [𝕏 post by @sundarpichai](https://x.com/sundarpichai/status/2054255861158338713)
[^3]: [𝕏 post by @GoogleDeepMind](https://x.com/GoogleDeepMind/status/2054246125524095027)
[^4]: [𝕏 post by @GoogleDeepMind](https://x.com/GoogleDeepMind/status/2054246128221143399)
[^5]: [𝕏 post by @TheRundownAI](https://x.com/TheRundownAI/status/2054306653302858207)
[^6]: [𝕏 post by @NewsFromGoogle](https://x.com/NewsFromGoogle/status/2054187628702888435)
[^7]: [𝕏 post by @KLieret](https://x.com/KLieret/status/2054215545663144217)
[^8]: [𝕏 post by @OfirPress](https://x.com/OfirPress/status/2054229348341661784)
[^9]: [𝕏 post by @Chenhao3564](https://x.com/Chenhao3564/status/2054211432628236746)
[^10]: [𝕏 post by @j_dekoninck](https://x.com/j_dekoninck/status/2054212074511774070)
[^11]: [𝕏 post by @dair_ai](https://x.com/dair_ai/status/2054224343551639958)
[^12]: [𝕏 post by @PrimeIntellect](https://x.com/PrimeIntellect/status/2054347134821154841)
[^13]: [𝕏 post by @SakanaAILabs](https://x.com/SakanaAILabs/status/2052787226136990029)
[^14]: [𝕏 post by @hardmaru](https://x.com/hardmaru/status/2052787980344099293)
[^15]: [𝕏 post by @teortaxesTex](https://x.com/teortaxesTex/status/2054259996376903770)
[^16]: [𝕏 post by @j_dekoninck](https://x.com/j_dekoninck/status/2054162277675282724)
[^17]: [𝕏 post by @JordiRib1](https://x.com/JordiRib1/status/2054248649547255828)
[^18]: [𝕏 post by @MetaNewsroom](https://x.com/MetaNewsroom/status/2054205287515484397)
[^19]: [𝕏 post by @jhyuxm](https://x.com/jhyuxm/status/2054312924014154072)
[^20]: [𝕏 post by @demishassabis](https://x.com/demishassabis/status/2054197462101889277)
[^21]: [𝕏 post by @steph_palazzolo](https://x.com/steph_palazzolo/status/2054344812053008541)
[^22]: [𝕏 post by @ivanhzhao](https://x.com/ivanhzhao/status/2054273186699636975)
[^23]: [𝕏 post by @ClementDelangue](https://x.com/ClementDelangue/status/2054219141653921794)
[^24]: [𝕏 post by @ArtificialAnlys](https://x.com/ArtificialAnlys/status/2054234919887573292)
[^25]: [𝕏 post by @SophontAI](https://x.com/SophontAI/status/2054270239387627927)
[^26]: [𝕏 post by @iScienceLuvr](https://x.com/iScienceLuvr/status/2054272656988410209)
[^27]: [𝕏 post by @adcock_brett](https://x.com/adcock_brett/status/2054392873685340287)