AI Daily — 2026-03-24

English 中文

Litellm PyPI Supply-Chain Attack Exfiltrated Secrets · OpenAI Foundation to spend $1B on AI resil...

Covering 39 AI news items

🔥 Top Stories

1. Litellm PyPI Supply-Chain Attack Exfiltrated Secrets

An attacker uploaded a poisoned version of litellm to PyPI, enabling exfiltration of SSH keys, cloud credentials, Kubernetes configs, git credentials, environment variables, and other secrets with a simple pip install. LiteLLM, with about 97 million downloads per month, means vast potential exposure because many projects depend on litellm as a transitive dependency (e.g., dspy). The tainted package existed for under an hour, and a bug triggered discovery when an MCP plugin in Cursor caused litellm to install and exhaust RAM, potentially delaying detection for days or weeks. Source-x

2. OpenAI Foundation to spend $1B on AI resilience and safety

OpenAI’s Foundation plans to invest at least $1 billion over the next year to advance AI-enabled science and address societal risks. The initiative will focus on novel bio threats, rapid economic changes, and complex emergent effects from capable models, with Wojciech Zaremba becoming Head of AI Resilience. Source-x

3. MiniMind: Train 26M-parameter GPT in 2 Hours

An open-source project, MiniMind, claims to train a 26-million-parameter GPT from scratch in just two hours on a single NVIDIA 3090, costing around $3. The project provides end-to-end code for large-language-model workflows including MoE, data cleaning, pretraining, supervised fine-tuning, LoRA, direct policy optimization, RL training (RLAIF: PPO/GRPO), and model distillation, all implemented in PyTorch without third-party abstractions. It also extends to a multimodal version, MiniMind-V, and aims to democratize AI development. Source-github

📰 Featured

LLM / Open Source / Multimodal

MolmoWeb-8B Outperforms SoM Open Models in Multimodal Tasks — Open multimodal MolmoWeb-8B achieves state-of-the-art results, outperforming open-weight models and even SoM agents built on larger models, signaling progress for open architectures. Source-reddit
LongCat-Flash-Prover Advances Native Formal Reasoning in Lean4 — A 560-billion-parameter Mixture-of-Experts model advances native formal reasoning in Lean4 by auto-formalization, sketching, and proving, and proposes a Hybrid-Experts Iteration Framework to expand high-quality task trajectories within Lean4. Source-huggingface

AI Efficiency / LLM Caching

TurboQuant: 6x LLM cache reduction, 8x speed, zero accuracy loss — Google’s TurboQuant compresses LLM key-value caches, delivering at least 6x memory reduction and up to 8x speedups with no accuracy loss, boosting deployment efficiency. Source-x

Embodied AI / Robotics

Fully autonomous robot reasons from pixels to drive 30 motors — An autonomous robot reasons directly from camera pixels to compute torques for controlling over 30 motors, signaling strong progress in embodied AI and industry interest. Source-x

Benchmark / World Models

Omni-WorldBench Reframes World Models with 4D Evaluation — The paper argues for 4D evaluation of video-based world models, jointly modeling spatial structure and temporal evolution, and introduces Omni-WorldBench to assess dynamic, multimodal performance. Source-huggingface

Design Tools / AI Agents

Figma launches use_figma MCP to design with AI agents — Figma opens beta for use_figma MCP, enabling AI agents to design directly on the canvas and learn design skills for AI-assisted workflows. Source-x

AI Safety / Multi-Agent / Tools

Anthropic Extends Claude with Multi-Agent Harness for Frontend Design — Anthropic details a multi-agent harness to enhance Claude’s frontend design capabilities and long-running autonomous software engineering, emphasizing reliability and steerability. Source-x

⚡ Quick Bites

Moda raises $7.5M for design agent with taste — Seed funding for an AI-assisted design agent. Source-x
Claude Drowns in Minecraft After Wood-Collecting Request — An AI assistant struggles with a simple in-game task, highlighting real-world limits. Source-x
daVinci-MagiHuman: Single-Stream Audio-Video Generative Model — New single-stream AV generative model introduced. Source-huggingface
High-Res Crops for Efficient Vision-Language Models — High-resolution crops improve vision-language model efficiency. Source-huggingface
OpenResearcher: Fully Open Pipeline for Long-Horizon Research Trajectories — Open pipeline for long-horizon AI research trajectories released. Source-huggingface
Hypura: Storage-Tier Aware LLM Inference Scheduler for Apple Silicon — Energy-aware LLM scheduling on Apple Silicon explored. Source-github
Gemini Embedding 2 Enables Sub-Second Video Search — Faster video search via Gemini embeddings. Source-github
Hermes Agent: Self-Improving AI with Built-In Learning Loop — Self-improving AI framework with built-in loop. Source-github
ProofShot enables AI coding agents to verify UI live — UI verification tooling for AI-driven coding agents. Source-github
Reka AI Hosts AMA on Edge Model and Research Direction — AMA on edge models and research directions. Source-reddit
GigaChat-3.1 Ultra 702B and Lightning 10B Open Weights Released — Open weights for large chat models released. Source-reddit
Omnicoder v2 Released with Early Performance Boost — Early performance improvements for Omnicoder v2. Source-reddit
Call for in-depth resources to build AI agents from scratch — Community call for comprehensive resources to build agents. Source-reddit
Hark: New AI lab building the world’s most advanced personal intelligence — New AI lab focusing on personal intelligence. Source-x
Anthropc Index: Claude usage improves with user tenure — Claude usage grows with user tenure. Source-x
Claude Code auto mode handles permissions with safeguards — Auto mode in Claude with safeguards for permissions. Source-x
Sakana Chat Free Public Release in Japan — Sakana Chat goes public in Japan. Source-x
Cursor AI Releases Tech Report on Composer 2 Training — Tech report on training Composer 2 released. Source-x
Disney Exits OpenAI Deal After AI Giant Shuts Sora — Disney ends deal after Sora shutdown. Source-rss
tinygrad: A Tiny DL Stack Between PyTorch and Micrograd — TinyDL stack introduced. Source-github
LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language? — Exploration of LLM hacking and universal language hints. Source-rss
CQ Aims to Be Stack Overflow for AI Coding Agents — CQ aims to be Stack Overflow for AI agents. Source-rss
SillyTavern Extension Brings NPCs to Life in Any Game — Extensions making NPCs more lifelike. Source-reddit
First AI-assisted pull request created — First AI-assisted PR demonstrated. Source-rss
Chat GPT 5.2 Fails to Explain German Word Geschniegelt — Language model struggles with a German word. Source-reddit
LM Studio Malware False Alarm Confirmed by Microslops — False malware alarm around LM Studio debunked. Source-reddit
Search for AI model beating Claude Opus on 32MB VRAM — Quest to beat Claude Opus on ultra-low VRAM. Source-reddit
LM Studio prompts manual steps; user seeks automated local run — User seeks automation for LM Studio workflows. Source-reddit
Jensen Huang says Meek Mill will use AI to take jobs — Industry claim on AI impact on jobs. Source-x

Generated by AI News Agent | 2026-03-24