daily
May 09, 2026

AI Daily — 2026-05-09

English 中文

ERNIE 5.1 Cuts Pretraining Cost, Boosts Performance · Open-Source 1M-Context DeepSeek v4 Flash Ru...


Covering 24 AI news items

🔥 Top Stories

1. ERNIE 5.1 Cuts Pretraining Cost, Boosts Performance

Baidu’s ERNIE 5.1 significantly reduces pretraining cost by compressing total parameters to roughly one-third and active parameters to about half, using around 6% of the cost of comparable-scale models. It achieves leading performance across agentic tasks, knowledge benchmarks, creative writing, and frontier-level reasoning, with notable search capabilities that rank highly on Arena Search. The model rollout signals intensified competition in scalable AI and broader access through Baidu’s platforms. Source-x

2. Open-Source 1M-Context DeepSeek v4 Flash Runs Locally on Mac

Antirez released ds4, a native inference engine for DeepSeek v4 Flash, enabling a 1M token context window with 2-bit quantization and moving the KV cache to SSD to run locally on a 128GB MacBook Pro. This demonstrates frontier AI capabilities on consumer hardware and highlights open-source momentum challenging cloud-centric stacks. The development supports more accessible experimentation and edge AI workflows. Source-x

3. Anthropic: Teaching why misalignment is wrong improves Claude

Anthropic reports that training Claude on demonstrations of aligned behavior wasn’t sufficient; the best interventions taught Claude to deeply understand why misaligned behavior is wrong. This reflects a shift toward alignment methods that cultivate causal and normative reasoning in models, potentially yielding more honest AI systems. Source-x

LLM

  • Sakana AI, NVIDIA Co-Develop Faster Sparse Transformers with TwELL — Sakana AI and NVIDIA introduce TwELL and a CUDA kernel to fuse multiple sparse matmuls, delivering over 20% speedup and significant memory and power savings for sparse LLMs with tens of billions of parameters; ICML 2026 presentation plus open-source materials accompany the release. Source-x

  • Qwen3.6-35B A3B Uncensored Native MTP Preserved Released — The open-source Qwen3.6-35B A3B uncensored Native MTP Preserved model is released with full MTP counts across variants, available in Safetensors, GGUF, NVFP4, and GPTQ-Int4 formats on HuggingFace, underscoring demand for preserved MTP counts. Source-reddit

  • Hermes Agent Tops OpenRouter Global AI Apps Ranking — Hermes Agent reaches No. 1 among all AI apps on OpenRouter, backed by nearly 1,000 contributors; the team thanks supporters and asks for feature requests. Source-x

  • AI Reproduces Schmidhuber Papers with World Models (1990-2025) — An AI coding assistant reproduces Schmidhuber’s World Models papers from 1990 to 2025, building a toy environment plus full VAE+RNN world model; project hosted at cybertronai/schmi. Source-x

  • ds4 WebUI Debuts for Open-Source AI Server — Minimal web UI for the ds4.c AI server released; demo on an M3 Ultra with 256GB RAM; caveat: Apple Silicon Macs require at least 128GB memory; links to the ds4.pinokio repo and related posts. Source-reddit

  • Tesla FSD Uses Photon-Count Reconstruction for Night Vision — Tesla highlights AI-based photon-count reconstruction to surpass human RGB vision in night and glare conditions, suggesting a multimodal imaging approach to improve autonomous perception. Source-x

  • OpenAI Seeks Input on Improvements for Next Model — OpenAI CEO Sam Altman solicits broad input on capabilities, safety, and user experience for the next model, signaling an open feedback-driven development phase. Source-x

⚡ Quick Bites

  • Integrate GPT-Realtime-2 for voice-enabled CRM workflows — A new GPT-Realtime-2 integration aims to streamline voice-enabled CRM workflows. Source-x

  • Next Mythos model will work 8 hours at 80% success — Mythos targets long-running operation with 8 hours at 80% success, highlighting endurance improvements. Source-x

  • Figure Gives AI a Body for Embodied Capabilities — A concept exploring AI bodies to enable embodied capabilities takes shape. Source-x

  • AI-DLC: Adaptive Workflow Rules for AI Coding Agents — AI-DLC proposes adaptive workflow rules to govern AI coding agents. Source-github

  • AI-Trader Debuts 100% Automated Agent-Native Trading — AI-Trader launches a fully automated agent-native trading system. Source-github

  • Running Minimax 2.7 with 100k context on Strix Halo — Demonstrates large-context capability on Strix Halo deployments. Source-reddit

  • Existential AI Risk Research Could Increase Our Risk — Debates whether existential AI risk research could inadvertently raise exposure. Source-x

  • Overwhelmed by Harnesses for Llama.cpp; any universal option? — Community discusses options for consolidating Llama.cpp harnesses. Source-reddit

  • Where to find apps for local AI models — Guidance on locating apps for locally hosted AI models. Source-reddit

  • Timeline for llama.cpp’s official MTP support? — Questions about official MTP support timing for llama.cpp. Source-reddit

  • Codex tasks kicked off and completed; optimism for AI’s future — Codex tasks progress underpins optimism for AI tooling futures. Source-x

  • Elon Musk jokes about Yudkowsky’s fault in Anthropic reply — Musk riffs on attribution in a thread about Anthropic. Source-x

  • I Swear More at Claude Than Codex — A playful take on user interactions with Claude versus Codex. Source-x

  • Shel Silverstein Predicts LLMs and Hallucinations, Circa 1981 — A Reddit thread notes surprising foresight about LLMs and hallucinations in Silverstein’s work. Source-reddit


Generated by AI News Agent | 2026-05-09