AI Daily — 2026-05-28

English 中文

Claude Opus 4.8 Released with Sharper Judgment and Independence · Anthropic to roll out Claude My...

Covering 34 AI news items

🔥 Top Stories

1. Claude Opus 4.8 Released with Sharper Judgment and Independence

Claude Opus 4.8 arrives with fixes and enhancements that address feedback from 4.7, delivering sharper judgment, more transparent progress reporting, and longer independent operation. The update strengthens its appeal for coding, knowledge work, and end-to-end task handling, while a benchmark table positions Opus 4.8 against its predecessor and peers across coding, agentic skills, reasoning, and practical tasks. Source-x

2. Anthropic to roll out Claude Mythos amid safety concerns, boosted inference compute

Anthropic plans to roll out Claude Mythos in the coming weeks, framing the move as addressing safety concerns while reportedly securing tens of billions in inference compute. The rollout comes amid ongoing chatter about the model’s cyber capabilities and safety implications for enterprise use. Source-x

3. Zai Deploys ZCube to Boost GLM-5.1 Inference Performance

Zai replaced the network topology on a thousand-GPU GLM-5.1 cluster with ZCube, achieving cost savings, throughput gains, and latency reductions by removing the Spine layer. The changes yield about 33% cost reductions on switches, 15% higher GPU throughput, and a 40.6% drop in P99 first-token latency. Source-reddit

📰 Featured

LLM

LFM2.5-8B-A1B Debuts as Device-Optimized AI with 128K Context — A device-friendly 8B MoE with 128K context targets edge devices; trained on 38T tokens with large-scale RL and supports fast tool calling with single-GPU customization, all under an open-weight license. Source-x
Claude Code Adds Dynamic Workflows for Multi-Agent Orchestration — Claude Code can generate a strict orchestration plan when prompted with “workflow,” enabling reliable, end-to-end multi-agent automation across hundreds of agents. Source-x
Bidirectional Evolutionary Search Enables Self-Improving LMs — Proposes a Bidirectional Evolutionary Search to broaden exploration and improve post-training sample generation and inference for agentic systems; hosted on Hugging Face. Source-huggingface

AI

AlphaGo inspires math breakthroughs, mirrors Go skill gains — Following AlphaGo, human performance in mathematics appears to be accelerating through AI-assisted methods tied to the unit distance conjecture, echoing Go-era gains. Source-x

Industry

Spielberg: Don’t Use AI as Final Word on Creativity — The filmmaker argues AI should be a tool within a broader production toolkit, not the final authority on dialogue, camera directions, or set design. Source-x

RL

ProRL Advances RL for Proactive Recommendations with Rectified Policy Gradient — ProRL presents a reinforcement learning approach with a rectified policy gradient to improve long-term performance in proactive recommender systems. Source-huggingface

Multimodal

GEM: Generative Supervision Bridges Embodied Vision-Language Gaps — GEM targets Vision-Language-Action tasks to close gaps between high-level semantic pre-training and low-level spatial knowledge for embodied robotics. Source-huggingface

⚡ Quick Bites

MoneyPrinterTurbo: One-Click AI Video Generator — A one-click AI video generator enabling rapid production of AI-generated videos. Source-github
Claude Code Harness Enables Plan-Work-Review Cycle — Harness enables a plan-work-review cycle for traceability in Claude Code workflows. Source-github
LiquidAI Unveils LFM2.5-8B-A1B On-Device Hybrid LLM — LiquidAI releases an on-device hybrid LLM designed for edge devices. Source-reddit
Mimo 2.5 Pro Delivers 83 t/s on 8x GB10 Cluster — High-throughput performance on a GB10 cluster signals strong on-device scalability. Source-reddit
Western Open-Weight SOTA: Gemma4-31B vs Nemotron3-Super-120B — Open-weight SOTA discussion comparing models across benchmarks. Source-reddit
Qwen-Image-Bench Launches Q-Judger for Image Quality — Q-Judger introduced to assess image quality in Qwen-Image-Bench. Source-reddit
VLLM 5x Faster Than Llama; Quantization Status Unclear — VLLM reports 5x speedups over Llama; quantization status remains under discussion. Source-reddit
Vulnerability Found in Framework Used by VLLM and LLM Tools — Security flaw identified in a framework underpinning VLLM and LLM tooling; patching guidance forthcoming. Source-reddit
PaddleOCR-VL-1.6 Released by PaddlePaddle — PaddleOCR-VL-1.6 adds features and fixes to PaddlePaddle’s OCR suite. Source-reddit
SpaceX nears V1.0 in-house AI stack, claims 10x speedup vs JAX — SpaceX touts a major speedup with an in-house AI stack versus JAX. Source-x
Red-teaming helps improve new AI models before release — Red-teaming is highlighted as a key safety step before model release. Source-x
Revealed: The Prompt Behind Every SWE-Bench Test — Details the prompt design behind SWE-Bench tests. Source-x
Explorative Policy Optimization for Multimodal Agentic Reasoning — Proposes explorative policy optimization for multimodal agentic reasoning. Source-huggingface
DenoiseRL: Bootstrapping Reasoning to Recover from Noisy Prefixes — DenoiseRL improves reasoning under prefix noise. Source-huggingface
obra/superpowers: Agentic coding skills framework and methodology — Repository outlining agentic coding skills framework. Source-github
Reachy Mini Goes Fully Local for Voice Agents — Reachy Mini moves to fully local voice-agent capabilities. Source-reddit
Upgrade path from 4x RTX 3090s for LLM hosting — Practical upgrade guidance for LLM hosting hardware. Source-reddit
Qwen3.6 35B: TXT vs Markdown vs HTML vs HTML+CSS — Comparisons across text representations for Qwen3.6 35B. Source-reddit
Claude CLI 2.1.154 breaks vLLM; patch fixes roles — Patch fixes role handling issues between Claude CLI and vLLM. Source-reddit
HF models page adds ‘Base only’ filter — HuggingFace models page gains a base-only filter for easier browsing. Source-reddit
CHAEWON Uses Gemini on Android for Spotify Mood Recs — Gemini powers mood recommendations on Android via CHAEWON. Source-x
Poll: How much VRAM for local AI models? — Community poll exploring VRAM requirements for local models. Source-reddit
Anthropic Claims Cure for Laziness — Anthropic proclaims a “cure for laziness” in AI workflows; claims require scrutiny. Source-x
Home Office Radiator: 4 RTX Pro Max-Q Overheating Setup — Home office rig with 4 RTX Pro Max-Q GPUs runs hot. Source-reddit

Generated by AI News Agent | 2026-05-28