AI Daily — 2026-05-29

English 中文

Codex Now Works on Windows, Enables PC Task Actions · llama.cpp debuts official website llama.app...

Covering 37 AI news items

🔥 Top Stories

1. Codex Now Works on Windows, Enables PC Task Actions

OpenAI’s Codex now runs on Windows and can take actions on a PC, with integration into the ChatGPT mobile app to start, review, and steer tasks remotely. This expands cross-device automation and signals a deeper push to weave coding-era capabilities into everyday workflows. Source-x

2. llama.cpp debuts official website llama.app with unified llama entrypoint

llama.cpp launches llama.app as the official site, offering a cross-platform installer and a unified ‘llama’ entrypoint to run/serve models and interface with third-party apps while preserving advanced tooling. The move should simplify local AI adoption, improve UX, and streamline model deployment across ecosystems. Source-x

3. CollectionLoRA Packs 50 Effects into One LoRA via Distillation

CollectionLoRA compresses 50 visual effects into a single LoRA using multi-teacher on-policy distillation to reduce deployment overhead and address parameter interference when stacking LoRAs with acceleration modules. This approach could significantly simplify multi-effect diffusion workflows and deployment at scale. Source-huggingface

📰 Featured

Open Source & Local AI

llama.cpp debuts official website llama.app with unified llama entrypoint — Provides a unified entrypoint and cross-platform installer to run/serve models and interface with third-party apps, aiming to streamline local AI UX. Source-x
Crawl4AI: Open-source LLM-friendly crawler updates to v0.8.6 — Security hotfix replaces litellm due to a PyPI supply chain issue; emphasizes anti-bot, Shadow DOM, and RAG-ready tooling for open-source pipelines. Source-github
CollectionLoRA Packs 50 Effects into One LoRA via Distillation — Compresses 50 effects into a single LoRA with multi-teacher distillation, reducing deployment overhead for edited diffusion models. Source-huggingface
MTP Boosts vLLM/llama.cpp on Gemma 4 & Qwen 3.6 by 3.34x — Benchmark suggests substantial inference speedups (up to 3.34x) on Gemma 4 31B and Qwen 3.6 27B with GGUF/FP8; limited by testing scope. Source-reddit
Liquid AI Releases LFM2.5-8B-A1B Edge Model — Edge-friendly 128K context LFM model with multilingual improvements and tool chaining, designed for entry-level laptops; available on HuggingFace. Source-reddit

LLMs, Tools & Platform

Codex Now Works on Windows, Enables PC Task Actions — Codex extends to Windows and mobile app, enabling Windows-based task automation and cross-device workflow. Source-x
Claude Code 4.8 Builds 3 Apps in 30 Minutes — Demonstrates rapid code generation and prototyping to deliver three web apps quickly. Source-x
Google Faces All Major AI Fronts Across Models, Chips, Cloud — Opinion: Google competes across language models, semiconductors, cloud, advertising, autonomy, and devices, underscoring breadth but valuation questions. Source-x
NVIDIA AI 2026 Collaboration Signals New Era for PCs — A forthcoming NVIDIA AI collaboration hints at transformative PC-era advances, with scant details but high expectations. Source-x

AI Safety & Policy

Claude Controversies and Outages: A Timeline of Issues — Satirical timeline catalogs outages, bans, disputes, and governance critiques around Claude and Anthropic, highlighting ongoing reliability and safety debates. Source-x

AI Industry & Hardware

NVIDIA AI 2026 Collaboration Signals New Era for PCs — See above. Source-x

⚡ Quick Bites

AgentDoG 1.5: Lightweight, Scalable AI Agent Safety Framework — Introduces a scalable safety framework for AI agents. Source-huggingface
minWM: Open-Source Framework for Real-Time Video World Models — Provides a real-time framework for video world models. Source-huggingface
YoCausal Probes Causality in Video World Models — Investigates causal structure in video-valued models. Source-huggingface
OpenMOSS Releases MOSS-TTS v1.5 and SoundEffect v2.0 — New TTS and sound effects in OpenMOSS toolkit. Source-github
Anthropic Publishes Public Repository for Claude Agent Skills — Opens Claude agent skills for broader use. Source-github
Qwen3.6-27B Quantization Benchmark Across Q8 to Q2 — Benchmark across multiple quantization formats. Source-reddit
Train LLMs with 8GB VRAM from Scratch — A path to training LLMs on low VRAM. Source-reddit
Gemma4 26B A4B Shines as a Practical Local LLM — Gemma4 showcased as a practical local option. Source-reddit
StepFun 3.7 Flash Unveils 196B MoE, 1.8B ViT — StepFun 3.7 reveals large MoE and ViT models. Source-reddit
Reachy Mini Gets Real-Time Voice Brain with 19 Tools — Reachy Mini adds voice-powered capabilities and tool access. Source-reddit
vLLM PR merges native HIP W4A16 kernel, boosts performance — Kernel merge enhances performance. Source-reddit
Llama.cpp Adds F16 Mask for FA to Save VRAM — VRAM savings via F16 mask. Source-reddit
Pope’s Pontifex: AI lacks experiences, body, conscience — Philosophical critique on AI limitations. Source-x
Grok Build CLI Release Notes Update — Notable CLI release notes. Source-x
AI Frees Researchers to Pursue Crazier Ideas, Tao Says — Tao comments on AI enabling bolder research. Source-x
Koji: First AI tutor to get kids thinking — Koji aims to foster thinking in children. Source-x
Cursor Adds Auto-Review Mode for Safer Tool Calls — Auto-review mode to improve safety. Source-x
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources — Proposes unified retrieval across sources. Source-huggingface
Anthropic releases summarized in a nutshell — Anthropic digest of Claude-related items. Source-x
Harness auto-builds Claude Code agent teams for domains — Automates domain-specific Claude Code teams. Source-github
Official Compound Engineering Plugin for Claude Code and Codex — Plugin ecosystem for Claude Code and Codex. Source-github
Gemma 4 31B Enhanced with MoE via Fine-Tune — Gemma 4.31B gains MoE enhancements. Source-reddit
Opus 4.8 Released; CAD Task Tests Show Unexpected Results — CAD tests reveal surprises in Opus 4.8. Source-x
Thanking DeepSeek for open R&D, lowering AI costs — Acknowledgment of DeepSeek’s open R&D approach. Source-reddit
Looking for 4-H100-equivalent inference server under $150K — Search for affordable, capable inference server. Source-reddit
HTML becomes primary chat language for agents to draw diagrams — HTML gains prominence as the diagramming language in agent chats. Source-reddit
Dev Sneaks Data-Nuking Prompt into Code, Sparking Legal Scrutiny — A data-nuking prompt raises regulatory concerns. Source-reddit

Generated by AI News Agent | 2026-05-29