daily
May 29, 2026

AI Daily — 2026-05-29

English 中文

Codex Now Works on Windows, Enables PC Task Actions · llama.cpp debuts official website llama.app...


Covering 37 AI news items

🔥 Top Stories

1. Codex Now Works on Windows, Enables PC Task Actions

OpenAI’s Codex now runs on Windows and can take actions on a PC, with integration into the ChatGPT mobile app to start, review, and steer tasks remotely. This expands cross-device automation and signals a deeper push to weave coding-era capabilities into everyday workflows. Source-x

2. llama.cpp debuts official website llama.app with unified llama entrypoint

llama.cpp launches llama.app as the official site, offering a cross-platform installer and a unified ‘llama’ entrypoint to run/serve models and interface with third-party apps while preserving advanced tooling. The move should simplify local AI adoption, improve UX, and streamline model deployment across ecosystems. Source-x

3. CollectionLoRA Packs 50 Effects into One LoRA via Distillation

CollectionLoRA compresses 50 visual effects into a single LoRA using multi-teacher on-policy distillation to reduce deployment overhead and address parameter interference when stacking LoRAs with acceleration modules. This approach could significantly simplify multi-effect diffusion workflows and deployment at scale. Source-huggingface


Open Source & Local AI

  • llama.cpp debuts official website llama.app with unified llama entrypoint — Provides a unified entrypoint and cross-platform installer to run/serve models and interface with third-party apps, aiming to streamline local AI UX. Source-x
  • Crawl4AI: Open-source LLM-friendly crawler updates to v0.8.6 — Security hotfix replaces litellm due to a PyPI supply chain issue; emphasizes anti-bot, Shadow DOM, and RAG-ready tooling for open-source pipelines. Source-github
  • CollectionLoRA Packs 50 Effects into One LoRA via Distillation — Compresses 50 effects into a single LoRA with multi-teacher distillation, reducing deployment overhead for edited diffusion models. Source-huggingface
  • MTP Boosts vLLM/llama.cpp on Gemma 4 & Qwen 3.6 by 3.34x — Benchmark suggests substantial inference speedups (up to 3.34x) on Gemma 4 31B and Qwen 3.6 27B with GGUF/FP8; limited by testing scope. Source-reddit
  • Liquid AI Releases LFM2.5-8B-A1B Edge Model — Edge-friendly 128K context LFM model with multilingual improvements and tool chaining, designed for entry-level laptops; available on HuggingFace. Source-reddit

LLMs, Tools & Platform

  • Codex Now Works on Windows, Enables PC Task Actions — Codex extends to Windows and mobile app, enabling Windows-based task automation and cross-device workflow. Source-x
  • Claude Code 4.8 Builds 3 Apps in 30 Minutes — Demonstrates rapid code generation and prototyping to deliver three web apps quickly. Source-x
  • Google Faces All Major AI Fronts Across Models, Chips, Cloud — Opinion: Google competes across language models, semiconductors, cloud, advertising, autonomy, and devices, underscoring breadth but valuation questions. Source-x
  • NVIDIA AI 2026 Collaboration Signals New Era for PCs — A forthcoming NVIDIA AI collaboration hints at transformative PC-era advances, with scant details but high expectations. Source-x

AI Safety & Policy

  • Claude Controversies and Outages: A Timeline of Issues — Satirical timeline catalogs outages, bans, disputes, and governance critiques around Claude and Anthropic, highlighting ongoing reliability and safety debates. Source-x

AI Industry & Hardware

  • NVIDIA AI 2026 Collaboration Signals New Era for PCs — See above. Source-x

⚡ Quick Bites

  • AgentDoG 1.5: Lightweight, Scalable AI Agent Safety Framework — Introduces a scalable safety framework for AI agents. Source-huggingface
  • minWM: Open-Source Framework for Real-Time Video World Models — Provides a real-time framework for video world models. Source-huggingface
  • YoCausal Probes Causality in Video World Models — Investigates causal structure in video-valued models. Source-huggingface
  • OpenMOSS Releases MOSS-TTS v1.5 and SoundEffect v2.0 — New TTS and sound effects in OpenMOSS toolkit. Source-github
  • Anthropic Publishes Public Repository for Claude Agent Skills — Opens Claude agent skills for broader use. Source-github
  • Qwen3.6-27B Quantization Benchmark Across Q8 to Q2 — Benchmark across multiple quantization formats. Source-reddit
  • Train LLMs with 8GB VRAM from Scratch — A path to training LLMs on low VRAM. Source-reddit
  • Gemma4 26B A4B Shines as a Practical Local LLM — Gemma4 showcased as a practical local option. Source-reddit
  • StepFun 3.7 Flash Unveils 196B MoE, 1.8B ViT — StepFun 3.7 reveals large MoE and ViT models. Source-reddit
  • Reachy Mini Gets Real-Time Voice Brain with 19 Tools — Reachy Mini adds voice-powered capabilities and tool access. Source-reddit
  • vLLM PR merges native HIP W4A16 kernel, boosts performance — Kernel merge enhances performance. Source-reddit
  • Llama.cpp Adds F16 Mask for FA to Save VRAM — VRAM savings via F16 mask. Source-reddit
  • Pope’s Pontifex: AI lacks experiences, body, conscience — Philosophical critique on AI limitations. Source-x
  • Grok Build CLI Release Notes Update — Notable CLI release notes. Source-x
  • AI Frees Researchers to Pursue Crazier Ideas, Tao Says — Tao comments on AI enabling bolder research. Source-x
  • Koji: First AI tutor to get kids thinking — Koji aims to foster thinking in children. Source-x
  • Cursor Adds Auto-Review Mode for Safer Tool Calls — Auto-review mode to improve safety. Source-x
  • OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources — Proposes unified retrieval across sources. Source-huggingface
  • Anthropic releases summarized in a nutshell — Anthropic digest of Claude-related items. Source-x
  • Harness auto-builds Claude Code agent teams for domains — Automates domain-specific Claude Code teams. Source-github
  • Official Compound Engineering Plugin for Claude Code and Codex — Plugin ecosystem for Claude Code and Codex. Source-github
  • Gemma 4 31B Enhanced with MoE via Fine-Tune — Gemma 4.31B gains MoE enhancements. Source-reddit
  • Opus 4.8 Released; CAD Task Tests Show Unexpected Results — CAD tests reveal surprises in Opus 4.8. Source-x
  • Thanking DeepSeek for open R&D, lowering AI costs — Acknowledgment of DeepSeek’s open R&D approach. Source-reddit
  • Looking for 4-H100-equivalent inference server under $150K — Search for affordable, capable inference server. Source-reddit
  • HTML becomes primary chat language for agents to draw diagrams — HTML gains prominence as the diagramming language in agent chats. Source-reddit
  • Dev Sneaks Data-Nuking Prompt into Code, Sparking Legal Scrutiny — A data-nuking prompt raises regulatory concerns. Source-reddit

Generated by AI News Agent | 2026-05-29