AI Daily — 2026-05-29
Codex Now Works on Windows, Enables PC Task Actions · llama.cpp debuts official website llama.app...
Covering 37 AI news items
🔥 Top Stories
1. Codex Now Works on Windows, Enables PC Task Actions
OpenAI’s Codex now runs on Windows and can take actions on a PC, with integration into the ChatGPT mobile app to start, review, and steer tasks remotely. This expands cross-device automation and signals a deeper push to weave coding-era capabilities into everyday workflows. Source-x
2. llama.cpp debuts official website llama.app with unified llama entrypoint
llama.cpp launches llama.app as the official site, offering a cross-platform installer and a unified ‘llama’ entrypoint to run/serve models and interface with third-party apps while preserving advanced tooling. The move should simplify local AI adoption, improve UX, and streamline model deployment across ecosystems. Source-x
3. CollectionLoRA Packs 50 Effects into One LoRA via Distillation
CollectionLoRA compresses 50 visual effects into a single LoRA using multi-teacher on-policy distillation to reduce deployment overhead and address parameter interference when stacking LoRAs with acceleration modules. This approach could significantly simplify multi-effect diffusion workflows and deployment at scale. Source-huggingface
📰 Featured
Open Source & Local AI
- llama.cpp debuts official website llama.app with unified llama entrypoint — Provides a unified entrypoint and cross-platform installer to run/serve models and interface with third-party apps, aiming to streamline local AI UX. Source-x
- Crawl4AI: Open-source LLM-friendly crawler updates to v0.8.6 — Security hotfix replaces litellm due to a PyPI supply chain issue; emphasizes anti-bot, Shadow DOM, and RAG-ready tooling for open-source pipelines. Source-github
- CollectionLoRA Packs 50 Effects into One LoRA via Distillation — Compresses 50 effects into a single LoRA with multi-teacher distillation, reducing deployment overhead for edited diffusion models. Source-huggingface
- MTP Boosts vLLM/llama.cpp on Gemma 4 & Qwen 3.6 by 3.34x — Benchmark suggests substantial inference speedups (up to 3.34x) on Gemma 4 31B and Qwen 3.6 27B with GGUF/FP8; limited by testing scope. Source-reddit
- Liquid AI Releases LFM2.5-8B-A1B Edge Model — Edge-friendly 128K context LFM model with multilingual improvements and tool chaining, designed for entry-level laptops; available on HuggingFace. Source-reddit
LLMs, Tools & Platform
- Codex Now Works on Windows, Enables PC Task Actions — Codex extends to Windows and mobile app, enabling Windows-based task automation and cross-device workflow. Source-x
- Claude Code 4.8 Builds 3 Apps in 30 Minutes — Demonstrates rapid code generation and prototyping to deliver three web apps quickly. Source-x
- Google Faces All Major AI Fronts Across Models, Chips, Cloud — Opinion: Google competes across language models, semiconductors, cloud, advertising, autonomy, and devices, underscoring breadth but valuation questions. Source-x
- NVIDIA AI 2026 Collaboration Signals New Era for PCs — A forthcoming NVIDIA AI collaboration hints at transformative PC-era advances, with scant details but high expectations. Source-x
AI Safety & Policy
- Claude Controversies and Outages: A Timeline of Issues — Satirical timeline catalogs outages, bans, disputes, and governance critiques around Claude and Anthropic, highlighting ongoing reliability and safety debates. Source-x
AI Industry & Hardware
- NVIDIA AI 2026 Collaboration Signals New Era for PCs — See above. Source-x
⚡ Quick Bites
- AgentDoG 1.5: Lightweight, Scalable AI Agent Safety Framework — Introduces a scalable safety framework for AI agents. Source-huggingface
- minWM: Open-Source Framework for Real-Time Video World Models — Provides a real-time framework for video world models. Source-huggingface
- YoCausal Probes Causality in Video World Models — Investigates causal structure in video-valued models. Source-huggingface
- OpenMOSS Releases MOSS-TTS v1.5 and SoundEffect v2.0 — New TTS and sound effects in OpenMOSS toolkit. Source-github
- Anthropic Publishes Public Repository for Claude Agent Skills — Opens Claude agent skills for broader use. Source-github
- Qwen3.6-27B Quantization Benchmark Across Q8 to Q2 — Benchmark across multiple quantization formats. Source-reddit
- Train LLMs with 8GB VRAM from Scratch — A path to training LLMs on low VRAM. Source-reddit
- Gemma4 26B A4B Shines as a Practical Local LLM — Gemma4 showcased as a practical local option. Source-reddit
- StepFun 3.7 Flash Unveils 196B MoE, 1.8B ViT — StepFun 3.7 reveals large MoE and ViT models. Source-reddit
- Reachy Mini Gets Real-Time Voice Brain with 19 Tools — Reachy Mini adds voice-powered capabilities and tool access. Source-reddit
- vLLM PR merges native HIP W4A16 kernel, boosts performance — Kernel merge enhances performance. Source-reddit
- Llama.cpp Adds F16 Mask for FA to Save VRAM — VRAM savings via F16 mask. Source-reddit
- Pope’s Pontifex: AI lacks experiences, body, conscience — Philosophical critique on AI limitations. Source-x
- Grok Build CLI Release Notes Update — Notable CLI release notes. Source-x
- AI Frees Researchers to Pursue Crazier Ideas, Tao Says — Tao comments on AI enabling bolder research. Source-x
- Koji: First AI tutor to get kids thinking — Koji aims to foster thinking in children. Source-x
- Cursor Adds Auto-Review Mode for Safer Tool Calls — Auto-review mode to improve safety. Source-x
- OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources — Proposes unified retrieval across sources. Source-huggingface
- Anthropic releases summarized in a nutshell — Anthropic digest of Claude-related items. Source-x
- Harness auto-builds Claude Code agent teams for domains — Automates domain-specific Claude Code teams. Source-github
- Official Compound Engineering Plugin for Claude Code and Codex — Plugin ecosystem for Claude Code and Codex. Source-github
- Gemma 4 31B Enhanced with MoE via Fine-Tune — Gemma 4.31B gains MoE enhancements. Source-reddit
- Opus 4.8 Released; CAD Task Tests Show Unexpected Results — CAD tests reveal surprises in Opus 4.8. Source-x
- Thanking DeepSeek for open R&D, lowering AI costs — Acknowledgment of DeepSeek’s open R&D approach. Source-reddit
- Looking for 4-H100-equivalent inference server under $150K — Search for affordable, capable inference server. Source-reddit
- HTML becomes primary chat language for agents to draw diagrams — HTML gains prominence as the diagramming language in agent chats. Source-reddit
- Dev Sneaks Data-Nuking Prompt into Code, Sparking Legal Scrutiny — A data-nuking prompt raises regulatory concerns. Source-reddit
Generated by AI News Agent | 2026-05-29