daily
Apr 05, 2026

AI Daily — 2026-04-05

English 中文

OpenAI Pretrains GPT-5.5 'Spud'; Sora Cancelled · Anthropic billing tied to system prompts sparks...


Covering 29 AI news items

🔥 Top Stories

1. OpenAI Pretrains GPT-5.5 ‘Spud’; Sora Cancelled

OpenAI has completed pretraining on a new major model codenamed Spud (GPT-5.5). The company claims a strong model in weeks that could accelerate the economy, while it is scrapping the Sora project and foregoing Disney’s investment to focus on Spud and related products. The move implies building an entirely new foundation with fresh architecture, data, and scale, led by Greg Brockman, with Altman stepping back from safety oversight to concentrate on datacenters and supply chains. Source-twitter

2. Anthropic billing tied to system prompts sparks backlash

A viral post alleges that billing can vary based on the text of the system prompt, which critics call a really bad look. It notes Anthropic is blocking first-party harness usage and that third-party apps will be charged from extra usage rather than plan limits, effectively moving pricing with user prompts. Source-twitter

3. Codex app server enables quick building of agentic apps

The Codex app server lets users build their own agentic apps on top of Codex, with seamless session syncing across devices and integration with ChatGPT accounts. Demonstrations like the kitty litter app by SIGKITTEN show that skills, agents, sessions, folders, and prompts can be exposed via the app server for unified mobile and desktop experiences. Source-twitter

Open Source

  • Microsoft Agent Framework Enables Multi-Agent AI Workflows — Microsoft released the Agent Framework, a cross-language platform for building, orchestrating, and deploying AI agents with Python and .NET support. It supports everything from simple chat agents to complex multi-agent workflows with graph-based orchestration. The repo provides installation commands, documentation sections (Quick Start, Tutorials, User Guide) and notes on migrating from Semantic Kernel. Source-github
  • Open-source AI job-search tool for Claude Code gains traction — An open-source AI job search system built for Claude Code reportedly scored 700+ applications and landed a job. It scans company career pages, rewrites CVs per job, and auto-fills forms, with features like 14 skill modes, a Go terminal dashboard, ATS-optimized PDFs, and 45+ pre-configured companies. The project is hosted by user santifer on GitHub. Source-twitter
  • Nanocode Enables Claude Code on JAX TPUs for $200 — A Hacker News discussion highlights Nanocode’s claim of running Claude Code on pure JAX on TPUs for around $200. The thread centers on an open-source approach by Salman Mohammadi, showcasing cost-effective access to Claude Code capabilities via JAX on TPUs. Source-hackernews
  • Blaizzy’s MLX-VLM Brings VLM Inference and Fine-Tuning to Mac — MLX-VLM is an open-source package enabling inference and fine-tuning of Vision Language Models (VLMs) and Omni Models on macOS via MLX. Hosted at Blaizzy/mlx-vlm on GitHub, the project includes CLI, chat UI, multi-image chat, and extensive model-specific documentation to streamline setup and usage. Source-github
  • Travel Hacking Toolkit Enables AI-Powered Points Planning — An open-source Show HN introduces the Travel Hacking Toolkit, an AI-assisted suite to search award availability, compare cash fares, and plan trips across multiple loyalty programs. Built around Claude Code and OpenCode, it bundles 7 skills and 6 MCP servers to query data from dozens of sources, including Seats.aero, Google Flights, AwardWallet, Trivago, Airbnb, and Atlas Obscura, with transfer ratios and point valuations from The Points Guy (TPG) and Upgra. The project aims to automate the complex math of when to use points versus cash across programs. Source-hackernews

AI Safety

  • Claude Code Uncovers Linux Vulnerability Hidden for 23 Years — An AI-driven code analysis system named Claude Code reportedly uncovered a Linux kernel vulnerability that had remained hidden for 23 years. The discovery, highlighted by mtlynch.io and discussed on Hacker News, showcases AI-assisted security research and the potential impact on open-source software. Source-hackernews

LLM

  • Gemma 4 (31B) Dominates Benchmark at $0.20/run — Gemma 4, a 31B-parameter model, reportedly dominates the FoodTruck Bench leaderboard with 100% survival and a +1,144% median ROI at $0.20 per run. It outperforms GPT-5.2, Gemini 3 Pro, Sonnet 4.6, and several Chinese open-source models, with Opus 4.6 the only model beating it on price at $36/run. The takeaway is an exceptional cost-to-performance ratio that could boost agentic workflows. Source-reddit
  • Dante-2B: 2.1B bilingual Italian/English LLM built from scratch — Dante-2B is a 2.1B parameter decoder-only LLM trained from scratch to be fluent in Italian and English. It uses a custom 64K BPE tokenizer and a LLaMA-style architecture (28 layers, d_model 2560) optimized for 2× H200 GPUs. Phase 1 is complete, with coherent Italian output achieved in 16 days of training using random initialization and no fine-tuning of Llama or adapters. Source-reddit
  • Chinese Labs Pause Open-Source AI Models Amid Suspected Coordination — Several Chinese AI labs, including Minimax-m2.7, GLM-5.1/5-turbo/5v-turbo, Qwen3.6, and Mimo-v2-pro, have stopped open-sourcing their latest models and promise improvements before release. Critics note the simultaneous shift and question whether these moves signal a coordinated transition to closed models, rather than organic, individual decisions. Source-reddit
  • Pre-1900 LLM Attempts Relativity and Quantum Concepts — A researcher trained an LLM from scratch on pre-1900 texts to test whether it could generate ideas related to quantum mechanics and relativity. The model was too small for meaningful reasoning but showed glimpses of intuition, such as statements about energy quanta and the local equivalence of gravity and acceleration when given historical observations. The dataset and models are released publicly, with a demonstration site, a blog, and a GitHub repository. Source-reddit

AI Tools

  • Turboquant-gpu enables 5x KV cache compression for any GPU — Turboquant-gpu provides a 3-bit Lloyd-Max fused KV cache compression for LLM inference, working with Hugging Face transformers through a simple compress+generate API. It reportedly outperforms MXFP4 and NVFP4 in compression and achieved a 5.02x KV cache reduction on Mistral-7B (1,408 KB → 275 KB). The project is CUDA/cuTile-based, supports CUDA 12/13 with PyTorch fallbacks and runs on common GPUs like RTX, H100, A100, and B200. Source-twitter

Industry

  • The Subprime AI Crisis Is Here — The article argues that the AI industry is entering a risky phase reminiscent of a subprime mortgage crisis, warning of overvaluation and fragile funding. It calls for more prudent due diligence and sustainable deployment in AI ventures. Source-hackernews

⚡ Quick Bites

  • Grok Imagine Upgraded: Realistic Cinematic Shots Now Possible — Grok Imagine received a major upgrade enabling realistic cinematic shot generation. The post includes a prompt template and usage examples to help users achieve similar results. The update is shared via a tweet by doganuraldesign. Source-twitter
  • Anonymized ChatGPT data reveals health messaging patterns — A post citing anonymized U.S. ChatGPT data reports about 2 million weekly health-insurance messages, and 600K weekly messages from people in ‘hospital deserts,’ with 70% occurring outside clinic hours. The author also describes using ChatGPT (and Claude) for organizing health information during a family health issue, noting live document syncing and data ingestion improving decision-making. Source-twitter
  • Siri connects to Hermes agent via Telegram URL — How to use Siri with your Hermes agent: create a new contact, add a URL field with the bot link, and select Telegram as the URL type. This enables sending messages to the Hermes agent from all Apple devices, as long as there is an open conversation. Source-twitter
  • Access Claude via Hermes with simple /claude-code command — An Hermes Agent can pilot a Claude code session in a new Hermes session by typing /claude-code, enabling access to Anthropic’s Claude without tricks. The post claims this approach preserves Hermes’ self-improvement loop and features, unlike OpenClaw’s approach. The information originates from Teknium on X (Twitter). Source-twitter
  • Power users erode margins in AI subscription pricing — AI subscription services look profitable until a single ‘power user’ consumes disproportionate resources, eroding margins. The takeaway is that pricing and usage risks must be considered for AI SaaS models, especially when designing plans and tiers for high-volume users. Source-twitter
  • Musician accuses AI firm of cloning her music, files claims — A musician alleges an AI company is cloning her music and has filed claims against the firm. The dispute highlights copyright concerns in AI-generated music. The report references a Twitter post linked by Hacker News. Source-hackernews
  • Block/goose: Open-source AI agent automates engineering tasks — Block/goose is a local, extensible AI agent that automates engineering tasks from start to finish. It can build projects from scratch, write and execute code, debug failures, orchestrate workflows, and interact with external APIs using any LLM. It supports multi-model configurations, MCP server integration, and is available as both a desktop app and a CLI. Source-github
  • Building Syntaqlite AI in Three Months After Eight Years — An article chronicles the creation of Syntaqlite AI, a project pursued for eight years and realized in three months with the help of AI tooling. It discusses the development journey, the role of AI in accelerating building, and lessons learned. Source-hackernews
  • AI copies musical artist files, copyright claim against artist — An AI copied files belonging to a musical artist, triggering a copyright claim against the artist. The story has updates and discussion on Hacker News, referencing a post from VladTheInflator on XCancel and a related tweet. Source-hackernews
  • 12k AI-generated blog posts added in one commit — A GitHub commit from OneUptime/blog reportedly added around 12,000 AI-generated blog posts in a single push. The move sparks discussion about content quality, licensing, spam risk, and maintenance for open-source projects when automated content is bulk-added. The incident drew attention on Hacker News. Source-hackernews
  • I Used AI. It Worked. I Hated It. — A personal Hacker News post recounts using AI for a task: the tool delivered results, but the experience left the author frustrated. It highlights the gap between AI promises and real-world practicality, including potential trade-offs and dissatisfaction. Source-hackernews
  • LLM Wiki Showcases Idea File Concept by Karpathy — Karpathy highlights an ‘idea file’ approach in an LLM Wiki, illustrating how ideas about language models can be organized and shared. The post circulated on X and Hacker News, with a GitHub Gist linked for more details. Source-hackernews
  • Cancelled Claude Code Subscription; April Setup Relocated — A user reports canceling their Claude Code subscription for April and relocating the setup to a MacBook Pro with Gemma 4. They highlight no internet, no API costs, and no limits. The post suggests a simplified, offline-focused configuration. Source-twitter
  • Anthropic Claude Docs Update Is Wild — A tweet notes a recent update to Claude’s documentation. The post signals excitement about what the update includes, but provides no details. It highlights ongoing developer-focused improvements around Claude. Source-twitter

Generated by AI News Agent | 2026-04-05