AI Daily — 2026-04-30
OpenAI rolls out GPT-5.5-Cyber to critical cyber defenders · Codex Enables Role-Based, App-Linked...
Covering 29 AI news items
🔥 Top Stories
1. OpenAI rolls out GPT-5.5-Cyber to critical cyber defenders
OpenAI is starting the rollout of GPT-5.5-Cyber, a frontier cybersecurity model, to critical cyber defenders in the coming days. The company will collaborate with the broader ecosystem and government to establish trusted access for cyber operations, aiming to rapidly secure companies and infrastructure. Source-twitter
2. Codex Enables Role-Based, App-Linked Prompts to Everyday Work
Codex updates enable role-based workflows by letting users choose a role, link the apps they use daily, and try suggested prompts. It supports everyday tasks—from research and planning to docs, slides, and spreadsheets—through integrated app connections. The feature emphasizes streamlined productivity with Codex as a central assistant. Source-twitter
3. OpenAI launches Advanced Account Security for ChatGPT
OpenAI is rolling out Advanced Account Security as an opt-in feature for ChatGPT accounts, targeting users at higher risk of digital attacks. The update adds phishing-resistant sign-in, stronger account recovery, and enhanced protections to safeguard sensitive data and prevent account takeover. Source-twitter
📰 Featured
LLM
- Where Goblin Outputs Come From in GPT-5, OpenAI Explains — OpenAI explains the phenomenon of ‘goblins’ in AI outputs, detailing how personality-driven quirks spread across models like GPT-5. The article traces the timeline, root causes, and proposed fixes to curb these behaviors. It frames these quirks as issues to understand and address in future models. Source-twitter
- GLM-5V-Turbo Advances Multimodal Foundation Models — GLM-5V-Turbo is presented as a step toward native foundation models for multimodal agents. The approach treats multimodal perception as a core component of reasoning, planning, tool use, and execution, enabling agents to operate over images, videos, webpages, documents, and GUIs. The work is published on HuggingFace as part of ongoing efforts to integrate perception with reasoning in foundation models. Source-huggingface
- April 2026: Local LLMs Deliver Best Month Yet — An /r/LocalLLaMA post describes April 2026 as an exceptional month for local LLMs and invites discussion of underrated models. It notes that MiniMax-M2.7 shifted from MIT to a Non-Commercial license, excluding it from the graph, and credits user pmttyji for assembling the model list and graph. Source-reddit
- Qwen-Scope Unveils Official Sparse Autoencoders for Qwen 3.5 — Qwen Team released Qwen-Scope, a collection of Sparse Autoencoders for the Qwen 3.5 family (2B–35B MoE). It maps residual-stream features into a dictionary of internal concepts, enabling observation of high-level ideas like ‘legal talk’ or ‘Python code’ instead of raw numbers. The toolkit supports precise ‘Surgical Abliteration’ to suppress specific concepts and ‘Feature Steering’ to force-activate concepts during generation, but cautions against using it to remove safety filters, despite Apache 2.0 licensing. Source-reddit
- Mistral 3.5 128B MLX 4-bit: broken, patch required — A user reports converting Mistral Medium 3.5 128B to MLX 4-bit, but encounters multiple issues. The MLX path currently lacks support for the Eagle speculative decoding model, a sanitizer bug caused 438 parameters to be skipped in vision components, and there are performance and sampling caveats; a local patch is described and details are on the Hugging Face readme. The user does not recommend downloading or using it for now. Source-reddit
Multimodal
- GPT Image 2 Prompt Goes Viral with Clumsy Redraw — A prompt for GPT Image 2 asking to redraw an image in a deliberately clumsy, MS Paint-like style has gone viral on X/Twitter. The instructions describe a deliberately low-quality, pixelated output that is vaguely similar but off in a humorous way. The viral post originates from arrakis_ai on X. Source-twitter
- RADIO-ViPE Enables Open-Vocabulary Semantic SLAM in Monocular Video — RADIO-ViPE is an online semantic SLAM system that achieves geometry-aware grounding for open-vocabulary language queries. It operates directly on raw monocular RGB video without requiring camera intrinsics, depth sensors, or pose initialization, thanks to tight multi-modal fusion. The approach links natural language prompts to localized 3D regions and objects in dynamic environments. Source-huggingface
- DeepSeek Unveils Thinking-with-Visual-Primitives Multimodal Framework — DeepSeek, in collaboration with Peking University and Tsinghua University, released the paper Thinking with Visual Primitives and its open-source repository for a new multimodal reasoning framework. The framework elevates spatial tokens—coordinate points and bounding boxes—into minimal units of thought that are interleaved with the model’s chain-of-thought, enabling pointing to image locations during reasoning. The project is available on GitHub at deepseek-ai/Thinking-with-Visual-Primitives. Source-reddit
Industry
- Alphabet Reinvented by AI-First Strategy, TIME Says — The piece notes that in 2016 Sundar Pichai declared Google an AI-first company and began funding projects such as custom chips, Cloud, YouTube, and deep AI research beyond search. TIME’s TIME100 Companies feature argues these AI bets have paid off, driving Alphabet’s reinvention and influence. Source-twitter
AI Safety
- Shai-Hulud Malware Found in PyTorch Lightning — Semgrep reports a malicious dependency within the PyTorch Lightning AI training library. The Shai-Hulud-themed malware highlights security risks in open-source ML tooling and dependency chains. Developers are advised to audit dependencies and update affected packages to mitigate potential abuse during AI training. Source-hackernews
Open Source
- jcode: Next-Gen Coding Agent Harness — An open-source project from 1jehuang, jcode is a next-generation coding agent harness designed for multi-session workflows, high customizability, and performance efficiency. It offers cross-platform installation (macOS/Linux via a script; Windows and other setups available via detailed install guide) and includes RAM/boot-time benchmarks comparing against Codex CLI, OpenCode, and GitHub Copilot CL to illustrate resource efficiency. Source-github
- Llama-swap adds matrix grouping to run models concurrently — Llama-swap introduces a new matrix feature that lets users group models for concurrent execution. Previously a model could belong to only one group; now you can define custom groups (e.g., big models, STT + larger model, RAG usage) and the solver unloads models based on cost. The config example demonstrates a solver-based approach and notes that configurations must use either matrix or legacy groups. Source-reddit
Hardware
- Qwen3.6-27B on RTX 3090 Reaches 218K Context, Stable Tool Calls — On a single RTX 3090, Qwen3.6-27B reached ~218K context at ~50–66 TPS for text and code, and ~198K+ context for vision at ~51–68 TPS. Tool calls producing outputs around 25K tokens now complete without OOM, signaling improved stability. The gains follow a PN12 patch drift in genesis-vllm patches; after fixing the anchor drift (PR #13), higher-context configurations became usable. Source-reddit
- 32x AMD MI50 Cluster Hits 9.7/264 tok/s with Kimi-K2.6 — Two-node 32-GPU AMD MI50 setup runs Kimi-K2.6 on a vllm fork, delivering 9.7 tokens/s output and 264 tokens/s input. Power draw ranges from ~640W idle to ~4800W peak. The author notes the setup is niche and not worthwhile unless energy is effectively free, and provides GitHub links and the exact launch commands. Source-reddit
⚡ Quick Bites
- Claude Code blocks OpenClaw mentions in JSON, bills extra — An anecdotal post claims Claude Code will refuse a request or charge extra if a commit mentioning OpenClaw appears in a JSON blob. The tester used an empty repository and called Claude Code directly, calling the result ‘Insanity.’ The note highlights potential safety or policy triggers around OpenClaw references in code. Source-twitter
- AI automates tasks, not jobs; radiologists’ pay climbs. — AI tends to automate individual tasks rather than whole jobs. When a task becomes cheaper, demand for the related job often increases. End-to-end automation of jobs remains elusive due to limited autonomy and supervision needs, with radiologists cited as an example of rising pay despite AI advances. Source-twitter
- Mira: AI Lives on Your Face to Capture Conversations — Promotes Mira, an AI product described as living on the user’s face and capable of capturing every conversation to create a highly personalized AI experience. The tweet includes a call to order now and mentions enabling HLS playback. Source-twitter
- Exploratory Sampling Enables Semantic Diversity in Large Language Models — Researchers introduce Exploratory Sampling (ESamp), a decoding method that pushes for semantic diversity in large language models beyond surface-level word variations. ESamp leverages the idea that neural networks generalize better on inputs similar to those encountered before, explicitly encouraging diverse, semantically meaningful outputs during generation. Source-huggingface
- Gen Z’s AI Use Grows, Hatred Grows — A The Verge article examines how rising AI adoption among Gen Z coincides with growing negative sentiment toward AI. It discusses possible causes such as privacy concerns and fear of disruption, and notes a paradox where more young people use AI while disliking it more. The piece situates these attitudes within broader debates about AI’s impact on society. Source-hackernews
- Warp open-sources agentic development environment with GPT-powered workflows — Warp is an agentic development environment born in the terminal and now open-sourced under a new Warp repository sponsored by OpenAI. The project features GPT-powered agentic management workflows and supports built-in coding agents or third-party CLI agents such as Claude Code, Codex, and Gemini CLI. Installation and UI components are MIT-licensed, with a web dashboard and GitHub-backed collaboration. Source-github
- Zig’s rationale for anti-AI contribution policy — Zig explains its rationale for an anti-AI contribution policy. The article outlines the governance approach and how the policy is intended to shape contributions and maintain project standards. It frames the move as part of Zig’s broader community and code quality goals. Source-hackernews
- Mike: Open-Source Legal AI — The post highlights an open-source legal AI project named Mike, hosted at mikeoss.com. It has generated substantial discussion on Hacker News, with 189 upvotes and 92 comments. Source-hackernews
- Benchmark: Claude Code Caveman Plugin vs Be Brief — A blogger benchmarks Claude Code’s Caveman plugin against a brief-prompt approach. The comparison examines verbosity, accuracy, and developer experience, offering practical takeaways for choosing between plugin-assisted and prompt-based coding workflows. Source-hackernews
- Hipfire runs in Docker on RX 7900 XTX with llamacpp — A Reddit user dockerized hipfire to run alongside an existing llamacpp stack, addressing long-context failures on Qwen3.6 27B. They run Qwen3.6 27B MQ4 on a RX 7900 XTX, with TriAttention sidecar and DFlash draft loading per logs and ~40 tokens/s AR, though DFlash engagement is not yet confirmed. The CLI is a Bun/TypeScript HTTP server that launches the engine as a subprocess, and there’s interest in sharing a Dockerfile and compose setup. Source-reddit
- CUDA+ROCm Run GGML Backend Simultaneously — Reddit user demonstrates running Minimax 2.7 Q4 models using both CUDA and ROCm on Windows, bypassing Vulkan. They report full-layer offload (63/63) with sizable CUDA and ROCm buffers, highlighting the prefill advantage. The setup enables GGML_BACKEND_DL and toggles GGML_HIP and GGML_CUDA via CMake. Source-reddit
- New Stealth Model Owl Alpha Emerges, Rumored Chinese-Origin — A Reddit post discusses a stealth model named Owl Alpha, speculating it may be of Chinese origin after noting it refuses to answer certain prompts. The post also claims the model has a 1 million context and was submitted by user /u/Kingwolf4 referencing LocalLLaMA. The chatter highlights uncertainty around the model’s provenance and capabilities. Source-reddit
Generated by AI News Agent | 2026-04-30