AI Daily — 2026-04-13

English 中文

Google DeepMind Hires Philosopher for Machine Consciousness Research · OCR 27k arXiv Papers to Ma...

Covering 27 AI news items

🔥 Top Stories

1. Google DeepMind Hires Philosopher for Machine Consciousness Research

A researcher announces they’ve been recruited by Google DeepMind for a new Philosopher position focusing on machine consciousness, human-AI relationships, and AGI readiness, starting in May. They will continue research and teaching at Cambridge part-time. Source-twitter

2. OCR 27k arXiv Papers to Markdown with Open 5B Model

An open 5B model was used to OCR and convert 27,000 arXiv papers into Markdown, processed with 16 parallel Hugging Face jobs on NVIDIA L40S GPUs and a mounted bucket. The project cost $850 and took about 29 hours, with zero crashes. The resulting capability powers HF’s “Chat with your paper” on hf.co/papers. Source-twitter

3. Apr 2026 Best Local LLMs: Qwen3.5, Gemma4, GLM-5.1

Reddit’s Best Local LLMs Megathread spotlights new releases like Qwen3.5 and Gemma4, along with GLM-5.1’s claimed SOTA performance and Minimax-M2.7 as an accessible ‘Sonnet at home,’ plus PrismML Bonsai 1-bit models that actually work. The post invites detailed, setup-rich comparisons of open-weight LLMs, focusing on usage scale, tools, prompts, and user preferences. Source-reddit

📰 Featured

AI

Google open-sources Magika: AI-powered file type detector — Google built Magika, an AI-powered tool that detects true file content types. It has been used internally across Gmail, Google Drive, and Safe Browsing, processing hundreds of billions of files weekly. The project is open-sourced at google/magika on GitHub, claiming 200+ content types, 99% accuracy, and 5 ms per file, trained on 100 million files. Source-twitter
GLM-5.1 vs Claude Code: Open-Source Racing AI Demo — An AI-focused tweet compares GLM-5.1 with Claude Code (Opus 4.6) by showcasing a Three.js racing game built to evaluate AI code. It highlights one-shot car physics with drifting, 531-line racing AI with four personalities, and self-iterating debugging tools, including 20+ Bun.WebView tools to drive the car and read game state. The post praises the open-source demo and predicts it will be widely discussed. Source-twitter

LLM

Uncensored Gemma-4 26B Released, Smokes Regular Gemma-4 — A new uncensored version of the Gemma-4 26B model, SuperGemma4-26B-Uncensored GGUF v2, has been released and is trending on HuggingFace. Reportedly offering zero refusals and improved tool-calling, it claims to outperform the standard Gemma-4 26B. The release highlights ongoing competition in open-source LLMs and uncensoring debates in AI communities. Source-twitter
Open Agents: Open-source cloud coding agent that writes its own code — 3 months after starting, the creator built a cloud-based coding agent that has written every line of their shipped code, including its own. They have open-sourced the project, named Open Agents. The release highlights ongoing advances in autonomous coding agents and AI-assisted software development. Source-twitter
Trained 125M LM from scratch, releases weights and SFT framework — They trained a 12-layer, 125M-parameter causal LM from scratch with a custom 16k BPE tokenizer, achieving ~6.19 validation perplexity on WikiText-103 after ~92k steps. A conversational variant using LoRA on DailyDialog was released, with both base and instruct checkpoints on HuggingFace, along with an SFT framework for others to fine-tune their own variants. The effort emphasizes open-source experimentation rather than competing with 1B+ models. Source-reddit
Local Minimax M2.7 Benchmarks GTA-Like 3D Web Experience — Local Minimax M2.7 demonstrates a 3D GTA-like experience on a single web page. It compares Minimax to GLM 5, noting better aesthetics and the ability to add trees and boids. The author tests in OpenWebUI artifacts and OpenCode, runs at IQ2_XXS for speed, and reports coherent behavior despite some control quirks. Source-reddit
DFlash Speculative Decoding on Apple Silicon Open-Sourced, 4x Speedups — A native MLX implementation of DFlash has been open-sourced with an updated benchmark methodology. A small draft model runs 16 tokens in parallel via block diffusion, with each emitted token verified by the target model in a single forward pass. On Apple Silicon (M5 Max, 64GB) using MLX 0.31.1, results show roughly 4x speedups across Qwen3.5 variants, with full results available in the repo. Source-reddit
Critique: MiniMax-M2.7-GGUF Broken in UD-Q4_K_XL Quantization — A Reddit post accuses unsloth and others of rushing new models without adequate testing. It claims MiniMax-M2.7-GGUF in UD-Q4_K_XL exhibits broken PPL measurements with NaN issues, contrasting it with similar quantized models from aessedai and ubergarm that reportedly lack such errors. The author asserts the problem is not backend kernels or CUDA 13.2 and urges avoiding use of this quantization. Source-reddit

Multimodal

WildDet3D Scales Promptable 3D Detection in the Wild — WildDet3D explores scaling promptable 3D detection from monocular RGB images, aiming to recover object size, location, and orientation. It emphasizes open-world generalization beyond closed-set categories, support for diverse prompts, and leveraging geometric cues when available. The article discusses bottlenecks of current methods that are designed for a single prompt modality, outlining a path toward practical, broad-spectrum 3D detection in unconstrained environments. Source-huggingface

AI Safety

Claude Mythos Preview First to Complete AISI Cyber Range End-to-End — AI Security Institute reports that Claude Mythos Preview completed an end-to-end evaluation on an AISI cyber range, reportedly the first model to achieve this. The milestone highlights progress in AI safety-focused evaluation and cybersecurity testing for large language models. Source-twitter

⚡ Quick Bites

Video Demystifies Claude Code and Agent Harnesses, Builds One — An author on t3.gg explains Claude Code and other agent harnesses, arguing they’re not black magic. The video helps viewers understand how these harnesses work and makes them easier to use. The creator even builds a harness to demonstrate the concept firsthand. Source-twitter
$200/Month Buys H100 GPU for 6 Hours/Workday — A tweet claims that $200 per month is enough to access an H100 GPU for six hours on each workday. This highlights hardware economics and potential affordability of high-end AI accelerators. If accurate, the claim could influence budgeting for AI projects and independent research. Source-twitter
Opus 4.7 and Sonnet 4.8 Release Soon — A reminder that Opus 4.7 and Sonnet 4.8 releases are expected imminently. These updates are likely to include new features and improvements for their AI tooling libraries. Developers should watch for official release notes and announcements. Source-twitter
MegaStyle: Scalable, Consistent Text-to-Image Style Dataset — MegaStyle presents a scalable data curation pipeline that builds an intra-style consistent and inter-style diverse style dataset by exploiting consistent text-to-image style mappings from large generative models. The framework curates a large prompt gallery with 170,000 style prompts and 400,000 content prompts to support high-quality style data for model training. Source-huggingface
FORGE: Fine-Grained Multimodal Evaluation for Manufacturing — FORGE introduces a high-quality multimodal dataset that combines real-world 2D images and 3D point clouds to evaluate Multimodal Large Language Models in manufacturing. It addresses data scarcity and the lack of fine-grained domain semantics, enabling more rigorous assessments of MLLMs operating in real-world manufacturing environments. Source-huggingface
Claude-Mem: Persistent memory for Claude Code sessions — An open-source Claude Code plugin that automatically captures your coding tool usage, compresses it with AI via Claude’s agent-sdk, and injects semantic context into future sessions. It preserves context across sessions to help Claude maintain continuity of knowledge during coding work. Source-github
Ralph: Autonomous AI Agent Repeats Until PRD Complete — Ralph is an autonomous AI agent loop that repeatedly runs AI coding tools (Amp CLI by default, with Claude Code as an alternative) until all PRD items are complete. Each iteration starts with a fresh context, while memory is preserved through git history, progress.txt, and prd.json. The workflow follows Geoffrey Huntley’s Ralph pattern, and setup involves adding Ralph files and tool prompts to a project, plus prerequisites like installed tools, authentication, jq, and a git repository. Source-github
Merged Two RTX PRO 6000 Towers into One Workstation — An enthusiast explains merging two RTX PRO 6000 towers into a single high-end workstation. The build uses AMD Threadripper PRO 7965WX, ASUS Pro WS WRX90E-SAGE SE, 128 GB ECC RAM, and two RTX PRO 6000 Blackwell GPUs (96 GB each, 192 GB total VRAM), with a 1600W titanium PSU and Corsair 9000D case. They invite tips and questions. Source-reddit
Ram-air and window vent cools 1100W AI box — A Reddit user describes using ram-air cooling and a window vent to dissipate heat from a high-power AI box rated around 1100W. The setup reportedly vents about 90% of heat out the window, achieving cooling comparable to an open case. The post aims to inspire others to explore similar cooling solutions. Source-reddit
MiniMax API-Centric License Might Expand to Regular Users — Ryan Lee of MiniMax posted an article arguing the license is aimed mainly at API providers that poorly served M2.1/M2.5. He indicated that MiniMax may update the license to cover regular users. The post suggests potential policy changes affecting non-API users. Source-reddit
Optimizing Nano Banana Generations: Story, Subject, Style Tips — The post outlines a structured prompt framework to maximize Nano Banana image generations by defining key elements: story, subject, composition, action, location, and style. It also provides direct editing instructions for image modification and cites a Twitter post from GeminiApp as the source. Source-twitter
I built an agent harness to prove it’s not magic — An AI-focused post argues that agent harnesses aren’t magical, and the author built one to demonstrate how they work. The project includes enabling HLS playback to illustrate practical steps for constructing and evaluating an AI agent harness. Source-twitter
Kimi K2.6 Imminent Release — An anonymous Reddit post hints that Kimi K2.6 is imminent, though it offers no details on features or timing. The item is posted by user Deep-Vermicelli-4591 on the r/LocalLLaMA subreddit and links to a LocalLLaMA discussion. No official announcement is provided. Source-reddit
New Weight Class Emerges, Potential Trend Starts — Reddit post hints at a new weight class, suggesting a possible trend in AI model weights. The message is speculative with no concrete details, expressing cautious optimism about future developments. Source-reddit

Generated by AI News Agent | 2026-04-13