AI Daily — 2026-02-26

English 中文

Google's Nano Banana 2 Debuts as Real-Time Multimodal Image Model · Pentagon Makes Final Offer to...

Covering 29 AI news items

🔥 Top Stories

1. Google’s Nano Banana 2 Debuts as Real-Time Multimodal Image Model

Google unveils Nano Banana 2, its latest image model built on Gemini’s world understanding and powered by real-time web data. It can reflect live conditions with high fidelity, demonstrated by the Window Seat demo that renders more accurate views with live weather in 2K/4K. Nano Banana 2 rolls out as the default in Geminiapp and Search across 141 countries, and Flow, with previews in Google AI Studio and Vertex AI, and availability on Antigravity. Source-twitter

2. Pentagon Makes Final Offer to Anthropic for Unrestricted AI Use

The U.S. Pentagon reportedly issued a final offer pressuring Anthropic to permit unrestricted military use of its AI capabilities, with a deadline approaching. Anthropic is said to resist using its AI for American surveillance and for lethal military missions, even as it recently won a $200 million government contract. The situation underscores high-stakes debates over AI warfare and governance. Source-twitter

3. Apple launches Python SDK for on-device Mac LLMs

Apple released a Python SDK enabling access to on-device language models on Mac via the Foundation Models framework. The repository, apple/python-apple-fm-sdk, provides Python bindings to interact with the on-device model. This enables local AI inference without cloud dependency and expands developers’ tooling for Apple hardware. Source-twitter

📰 Featured

LLM

LFM2-24B-A2B Runs 2x Faster on Strix Halo — A Reddit post claims LFM2-24B-A2B delivers roughly twice the speed of gpt-oss-20b when run on Strix Halo with ROCm and Lemonade v9.4.0. The author expresses optimism about potential applications and invites others to share their use cases. Source-reddit
Test-Time Training with KV Binding Is Secretly Linear Attention — New analysis argues that test-time training with key-value binding is not mere memorization. It shows that many TTT architectures can be reformulated as learned linear attention, reframing online meta-learning as a linear-attention process. The work revisits TTT formulations and helps explain previously puzzling model behavior. Source-huggingface
Learn Claude Code: Nano Claude Code‑like Agent from 0 to 1 — An open-source project from shareAI-lab outlining a Claude Code‑like agent built from a minimal loop. It introduces the AGENT PATTERN and 12 progressive sessions, each adding one mechanism to evolve the agent from a simple loop toward isolated autonomous execution. The design emphasizes planning, process isolation, and load-on-demand knowledge via tool results, with the repository hosted on GitHub. Source-github
Plano: AI-native proxy and data plane for agentic apps — Plano is an AI-native proxy and data plane that offloads routing, orchestration, signals, safety filters, and LLM routing from agentic apps. It aims to decouple developers from brittle framework abstractions and speed production delivery across languages and AI frameworks, centralizing governance and observability for agentic workflows. Source-github
NVIDIA Unveils Megatron-LM Core for Large-Scale Transformer Training — NVIDIA’s Megatron-LM project introduces two components: Megatron-LM, a research-focused reference with pre-configured training scripts, and Megatron Core, a GPU-optimized library of transformer building blocks with advanced parallelism and mixed-precision support. Megatron Core supports TP, PP, DP, EP, and CP, along with FP16, BF16, FP8, and FP4, enabling custom training pipelines, while Megatron-LM provides ready-to-run configurations for rapid experimentation. The suite also includes Megatron Bridge for bidirectional Hugging Face ↔ Megatron checkpoint conversion, with quick-start installation via pip. Source-github
Qwen3.5-35B-A3B Q4 Quantization Comparison — A Q4 quantization sweep across Qwen3.5-35B-A3B evaluates faithfulness to the BF16 baseline using various quantizers and recipes. The study reports KL Divergence and Perplexity to measure information loss and model confidence, aiming to guide data-driven file selection rather than guessing. Source-reddit
Training a 144M Spiking Neural Network for Text Generation from Scratch — A 144M-parameter spiking neural network language model was trained from scratch on FineWeb-Edu for about $10 using a rented NVIDIA RTX A5000, with no transformer teacher or distillation. The model naturally achieves 97-98% inference sparsity and demonstrates stronger topic coherence than GPT-2 Small on the same prompts. Spike-rate analysis reveals interpretable processing, with Block 4 the most active (9.8%) and Block 0 filtering noise (0.6%), plus online learning capabilities. Source-reddit
DualPath Breaks Storage Bandwidth Bottleneck in Agentic LLM Inference — A joint team from Peking University, Tsinghua University, and DeepSeek-AI released a paper introducing DualPath, an inference system designed to address KV-Cache storage I/O bandwidth bottlenecks in agentic LLM workloads. The work targets improving LLM inference throughput by optimizing storage access patterns and data flow for agentic scenarios. Source-reddit

Hardware

Prime Intellect Launches AI Compute Infra with H100 GPUs — Prime Intellect unveils state-of-the-art AI compute infrastructure featuring H100 GPUs, InfiniBand, and 24/7 support, and promotes comparing 15+ cloud providers to pick the best GPUs. The item also includes a tweet by Johannes Hagemann and notes Hieu Pham’s departure from OpenAI due to burnout. Source-twitter

AI in Finance

Perplexity AI Demo Runs Full Fund Workflow in a Box — An AI-driven system from Perplexity AI claims to credibly run a small fund’s core workflow with just 1-2 humans, instead of a full desk of 10 analysts. The project reportedly involves over 4,500 lines of code and a fully working web app. The creator frames it as a cheaper alternative to a Bloomberg terminal and teases a thread detailing the build. Source-twitter

AI

HyTRec Unveils Hybrid Attention for Long-Behavior Recommendations — HyTRec introduces a Hybrid Attention architecture that decouples long-term stable user preferences from short-term signals in long-behavior sequential recommendation. The approach aims to overcome the efficiency limitations of linear attention while reducing the computational burden of softmax attention, improving retrieval precision for long sequences. It positions HyTRec as a scalable solution for long-range user modeling in recommender systems. Source-huggingface
MolHIT Advances Molecular-Graph Diffusion with Hierarchical Discrete Models — MolHIT introduces a framework for molecular graph generation using hierarchical discrete diffusion, aiming to overcome long-standing limitations of prior graph diffusion approaches. The work targets improved chemical validity and better alignment with target properties, addressing challenges relative to 1D modeling. This development advances AI-driven drug discovery and materials science by strengthening molecular graph generation capabilities. Source-huggingface

AI Safety

How to fix OpenClaw’s internal reasoning leaks — A tweet asks how to mitigate leakage of OpenClaw’s internal reasoning. It highlights safety concerns about exposing chain-of-thought in an AI agent and seeks practical fixes. The post underscores ongoing debates around safeguarding internal reasoning in open-source AI systems. Source-twitter

Multimodal

DreamID-Omni: Unified, Controllable Audio-Video Generation — The paper argues current methods treat reference-based audio-video generation (R2AV), video editing (RV2AV), and audio-driven video animation (RA2V) as separate tasks. It introduces DreamID-Omni, a unified framework aimed at providing precise, disentangled control over multiple character identities and voice timbres within a single system. Source-huggingface

Open Source

Open-source AI guide: AI knowledge base and Vibe Coding tutorials — A open-source AI resource hub by programmer liyupi, ai-guide aggregates a free AI knowledge base, tutorials, prompts, and monetization guidance. It covers model options (DeepSeek, GPT, Gemini, Claude), tools (Cursor, Claude Code, OpenClaw, TRAE, Lovable, Agent Skills), and development frameworks (Spring AI, LangChain), plus industry news. The project is freely accessible and aims to close information gaps, now evolved into the Fish AI Navigation site. Source-github
Hello-Agents: Datawhale’s AI-native Agent Tutorial — Datawhale launches Hello-Agents, a comprehensive open-source tutorial guiding learners from fundamentals to practical development of AI-native agents. The project contrasts AI-native agents with workflow-driven approaches and provides hands-on building of multi-agent applications using the HelloAgents framework built on OpenAI’s API, including context, memory, protocols, and evaluation. It features real-world projects like a travel assistant and a cyber town, plus AI-agent job-interview prep. Source-github

AI Hardware

Ubuntu 26.04 LTS Optimizes Local AI with Auto Drivers — Ubuntu 26.04 LTS is expected to ship with automatic hardware-aware AI drivers, including out-of-the-box NVIDIA CUDA and AMD ROCm support. It will also introduce Inference Snaps, ready-to-use sandboxed AI inference containers, and features for sandboxed AI agents. The updates aim to enhance local AI capabilities for developers and researchers on Ubuntu. Source-reddit

Industry

DeepSeek grants Huawei early access to V4; Nvidia/AMD wait — DeepSeek has granted early access to its V4 AI model update to domestic suppliers such as Huawei, aimed at optimizing software and ensuring efficient performance on their hardware. Meanwhile, Nvidia and AMD have not received access, highlighting continued gatekeeping and geopolitical frictions in AI hardware ecosystems. Source-reddit

⚡ Quick Bites

Anthropic CEO Dario Amodei Addresses Department of War Discussions — Anthropic CEO Dario Amodei released a statement regarding the company’s discussions with the Department of War. The message outlines considerations around national security uses of AI and emphasizes responsible, safe AI deployment in government contexts. More details are available at anthropic.com/news/statement. Source-twitter
Nontraditional Talent Sought for AGI Research Recruiting — AGI researchers are promoting research recruiting as a way for non-technical people to contribute to the field. Tifa (@tifafafafa) is seeking exceptional recruiters from nontraditional backgrounds, with a preference for former founders. The aim is to build teams with context and forward-looking taste to push the frontier rather than simply fill roles. Source-twitter
ByteDance Unveils DeerFlow 2.0 Open-Source Super Agent Harness — DeerFlow 2.0 is a ground-up rewrite of ByteDance’s open-source super-agent harness that coordinates sub-agents, memory, and sandboxes to tackle complex tasks. The 1.x branch remains for the original Deep Research framework, with active development now on 2.0. The project is hosted on GitHub and has demos and documentation at deerflow.tech. Source-github
Closed US AI models vs open Chinese models sparks industry tension — The post discusses how customers requiring offline AI in closed environments clash with the limited availability of capable US models. It argues that older, less capable models force a lag behind modern LLMs and considers options like switching to Chinese models or lobbying for more open weights from OpenAI, amid concerns about national security and data leakage. Source-reddit
Dual AMD Instinct MI50 AI Rig with 64GB VRAM and Custom Shroud — A hobbyist built a local AI server featuring two AMD Instinct MI50 GPUs (64GB total) on a Gigabyte X399 DESIGNARE motherboard with a Threadripper 2990WX. The rig runs Ubuntu 24.04 LTS with ROCm 6.3 and llama.cpp, achieving around 50 t/s on GLM 4.7 Q8_0 though performance drops under load. To keep noise down, they designed a 3D-printed, MIT-licensed shroud (three parts) for a single 92mm fan, and documented it as an open-source modular cooling solution. Source-reddit
AI shines in small-TAM custom software markets — Balaji Srinivasan argues that AI is especially powerful for custom software serving very small markets because such markets typically can’t support the costs of traditional software development. He suggests smaller markets yield greater AI-driven value by reducing development expenses. The post frames AI as a cost-saving enabler for niche software projects. Source-twitter
Top 10 Trending Hugging Face Models — Reddit user jacek2023 asks for conclusions about the top 10 trending models on Hugging Face. The post signals interest in which HF models are gaining traction, but provides no specific model names or details. Source-reddit
OpenAI employee leaves over burnout, plans break to Vietnam — An AI professional announces they are leaving OpenAI, after previously working at xai. They praise the people and the work on advanced AI but cite burnout and deteriorating mental health, prompting them to take a break from frontier AI labs and relocate with family to Vietnam to pursue new opportunities and treatment. Source-twitter

Generated by AI News Agent | 2026-02-26