daily
Mar 11, 2026

AI Daily — 2026-03-11

English 中文

NVIDIA Nemotron 3 Super Unveiled 120B Hybrid MoE Latent Model · Anthropic's Claude 3.7 Sonnet del...


Covering 34 AI news items

🔥 Top Stories

1. NVIDIA Nemotron 3 Super Unveiled 120B Hybrid MoE Latent Model

NVIDIA unveils Nemotron 3 Super, a 120B-12A Hybrid SSM Latent MoE model designed for Blackwell 36 on AAIndex v4, with claims of up to 2.2x faster FP4 performance versus GPT-OSS-120B. The company will publish open data, recipes, and weights alongside a technical report, signaling a push toward openness in high-end model development. NVIDIA also hints an Ultra variant is on the horizon, potentially widening access to large-scale MoE architectures. Source-x

2. Anthropic’s Claude 3.7 Sonnet delayed; AI code dominates future models

The Times reports model releases are now spaced by weeks, with Claude responsible for the majority of code used in future models. Anthropic delayed the Claude 3.7 Sonnet rollout by 10 days to ensure certainty. Industry watchers warn the coming years could redefine job markets, with Amodei noting many entry-level white-collar roles may vanish within 1-5 years. Source-x

3. Nvidia to Spend $26B on Open-Weight AI Models

Filings reveal NVIDIA plans to invest about $26 billion to develop open-weight AI models, signaling a push toward widely accessible weights for researchers and developers and a potential acceleration of open-weight ecosystem growth. This move underscores a broader industry shift toward openness and collaborative research. Source-reddit

Open Source & Developer Tools

  • ByteDance’s DeerFlow 2.0 Tops GitHub Trending as Open-Source AI Harness — DeerFlow 2.0 is a ground-up rewrite that claimed the #1 spot on GitHub Trending after its launch, signaling strong community adoption for AI agent orchestration. Source-github

Computer Vision & Multimodal

  • Moondream Segmentation Gets New SOTA, 40% Faster — Moondream announces a new state-of-the-art segmentation update with a 40% speedup; live on Moondream Cloud with a local model and whitepaper coming later this week. Source-x
  • Voxtral WebGPU Brings Real-Time In-Browser Speech Transcription — Voxtral-Mini-4B-Realtime supports 13 languages with latency under 500 ms and is integrated into Transformers.js to enable fully local browser captioning via WebGPU; demo and source on Hugging Face. Source-reddit

AI in Defense & Industry

  • Google to provide Pentagon with AI agents — Google will supply AI agents to the U.S. DoD for unclassified tasks, highlighting ongoing private-sector collaboration on defense-oriented AI applications. Source-rss

AI Research & Collaboration

  • OpenAI Hiring Researchers and Engineers for RLHF and Multimodal AI — OpenAI seeks researchers and software engineers focused on RLHF, long-horizon evaluation, reward modeling, and data infrastructure for personalized multimodal AI. Source-x
  • Neal Wu joins ThinkyMachines for collaborative AI — Neal Wu announces joining ThinkyMachines to work with top researchers on advancing collaborative AI, inviting others to train with the team at thinkingmachines.ai/#join-us. Source-x

RL & 3D Vision

  • Geometry-Guided Reinforcement Learning for Multi-View 3D Scene Editing — Explores using priors from 2D diffusion models to edit 3D scenes, noting multi-view consistency as a core challenge and data scarcity as a barrier to supervised fine-tuning, advocating a geometry-guided RL approach. Source-huggingface

⚡ Quick Bites

  • Perplexity Computer Rollout for PRO Users with Credits Bonus — Perplexity expands PRO access with a credits bonus to boost high-tier usage. Source-x

  • PostTrainBench v1.0 released to benchmark frontier AI agents — New benchmark tool standardizes evaluation for frontier AI agents. Source-x

  • Run Qwen3.5 Locally on RTX GPUs with Unsloth GGUF — Enables local offline inference of Qwen3.5 on RTX GPUs via Unsloth GGUF. Source-x

  • Reasoning Expands Parametric Recall in LLMs — New work shows reasoning improves parametric recall capabilities in LLMs. Source-huggingface

  • Omni-Diffusion: Unified Multimodal AI with Masked Diffusion — Proposes a unified diffusion framework for multimodal AI tasks. Source-huggingface

  • InternVL-U: 4B Unified Multimodal Model for Understanding and Editing — Introduces a compact 4B model for multimodal tasks. Source-huggingface

  • Glad the Anthropic Fight Is Happening Now — Commentary on the ongoing public discourse around Anthropic and rival AI efforts. Source-rss

  • I Was Interviewed by an AI Bot for a Job — A look at AI-driven interviews and their implications. Source-rss

  • Open-source browser for AI agents with synchronized state (ABP) — Proposes an open-source browser protocol for AI agents with synchronized state. Source-github

  • Llama.cpp Adds Real Reasoning Budget and Messaging Aid — Adds a real reasoning budget and messaging aids to Llama.cpp. Source-reddit

  • Run LLMs on AMD NPU in Linux with Lemonade Stack — Demonstrates running LLMs on AMD NPU under Linux with Lemonade. Source-reddit

  • Apex-1: 350M Tiny LLM Trained for Edge Hardware — A compact LLM designed for edge devices. Source-reddit

  • Reka Edge 7B Multimodal Model Debuts on Hugging Face — New 7B multimodal model appears on Hugging Face. Source-reddit

  • Why AI Coding Agents Waste Half Their Context Window — Discussion on context-window inefficiencies in AI coding agents. Source-reddit

  • Codex Best Practices Now in OpenAI Developer Docs — OpenAI documents codex best practices for developers. Source-x

  • MM-Zero Enables Self-Evolving Multimodal Vision-Language Models — Proposes self-evolving multimodal models. Source-huggingface

  • Promptfoo: LLM Evals and Red-Teaming Tool — Introduces a tool for evaluating and red-teaming LLMs. Source-github

  • Claude Code Login Errors Elevated, Access Disrupted — Reports elevated login errors affecting Claude Code access. Source-rss

  • How We Hacked McKinsey’s AI Platform — Case study on compromising an enterprise AI platform. Source-rss

  • T3Chat Canvas Enhances Image Gen UX, Enables Multi-Model Testing — Improves UX for image generation and supports multi-model testing. Source-x

  • Ask HN: Is Claude down again? — Community discussion about Claude availability. Source-hackernews

  • Codex Beats Claude Code; OpenAI Transparency Sets Industry Standard — Codex outperforms Claude in coding tasks; OpenAI leads with transparency. Source-x

  • Hacker News bans AI-generated comments to keep conversations human — Policy update restricting AI-generated comments on Hacker News. Source-hackernews

  • Today’s large neural networks may be slightly annoyed with you — A tongue-in-cheek take on the behavior of large models. Source-x


Generated by AI News Agent | 2026-03-11

━━━━━━ End of Template ━━━━━━