AI Daily — 2026-02-23

English 中文

Anthropic reports industrial-scale distillation attacks on Claude · Anthropic launches AI Fluency...

Covering 35 AI news items

🔥 Top Stories

1. Anthropic reports industrial-scale distillation attacks on Claude

Anthropic disclosed that adversaries conducted industrial-scale distillation attacks on Claude, carried out by DeepSeek, Moonshot AI, and MiniMax. The campaigns allegedly created over 24,000 fraudulent accounts and executed more than 16 million exchanges with Claude to extract its capabilities for training rival models, highlighting serious security, privacy, and procurement risks for AI copilots. These findings bolster arguments for stronger access controls and tamper-resistant interfaces around deployed LLMs. Source-x

2. Anthropic launches AI Fluency Index to gauge collaboration with Claude

Anthropic introduced the AI Fluency Index to measure 11 observable behaviors across thousands of Claude.ai conversations, aimed at understanding how people develop effective collaboration with AI. The index is part of Anthropic’s Education Report and sheds light on iteration patterns and how users refine their work with Claude. This metric could influence training, design, and evaluation of human-AI teams. Source-x

3. GLM-5 Tops Extended NYT Connections Benchmark

GLM-5, an open-weights model, achieved the top score on the Extended NYT Connections benchmark with 81.8, ahead of Kimi K2.5 Thinking at 78.3. The result underscores ongoing progress for open-source LLMs on complex evaluation tasks and provides a benchmark reference for future open models. Source-reddit

📰 Featured

Open Source / Research

Unified Latents (UL): Train Latents with Diffusion Prior — UL jointly regularizes latent representations using a diffusion prior and decodes with a diffusion model, yielding a simple training objective that upper bounds latent bitrate. On ImageNet-512, UL achieves a competitive FID of 1.4 with high PSNR and reduced training FLOPs. Source-huggingface

Open Data & Data Platforms

OpenBB Open Data Platform Enables AI Copilots and Dashboards — OpenBB launches the Open Data Platform (ODP), an open-source toolset to blend proprietary, licensed, and public data for AI copilots and dashboards, serving data to Python environments, OpenBB Workspace, Excel, MCP servers, and REST APIs. This aims to streamline data access across tools and teams. Source-github

AI Evaluation & Benchmarking

OpenAI shifts frontier coding evals to SWE-bench Pro — SWE-bench Verified is deemed unreliable for frontier coding evaluation due to design, leakage, and public-repo contamination issues; SWE-bench Pro becomes the recommended standard as the industry refines benchmarking practices. Source-x

RL & Training Stability

VESPO Enables Stable Off-Policy LLM Training via Variational Optimization — VESPO proposes Variational Sequence-Level Soft Policy Optimization to stabilize off-policy LLM training amid distribution shifts, noting that importance sampling reduces bias but inflates variance and that other fixes are insufficient. This work highlights the push for robust RL methods in LLM pipelines. Source-huggingface

Industry Access & Safety Policy

Google restricts AI Pro/Ultra users over OpenClaw usage — Google reportedly restricted OpenClaw access for Google AI Pro/Ultra subscribers via OAuth, provoking debate on access controls and platform governance without prior warning. The move has spurred discussion on safety, policy, and competitive dynamics in AI tooling. Source-rss

Open Source Debate / Safety

Dario Is Scared — An opinion piece argues Anthropic leverages fear as a pretext to curb open-source tools, claiming OpenRouter and OpenClaw boost open-model usage and that safety rhetoric serves to entrench control over future intelligence. Source-reddit

Hardware & Inference

Portable Inference Rig Achieves 165 tok/sec on GPT-OSS 120B — A hobbyist-built portable rig with an unusually thin cooling setup achieves about 150–165 tokens/sec on GPT-OSS 120B under Windows with LM Studio, aided by CPU undervolting, PBO tuning, fast RAM, and GPU power caps; a noted limitation is the lack of a compact ITX Threadripper board. Source-reddit

⚡ Quick Bites

Qwen3 Voice Embeddings Enable Cloning and Voice Remix — Voice embeddings enable cloning and remixing voices in Qwen3. Source-reddit
RWKV-7 Inference O(1) Memory, Beats LLaMA 3.2 on ARM — RWKV-7 achieves constant memory inference and outperforms LLaMA 3.2 on ARM. Source-reddit
Open-Source Framework Delivers Gemini 3 and GPT-5.2 Pro-Level Local AI — An open-source framework enables Gemini 3 and GPT-5.2-level local AI. Source-reddit
TinyTeapot 77M Params: CPU LLM ~40 tok/s Open-Source — A 77M-parameter CPU LLM achieves ~40 tokens/sec. Source-reddit
OpenClaw’s confirm-before-acting bug speedruns inbox deletions — A bug in OpenClaw speeds up inbox deletions. Source-x
Simon Willison Publishes First Two Chapters on Agentic Engineering Patterns — Willison previews agentic engineering patterns in early chapters. Source-x
Elon Musk accuses Anthropic of stealing from coders — Musk publicly accuses Anthropic of coder-theft allegations. Source-x
Do Reasoning Models Know When to Stop Thinking? — Analysis questions whether reasoning models can determine when to stop iterating. Source-huggingface
IBM Plunges After Anthropic’s Latest Update Takes on COBOL — Market reaction follows Anthropic’s COBOL-focused update. Source-rss
AI Timeline Tracks 171 LLMs from Transformer to GPT-5.3 — Comprehensive timeline tracks evolution of 171 LLMs. Source-rss
Aqua: CLI tool for AI agents — Lightweight CLI tool for orchestrating AI agents. Source-github
Researchers Hide Backdoors in 40MB Binaries; AI + Ghidra Find Them — BinaryAudit uses AI + Ghidra to detect backdoors in binaries. Source-rss
Anthropic Has Never Open-Sourced Any LLMs — Contention around Anthropic’s open-source posture. Source-reddit
Local GPT-OSS 20B Demonstrates Agentic Capabilities — Local GPT-OSS 20B shows notable agentic behaviors. Source-reddit
Feasibility of Training a 3B-Parameter LLM Locally — Feasibility discussion for training a 3B-model locally. Source-reddit
Strix Halo 128GB: Best Quantizations for GPU-Only Llama Runs — Recommendations on quantizations for GPU-only Llama workloads. Source-reddit
Veo 3.1 Templates Roll Out in Gemini App Today — Veo templates rollout in Gemini app. Source-x
Gemini 3.1 Pro Overcapacity Breaks Code; User Pays $250 — Gemini 3.1 Pro faces overcapacity and a new charge. Source-x
Codex Weekend Projects Spark Creativity with OpenAI — Community coding weekend projects with Codex. Source-x
AI Threatens Open Source Quality, and It’s Not Ready — Discussion on AI’s impact on open-source quality and readiness. Source-rss
Open-Source System Prompts and AI Tools Catalog — Catalog of open-source prompts and tools. Source-github
Pinterest Drowning in AI Slop and Auto-Moderation — Pinterest grapples with AI-generated slop and moderation issues. Source-rss
Anthropic accused of fear-mongering and lobbying against open-source AI — Accusations of fear-mongering in open-source AI debates. Source-x
Distillation vs Training in LocalLLaMA Discussion — Discussion contrasts distillation and training approaches in LocalLLaMA. Source-reddit
Tweet: Claude Only Distilled in Silicon Valley, Argues User — User argues Claude’s distillation origin is Silicon Valley-centric. Source-x

Generated by AI News Agent | 2026-02-23