Home » Anthropic’s Outrage Rings Hollow On Distillation

Anthropic’s Outrage Rings Hollow On Distillation

Anthropic says three Chinese labs—DeepSeek, Moonshot, and Miniax—ran an organized effort to siphon its model’s know-how. The claim is detailed and serious: tens of thousands of fake accounts, millions of interactions, and proxy networks to mask origin. I agree this behavior is aggressive. But I also think Anthropic’s moral high ground is shaky.

The industry cannot cry theft while building on scraped work it never fairly licensed. That double standard weakens both trust and policy arguments. It also leaves users and developers caught between corporate outrage and murky rules.

What Happened—And Why It Matters

Anthropic argues the campaigns were an “industrial-scale distillation attack” aimed at copying its model’s reasoning and safety behavior. The numbers are striking: 24,000 fake accounts and 16 million exchanges routed through proxies. The companies allegedly targeted reasoning, coding, tool use, and policy shaping.

“We observed… 24,000 fake accounts, 16 million exchanges, and proxy networks… an industrial-scale distillation attack… framed as a national security issue.”

DeepSeek allegedly ran 150,000 exchanges focused on reasoning, grading tasks, and censorship-safe outputs. Moonshot, the piece says, executed over 3.4 million exchanges aimed at agentic reasoning, tool use, coding, and vision. Miniax dwarfed both with more than 13 million interactions and pivoted to a newly released model within 24 hours.

“When we released a new model during Miniax’s campaign, they pivoted within 24 hours, redirecting nearly half of their traffic.”

Model distillation itself is standard practice. Companies train smaller models on prompts and answers from their stronger “teacher” models to get speed at lower cost. The controversy here is whose model is serving as the teacher, and whether the access was permitted.

The Hypocrisy Problem

Anthropic’s complaint clashes with its own reliance on unlicensed training data. The company settled a major author lawsuit in 2025 for $1.5 billion. It has also faced accusations from Reddit over scraping without permission. This is not unique: OpenAI has been accused of transcribing vast amounts of video; Meta has faced leaks showing staff knew certain data sources were off-limits; Google and others have similar suits.

“The entire AI industry has been built on a foundation of take first and then ask permission never.”

That history blunts the force of Anthropic’s outrage. If outputs from a model trained on unlicensed books and scraped forums are later used to train a rival, who exactly owns what? The speaker’s point lands: terms of service are not the same as settled law.

Three Risks We Can’t Ignore

The speaker highlights practical risks that deserve attention.

Safety erosion: distilled models can shed refusal behavior and release harmful content more easily.
Geopolitics: the case may fuel calls to restrict compute and access, citing proof of organized capability capture.
Legal gray zone: current copyright rules exclude AI-generated output; TOS violations are not the same as crimes.

These issues won’t wait for perfect doctrine. Users, developers, and regulators need clarity before another wave of shadow training sets shapes public systems.

Where I Stand

Coordinated capability extraction at this scale is unacceptable—and the rules should say so. But I cannot ignore that major labs built their strength by scraping other people’s work and sorting out payments later. The industry now faces its own mirror.

We need a fair deal on three fronts: auditable access controls to stop mass harvesting, licensing frameworks for training data and outputs, and safety retention standards for distilled models. If leaders want public trust, they must accept independent checks, not just blog posts and finger-pointing.

“Are we going to draw a line somewhere, or is this just how it works now? Everybody just copies everybody else.”

My view: draw the line, write the rules, and enforce them across borders. If we fail, we reward the quietest copier and punish the most honest lab.

Call to Action

Lawmakers should set clear rights over model outputs used for training, require disclosure of large-scale harvesting, and mandate retention of core safety behaviors in distilled systems. Companies should adopt third-party audits and watermarking for both prompts and responses. And users should demand providers state, in plain language, what can and cannot be done with generated text.

The choice is simple: set standards now or let shadow copying decide the future of AI.

Frequently Asked Questions

Q: What is model distillation in plain terms?

A smaller model is trained on prompts and answers from a stronger “teacher” model. It learns to act similarly, usually with lower cost and faster responses.

Q: Why does Anthropic say this case is different?

They argue the activity used fake accounts, proxies, and massive scale to capture specific capabilities from a competitor’s system, raising security and policy concerns.

Q: Is using AI outputs to train another model illegal?

Not under settled law in many places. Violating terms of service can lead to bans or lawsuits, but it is not automatically a crime.

Q: Do distilled models lose safety features?

They can. Unless teams explicitly preserve refusals and safeguards, smaller models may respond to harmful or restricted prompts more freely.

Q: What policy steps would help right now?

Create licensing rules for training data and outputs, require transparency for large-scale harvesting, and set standards to maintain safety behaviors during distillation.

Joe Rothwell

Journalist at DevX

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.