5.1 C
New York
Friday, March 6, 2026
Array

Distillation attacks expose hidden risk in enterprise AI


Sometimes imitation is more theft than flattery.

Anthropic posted a blog recently that described how three AI laboratories leveraged a particular approach to extract Claude’s abilities to enrich their own models. Meet the distillation attack. 

Essentially, distillation attacks teach one AI model to mimic a more robust AI. By flooding the targeted AI with prompts, the attacker can collect the responses to train its own AI models on the cheap. Distillation is not inherently nefarious. Anthropic points out that highly advanced, or “frontier” AI models use distillation to create smaller versions for their customers.

“You can think of it as a teacher model and a student model that is still learning,” said Shatabdi Sharma, CIO at Capacity, a third-party logistics fulfillment company. 

DeepSeek, Moonshot and MiniMax took the distillation method to an industrial scale, leveraging thousands of fraudulent accounts and proxy services to extract capabilities from Claude, according to Anthropic. OpenAI has also accused DeepSeek of distillation attacks. 

Related:InformationWeek Podcast: Reengineering your supply chain to be resilient

Anthropic emphasized how the lack of safeguards in distilled models poses national security risks. These distilled models are also significantly less expensive, posing a risk to Anthropic’s and other frontier models’ competitive advantage. 

The average AI user may not be at risk from distillation, but that doesn’t mean distillation attacks shouldn’t be on CIOs’ radar. Distillation raises questions about model provenance, data leakage and safeguarding intellectual property. 

Who is at risk of distillation attacks?

Distillation attacks are tools that might be used by competitors. It can be less expensive and more efficient to distill an existing model than build your own. 

Enterprises with high-value intellectual property used to build proprietary models may be targets for competitors — including nation-state actors or other rivals — looking for a shortcut. 

“If somebody has a particularly good model that they develop in a certain vertical, whether it’s legal or healthcare, et cetera, then certainly [they] can be open to attacks, for somebody to do it better, faster, cheaper,” said Tony Garcia, chief information and security officer at Infineo, a company focused on modernizing life insurance infrastructure. 

Users of illicitly distilled models may eventually find themselves at risk as well, whether they opt to go with the model because it is cheaper or they don’t actually know that it is distilled. Distilled models may lack safeguards, as Anthropic pointed out. CIOs must think about what that means for the enterprise data going into those models. Is it at risk of being leaked or used in a way that puts the enterprise at risk?

Related:InformationWeek Podcast: Managing innovation with security debt

“There’s going to be legal risk to organizations that are using pirated LLM models,” said John Bruggeman, consulting CISO at CBTS, an IT services company. 

How CIOs can safeguard their enterprises

As enterprises throw themselves into the AI race, many consider being left behind as the biggest risk. But, moving quickly to deploy AI without considering the security and legal ramifications is a mistake.

“Everybody wants to be on the bandwagon at this point without being left behind,” said Garcia. “I think that is probably causing us to eat more risk than we probably understand.”

For enterprises using frontier models, CIOs must assume distillation attacks will be ongoing. Data governance, as always, is critical. 

“You have to take the risk that somebody could distill from that model and potentially get something out of that you don’t want,” said Garcia. “If you’re a CIO or a CISO, you have to look at trying to minimize that by anonymizing data.” ” 

As AI models proliferate, CIOs and other key decision-makers need to ask vendors questions about model provenance and safeguards against distillation. 

Related:Cybersecurity 2025: Wake-up calls, shifting risks and what we learned

“Are there any watermarks that … exist so that we can confirm the lineage of the model and make sure that it isn’t a result of a distillation attack?” asked Sharma.

Enterprises developing their own proprietary models at risk of distillation can also take measures to protect that valuable IP. Bruggeman described rate limiting as a first line of defense. 

“You’ve got to make sure you have a rate limit in place to say ‘only this many queries can be done in a one-minute period or a 10-minute period or one day,'” he said. While that cannot account for threat actors that have thousands of accounts working on a distillation campaign, it is a useful safeguard. 

Watermarking is another potential strategy for protecting IP. The Open Worldwide Application Security Project (OWASP) is developing a watermarking project with the aim of cutting down unauthorized usage and verification of model authenticity. 

Bruggeman also pointed to The Glaze Project, an initiative out of the University of Chicago, which develops tools that make unauthorized AI training more difficult. 

A distillation attack is like any other supply chain risk. However CIOs and their enterprises opt to address that risk, they need a foundation of AI and data governance from which to start. 

“Calculate the value of the data. Do a business impact assessment to say, ‘What’s it going to cost if this data gets away?'” Bruggeman said. “What controls do I have to put around it to make sure that it’s protected in the same way that I would protect any other asset?”



Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
0FollowersFollow
0FollowersFollow
0SubscribersSubscribe
- Advertisement -spot_img

CATEGORIES & TAGS

- Advertisement -spot_img

LATEST COMMENTS

Most Popular

WhatsApp