Head like a hole

Claude’s new AI file-creation feature ships with security risks built in

Expert calls security advice “unfairly outsourcing the problem to Anthropic’s users.”

Benj Edwards – Sep 9, 2025 4:55 pm | 50

On Tuesday, Anthropic launched a new file-creation feature for its Claude AI assistant that enables users to generate Excel spreadsheets, PowerPoint presentations, and other documents directly within conversations on the web interface and in the Claude desktop app. While the feature may be handy for Claude users, the company’s support documentation also warns that it “may put your data at risk” and details how the AI assistant can be manipulated to transmit user data to external servers.

The feature, awkwardly named “Upgraded file-creation and analysis,” is basically Anthropic’s version of ChatGPT’s Code Interpreter and an upgraded version of Anthropic’s “analysis” tool. It’s currently available as a preview for Max, Team, and Enterprise plan users, with Pro users scheduled to receive access “in the coming weeks,” according to the announcement.

The security issue comes from the fact that the new feature gives Claude access to a sandbox computing environment, which enables it to download packages and run code to create files. “This feature gives Claude Internet access to create and analyze files, which may put your data at risk,” Anthropic writes in its blog announcement. “Monitor chats closely when using this feature.”

According to Anthropic’s documentation, “a bad actor” manipulating this feature could potentially “inconspicuously add instructions via external files or websites” that manipulate Claude into “reading sensitive data from a claude.ai connected knowledge source” and “using the sandbox environment to make an external network request to leak the data.”

This describes a prompt injection attack, where hidden instructions embedded in seemingly innocent content can manipulate the AI model’s behavior—a vulnerability that security researchers first documented in 2022. These attacks represent a pernicious, unsolved security flaw of AI language models, since both data and instructions in how to process it are fed through as part of the “context window” to the model in the same format, making it difficult for the AI to distinguish between legitimate instructions and malicious commands hidden in user-provided content.

Claude file-creation demo video by Anthropic.

The company states in its security documentation that it identified these theoretical vulnerabilities through threat modeling and security testing before release, though an Anthropic representative told Ars Technica that its red-teaming exercises have not yet demonstrated actual data exfiltration.

Anthropic’s recommended mitigation for users is to “monitor Claude while using the feature and stop it if you see it using or accessing data unexpectedly,” although this places the burden of security on the user. Independent AI researcher Simon Willison, reviewing the feature today on his blog, noted that Anthropic’s advice to “monitor Claude while using the feature” amounts to “unfairly outsourcing the problem to Anthropic’s users.”

Anthropic’s mitigations

Anthropic is not completely ignoring the problem, however, and it has implemented several security measures for the file-creation feature. The company has implemented a classifier that attempts to detect prompt injections and stop execution if they are detected. In addition, for Pro and Max users, Anthropic disabled public sharing of conversations that use the file-creation feature. For Enterprise users, the company implemented sandbox isolation so that environments are never shared between users. The company also limited task duration and container runtime “to avoid loops of malicious activity.”

Anthropic provides an allowlist of domains Claude can access for all users, including api.anthropic.com, github.com, registry.npmjs.org, and pypi.org. Team and Enterprise administrators have control over whether to enable the feature for their organizations

Anthropic’s documentation states the company has “a continuous process for ongoing security testing and red-teaming of this feature.” The company encourages organizations to “evaluate these protections against their specific security requirements when deciding whether to enable this feature.”

Prompt injections galore

Even with Anthropic’s security measures, Willison says he’ll be cautious. “I plan to be cautious using this feature with any data that I very much don’t want to be leaked to a third party, if there’s even the slightest chance that a malicious instruction might sneak its way in,” he wrote on his blog.

We covered a similar potential prompt-injection vulnerability with Anthropic’s Claude for Chrome, which launched as a research preview last month. For enterprise customers considering Claude for sensitive business documents, Anthropic’s decision to ship with documented vulnerabilities suggests competitive pressure may be overriding security considerations in the AI arms race.

That kind of “ship first, secure it later” philosophy has caused frustrations among some AI experts like Willison, who has extensively documented prompt-injection vulnerabilities (and coined the term). He recently described the current state of AI security as “horrifying” on his blog, noting that these prompt-injection vulnerabilities remain widespread “almost three years after we first started talking about them.”

In a prescient warning from September 2022, Willison wrote that “there may be systems that should not be built at all until we have a robust solution.” His recent assessment in the present? “It looks like we built them anyway!”

This story was updated on September 10, 2025 at 9:50 AM to correct information about Anthropic’s red-teaming efforts and to add detail to Anthropic’s mitigation measures.

Benj Edwards Senior AI Reporter

Benj Edwards is Ars Technica's Senior AI Reporter and founder of the site's dedicated AI beat in 2022. He's also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

50 Comments