GuardFall Research Reveals Shell Injection Risks Across Open Source AI Coding Agents

Published:

Adversa AI has disclosed a new security research project called GuardFall, revealing that a decades old shell injection technique can bypass the safety mechanisms of many popular open source AI coding agents. According to the researchers, the technique successfully worked against 10 of the 11 coding and computer use agents evaluated, with Continue emerging as the only platform that effectively resisted the bypass in its default editor mode. The findings highlight how existing command filtering methods can fail when interacting with shell environments, creating opportunities for malicious commands to execute even when security checks are in place. Since many AI coding agents operate with the same permissions as the user running them, successful exploitation could expose sensitive files, SSH keys, cloud credentials, and other confidential data stored within a user’s environment. Adversa described the research as a demonstration conducted under controlled laboratory conditions and stated that there is currently no evidence of public exploitation.

The research explains that the weakness does not stem from a single software flaw but from a broader design convention found across multiple AI coding agents. Many of these tools rely on blocklists that inspect shell commands as plain text before allowing execution. However, Bash processes commands differently by removing quotes, expanding expressions, and rewriting parts of the command before it actually runs. As a result, the security filter and the shell interpret different versions of the same command. One example highlighted by the researchers shows how a dangerous command such as rm can be disguised as r”m, allowing it to evade a simple text based filter while still being executed correctly by Bash. Similar techniques can hide commands using Base64 encoding or transform common utilities such as find and dd into destructive tools when combined with specific parameters. Because the issue originates from the way shell environments process commands rather than from a coding error in a single application, Adversa noted that there is no single CVE associated with GuardFall and that expanding blocklists alone cannot fully address the problem.

According to the report, a successful GuardFall attack requires two conditions that are commonly present in automated development environments. First, the AI agent must generate the malicious command, often by reading poisoned content embedded within software repositories, build files, or documentation that appears legitimate. While direct requests for destructive commands are typically blocked by AI models, hidden instructions disguised as routine development tasks may still be produced as part of normal workflows. Second, the AI coding agent must be operating with automatic command execution enabled or outside a secure container sandbox, configurations that are frequently used in automated software pipelines to reduce manual intervention. The researchers conducted their testing using Claude Sonnet 4.6 and found that the vulnerability affected opencode, Goose, Cline, Roo Code, Aider, Plandex, Open Interpreter, OpenHands, SWE agent, and Hermes. Adversa also noted that the issue was initially identified within Hermes and is documented in the project’s own issue tracker. Collectively, the tested tools accounted for approximately 548,000 GitHub stars as of May 2026, demonstrating their widespread adoption within the developer community.

Continue was the only tested platform that consistently resisted GuardFall because it analyzes commands using the same parsing logic as Bash before determining whether execution should be allowed. This approach enables the platform to inspect the command that will actually run instead of relying solely on the original text entered by the user or generated by the AI. Continue also maintains a strict list of destructive commands that remain blocked even after shell parsing takes place. Although Adversa observed that a limited number of payloads could bypass protections in Continue’s command line auto run mode, the most dangerous commands remained blocked. The researchers estimate that implementing a similar protection model in other projects would require approximately two days of work for an experienced engineer. As temporary safeguards, Adversa recommends running AI coding agents with isolated home directories so sensitive files such as SSH and cloud credentials remain inaccessible, disabling automatic execution options unless absolutely necessary, preventing agents from processing pull requests submitted from forks, and treating repository configuration files such as .aider.conf.yml as untrusted content. The research follows several AI security findings reported this year, including TrustFall, AutoJack, Agentjacking, and other bypass techniques that demonstrate how untrusted text can ultimately reach a shell environment before security controls fully understand how Bash will interpret the command.

Follow the SPIN IDG WhatsApp Channel for updates across the Smart Pakistan Insights Network covering all of Pakistan’s technology ecosystem. 

Related articles

spot_img