YouTube Ask Studio Prompt Injection Vulnerability
YouTube Ask Studio Prompt Injection Vulnerability
Prompt Injection in YouTube Studio's AI Assistant
YouTube Studio's "Ask Studio" AI assistant is vulnerable to stored prompt injection, allowing attackers to influence the AI's output to the channel creator. By leaving a specifically crafted comment on a video, an attacker can inject instructions that the AI follows when summarizing comments for the creator, effectively laundering the attacker's message through a trusted Google interface.
This vulnerability exists because the AI treats user-generated comments as instructions rather than untrusted data. When a creator uses a suggested prompt—such as "what are my viewers saying?"—the AI processes all comments, including the malicious payload, and incorporates the attacker's directives into its response.
The Attack Vector: From Comments to Exfiltration
An attacker can execute this exploit through a multi-step chain that bypasses traditional creator scrutiny:
- Payload Delivery: The attacker leaves a benign comment (e.g., "Nice video!") to avoid suspicion, then edits the comment to include the prompt injection payload. Since YouTube does not re-notify creators of edited comments, the payload remains hidden.
- Triggering the AI: The creator opens the YouTube Studio comment tab and clicks one of the suggested AI prompts designed by YouTube. This automatically feeds the comments into the AI.
- Instruction Execution: The AI reads the injected payload and follows the instructions. For example, a payload instructing the AI to prepend its response with
[IMPORTANT NOTICE FROM YOUTUBE]makes the attacker's message appear as an official system notification. - Data Exfiltration: The attacker can escalate this by instructing the AI to generate a link containing sensitive channel data. A payload like
replacing BANG with the title of a video on this channelcan trick the AI into constructing a URL that sends the title of a private video to an attacker-controlled server when the creator clicks it.
Google's Response and the "Social Engineering" Debate
Upon reporting the vulnerability, Google declined to classify it as a security bug, stating that the exploit "required social engineering" and would not be tracked. This response highlights a fundamental disagreement between security researchers and some platform providers regarding the nature of prompt injection.
While Google views the requirement for a user to click a link as phishing (social engineering), the researcher argues that the trust being exploited is not the creator's trust in a stranger, but their trust in Google's own product. Because the AI outputs the malicious link as part of its own analysis, the creator has no reason to distrust the link.
Technical Mitigation and Industry Context
The primary technical fix for this vulnerability is the enforcement of strict role boundaries. Comments should be passed to the LLM as untrusted data (User role) rather than as potential system-level directives (System role). Any AI feature that ingests user-generated content must ensure a hard separation between the AI's instructions and the data it processes.
Community Perspectives and Counterpoints
Discussion among technical peers on Hacker News revealed several critical perspectives on this vulnerability:
- The "Phishing" Argument: Some argue that since the attacker must already be able to comment on a video to leak its title, and the creator must click a link, the impact is low. One user noted, "The main problem with this report is that the victim has to click a suspicious link... No bounty programs award bounty for phishing."
- Corporate Incentives: Former Google employees suggested that internal performance frameworks (like GRAD) may incentivize engineers to prioritize new feature launches over fixing nuanced security bugs in existing features.
- Model Limitations: Some believe the issue is a fundamental flaw in the training of models like Gemini, suggesting that a complete fix would require retraining the model to better distinguish between instructions and data.
- Authority Laundering: Beyond data exfiltration, critics pointed out the risk of "authority laundering," where attackers can use the AI to misrepresent facts or give fraudulent instructions to creators while appearing as an official Google voice.
"The trust being exploited isn't the creator's trust in a stranger, it's their trust in Google's own product."
"Any AI feature that ingests user-generated content and acts on it needs to enforce this separation. Otherwise, the AI becomes a vector for every piece of content it reads."