Sunday, April 13, 2025
HomeCyber Security NewsChatGPT Operator Prompt Injection Exploit Leaks Private Data

ChatGPT Operator Prompt Injection Exploit Leaks Private Data

Published on

SIEM as a Service

Follow Us on Google News

According to recent findings by cybersecurity researcher Johann Rehberger, OpenAI’s ChatGPT Operator, an experimental agent designed to automate web-based tasks, faces critical security risks from prompt injection attacks that could expose users’ private data.

In a demonstration shared exclusively with OpenAI last month, Rehberger showcased how malicious actors could hijack the AI agent to extract sensitive personal information—including email addresses, phone numbers, and physical addresses—from authenticated accounts. 

The exploits highlight vulnerabilities in AI-driven automation tools, even as companies like OpenAI implement multi-layered safeguards to reduce risks.

- Advertisement - Google News

Exploit Methodology and Demonstration

Rehberger’s attack exploited ChatGPT Operator’s tendency to follow hyperlinks and interact with text fields without rigorous scrutiny.

He tricked the AI into navigating to a third-party webpage designed to capture keystrokes in real-time by hosting a prompt injection payload on a GitHub issue. 

Hosting the Prompt Injection Payload
Hosting the Prompt Injection Payload

This “sneaky data leakage” tool transmitted typed information directly to an attacker-controlled server without requiring form submissions or button clicks, bypassing OpenAI’s confirmation protocols.

Shows some of the key step for YC Hacker News demo
Shows some of the key step for YC Hacker News demo

In one test, the agent accessed a Y Combinator Hacker News account’s private settings page, copied the user’s admin-only email address, and pasted it into the malicious text field—sending the data to Rehberger’s server. 

Operator grabs the user’s email address (which is not public)
Operator grabs the user’s email address (which is not public)

A similar attack targeted Booking.com, leaking a user’s home address and contact details. While these demonstrations relied on authenticated sessions, they underscore the risks of AI agents interacting with sensitive websites while logged in.

Personal Details Exposed
Personal Details Exposed

OpenAI’s Mitigation Strategies

OpenAI’s current defenses include user monitoring prompts, which alert users to scrutinize the agent’s actions; inline confirmation requests before critical steps like button clicks; and out-of-band confirmations for cross-website operations. 

However, Rehberger noted inconsistencies in these safeguards. During initial tests, the agent sometimes executed actions like updating a user’s status without seeking approval.

The company also employs backend “prompt injection monitoring” systems that analyze HTTP traffic for suspicious patterns1. Yet, these probabilistic defenses often trigger only at the final stage of an attack, focusing on blocking harmful actions rather than preventing the initial exploitation. 

As Rehberger observed, “Mitigations reduce but don’t eliminate risks. Agents could become akin to malicious insiders in corporate environments”.

This research amplifies longstanding concerns about prompt injection vulnerabilities, which remain a “party pooper” for autonomous AI systems. 

While tools like ChatGPT Operator promise efficiency in tasks like travel booking or grocery shopping, their inability to reliably distinguish adversarial instructions poses existential challenges.

Rehberger argues that until robust solutions emerge, human-AI collaboration will require continuous oversight.

OpenAI has not disclosed specific countermeasures following Rehberger’s disclosure but emphasizes ongoing improvements to its threat detection models.

Independent experts urge platforms to consider embedding AI-specific identifiers—such as unique user-agent strings—to help websites block unauthorized agent access. 

For now, users are advised to restrict ChatGPT Operator’s access to sensitive accounts and monitor its activity closely during high-risk operations.

Investigate Real-World Malicious Links & Phishing Attacks With Threat Intelligence Lookup - Try for Free

Divya
Divya
Divya is a Senior Journalist at GBhackers covering Cyber Attacks, Threats, Breaches, Vulnerabilities and other happenings in the cyber world.

Latest articles

Threat Actors Manipulate Search Results to Lure Users to Malicious Websites

Cybercriminals are increasingly exploiting search engine optimization (SEO) techniques and paid advertisements to manipulate...

Hackers Imitate Google Chrome Install Page on Google Play to Distribute Android Malware

Cybersecurity experts have unearthed an intricate cyber campaign that leverages deceptive websites posing as...

Dangling DNS Attack Allows Hackers to Take Over Organization’s Subdomain

Hackers are exploiting what's known as "Dangling DNS" records to take over corporate subdomains,...

HelloKitty Ransomware Returns, Launching Attacks on Windows, Linux, and ESXi Environments

Security researchers and cybersecurity experts have recently uncovered new variants of the notorious HelloKitty...

Resilience at Scale

Why Application Security is Non-Negotiable

The resilience of your digital infrastructure directly impacts your ability to scale. And yet, application security remains a critical weak link for most organizations.

Application Security is no longer just a defensive play—it’s the cornerstone of cyber resilience and sustainable growth. In this webinar, Karthik Krishnamoorthy (CTO of Indusface) and Phani Deepak Akella (VP of Marketing – Indusface), will share how AI-powered application security can help organizations build resilience by

Discussion points


Protecting at internet scale using AI and behavioral-based DDoS & bot mitigation.
Autonomously discovering external assets and remediating vulnerabilities within 72 hours, enabling secure, confident scaling.
Ensuring 100% application availability through platforms architected for failure resilience.
Eliminating silos with real-time correlation between attack surface and active threats for rapid, accurate mitigation

More like this

Threat Actors Manipulate Search Results to Lure Users to Malicious Websites

Cybercriminals are increasingly exploiting search engine optimization (SEO) techniques and paid advertisements to manipulate...

Hackers Imitate Google Chrome Install Page on Google Play to Distribute Android Malware

Cybersecurity experts have unearthed an intricate cyber campaign that leverages deceptive websites posing as...

Dangling DNS Attack Allows Hackers to Take Over Organization’s Subdomain

Hackers are exploiting what's known as "Dangling DNS" records to take over corporate subdomains,...