Friday, February 21, 2025
Homecyber securityHackers Compromised ChatGPT Model with Indirect Prompt Injection

Hackers Compromised ChatGPT Model with Indirect Prompt Injection

Published on

SIEM as a Service

Follow Us on Google News

ChatGPT quickly gathered more than 100 million users just after its release, and the ongoing trend includes newer models like the advanced GPT-4 and several other smaller versions.

LLMs are now widely used in a multitude of applications, but flexible modulation through natural prompts creates vulnerability. As this flexibility makes them vulnerable to targeted adversarial attacks like Prompt Injection attacks letting attackers bypass instructions and controls.

Beyond direct user prompts, LLM-Integrated Apps blur the data instruction line. Indirect Prompt Injection lets adversaries exploit apps remotely by injecting prompts into retrievable data.

Recently at the Black Hat event, the following cybersecurity researchers demoed how they compromised the chatGPT model with indirect prompt injection:-

  • Kai Greshake from Saarland University and Sequire Technology GmbH
  • Sahar Abdelnabi from CISPA Helmholtz Center for Information Security
  • Shailesh Mishra from Saarland University
  • Christoph Endres from Sequire Technology GmbH
  • Thorsten Holz from CISPA Helmholtz Center for Information Security
  • Mario Fritz from CISPA Helmholtz Center for Information Security
LLM-integrated applications (Source – BlackHat)

Indirect Prompt Injection

Indirect Prompt Injection challenges LLMs, blurring data-instruction lines, as adversaries can remotely manipulate systems via injected prompts. 

Retrieval of such prompts indirectly controls the models, raising concerns about recent incidents revealing behaviors that are unwanted.

This shows that how adversaries could deliberately alter LLM behavior in apps, impacting millions of users. 

The unknown attack vector brings diverse threats, prompting the development of a comprehensive taxonomy to assess these vulnerabilities from a security perspective.

Indirect prompt injection threats to LLM-integrated applications (Source – BlackHat)

The Prompt injection (PI) attacks threaten LLM security, and traditionally on individual instances, integrating LLMs exposes them to untrusted data and new threats ‘indirect prompt injections.’ 

The introduction of ‘indirect prompt injections,’ could enable the delivery of targeted payloads and breach of the security boundaries with a single search query.

Injection Methods

Here below we have mentioned all the injection methods that are identified by the researchers:-

  • Passive Methods
  • Active Methods
  • User-Driven Injections
  • Hidden Injections

Mitigations

LLMs spark broad ethical concerns, heightened with their widespread use in applications. Researchers responsibly disclosed ‘indirect prompt injection’ vulnerabilities to OpenAI and Microsoft. 

However, apart from this, from the security standpoint, the novelty is debatable, given LLMs’ prompt sensitivity.

GPT-4 aimed to curb jailbreaks with safety-oriented RLHF intervention. Real-world attacks continue despite fixes, resembling a “Whack-A-Mole” pattern. 

The impact of RLHF on attacks remains uncertain; theoretical work questions full defense. The practical interplay between attacks, defenses, and their implications remains unclear.

RLHF and undisclosed real-world app defenses can counter attacks. Bing Chat’s success with additional filtering raises questions about evasion with stronger obfuscation or encoding in future models.

The defenses like input processing to filter instructions raise difficulties. Balancing less general models to avoid traps and complex input detection is challenging. 

As the Base64 encoding experiment needed explicit instructions, future models may automate the decoding with self-encoded prompts.

Keep informed about the latest Cyber Security News by following us on GoogleNews, Linkedin, Twitter, and Facebook.

Gurubaran
Gurubaran
Gurubaran is a co-founder of Cyber Security News and GBHackers On Security. He has 10+ years of experience as a Security Consultant, Editor, and Analyst in cybersecurity, technology, and communications.

Latest articles

SPAWNCHIMERA Malware Exploits Ivanti Buffer Overflow Vulnerability by Applying a Critical Fix

In a recent development, the SPAWNCHIMERA malware family has been identified exploiting the buffer...

Sitevision Auto-Generated Password Vulnerability Lets Hackers Steal Signing Key

A significant vulnerability in Sitevision CMS, versions 10.3.1 and earlier, has been identified, allowing...

NSA Allegedly Hacked Northwestern Polytechnical University, China Claims

Chinese cybersecurity entities have accused the U.S. National Security Agency (NSA) of orchestrating a...

ACRStealer Malware Abuses Google Docs as C2 to Steal Login Credentials

The ACRStealer malware, an infostealer disguised as illegal software such as cracks and keygens,...

Supply Chain Attack Prevention

Free Webinar - Supply Chain Attack Prevention

Recent attacks like Polyfill[.]io show how compromised third-party components become backdoors for hackers. PCI DSS 4.0’s Requirement 6.4.3 mandates stricter browser script controls, while Requirement 12.8 focuses on securing third-party providers.

Join Vivekanand Gopalan (VP of Products – Indusface) and Phani Deepak Akella (VP of Marketing – Indusface) as they break down these compliance requirements and share strategies to protect your applications from supply chain attacks.

Discussion points

Meeting PCI DSS 4.0 mandates.
Blocking malicious components and unauthorized JavaScript execution.
PIdentifying attack surfaces from third-party dependencies.
Preventing man-in-the-browser attacks with proactive monitoring.

More like this

SPAWNCHIMERA Malware Exploits Ivanti Buffer Overflow Vulnerability by Applying a Critical Fix

In a recent development, the SPAWNCHIMERA malware family has been identified exploiting the buffer...

Sitevision Auto-Generated Password Vulnerability Lets Hackers Steal Signing Key

A significant vulnerability in Sitevision CMS, versions 10.3.1 and earlier, has been identified, allowing...

NSA Allegedly Hacked Northwestern Polytechnical University, China Claims

Chinese cybersecurity entities have accused the U.S. National Security Agency (NSA) of orchestrating a...