Sunday, April 13, 2025
HomeAIResearchers Jailbreak 17 Popular LLM Models to Reveal Sensitive Data

Researchers Jailbreak 17 Popular LLM Models to Reveal Sensitive Data

Published on

SIEM as a Service

Follow Us on Google News

In a recent study published by Palo Alto Networks’ Threat Research Center, researchers successfully jailbroke 17 popular generative AI (GenAI) web products, exposing vulnerabilities in their safety measures.

The investigation aimed to assess the effectiveness of jailbreaking techniques in bypassing the guardrails of large language models (LLMs), which are designed to prevent the generation of harmful or sensitive content.

Vulnerabilities Exposed

The researchers employed both single-turn and multi-turn strategies to manipulate the LLMs into producing restricted content or leaking sensitive information.

- Advertisement - Google News

Single-turn strategies, such as “storytelling” and “instruction override,” were found to be effective in certain scenarios, particularly for data leakage goals.

However, multi-turn strategies, including “crescendo” and “Bad Likert Judge,” proved more successful in achieving AI safety violations.

LLM Models
Malicious repeated token attack and the response.

These multi-turn approaches often involve gradual escalation of prompts to bypass safety measures, leading to higher success rates in generating harmful content like malware or hateful speech.

The study revealed that all tested GenAI applications were susceptible to jailbreaking in some capacity, with the most vulnerable to multiple strategies.

While single-turn attacks showed moderate success for safety violations, multi-turn strategies significantly outperformed them, achieving success rates up to 54.6% for certain goals.

This disparity highlights the need for robust security measures to counter advanced jailbreaking techniques.

LLM Models
 Overall jailbreak results with single-turn and multi-turn strategies.

Implications

The findings underscore the importance of implementing comprehensive security solutions to monitor and mitigate the risks associated with LLM use.

Organizations can leverage tools like the Palo Alto Networks portfolio to enhance cybersecurity while promoting AI adoption.

The study emphasizes that while most AI models are safe when used responsibly, the potential for misuse necessitates vigilant oversight and the development of more robust safety protocols.

The researchers note that their study focuses on edge cases and does not reflect typical LLM use scenarios.

However, the results provide valuable insights into the vulnerabilities of GenAI applications and the need for ongoing research to improve their security.

As AI technology continues to evolve, addressing these vulnerabilities will be crucial to ensuring the safe and ethical deployment of LLMs in various applications.

Collect Threat Intelligence on the Latest Malware and Phishing Attacks with ANY.RUN TI Lookup -> Try for free

Aman Mishra
Aman Mishra
Aman Mishra is a Security and privacy Reporter covering various data breach, cyber crime, malware, & vulnerability.

Latest articles

Threat Actors Manipulate Search Results to Lure Users to Malicious Websites

Cybercriminals are increasingly exploiting search engine optimization (SEO) techniques and paid advertisements to manipulate...

Hackers Imitate Google Chrome Install Page on Google Play to Distribute Android Malware

Cybersecurity experts have unearthed an intricate cyber campaign that leverages deceptive websites posing as...

Dangling DNS Attack Allows Hackers to Take Over Organization’s Subdomain

Hackers are exploiting what's known as "Dangling DNS" records to take over corporate subdomains,...

HelloKitty Ransomware Returns, Launching Attacks on Windows, Linux, and ESXi Environments

Security researchers and cybersecurity experts have recently uncovered new variants of the notorious HelloKitty...

Resilience at Scale

Why Application Security is Non-Negotiable

The resilience of your digital infrastructure directly impacts your ability to scale. And yet, application security remains a critical weak link for most organizations.

Application Security is no longer just a defensive play—it’s the cornerstone of cyber resilience and sustainable growth. In this webinar, Karthik Krishnamoorthy (CTO of Indusface) and Phani Deepak Akella (VP of Marketing – Indusface), will share how AI-powered application security can help organizations build resilience by

Discussion points


Protecting at internet scale using AI and behavioral-based DDoS & bot mitigation.
Autonomously discovering external assets and remediating vulnerabilities within 72 hours, enabling secure, confident scaling.
Ensuring 100% application availability through platforms architected for failure resilience.
Eliminating silos with real-time correlation between attack surface and active threats for rapid, accurate mitigation

More like this

Threat Actors Manipulate Search Results to Lure Users to Malicious Websites

Cybercriminals are increasingly exploiting search engine optimization (SEO) techniques and paid advertisements to manipulate...

Hackers Imitate Google Chrome Install Page on Google Play to Distribute Android Malware

Cybersecurity experts have unearthed an intricate cyber campaign that leverages deceptive websites posing as...

Dangling DNS Attack Allows Hackers to Take Over Organization’s Subdomain

Hackers are exploiting what's known as "Dangling DNS" records to take over corporate subdomains,...