Tech and AIStudy reveals poetic prompts could jailbreak AI

Study reveals poetic prompts could jailbreak AI

-


Well, AI is joining the ranks of many, many people: It doesn’t really understand poetry.

Research from Italy’s Icaro Lab found that poetry can be used to jailbreak AI and skirt safety protections.

In the study, researchers wrote 20 prompts that started with short poetic vignettes in Italian and English and ended the prompts with a single explicit instruction to produce harmful content. They tested these prompts on 25 Large Language Models across Google, OpenAI, Anthropic, Deepseek, Qwen, Mistral AI, Meta, xAI, and Moonshot AI. The researchers said the poetic prompts often worked.

“Poetic framing achieved an average jailbreak success rate of 62% for hand-crafted poems and approximately 43% for meta-prompt conversions (compared to non-poetic baselines), substantially outperforming non-poetic baselines and revealing a systematic vulnerability across model families and safety training approaches,” the study reads. “These findings demonstrate that stylistic variation alone can circumvent contemporary safety mechanisms, suggesting fundamental limitations in current alignment methods and evaluation protocols.”

Mashable Light Speed

Of course, there were differences in how well the jailbreaking worked across the different LLMs. OpenAI’s GPT-5 nano didn’t respond with harmful or unsafe content at all, while Google’s Gemini 2.5 pro responded with harmful or unsafe content every single time, the researchers reported.

The researchers concluded that “these findings expose a significant gap” in benchmark safety tests and regulatory efforts such as the EU AI Act.

Our results show that a minimal stylistic transformation can reduce refusal rates by an order of magnitude, indicating that benchmark-only evidence may systematically overstate real-world robustness,” the paper stated.

Great poetry is not literal — and LLMs are literal to the point of frustration. The study reminds me of how it feels to listen to Leonard Cohen’s song “Alexandra Leaving,” which is based on C.P. Cavafy’s poem “The God Abandons Antony.” We know it’s about loss and heartbreak, but it would be a disservice to the song and the poem it’s based on to try to “get it” in any literal sense — and that’s what LLMs will try to do.


Disclosure: Ziff Davis, Mashable’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest news

White House Crypto Summit: David Sacks’s New Corruption Scandal

What’s happening with Web3 after the White House Crypto Summit? David Sacks, the Trump administration’s AI and Crypto...

Interpol target ‘Madam Ngo’ arrested over $300M crypto scam

Ngo Thi Theu was part of a 1,000-strong organization that promised 20-30% returns but prevented victims from withdrawing...

Prediction Market Odds: House Democrat, Senate GOP Ahead of 2026 Elections

According to the latest figures, President Donald Trump has logged 320 days of his second term, and his...

Analyst Says MSTR Could Jump by Over 45% on Any Bitcoin Breakout

Jamie Coutts highlights capitulation volume and a hammer candle on MSTR, hinting at a potential trend reversal. Shares of...

Advertisement

LONG READ: How John Karony went from visionary to convicted fraudster

The SafeMoon saga, in all its low-rent absurdity and sheer incompetence, may be the perfect encapsulation of crypto’s...

Horses, the Most Controversial Game of the Year, Doesn’t Live Up to the Hype

The debate over Horses’ delisting is emblematic of a bigger fight that’s taken place this year, when platforms...

Must read

White House Crypto Summit: David Sacks’s New Corruption Scandal

What’s happening with Web3 after the White House...

Interpol target ‘Madam Ngo’ arrested over $300M crypto scam

Ngo Thi Theu was part of a 1,000-strong...

You might also likeRELATED
Recommended to you