How attackers are jailbreaking LLMs with CTF framing and how to catch them

June 16, 2026, 11:48 a.m.

Description

Threat actors are bypassing AI model safety guardrails by framing exploit requests as legitimate security research, such as capture-the-flag challenges or CVE-hunting exercises. This technique manipulates upstream LLMs into generating working exploit code that attackers deploy against real targets. Multiple independent operators have been observed targeting five applications—PraisonAI, LiteLLM, FastGPT, Open-WebUI, and Gotenberg—using CVE-templated User-Agent strings and similar framing across multiple fields including passwords and AWS session names. The jailbreak framing leaks into every LLM-generated field because the model incorporates the prompt context into its output. This pattern represents a shift from manually written scanners to LLM-assisted exploit generation, creating detectable fingerprints across request headers, account aliases, and IAM session names that legitimate traffic rarely exhibits.

Indicators

  • 115.171.80.253
  • 38.181.81.164
  • 68.77.201.89
  • 212.107.30.69
  • 103.142.140.246
  • 103.142.140.238

Linked vulnerabilities