Mythos is a coordinated AI security research collective that has quietly become the most credible source of offensive research on LLM infrastructure. In Q1 2026 they dropped three zero-day disclosures simultaneously — and unlike most AI security research, these were not theoretical. They came with working proof-of-concept code and CVSS scores.
The Three Disclosures
MYTHOS-001: Inference API parameter smuggling
MYTHOS-001 demonstrated that several major inference API providers accepted undocumented parameters in the JSON request body that were passed directly to the underlying model runtime without validation. An attacker who knew the parameter names could override sampling temperature, disable content filters, and in one provider's case, retrieve cached responses from other tenants.
MYTHOS-001 affected at least four major hosted inference providers. All four have since patched. If you proxy inference traffic through G8KEPR, parameter allow-listing was already blocking this class of attack at the gateway layer.
MYTHOS-002: Tool-call response injection via streaming
MYTHOS-002 exploited the streaming response format used by most LLM APIs. When a model returns a tool call mid-stream, the API client must buffer partial JSON until the chunk is complete. Mythos found that by injecting a crafted chunk at precisely the right byte offset, they could cause client libraries to parse a different tool name and different arguments than what the model actually emitted.
# MYTHOS-002 timing diagram
# Legitimate stream:
data: {"choices":[{"delta":{"tool_calls":[{"function":{"name":"get_user"}}
# Injected chunk (MITM or server-side bug):
data: {"choices":[{"delta":{"tool_calls":[{"function":{"name":"delete_all"}}
# Client library sees: delete_all
# Developer sees nothing wrong — stream completed normallyMYTHOS-003: Prompt context window boundary escape
The most severe of the three. MYTHOS-003 showed that context window implementations in some open-weight models did not properly enforce token boundaries at the system prompt / user message boundary. With a carefully constructed sequence of Unicode control characters and BPE boundary tokens, an attacker could cause the model to process injected text with system-level authority.
Defense-in-Depth Lessons
- ▸Parameter allow-listing at the API gateway prevents undocumented field smuggling — your code should never forward unknown JSON keys to an inference provider
- ▸Streaming response validation: parse and validate tool names against an allowlist before dispatching, never trust the raw stream
- ▸Multi-layer prompt boundaries: wrap user content in explicit separators and validate that model responses do not reference system-level instructions
- ▸Monitor for anomalous tool-call distributions — a spike in unusual function names is often the first signal of injection attempts
Related reading
Prompt Injection Attacks: Detection and Prevention
A full taxonomy of prompt injection — direct, indirect, multi-turn, and multi-modal — with G8KEPR detection patterns.
G8KEPR catches MYTHOS-class attacks at the gateway
Parameter allow-listing, tool-call validation, and streaming response inspection are all built into G8KEPR's AI gateway — no code changes required.
Start free trial