LLM Red Teaming with the STAR Framework: Structured Threat Assessment for AI

Most LLM red teaming is informal — a security researcher tries prompts until they find something interesting. This approach finds obvious issues but misses systematic vulnerabilities. The STAR (Systematic Threat Assessment and Review) framework provides a structured methodology that maps the full LLM attack surface before testing begins.

The STAR Framework: Four Threat Domains

Domain 1: Model Behavior (S — Safety and Alignment)

▸Content policy bypass: direct instruction following that violates safety guidelines
▸Jailbreaking via roleplay, encoding, and narrative framing
▸Many-shot override of safety training
▸System prompt extraction and inversion attacks
▸Competing objectives exploitation

Domain 2: Tool and Action Layer (T — Tool Security)

▸Tool namespace collision and hijacking
▸Privilege escalation via tool chaining
▸Indirect prompt injection through tool outputs
▸Tool argument injection and format confusion
▸Unauthorized tool invocation via prompt manipulation

Domain 3: Application Infrastructure (A — API and Auth)

▸API parameter smuggling and injection
▸Authentication bypass and session manipulation
▸Rate limit bypass and quota abuse
▸Tenant isolation and data boundary violations
▸Response integrity attacks on streaming channels

Domain 4: Retrieval and Context (R — RAG and Memory)

▸RAG poisoning via malicious document injection
▸Embedding space attacks that manipulate retrieval relevance
▸Memory persistence attacks and belief injection
▸Context window manipulation and priority override
▸Cross-user memory leakage via retrieval

Running a STAR Assessment

1.Scope: define which STAR domains apply to your system — not all systems have all four attack surfaces
2.Asset inventory: enumerate all models, tools, APIs, and data stores in scope
3.Threat mapping: for each domain, map the specific threats to your concrete implementation
4.Prioritization: score each threat by likelihood and impact — focus testing effort on high-priority combinations
5.Testing: execute structured test cases for each prioritized threat
6.Documentation: record findings with clear reproduction steps and severity ratings
7.Remediation tracking: create tickets for each finding with clear owner and deadline

G8KEPR can serve as a monitoring layer during red team exercises. Enabling verbose logging during assessments gives you a precise record of which attack patterns triggered detections and which slipped through.

ShareX / Twitter LinkedIn

LLM Red Teaming with the STAR Framework: Structured Threat Assessment for AI

The STAR Framework: Four Threat Domains

Domain 1: Model Behavior (S — Safety and Alignment)

Domain 2: Tool and Action Layer (T — Tool Security)

Domain 3: Application Infrastructure (A — API and Auth)

Domain 4: Retrieval and Context (R — RAG and Memory)

Running a STAR Assessment

Related Articles

G8KEPR Red Team Run 4: What We Found and What We Fixed

MCP Security in 2026: How to Sandbox AI Tool Calls

What Is Model Context Protocol (MCP) and Why Does It Need Security?

Ready to secure your AI stack?