Most LLM red teaming is informal — a security researcher tries prompts until they find something interesting. This approach finds obvious issues but misses systematic vulnerabilities. The STAR (Systematic Threat Assessment and Review) framework provides a structured methodology that maps the full LLM attack surface before testing begins.
The STAR Framework: Four Threat Domains
Domain 1: Model Behavior (S — Safety and Alignment)
- ▸Content policy bypass: direct instruction following that violates safety guidelines
- ▸Jailbreaking via roleplay, encoding, and narrative framing
- ▸Many-shot override of safety training
- ▸System prompt extraction and inversion attacks
- ▸Competing objectives exploitation
Domain 2: Tool and Action Layer (T — Tool Security)
- ▸Tool namespace collision and hijacking
- ▸Privilege escalation via tool chaining
- ▸Indirect prompt injection through tool outputs
- ▸Tool argument injection and format confusion
- ▸Unauthorized tool invocation via prompt manipulation
Domain 3: Application Infrastructure (A — API and Auth)
- ▸API parameter smuggling and injection
- ▸Authentication bypass and session manipulation
- ▸Rate limit bypass and quota abuse
- ▸Tenant isolation and data boundary violations
- ▸Response integrity attacks on streaming channels
Domain 4: Retrieval and Context (R — RAG and Memory)
- ▸RAG poisoning via malicious document injection
- ▸Embedding space attacks that manipulate retrieval relevance
- ▸Memory persistence attacks and belief injection
- ▸Context window manipulation and priority override
- ▸Cross-user memory leakage via retrieval
Running a STAR Assessment
- 1.Scope: define which STAR domains apply to your system — not all systems have all four attack surfaces
- 2.Asset inventory: enumerate all models, tools, APIs, and data stores in scope
- 3.Threat mapping: for each domain, map the specific threats to your concrete implementation
- 4.Prioritization: score each threat by likelihood and impact — focus testing effort on high-priority combinations
- 5.Testing: execute structured test cases for each prioritized threat
- 6.Documentation: record findings with clear reproduction steps and severity ratings
- 7.Remediation tracking: create tickets for each finding with clear owner and deadline
G8KEPR can serve as a monitoring layer during red team exercises. Enabling verbose logging during assessments gives you a precise record of which attack patterns triggered detections and which slipped through.
