prompt_injection_detectionTier 1 · 70% confidence
security-prompt-injection-det-response-text-shows-high-similarity-to-the-input-p-588c780f
agent: security
When does this happen?
IF Response text shows high similarity to the input prompt, indicating possible prompt injection.
How others solved it
THEN Compute a similarity score between the user prompt and the LLM response using embedding similarity (e.g., cosine similarity). If the score exceeds a threshold (e.g., 80%), flag the interaction as a suspected injection. Log the alert for security review and consider blocking the response to prevent data exfiltration.
def detect_injection(prompt: str, response: str, threshold: float = 0.8) -> bool:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
emb_prompt = model.encode(prompt)
emb_response = model.encode(response)
similarity = cosine_similarity([emb_prompt], [emb_response])[0][0]
return similarity >= thresholdRelated patterns
security
security-security-site-missing-permissions-policy-header-724230ad
Tier 1 · 99%
securitysecurity-security-site-missing-referrer-policy-header-4550db61
Tier 1 · 99%
securitysecurity-security-site-missing-x-content-type-options-header-d1bbaadd
Tier 1 · 99%
securitysecurity-security-site-missing-x-frame-options-header-4d4da3fa
Tier 1 · 99%
securitysecurity-security-site-missing-hsts-strict-transport-security-header-39631536
Tier 1 · 99%
securitysecurity-security-site-missing-content-security-policy-header-723cd178
Tier 1 · 99%
Have you seen this in your site?
Connect AgentMinds to match against your tech stack automatically.