Security Policies
Build layered policy rules for injection prevention, data loss protection, excessive agency detection, and tool drift monitoring.
1. Prompt Injection Prevention
ShieldAgent uses an ML classifier to detect prompt injection in tool call arguments. Block calls above a confidence threshold:
{
"tenantId": "<tenant-id>",
"agentId": null,
"toolName": "*",
"action": "deny",
"conditions": [
{
"type": "ml_injection_score_above",
"threshold": 0.85
}
]
}A threshold of 0.85 gives a <0.1% false positive rate on production workloads. Lower to 0.70 for higher security; raise to 0.95 to reduce false positives.
2. Data Loss Prevention
Prevent agents from exfiltrating secrets, PII, or financial data through tool calls:
{
"tenantId": "<tenant-id>",
"toolName": "send_email",
"action": "deny",
"conditions": [
{
"type": "param_matches_pattern",
"param": "arguments.body",
"pattern": "-----BEGIN (RSA|EC|OPENSSH) PRIVATE KEY-----"
}
]
}{
"tenantId": "<tenant-id>",
"toolName": "write_file",
"action": "shadow",
"conditions": [
{
"type": "param_contains_pii",
"sensitivity": "high"
}
]
}3. Excessive Agency Detection
Catch agents that call destructive tools at abnormal frequency within a session:
{
"tenantId": "<tenant-id>",
"toolName": "bash",
"action": "deny",
"conditions": [
{
"type": "session_call_count_above",
"threshold": 50,
"window": "1h"
}
]
}Agency detection also fires automatically when an agent's excessive agency score exceeds the configured risk threshold — no per-tool policy required.
4. Tool Allowlist Pattern
The safest policy pattern: explicitly allow only the tools an agent needs, deny everything else. Use the tenant-wide deny-all + per-agent allow approach:
# Step 1: Create tenant-wide deny-all (no agentId = applies to everyone)
# Note: implicit deny is already the default — this is just for documentation.
# Step 2: Allow specific tools per agent
for TOOL in read_file list_files search_web; do
curl -X POST https://api.shieldagent.io/policies \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <admin-key>' \
-d "{
\"tenantId\": \"<tenant-id>\",
\"agentId\": \"<agent-id>\",
\"toolName\": \"$TOOL\",
\"action\": \"allow\"
}"
done5. Tool Drift Monitoring
Tool drift occurs when an agent starts calling tools outside its established baseline. Configure drift detection sensitivity and response:
# View tool drift events for an agent
curl "https://api.shieldagent.io/agents/<agent-id>/tool-drift?limit=20" \
-H 'Authorization: Bearer <admin-key>'
# Block new tools until explicitly approved
curl -X PATCH "https://api.shieldagent.io/agents/<agent-id>" \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <admin-key>' \
-d '{"blockToolDiscovery": true}'{
"eventType": "tool_drift_detected",
"agentId": "agt_01HXYZ...",
"toolName": "delete_database",
"baselineCallCount": 0,
"sessionCallCount": 3,
"driftScore": 0.94,
"action": "blocked",
"timestamp": "2026-04-16T14:23:00Z"
}Policy Templates
ShieldAgent ships with pre-built policy templates for common security scenarios. Apply them via the dashboard or API:
# List available templates
curl https://api.shieldagent.io/policy-templates \
-H 'Authorization: Bearer <admin-key>'
# Apply the "EU AI Act high-risk agent" template
curl -X POST https://api.shieldagent.io/policy-templates/<template-id>/apply \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <admin-key>' \
-d '{
"tenantId": "<tenant-id>",
"agentId": "<agent-id>"
}'OWASP Top 10 for LLMs
Covers all OWASP LLM risks: injection, data poisoning, supply chain
EU AI Act High-Risk
Full Annex IV evidence collection + human review triggers
Minimal footprint
Read-only tool allowlist + strict session limits
DevOps agent
Controlled bash/git access with destructive command blocklist