Skip to main content
Sign in →

Tool Drift Detection

Detect when MCP tool definitions change unexpectedly after registration — addressing OWASP MCP Top 10 #6 (Tool Definition Integrity). ShieldAgent hashes the full tool manifest at connection time and compares on every subsequent request, blocking or alerting when tool drift is detected.

The Threat

OWASP MCP Top 10 #6 (Tool Definition Integrity) describes attacks where a malicious or compromised MCP server silently alters its tool definitions after an agent has already approved and cached them. Because most MCP clients trust the tool manifest they received at startup, the agent continues calling the tool — now executing a different, attacker-controlled operation.

Schema substitution

An attacker modifies a tool's JSON schema so the agent sends data to a different endpoint or with different parameters than intended.

Description poisoning

The tool's description is changed post-registration to include prompt injection payloads that are fed back to the agent on the next capability discovery call.

Supply-chain drift

A legitimate MCP server is updated by its vendor and its tool manifest changes without the security team's knowledge, breaking security assumptions.

How ShieldAgent Detects It

When an agent first connects through the proxy, ShieldAgent captures a snapshot of the MCP server's tool manifest — the complete list of tools with their names, schemas, and descriptions. This snapshot is hashed and stored as the registered baseline.

On every subsequent tools/list response the proxy observes, the current manifest is re-hashed and compared against the baseline. A mismatch triggers a drift event. Depending on configuration, the response is blocked or passed through with an alert.

tools/list response
Re-hash manifest
Compare to baseline
Match → pass through|Mismatch → drift event

What is hashed

The hash covers the semantic content of the manifest, not the wire encoding, so JSON key ordering differences do not produce false positives. The following fields are included per tool:

FieldWhy it matters
nameIdentifies the tool. Rename = new tool or deceptive substitution.
descriptionFed to the agent as a capability hint — injection target.
inputSchema (full JSON Schema)Determines what data the agent sends to the tool.
annotations (if present)Capability flags like readOnly, destructive — a change here is policy-relevant.

Version Tracking

ShieldAgent maintains a version history for every registered server's tool manifest. Each time a drift event is acknowledged and approved (via the dashboard or API), the new manifest is stored as the current baseline and the previous baseline is kept in history for audit purposes.

Version fieldDescription
baselineHashcryptographic hash of the approved manifest snapshot (hex).
baselineVersionMonotonically incrementing integer. 1 = initial registration.
capturedAtISO 8601 timestamp when this version was first observed.
approvedAtTimestamp of dashboard/API acknowledgement. Null = unapproved drift.
approvedByAgent ID or user ID that approved the change.

Configuration

SettingDefaultDescription
Tool drift detectiontrueEnable tool definition integrity checks.
Action on driftalertAction on drift: alert (log + audit event) or block (reject the response).
Hash algorithmsha256Hash algorithm: sha256 or sha512.
Include annotationstrueInclude MCP tool annotations in the hash. Set false to ignore annotation-only changes.

Audit Events & API

Every drift detection is persisted as a tool_drift audit event including the previous and current hash, the diff of changed tool names, and the action taken.

json
{
  "id": "aev_...",
  "agentId": "agt_...",
  "tenantId": "ten_...",
  "eventType": "tool_drift",
  "action": "alert",
  "riskScore": 75,
  "details": {
    "serverId": "srv_...",
    "serverName": "filesystem-mcp",
    "previousHash": "a3f8...",
    "currentHash": "c912...",
    "baselineVersion": 2,
    "changedTools": ["read_file"],
    "addedTools": [],
    "removedTools": []
  },
  "timestamp": "2026-04-25T10:00:00.000Z"
}

API endpoints

GET/tenants/:tenantId/audit-events?eventType=tool_driftList tool drift events. Supports ?agentId=, ?serverId=, ?from=, ?to= filters.
GET/tenants/:tenantId/servers/:serverId/tool-manifestGet the current approved tool manifest baseline for a server.
POST/tenants/:tenantId/servers/:serverId/tool-manifest/approveApprove the current drifted manifest as the new baseline.

Policy Integration

Use security.toolDrift.detected as a policy condition to automatically block requests when a drift event is active on any tool the agent is attempting to use:

json
{
  "name": "Block on active tool drift",
  "priority": 5,
  "conditions": [
    { "field": "security.toolDrift.detected", "op": "eq", "value": true }
  ],
  "action": "block",
  "response": {
    "code": 403,
    "message": "Tool definition has changed since last approval. Request blocked pending review."
  }
}
Tool Drift Detection