AI agent website accessibility testing is now a single tool call away. Here is how OverlayRiskWitness exposes its two-pass axe-core witness over the Model Context Protocol — and what a real tool response looks like.
Most accessibility checks live outside the developer toolchain. A QA engineer runs a scanner in a browser extension, exports a report, pastes findings into a ticket. The agent doing the rest of the build has no visibility into any of it. The Model Context Protocol changes the integration surface: if the accessibility engine speaks MCP, any agent that can call a tool can run the check, read the findings, and act on them — without leaving its own context window.
OverlayRiskWitness exposes its witness as an MCP server. The witness loads a public page twice in a real browser — once with the overlay active, once with it blocked — runs axe-core on each pass, and diffs the results per rule. That full contract is now available as a tool call. This post walks through the transport options, the tool shape, and what the response actually contains.
Why MCP is the right integration surface for accessibility checks
An accessibility scanner has three properties that make it a good MCP tool: it is read-only, it operates on public URLs, and its output is structured enough that an agent can reason over it without parsing freeform text. The witness never mutates the target site — it only loads a public page and reports what axe-core observed. That makes it safe to hand to an autonomous agent as a side-effect-free tool. The worst it can do is load a page twice and return findings.
The MCP wrapper also keeps the heavy machinery — Browserbase sessions, axe-core rule evaluation, claim extraction — server-side. The agent never needs to know that a real browser loaded the page or that a WCAG rule engine ran against the DOM. It asks a question, gets a structured answer, and can act on the finding states without understanding the scanner implementation.
Two transports: stdio and hosted Streamable-HTTP
The server ships on two transports. For local development and desktop AI clients like Claude Desktop or Cursor, the stdio transport is the standard path: you point the client config at the Node binary and the server process starts on demand. For hosted clients and agent pipelines that cannot manage local processes, the hosted Streamable-HTTP endpoint at POST /mcp on overlayrisk.com is the alternative. Both transports expose the same tool with the same schema.
The hosted endpoint is stateless — each request carries its own full context and the server holds no session between calls. It is guarded the same way the public /api/witness route is: per-IP rate limiting, a global kill-switch, and same-URL response caching to avoid redundant browser sessions. Because the endpoint is stateless, it is safe to call from serverless functions and from agent orchestration frameworks that do not guarantee sticky connections.
Local stdio setup
To configure the stdio transport in a desktop MCP client, add the server entry to the client config file. The server process needs no persistent daemon — the client spawns it on the first tool call and manages the process lifetime.
{
"mcpServers": {
"overlayrisk-witness": {
"command": "node",
"args": ["./app/bin/start-mcp.js"],
"env": {
"APP_URL": "https://overlayrisk.com"
}
}
}
}The server is also published to npm under the package name in the official MCP registry listing, so clients that resolve servers by package name rather than local path can install it without cloning the repo. The Glama registry listing covers clients that pull from that index. The hosted Streamable-HTTP endpoint on overlayrisk.com/mcp is the Smithery integration point.
Tool call shape and response structure
The witness tool takes one parameter: a public URL. The server runs the two-pass scan — overlay blocked, then overlay active — and returns a single JSON payload. The free tier returns the overlay vendor detected, the count of claims tested, the first finding in full detail, and a count of additional findings locked behind the Risk Packet. Unlocking the full packet requires a $49 one-time purchase; the Drift Monitor subscription re-runs the witness on a schedule and alerts on state changes.
// Request
{
"method": "tools/call",
"params": {
"name": "witness",
"arguments": {
"url": "https://example.com"
}
}
}
// Response (free tier)
{
"overlayVendor": "accessiBe",
"claimsTested": 12,
"firstFinding": {
"rule": "color-contrast",
"wcagCriteria": "1.4.3",
"state": "didNotHoldUp",
"overlayOff": { "violations": 6 },
"overlayOn": { "violations": 6 },
"transition": "no_effect",
"claim": "This site meets WCAG 2.1 AA color contrast requirements."
},
"lockedFindingCount": 9,
"packetUrl": "https://overlayrisk.com/pricing"
}If you want to see what those fields look like outside the MCP payload, use How to document website accessibility evidence that holds up for the page-level packet structure — exact URL, timestamp, snapshot hash, quoted claim, and first broken step — that the tool response is meant to feed.
The three finding states — held up, did not hold up, not testable — describe what axe-core observed on a specific page at a specific moment. They are timestamped evidence. Whether a gap between a public claim and an observed finding has legal significance is a question for counsel. The witness provides the evidence; it does not provide a compliance certificate.
What the two-pass engine is actually doing
Understanding the response is easier if you understand the scan mechanics. The witness uses Browserbase — a hosted browser service — to load the target page in a real Chromium instance. This matters: overlay scripts run JavaScript that relies on a real DOM, real CSS computed styles, and real browser rendering. A headless sandbox that does not fully execute the overlay script would produce misleading results.
Pass one blocks the overlay script at the network layer, so axe-core sees the site exactly as it ships without any runtime augmentation. Pass two loads the same URL with the overlay active and waits for it to inject its changes — the page navigates to domcontentloaded rather than the full load event, because heavy sites on cold Browserbase sessions often never fire load within the timeout budget. Both passes run the same axe-core rule set. The diff is computed rule by rule: a rule that still fails with the overlay on is the one that matters.
- held up — the overlay did not introduce new violations on this rule and the claim is consistent with what axe-core saw.
- did not hold up — violations persist with the overlay active, or the overlay introduced new failures; the public claim is inconsistent with the observation.
- not testable — the rule could not be evaluated in one or both passes; this is a gap in evidence, not a pass.
Agent workflow patterns
Because the witness is a standard MCP tool, it composes with other tools in an agent's toolchain without any custom integration work. A few patterns that make practical sense:
- Pre-deploy audit: agent calls witness on the staging URL before a deploy approval step, surfaces any did-not-hold-up findings as blocking observations.
- Vendor evaluation: agent runs witness on three competitor sites and a prospect's own site, compares lockedFindingCount and firstFinding state across all four before a sales call.
- Regression triage: agent detects a deploy event via webhook, calls witness on the affected pages, and posts a summary of state changes to a Slack channel — no manual QA step in the middle.
- Drift alerting: Drift Monitor subscription covers up to 20 pages on a schedule ($99/mo); agent consumes the webhook payload and routes findings to the relevant on-call channel.
The witness tool never writes to the target site. It loads a public URL, runs axe-core, and returns observations. An agent with tool-call autonomy can call it without a human approval gate — there is no mutation risk. The same property that makes it safe to hand to an agent also makes it audit-friendly: every call is logged with the input URL, the timestamp, and the result.
The hosted endpoint enforces a per-IP rate limit and a same-URL cache, so a misconfigured agent that fires the same URL in a tight loop will not exhaust Browserbase session capacity or produce unbilled scan volume. Both limits are fail-open: if the cache layer errors, the scan runs normally; if the rate-limit store is unavailable, the request is allowed through rather than blocked.
The MCP server is read-only, stateless on the hosted transport, published to the official MCP registry, Glama, and npm, and has a hosted Streamable-HTTP endpoint for Smithery. If your agent can call a tool, it can run an accessibility witness on any public URL without any browser automation code on the agent side. The engine, the browser sessions, and the axe-core evaluation all stay server-side.
The OverlayRiskWitness engineering team builds the two-pass witness runner, the axe-core diff pipeline, and the Risk Packet composer. Every post is grounded in what the engine actually observes on live pages.