A Google Lighthouse accessibility score can summarize automated findings, but it is not a WCAG verdict or dated page-level evidence from a live route. Here is what the number can show, what it can hide, and where proof begins.
A Google Lighthouse accessibility score feels decisive because it collapses a page into one number. That can be useful for triage. It can also make teams think the number itself is the evidence. It isn’t.
Lighthouse can summarize automated findings on the rendered page, but it does not tell you whether one real route held up on one date. If your site runs an accessibility overlay, the narrower question is still the one that matters: on one named page, with the overlay on and then blocked, what changed, what stayed broken, and which public accessibility claim did that observation touch?
What a Google Lighthouse accessibility score usually measures
Most accessibility scores come from automated checks. Lighthouse is the familiar example: it scans the rendered page, weighs automated findings, and outputs a number. Vendor dashboards may layer their own scorecard logic on top of similar rule checks, but the result is still an automated summary rather than route-level proof.
- The score can show whether an automated engine found obvious issues such as missing alt attributes, contrast failures, or label gaps on that pass.
- The score can show whether a page improved or regressed compared with a prior run under the same scoring method.
- The score can help a team prioritize which pages deserve a closer review first.
- The score does not become stronger just because it is formatted as a 0–100 result instead of a raw issue list.
A score is a summary of scanner output. It can be operationally useful without being a neutral record of what held up on one live route.
Why a Lighthouse score is not a WCAG verdict
A good Lighthouse score can mean the page passed a useful set of automated checks. It does not mean the page conforms to WCAG in full, and it does not prove that the site’s public accessibility statement held up on a named route. Lighthouse is one lens on one rendered state, not a legal conclusion and not a conformance certificate.
- Lighthouse can surface automated findings without testing every interaction path a person depends on.
- A 100 does not prove that menus, modals, support forms, or cart steps stayed usable through a full task.
- The score does not quote the site’s own accessibility statement back and test whether that exact claim held up on the route.
- The score is not the same thing as dated evidence tied to one page, one route step, and one public claim.
What a single score hides
A single number compresses too much context. It flattens severity, hides route-specific behavior, and can blur the difference between a page that looks fine to an automated engine and a page that still fails a real keyboard path or interactive flow.
- Severity tradeoffs: a score may improve even while one high-friction issue remains on checkout, login, or another critical route.
- Route-specific failures: one page can score well while a cart drawer, modal, menu, or embedded form still breaks when the user tries to complete a task.
- Keyboard continuity and modal recovery: a score may look reassuring while focus gets lost, trapped, or restarted once the user opens a drawer or modal.
- Task completion gaps: the number does not prove that a shopper reached cart completion or that a support form could be finished end to end.
- Overlay behavior: a score usually does not explain what the overlay changed, what it left untouched, or whether the overlay itself introduced a new issue.
- Public-claim alignment: a number does not quote the site’s own accessibility statement back and ask whether that exact language held up on this page.
What a strong score still does not prove
A strong score is still not the same as proof that a page supported a task, a keyboard path, or a public statement. WAVE and axe can diagnose issues, but they are not a neutral before/after proof of an overlay customer’s public claims. The same boundary applies to scorecards built on top of those engines.
- It does not prove that a user could complete the real task on the route being evaluated.
- It does not prove that keyboard continuity held up across menus, drawers, modals, or checkout steps.
- It does not prove that the page matched the site’s own accessibility statement on the date tested.
- It does not prove that an overlay vendor’s monitoring or dashboard is independent of the vendor being evaluated.
The risky leap is turning a high score into a verdict about what happened on this page. A score can summarize. It cannot witness.
Where page-level evidence starts
Page-level evidence starts when the record becomes specific enough to stand on its own: exact URL, exact timestamp, exact task step, and the first broken interaction or rule state that matters. For overlay questions, it also means checking the same page twice — once with the overlay active and once with it blocked.
- The tested page path, not just the site as a whole.
- The overlay vendor observed on the page, such as accessiBe or UserWay.
- The browser condition in each pass: overlay active versus overlay blocked.
- The automated rule output beneath the summary score.
- The public accessibility statement or claim being checked, quoted back when available.
- A timestamp and snapshot reference so the result is tied to a moment rather than a marketing claim.
If you need a concrete packet structure rather than another score summary, use How to document website accessibility evidence that holds up for the exact URL, timestamp, snapshot-hash, claim-quote, and first-broken-step fields that turn a score conversation into a reviewable record.
How to use scores, checker output, and witness-style evidence together
These artifacts do different jobs. A Lighthouse score can point you toward trouble. A checker output can name the rules and nodes involved. A manual review can test the parts automation cannot reach. A witness-style record can show what changed on the same page when the overlay was present versus blocked and compare that result to the page’s public claim.
- Use a score to prioritize which pages or templates deserve attention first.
- Use raw checker output to see the exact rules, nodes, and issue counts behind the score.
- Use manual review to test keyboard paths, focus order, and other interaction details that a score cannot settle.
- Use dated page-level evidence when the real question is what happened on this page with this overlay and this public claim.
OverlayRiskWitness is not a score vendor, not the overlay vendor’s dashboard, and not a legal conclusion. It runs the page with the overlay on and off, records what changed, and helps turn a broad score conversation into one page-specific evidence record. Start with the free one-page witness, then use a Risk Packet when the route or statement actually matters.
Frequently asked questions
- What is a Google Lighthouse accessibility score?
- It is a 0–100-style summary based on automated accessibility checks run against the rendered page. It can help with triage and prioritization, but it is still an automated score rather than a page-level witness record.
- Does a good Lighthouse accessibility score prove accessibility?
- No. A good score is not a WCAG verdict, not a legal conclusion, and not dated evidence that a named route held up under real use. It shows what the automated pass found, not everything a user could experience on the page.
- Can a high accessibility score still hide serious issues?
- Yes. A page can score well while a keyboard path, modal flow, cart step, support form, or route-specific interaction still breaks in ways a single score does not explain clearly.
- Is OverlayRiskWitness another accessibility score vendor?
- No. OverlayRiskWitness is an independent witness. It compares the same page with the overlay active and blocked, then records what held up, what did not hold up, and what was not testable. Evidence, not legal advice.
The OverlayRiskWitness engineering team builds the two-pass witness runner, the axe-core diff pipeline, and the Risk Packet composer. Every post is grounded in what the engine actually observes on live pages.