Most scanners are graded on coverage. Mara is graded on conviction rate: of every finding we surface, what fraction did the validator actually exploit?
The validator is a deterministic Python module — no LLM. Per finding class:
- Reflected/DOM XSS → headless Chromium intercepting
alert(). - Stored XSS → same, after a re-fetch from a second session.
- Blind SQLi → time-based statistical delta.
- SSRF → out-of-band callback (interactsh).
- IDOR → two-account replay + diff.
If a hypothesis class has no validator, it stays in the audit log but does not surface as a finding. We'd rather miss a real bug than ship a false positive.