CodeWallDocs
Guidance

Hypotheses

How CodeWall uses hypotheses to drive targeted, methodical vulnerability discovery.

Hypotheses are the core reasoning mechanism behind CodeWall's penetration testing. Rather than running a static checklist of scans, CodeWall's agent formulates hypotheses about potential vulnerabilities in your target — then designs specific tests to confirm or reject each one.

What is a hypothesis?

A hypothesis is a structured statement about a suspected vulnerability. Each hypothesis includes:

FieldDescription
StatementA clear, testable claim — e.g., "The /api/users endpoint is vulnerable to IDOR via predictable user IDs"
SeverityThe expected impact if confirmed (Critical, High, Medium, Low)
FamilyThe vulnerability class — auth, injection, XSS, misconfiguration, memory-safety, etc.
ConfidenceHow likely the agent believes this hypothesis is to be true (0–100%)
RationaleWhy the agent suspects this vulnerability exists
PreconditionsWhat must be true for the vulnerability to be exploitable
Proposed checksThe specific tests the agent plans to run

Why hypotheses matter

Traditional scanners work by firing hundreds of generic payloads and checking for known signatures. CodeWall works differently — it reasons about your application's specific architecture, technology stack, and behaviour to form targeted hypotheses.

This approach has several advantages:

  • Fewer false positives — the agent only tests what it has reason to suspect, rather than spraying payloads
  • Deeper coverage — hypothesis-driven testing catches logic flaws, business logic vulnerabilities, and chained attack paths that signature-based scanners miss
  • Transparency — you can see exactly what the agent is thinking and why, not just a list of CVEs
  • Efficiency — the agent spends its budget on the most promising attack vectors rather than exhaustive enumeration

The hypothesis lifecycle

  1. Formulation — during the analysis phase, the agent reviews reconnaissance data and formulates hypotheses about potential vulnerabilities
  2. Prioritisation — hypotheses are ranked by severity and confidence, so the most impactful and likely vulnerabilities are tested first
  3. Validation — during the validate phase, the agent runs the proposed checks for each hypothesis
  4. Outcome — each hypothesis is marked as verified (vulnerability confirmed), rejected (not exploitable), not tested (skipped due to budget or prerequisites), or error (test failed to execute)

Verified hypotheses become findings with full proof-of-concept evidence and remediation guidance.

Adding your own hypotheses

When approval gates are enabled, you can inject your own hypotheses into a running test before the analysis phase completes. This is powerful for:

  • Domain knowledge — you know your application better than any scanner. If you suspect a specific endpoint is vulnerable, tell the agent to test it
  • Regression testing — add hypotheses for previously fixed vulnerabilities to verify they haven't regressed
  • Compliance checks — inject hypotheses for specific compliance requirements your organisation must meet
  • Red team scenarios — guide the agent toward specific attack paths you want validated

How to add a hypothesis

  1. Navigate to your running test
  2. Switch to the Hypotheses tab
  3. Fill in the statement, severity, family, and optional rationale
  4. Click Add Hypothesis

Your hypothesis is queued alongside the agent's own hypotheses and will be tested during the validate and exploit phases. The agent treats operator-submitted hypotheses with the same rigour as its own — designing specific test cases, executing them, and reporting the outcome.

When you can add hypotheses

You can add hypotheses while the test is in the preflight, recon, or analysis phases. If approval gates are enabled, you can also add them while the test is awaiting approval for the recon or analysis phase. Once the agent moves into the validate phase, the hypothesis list is locked.

Retesting hypotheses

If a hypothesis was marked as not tested or error, you can trigger a targeted retest directly from the Hypotheses tab. This launches a new focused test that only validates that specific hypothesis, without re-running the full engagement.

On this page