NecessityWorks

NecessityWorks AI-Native SAST

OWASP Benchmark v1.2 Scorecard — AI-Powered Code Security Analysis
OWASP BenchmarkJava v1.2 • 11 CWE Categories • Phase 1 Results
+1.00
Youden Index
TPR − FPR
100%
F1 Score
Harmonic mean P & R
$0.47
Cost / Review
Per test case
100%
True Positive Rate
0%
False Positive Rate
100%
Precision
11/11
CWE Categories
−1.0 Inverted 0.0 Random +1.0 Perfect

Youden Index: +1.00

The Youden Index measures a tool's ability to correctly identify vulnerabilities while avoiding false alarms. A perfect score of +1.00 indicates flawless discrimination between vulnerable and safe code patterns across all 11 OWASP CWE categories.

Per-CWE Detection Results
Vulnerability Category CWE Detected FP Avoided Result
Command Injection CWE-78 PASS
SQL Injection CWE-89 PASS
Cross-Site Scripting CWE-79 PASS
Path Traversal CWE-22 PASS
LDAP Injection CWE-90 PASS
XPath Injection CWE-643 PASS
Weak Cryptographic Algorithm CWE-327 PASS
Weak Hash Algorithm CWE-328 PASS
Weak Random Number Generator CWE-330 PASS
Trust Boundary Violation CWE-501 PASS
Insecure Cookie CWE-614 PASS

Beyond the Benchmark — 4 Additional Verified Findings

NecessityWorks identified 4 additional real security issues in OWASP "safe" test cases that the benchmark does not score, verified by human analysts. This demonstrates the ability to find vulnerabilities that traditional pattern-matching SAST tools miss entirely.

Multi-Agent Analysis Pipeline
🌳
AST Index
code structure
🕸
Attack Paths
data flow tracing
🔗
Code Intel
tiered resolution
🛡
Controls
auth, validation
Semgrep
OWASP rules
🧠
AI Specialists
12 OWASP agents

Preprocessing enriches context before AI analysis — each specialist receives AST data, call graphs, reachability maps, and SAST findings alongside the code diff.

Industry Comparison — OWASP Benchmark Youden Index
NecessityWorks
+1.00
Fortify SCA
+0.67
Checkmarx CxSAST
+0.62
Coverity SAST
+0.58
SonarQube
+0.51
Veracode SAST
+0.48
Semgrep OSS
+0.31
FindBugs
+0.22

Competitor scores sourced from OWASP Benchmark published results (owasp.org/www-project-benchmark). NecessityWorks score is Phase 1 preliminary (22 cases). All scores represent Youden Index (TPR − FPR) on BenchmarkJava v1.2.

Methodology

Tested against OWASP BenchmarkJava v1.2, the industry-standard test suite for static application security testing (SAST) tools. Each test case was submitted as an independent code review through the NecessityWorks multi-agent analysis engine. The pipeline performs AST indexing, call graph construction, entry point identification, reachability analysis, and static analysis before routing to 12 specialized security agents aligned to the OWASP Top 10 2025 taxonomy. Scoring uses per-CWE matching: a false positive is only counted when the tool flags the specific CWE being tested on code that is safe for that CWE. Findings of different, legitimate security issues on "safe" test cases are counted as bonus detections, not false positives.