LLM AppSecRegressionEvidenceHardening

LLM AppSec Harness

Deterministic regression harness for comparing baseline and hardened LLM application behavior with stable reports and explicit limits.

Problem / Scope

The project is a deterministic regression harness for AI application security testing. The point is not to claim broad safety guarantees; the point is to show repeatable comparisons between a baseline target and a hardened target, with clear artifacts describing what changed and what still fails.

Architecture

  • Fixed attack corpus and reason codes
  • Baseline and hardened targets evaluated against the same prompts
  • Stable markdown diff reports copied into the hub evidence pack
  • Repo permalinks used to tie the public code state to the published snapshot

Key Workflows / What It Proves

  • Run the corpus against both targets and compare outcomes
  • Inspect category-level regressions and improvements
  • Publish the diff report and run evidence as stable hub artifacts

Limitations

  • It does not prove real-world completeness
  • It does not replace manual review or broader evaluation coverage
  • Results are bounded by the chosen corpus, policies, and target implementations

Evidence Pack

E-APPSEC-001

Diff report snapshot

Stable snapshot of the latest baseline-versus-hardened report copied into the hub repo.

E-APPSEC-002

Corpus taxonomy snapshot

Published excerpt showing how prompt categories and reason codes are organized.

E-APPSEC-003

Run output screenshot

Terminal capture proving a real run produced the snapshot artifacts.

E-APPSEC-004

Repo commit permalink

Pinned public source reference for the snapshot version.

E-APPSEC-005

Known gaps note

Published note describing categories or mitigations not yet covered by the deterministic harness.