Risk
Adversarial Evasion
Description
Attackers craft inputs that evade model or runtime guardrails, causing misclassification or bypassing abuse filters.
Example
Adversary submits obfuscated harmful input that escapes detection and is processed by the model.
Assets Affected
Model Inference endpoint
Model Response
Mitigation
- Adversarial training
- Input filtering
- Continuous testing
- Update abuse detection mechanisms
Standards Mapping
- ISO 42001: A.6.2.6, A.9.4
- NIST AI RMF: MEASURE 2.6, MEASURE 2.7
- DASF v2: MODEL SERVING 9.2