Risk
Undocumented Data Flows and Lineage
Description
The pathways by which data enters, is processed within, and exits AI systems (including RAG sources) are not fully mapped or understood, obscuring potential data leakage points or non-compliance.
Example
An AI system is discovered, but it's unclear where its training data originated or where its output data is being sent, hindering privacy impact assessment.
Assets Affected
Dataset / RAG
AI App
Pipeline Job
3rd-party AI integration
Mitigation
- Map data flows for all discovered AI systems
- Implement data lineage tracking tools and processes
- Document data provenance and data management processes for all identified data resources
Standards Mapping
- ISO 42001: A.7.5, A.4.3
- NIST AI RMF: MAP 1.6, MAP 4.2
- DASF v2: RAW DATA 1.6, GOVERNANCE 4.1