RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA
- Nikita Silaech
- 2 days ago
- 2 min read

By Bhavik Agarwal, Hemant Sunil Jomraj, Simone Kaplunov, Jack Krolick, Viktoria Rojkova
Published: Aug 2025 | arXiv:2508.09893
Overview
This paper introduces a multi-agent architecture that combines an ontology-free Knowledge Graph (KG) with Retrieval-Augmented Generation (RAG) to answer complex regulatory compliance questions. The system extracts subject–predicate–object (SPO) triplets from regulatory documents, embeds them alongside their original text, and retrieves them for grounded, traceable answers. Unlike standard LLMs, which often generate content without source verification, this approach links every fact to its original regulation, providing a verifiable audit trail.
Why it matters
In regulated industries such as healthcare and pharmaceuticals, the accuracy and traceability of information are essential. Errors or hallucinations in AI-generated responses can have serious compliance and safety consequences. By grounding answers in verifiable triplets and improving navigation between related rules, this system addresses both accuracy and explainability. The flexible, ontology-free design also enables fast adaptation to evolving regulatory frameworks without manual schema updates.
Evaluation Highlights
Higher retrieval precision: At 0.75 similarity threshold, triplet retrieval scored 0.2888 vs 0.1684 for text-only retrieval.
Improved navigation: Denser graph connections (average degree 1.60 vs 1.29) and shorter path length (1.33 vs 2.02).
Slight accuracy improvement: 4.73 vs 4.71 on a 1–5 factual correctness scale.
Core Contributions
Ontology-free knowledge graph built from regulatory text without rigid schemas, enabling adaptability.
Triplet extraction and embedding with direct links to original source sentences for verification.
Multi-agent pipeline coordinating ingestion, extraction, cleaning, indexing, retrieval, and grounded answer generation.
Subgraph visualizations for intuitive mapping of connections between related regulations.
Key Benefits
Traceability: Every answer linked to authoritative regulatory text.
Reduced hallucinations: Structured triplets ground LLM outputs in facts.
Better navigation: Reveals hidden relationships between rules.
Scalability: Modular agents can be updated independently.
Main Challenges
Vocabulary fragmentation without a fixed ontology.
Domain-specific complexity leading to occasional extraction errors.
Computational demands for large-scale embedding and retrieval.
Path Forward
The authors propose adding multi-step reasoning capabilities, integrating human-in-the-loop feedback to improve triplet quality, enabling incremental updates for changing regulations, and extending the framework to domains such as finance, intellectual property, and environmental law.
Read the full paper: arXiv:2508.09893
Comments