top of page

RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA

  • Writer: Nikita Silaech
    Nikita Silaech
  • 2 days ago
  • 2 min read
ree

By Bhavik Agarwal, Hemant Sunil Jomraj, Simone Kaplunov, Jack Krolick, Viktoria Rojkova

Published: Aug 2025 | arXiv:2508.09893


Overview

This paper introduces a multi-agent architecture that combines an ontology-free Knowledge Graph (KG) with Retrieval-Augmented Generation (RAG) to answer complex regulatory compliance questions. The system extracts subject–predicate–object (SPO) triplets from regulatory documents, embeds them alongside their original text, and retrieves them for grounded, traceable answers. Unlike standard LLMs, which often generate content without source verification, this approach links every fact to its original regulation, providing a verifiable audit trail.


Why it matters

In regulated industries such as healthcare and pharmaceuticals, the accuracy and traceability of information are essential. Errors or hallucinations in AI-generated responses can have serious compliance and safety consequences. By grounding answers in verifiable triplets and improving navigation between related rules, this system addresses both accuracy and explainability. The flexible, ontology-free design also enables fast adaptation to evolving regulatory frameworks without manual schema updates.


Evaluation Highlights

  • Higher retrieval precision: At 0.75 similarity threshold, triplet retrieval scored 0.2888 vs 0.1684 for text-only retrieval.

  • Improved navigation: Denser graph connections (average degree 1.60 vs 1.29) and shorter path length (1.33 vs 2.02).

  • Slight accuracy improvement: 4.73 vs 4.71 on a 1–5 factual correctness scale.


Core Contributions

  • Ontology-free knowledge graph built from regulatory text without rigid schemas, enabling adaptability.

  • Triplet extraction and embedding with direct links to original source sentences for verification.

  • Multi-agent pipeline coordinating ingestion, extraction, cleaning, indexing, retrieval, and grounded answer generation.

  • Subgraph visualizations for intuitive mapping of connections between related regulations.


Key Benefits

  • Traceability: Every answer linked to authoritative regulatory text.

  • Reduced hallucinations: Structured triplets ground LLM outputs in facts.

  • Better navigation: Reveals hidden relationships between rules.

  • Scalability: Modular agents can be updated independently.


Main Challenges

  • Vocabulary fragmentation without a fixed ontology.

  • Domain-specific complexity leading to occasional extraction errors.

  • Computational demands for large-scale embedding and retrieval.


Path Forward

The authors propose adding multi-step reasoning capabilities, integrating human-in-the-loop feedback to improve triplet quality, enabling incremental updates for changing regulations, and extending the framework to domains such as finance, intellectual property, and environmental law.

Read the full paper: arXiv:2508.09893


 
 
 

Comments


bottom of page