International Journal of Innovative Research in Computer and Communication Engineering

ISSN Approved Journal | Impact factor: 8.771 | ESTD: 2013 | Follows UGC CARE Journal Norms and Guidelines

| Monthly, Peer-Reviewed, Refereed, Scholarly, Multidisciplinary and Open Access Journal | High Impact Factor 8.771 (Calculated by Google Scholar and Semantic Scholar | AI-Powered Research Tool | Indexing in all Major Database & Metadata, Citation Generator | Digital Object Identifier (DOI) |


TITLE A Privacy-Preserving Multi-Agent Architecture for Intelligent Query Routing over Heterogeneous Data Sources
ABSTRACT This paper presents a privacy-preserving multi-agent system for natural language query processing over heterogeneous data sources, including documents and relational databases. The central problem is the indiscriminate transmission of sensitive organizational data to cloud-hosted Large Language Models (LLMs), which violates data governance obligations in regulated domains such as healthcare, finance, and enterprise operations. The proposed Intelligent Routing System (IRS) classifies each incoming query at parse time via lightweight user-supplied source tags, enforcing a strict locality policy: tagged sensitive queries are processed entirely by a locally-deployed LLM, while general queries are forwarded to a cloud LLM. A five-agent architecture—comprising an Orchestrator, Routing Agent, Document Agent (RAG over ChromaDB), Database Agent (NL→SQL over PostgreSQL), and General Agent—underpins this routing logic. A schema context generation mechanism bootstraps local NL→SQL accuracy using a single one-time cloud LLM call at database registration, thereafter operating fully offline. Evaluation across healthcare, financial services, and enterprise domains demonstrates 93.9% combined accuracy versus a full-cloud baseline of 94.1%, 100% local execution of sensitive workloads, and mean end-to-end latency of 1.97 s, validating the architecture as a deployable solution for privacy-conscious AI-assisted data access.
AUTHOR JAIDATT RAOSAHEB KALE, HARSHITA SANJAY JAIN, ATHARVA RAHUL KORWAR, KANCHAN ASHISH JOJARE UG Student, Dept. of Computer Engineering, Dhole Patil College of Engineering, Pune, Maharashtra, India Professor, Dept. of Computer Engineering, Dhole Patil College of Engineering, Pune, Maharashtra, India
VOLUME 184
DOI DOI: 10.15680/IJIRCCE.2026.1405103
PDF pdf/103_A Privacy-Preserving Multi-Agent Architecture for Intelligent Query Routing over Heterogeneous Data Sources.pdf
KEYWORDS
References [1] Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data (General Data Protection Regulation), Official Journal of the European Union, vol. L 119, pp. 1–88, May 2016.
[2] U.S. Department of Health and Human Services, “Health Insurance Portability and Accountability Act of 1996 (HIPAA),” Public Law 104-191, 1996.
[3] J. R. Smith and A. K. Patel, “A survey of data privacy challenges in cloud-hosted large language model deployments,” IEEE Trans. Inf. Forensics Security, vol. 18, pp. 4201–4219, 2023.
[4] P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2020, pp. 9459–9474.
[5] Y. Gao et al., “Retrieval-augmented generation for large language models: A survey,” arXiv preprint arXiv:2312.10997, 2023.
[6] T. Yu et al., “Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task,” in Proc. EMNLP, 2018, pp. 3911–3921.
[7] R. Guo, C. Chen, Z. Zhang, and X. Zhao, “Bridging the gap: Enabling NL-to-SQL on local LLMs via schema-grounded in-context learning,” in Proc. ACL Findings, 2024, pp. 812–825.
[8] M. Chen, Y. Liu, D. Wang, and L. Zhang, “AutoGen: Enabling next-generation multi-agent LLM applications,” arXiv preprint arXiv:2308.08155, 2023.
[9] A. Salve, M. Deshmukh, S. Attar, S. Shivpuje, and A. M. Utsab, “A collaborative multi-agent approach to retrieval-augmented generation across diverse data sources,” in Proc. IEEE Int. Conf. Intelligent Systems (ICIS), 2024, pp. 318–327.
[10] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Proc. AISTATS, 2017, pp. 1273–1282.
[11] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Proc. Theory Cryptography Conf. (TCC), 2006, pp. 265–284.
[12] N. Mireshghallah, M. Uniyal, T. Wang, D. Evans, and T. Berg-Kirkpatrick, “Privacy risks of explaining large language model predictions,” arXiv preprint arXiv:2212.01484, 2022.
[13] Meta AI Research, “Llama 3: Open foundation and fine-tuned chat models,” Technical Report, Meta Platforms, Inc., 2024. [Online]. Available: https://ai.meta.com/research/publications/llama-3/
[14] Muthukumar S, Dr. Krishnan N, Pasupathi P, Deepa S, “Analysis of Image Inpainting Techniques with Exemplar, Poisson, Successive Elimination and 8 Pixel Neighbourhood Methods,” International Journal of Computer Applications, vol. 9, no. 11, 2010.
image
Copyright © IJIRCCE 2020.All right reserved