Research Article | Open Access | Download PDF
Volume 74 | Issue 3 | Year 2026 | Article Id. IJCTT-V74I3P101 | DOI : https://doi.org/10.14445/22312803/IJCTT-V74I3P101Artificial Intelligence: Leveraging Privacy and Security on AI Models and Small Language Model Thrive
Sivamurugan Perumal
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 19 Jan 2026 | 25 Feb 2026 | 10 Mar 2026 | 28 Mar 2026 |
Citation :
Sivamurugan Perumal, "Artificial Intelligence: Leveraging Privacy and Security on AI Models and Small Language Model Thrive," International Journal of Computer Trends and Technology (IJCTT), vol. 74, no. 3, pp. 1-6, 2026. Crossref, https://doi.org/10.14445/22312803/IJCTT-V74I3P101
Abstract
Today, we are in a rapid technological growth with artificial intelligence, in which privacy and security are the primary concerns. With various language models in the current market, the data that models are trained to provide relevant detail responses plays a vital role in how privacy and security can be leveraged without compromising, for example, the data like Personal Identifiable Information (PI2) [1] and Health Insurance Portability and Accountability ACT (HIPAA) [2]. In this, we provide a review of privacy and security that can be implemented in small industries (specific to the domain) with a Small Language Model (SLM). It also suggests which models are available on the market and how they can be leveraged, considering common factors that align with their business affordability.
Keywords
Artificial Intelligence, Cost savings, Computing, Financial firms, HealthCare, SLM (Small Language Model).
References
[1] Nicholas Carlini et al.,
“Extracting Training Data from Large Language Models,” Proceedings of the 30th
USENIX Security Symposium (USENIX Security’21), 2021.
[Google Scholar] [Publisher Link]
[2] Vivying S.Y. Cheng, and
Patrick C.K. Hung, “Health Insurance Portability and Accountability Act (HIPPA)
Compliant Access Control Model for Web Services,” International Journal of
Healthcare Information Systems and Informatics, vol. 1, no, 1, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Arthur Jochems et al.,
“Developing and Validating a Survival Prediction Model for NSCLC Patients
through Distributed Learning Across 3 Countries,” International Journal of
Radiation Oncology, Biology, Physics, vol. 99, no. 2, pp. 344-352, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Arthur Jochems et al.,
“Distributed Learning: Developing a Predictive Model based on Data from
Multiple Hospitals without Data Leaving the Hospital- A Real Life Proof of
Concept,” Radiotherapy and Oncology, vol. 121, no. 3, pp. 459-467, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Brendan McMahan et al.,
“Communication-efficient Learning of Deep Networks from Decentralized Data,” Proceedings
of the 20th International Conference on Artificial Intelligence and
Statistics, 2017.
[Google Scholar] [Publisher Link]
[6] Tom Brown et al., “Language
Models are Few-shot Learners,” Advances in Neural Information Processing
Systems, vol. 33, 2020.
[Google Scholar] [Publisher Link]
[7] Daniel M. Ziegler et al.,
“Fine-tuning Language Models from Human Preferences,” arXiv preprint
arXiv:1909.08593, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Malin Jansson et al.,
“Online Question and Answer Sessions: How Students Support Their Own and Other
Students’ Processes of Inquiry in a Text-based Learning Environment,” The
Internet and Higher Education, vol. 51, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Rodrigo Pedro et al., “From
Prompt Injections to SQL Injection Attacks: How Protected is your
LLM-Integrated Web Application?,” arXiv preprint arXiv:2308.01990, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Matthew Fredrikson et al.,
“Privacy in Pharmacogenetics: An End-to-end case Study of Personalized Warfarin
Dosing,” 23rd USENIX Security Symposium, 2014.
[Google Scholar] [Publisher Link]
[11] Shuai Zhao et al., “A
Survey of Backdoor Attacks and Defenses on Large Language Models: Implications
for Security Measure,” Authorea Preprints, pp. 1-21, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Yanzhou Li et
al., “BadEdit:
Backdooring Large Language Models by Model Editing,” Cryptography and
Security, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Vinu Sankar Sadasivan et
al., “Fast Adversarial Attacks on Language Models in one GPU Minute,” arXiv
preprint, pp. 1-20, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Vyas Raina, Adian
Liusie, and Mark Gales, “Is LLM-as-a-Judge Robust? Investigating Universal
Adversarial Attacks on Zero-shot LLM Assessment,” Proceedings of the 2024
Conference on Empirical Methods in Natural Language Processing, pp.
7499-7517, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Siwon Kim et al., “ProPILE:
Probing Privacy Leakage in Large Language Models,” Advances in Neural
Information Processing Systems, 2023.
[Google Scholar] [Publisher Link]
[16] Niloofar Mireshghallah et
al., “Can LLMs Keep a Secret? Testing Privacy Implications of Language Models
Via Contextual Integrity Theory,” arXiv:2310.17884, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Ali Naseh et al., “Stealing
the Decoding Algorithms of Language Models,” Proceedings of the 2023 ACM
SIGSAC Conference on Computer and Communications Security, pp. 1835-1849,
2023.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Ashwinee Panda et al.,
“Teach LLMs to Phish: Stealing Private Information from Language Models,” arXiv:2403.00871,
2024.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Michael Duan et al., “Do
Membership Inference Attacks Work on Large Language Models?,” arXiv:2402.07841,
2024.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Robin Staab et al., “Beyond
Memorization: Violating Privacy Via Inference with Large Language Models,” arXiv:2310.07298,
2023.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Shenao Yan et al., “An
LLM-assisted Easy-to-trigger Backdoor Attack on Code Completion Models:
Injecting Disguised Vulnerabilities against Strong Detection,” 33rd
USENIX Security Symposium, 2024.
[Google Scholar] [Publisher Link]
[22] Manli Shu et al., “On the
Exploitability of Instruction Tuning,” Advances in Neural Information
Processing Systems, 2023.
[Google Scholar] [Publisher Link]
[23] Zhaohan Xi et
al., “Defending
Pre-trained Language Models as Few-shot Learners Against Backdoor Attack,” Advances
in Neural Information Processing Systems, 2023.
[Google Scholar] [Publisher Link]
[24] Xi Li et al.,
“Chain-of-scrutiny: Detecting Backdoor Attacks for Large Language Models,”
Findings of the Association for Computational Linguistics, pp. 7705-7727, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Xilie Xu et
al., “An LLM
can Fool Itself: A Prompt-based Adversarial Attack,” arXiv: 2310.13345,
2023.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Aounon Kumar et al.,
“Certifying LLM Safety against Adversarial Prompting,” arXiv: 2309.02705,
2023.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Hannah Brown et al.,
“Self-evaluation as a Defense against Adversarial Attacks on LLMs,” arXiv
preprint, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Yukun Zhao et al.,
“Improving the Robustness of Large Language Models via Consistency Alignment,” arXiv
preprint, pp. 1-11, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Mathias Humbert et al.,
“Addressing the Concerns of the Lacks Family: Quantification of Kin Genomic
Privacy,” Proceedings of the 2013 ACM SIGSAC Conference on Computer and
Communication Security, pp. 1141-1152, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Mathias Humbert et al.,
“Quantifying Interdependent Risks in Genomic Privacy,” ACM Transactions on
Privacy and Security, vol. 20, no. 1, pp. 1-31, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Reza Shokri et al.,
“Quantifying Location Privacy,” 2011 IEEE Symposium on Security &
Privacy, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Badhan Chandra Das, M. Hadi
Amini, and Yanzhao Wu, “Privacy Risks Analysis and Mitigation in Federated
Learning for Medical Images,” 2023 IEEE International Conference on Bioinformatics
and Biomedicine (BIBM’23), 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Wenqi Wei et al., “A
Framework for Evaluating Client Privacy Leakages in Federated Learning,” Computer
Security—ESORICS 2020, pp. 545-566, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Jonas Geiping
et al., “Inverting
Gradients—How Easy is it to Break Privacy in Federated Learning?,” Advances
in Neural Information Processing Systems, 2020.
[Google Scholar] [Publisher Link]
[35] Oliver Cartwright, Harriet
Dunbar, and Theo Radcliffe, “Evaluating Privacy Compliance in Commercial Large
Language Models-ChatGPT, Claude, and Gemini,” Research Square, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Vaidehi Patil, Peter Hase,
and Mohit Bansal, “Can Sensitive Information be Deleted from LLMs? Objectives
for Defending Against Extraction Attacks,” arXiv: 2309.1740, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Jie Huang et al., “Are
Large Pre-trained Language Models Leaking Your Personal Information?,” Findings
of the Association for Computational Linguistics, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Mohammad Raeini,
“Privacy-preserving Large Language Models (PPLLMs),” SSRN, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Ruihan Wu et al., “Learning
to Invert: Simple Adaptive Attacks for Gradient Inversion in Federated
Learning,” Proceedings of the 39th Conference on Uncertainty in
Artificial Intelligence, 2023.
[Google Scholar] [Publisher Link]
[40] Minxin Du et al., “Sun
DP-Forward: Fine-tuning and Inference on Language models with Differential
Privacy in Forward Pass,” Proceedings of the 2023 ACM SIGASAC Conference on
Computer and Communications Security, pp. 2665-2679, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Dingfan Chen, Ning Yu, and
Mario Fritz, “RelaxLoss: Defending Membership Inference Attacks without Losing
Utility,” arXiv: 2207.05801, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Paul-Edouard Sarlin et al.,
“Superglue: Learning Feature Matching with Graph Neural Networks,” Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.
4938–4947, 2020.
[Google Scholar] [Publisher Link]
[43] Pranav Rajpurkar et al.,
“Squad: 100,000+ Questions for Machine Comprehension of Text,” Proceedings
of the 2016 Conference on Empirical Methods in Natural Language Processing,
pp. 2383-2392, 2016.
[Google Scholar] [Publisher Link]
[44] Mandar Joshi et al.,
“Triviaqa: A Large Scale Distantly Supervised Challenge Dataset for Reading
Comprehension,” Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics, pp. 1601-1611, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Siva Reddy, Danqi Chen, and
Christopher D Manning, “Coqa: A Conversational Question Answering Challenge,” Transactions
of the Association for Computational Linguistics, vol. 7, pp. 249–266,
2019.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Tom Kwiatkowski et al.,
“Natural Questions: A Benchmark for Question Answering Research,” Transactions
of the Association for Computational Linguistics, vol. 7, pp. 453–466,
2019.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Deepak Narayanan et al.,
“Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer
APIs,” arXiv: 2306.02440, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[48] Simran Arora et al.,
“Simple Linear Attention Language Models Balance the Recall-throughput
Tradeoff,” arXiv: 2402.18668, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[49] Xiaoqi Jiao et al.,
“Tinybert: Distilling BERT for Natural Language Understanding,” Findings of
the Association for Computational Linguistics: EMNLP 2020, pp. 4163–4174,
2020.
[CrossRef] [Google Scholar] [Publisher Link]
[50] Chen Liang et al., “Less is
More: Task-aware Layer-wise Distillation for Language Model Compression,” Proceedings
of the 40th International Conference on Machine Learning, 2023.
[Google Scholar] [Publisher Link]
[51] Atreya Shankar et al.,
“PrivacyGLUE: A Benchmark Dataset for General Language Understanding in Privacy
Policies,” Applied Sciences, vol. 13, no. 6, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[52] Alex Havrilla, and Maia
Iyer, “Understanding the Effect of Noise in LLM Training Data with Algorithmic
Chains of Thought,” arXiv: 2402.04004, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[53] Jovan Stojkovic et al.,
“Dynamollm: Designing LLM Inference Clusters for Performance and Energy
Efficiency,” 2025 IEEE International Symposium on High Performance Computer Architecture,
2025.
[CrossRef] [Google Scholar] [Publisher Link]
[54] Pratyush Patel et al.,
“Characterizing Power Management Opportunities for LLMs in the Cloud,” Proceedings
of the 29th ACM International Conference on Architectural Support
for Programming Languages and Operating Systems, vol. 3, pp. 207-222, 2024.
[CrossRef] [Google Scholar] [Publisher Link]