Vol. 5 No. 11 (2025)
Articles

Orchestrating Elasticity: A Comparative Analysis Of AI-Driven Predictive Scaling Versus Reactive Auto-Scaling In Microservices Architectures

Siddharth V. Menon
Independent Researcher, Artificial Intelligence & Cloud Architecture

Published 2025-11-27

Keywords

  • Cloud Computing,
  • Microservices,
  • Kubernetes,
  • Predictive Scaling

How to Cite

Siddharth V. Menon. (2025). Orchestrating Elasticity: A Comparative Analysis Of AI-Driven Predictive Scaling Versus Reactive Auto-Scaling In Microservices Architectures. Stanford Database Library of American Journal of Applied Science and Technology, 5(11), 144–150. Retrieved from https://oscarpubhouse.com/index.php/sdlajast/article/view/21

Abstract

As cloud computing paradigms shift towards microservices and containerized architectures, the efficiency of resource allocation remains a critical challenge. Traditional reactive auto-scaling mechanisms, which rely on threshold-based metrics such as CPU and memory utilization, often fail to address sudden workload spikes, leading to service degradation and "cold start" latency. This study presents a comparative analysis between standard reactive scaling, Ansible-based dynamic scaling on Azure PaaS, and a novel AI-driven predictive scaling framework. Drawing on recent developments in Artificial Intelligence and Infrastructure as Code (IaC), we evaluate these approaches using a synthesized workload representative of complex industrial scenarios, such as refinery turnarounds, and high-velocity e-commerce transactions. Our methodology involves the deployment of a Long Short-Term Memory (LSTM) neural network to forecast workload demands 10 minutes in advance, triggering proactive scaling actions. We contrast this with standard Kubernetes Horizontal Pod Autoscaling (HPA) and rule-based Ansible automation. The results demonstrate that the AI-driven predictive model reduces 95th percentile latency by approximately 34% compared to reactive approaches and mitigates cold-start latency by 90%. Furthermore, while the predictive model incurs a marginal computational overhead, it reduces overall cloud expenditure by 18% by minimizing over-provisioning during idle periods. The findings suggest that integrating AI into the orchestration layer is essential for the next generation of cost-efficient, high-performance cloud architectures.

References

  1. Sai Nikhil Donthi. (2025). Ansible-Based End-To-End Dynamic Scaling on Azure Paas for Refinery Turnarounds: Cold-Start Latency and Cost–Performance Trade-Offs. Frontiers in Emerging Computer Science and Information Technology, 2(11), 01–17. https://doi.org/10.64917/fecsit/Volume02Issue11-01
  2. P. Murthy and S. Bobba. (2025). AI-Powered Predictive Scaling in Cloud Computing: Enhancing Efficiency through Real-Time Workload Forecasting. International Research Journal of Engineering and Technology, 5(11), Issue 1. http://ijsrcseit.com
  3. Mouna Reddy Mekala. (2025). Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., 11(1), 1147-1157.
  4. K. Chouhan et al. (2021). Comprehensive Analysis of Artificial Intelligence with Human Resources Management. ResearchGate. https://www.researchgate.net/publication/353807927
  5. Newman, S. (2015). Building Microservices: Designing Fine-Grained Systems. O'Reilly Media.
  6. Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, Omega, and Kubernetes. ACM SIGOPS Operating Systems Review, 49(1), 65-80.
  7. Leitner, P., Wittern, E., Spillner, J., & Hummer, W. (2016). Challenging the cloud: distributed computing as a continuum. IEEE Internet Computing, 20(5), 64-73.
  8. Cockcroft, A. (2014). Microservices. Retrieved from https://www.slideshare.net/adriancockcroft/microservices-38641045
  9. Kubernetes Documentation. (n.d.). Retrieved from https://kubernetes.io/docs/home/
  10. Borg: The predecessor to Kubernetes. (n.d.). Retrieved from https://research.google/pubs/pub43438/
  11. Namiot, D., & Sneps-Sneppe, M. (2014). Cloud computing: principles and paradigms. John Wiley & Sons.
  12. Castro, P., & Rowstron, A. (2002). Towards an architecture for internet-scale overlay services. In Proceedings of the 2nd international workshop on Peer-to-peer systems (pp. 44-55).
  13. Google Cloud. (n.d.). Kubernetes Engine. Retrieved from https://cloud.google.com/kubernetesengine