A Multicriteria Method for Optimizing it Service Management of a Virtual Provider Based on Deep Reinforcement Learning

Authors

DOI:

https://doi.org/10.18372/1990-5548.88.20971

Keywords:

virtual service provider, service-resource model, proximal policy optimization, multicriteria optimization, IT service management, information systems

Abstract

This article addresses the problem of optimizing IT service management for the B2B segment under conditions of dynamic workloads and the probabilistic unreliability of service operators. The architecture of a Virtual Service Provider (VSP) management system is proposed, which automates the service processes for Corporate Customers. The core of the system is a hybrid translation module that mathematically transforms abstract business intents and client context into a deterministic Service-Resource Model with specified technical, financial, and time constraints. To efficiently orchestrate the generated tasks, a multicriteria optimization algorithm, PPO-VSP, based on deep reinforcement learning (Actor-Critic architecture) was developed. The implementation of a reputation assessment module allowed the system to identify unreliable service providers and avoid overloading them. Experimental studies using simulation modeling confirmed the stable mathematical convergence of the algorithm. The trained optimization agent ensured compliance with service level agreements (the Quality of Experience metric) at a level of 95.4% under limited resource conditions. The generation time of the decomposition matrix averaged 18 ms, guaranteeing rapid management decision-making during the operation of high-load infrastructures without system downtime.

Author Biographies

Oleksandr Rolik , National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”

Doctor of Science

Professor

Head of the Department of Information Systems and Technologies

Faculty of Informatics and Computer Engineering 

Kyrylo Znova , National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”

Postgraduate Student

Department of Information Systems and Technologies

Faculty of Informatics and Computer Engineering

References

A. Araldo, A.D. Stefano, and A.D. Stefano, “Resource allocation for edge computing with multiple tenant configurations,” in Proceedings of the 35th Annual ACM Symposium on Applied Computing, Brno Czech Republic: ACM, Mar. 2020, рр. 1190–1199. https://doi.org/10.1145/3341105.3374026.

A. Clemm, L. Ciavaglia, L.Z. Granville, and J. Tantsura, “Intent-Based Networking – Concepts and Definitions,” Internet Engineering Task Force, Request for Comments RFC 9315, Nov. 2022. https://doi.org/10.17487/RFC9315.

Y. Xu, M. Z. A. Bhuiyan, T. Wang, X. Zhou, and A. K. Singh, “C-FDRL: Context-Aware Privacy-Preserving Offloading Through Federated Deep Reinforcement Learning in Cloud-Enabled IoT,” IEEE Trans. Ind. Inform., vol. 19, no. 2, pp. 1155–1164, Feb. 2023, https://doi.org/10.1109/TII.2022.3149335.

O. I. Rolik and S. D. Zhevakin, “Cost Optimization Method for Informational Infrastructure Deployment in Static Multi-Cloud Environment,” Radio Electron. Comput. Sci. Control, vol. 3, pp. 160–172, Nov. 2024, https://doi.org/10.15588/1607-3274-2024-3-14.

W. Kong, X. Li, L. Hou, J. Yuan, Y. Gao, and S. Yu, “A Reliable and Efficient Task Offloading Strategy Based on Multifeedback Trust Mechanism for IoT Edge Computing,” IEEE Internet Things J., vol. 9, no. 15, pp. 13927–13941, Sер. 2022, https://doi.org/10.1109/JIOT.2022.3143572.

Y. Cai, P. Cheng, Z. Chen, M. Ding, B. Vucetic, and Y. Li, “Deep Reinforcement Learning for Online Resource Allocation in Network Slicing,” IEEE Trans. Mob. Comput., vol. 23, no. 6, pp. 7099–7116, Jun. 2024, https://doi.org/10.1109/TMC.2023.3328950.

G. Wei, A.V. Vasilakos, Y. Zheng, and N. Xiong, “A game-theoretic method of fair resource allocation for cloud computing services,” J. Supercomput., vol. 54, no. 2, pp. 252–269, Nov. 2010, https://doi.org/10.1007/s11227-009-0318-1.

T. Metsch, M. Viktorsson, A. Hoban, M. Vitali, R. Iyer, and E. Elmroth, “Intent-Driven Orchestration: Enforcing Service Level Objectives for Cloud Native Deployments,” SN Comput. Sci., vol. 4, no. 3, pp. 268, Mar. 2023, https://doi.org/10.1007/s42979-023-01698-0.

J. Hu, Y. Li, G. Zhao, B. Xu, Y. Ni, and H. Zhao, “Deep Reinforcement Learning for Task Offloading in Edge Computing Assisted Power IoT,” IEEE Access, vol. 9, pp. 93892–93901, 2021, https://doi.org/10.1109/ACCESS.2021.3092381.

H. Taneja and S. Kaur, “Reputation based novel trust management framework with enhanced availability for cloud,” J. Parallel Distrib. Comput., vol. 178, pp. 43–55, Aug. 2023, https://doi.org/10.1016/j.jpdc.2023.03.010.

K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep Reinforcement Learning: A Brief Survey,” IEEE Signal Process. Mag., vol. 34, no. 6, pp. 26–38, Nov. 2017, https://doi.org/10.1109/MSP.2017.2743240.

M. Iqbal et al., “Twin Delayed Deep Deterministic Policy Gradient for Intelligent Optimization in STAR-RIS-Assisted Wireless Networks,” IEEE Open J. Commun. Soc., vol. 6, pp. 9696–9713, 2025, https://doi.org/10.1109/OJCOMS.2025.3631341.

R. Siyadatzadeh et al., “ReLIEF: A Reinforcement-Learning-Based Real-Time Task Assignment Strategy in Emerging Fault-Tolerant Fog Computing,” IEEE Internet Things J., vol. 10, no. 12, pp. 10752–10763, Jun. 2023, https://doi.org/10.1109/JIOT.2023.3240007

Downloads

Published

2026-04-19

How to Cite

Rolik , O., & Znova , K. (2026). A Multicriteria Method for Optimizing it Service Management of a Virtual Provider Based on Deep Reinforcement Learning. Electronics and Control Systems, 2(88), 93–101. https://doi.org/10.18372/1990-5548.88.20971

Issue

Section

INFORMATION SYSTEMS AND TECHNOLOGIES