A Method for Preparing Convolutional Neural Networks for Edge Deployment

Authors

DOI:

https://doi.org/10.18372/1990-5548.87.20773

Keywords:

artificial intelligence, machine learning, neural networks, Edge deployment, pruning

Abstract

This study addresses the critical problem of accuracy loss during the compression of deep neural networks for mobile platforms. The research focuses on optimizing convolutional neural networks for operation under constrained hardware resources and the Memory Wall effect. An innovative Edge-deployment preparation method is proposed, which, unlike traditional sequential approaches, integrates structured pruning, post-training quantization, and a fine-tuning stage into a single iterative cycle. This approach provides a synergistic effect, minimizing accuracy degradation while achieving maximum parameter compression. Comparative analysis results confirm that the developed method meets strict latency and power consumption constraints, which are vital for mobile diagnostics in medical applications. Future research prospects involve adapting this method to other machine learning architectures.

Author Biography

Dmytro Prochukhan, Kharkiv National University of Radio Electronics

Postgraduate student

References

K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

G Litjens. et al, “A survey on deep learning in medical image analysis”, Medical Image Analysis, 2017, vol. 42, pp. 60–88. https://doi.org/10.1016/j.media.2017.07.005

V. Sze, Y.-H. Chen, T.-J. Yang, J. S. “Emer Efficient Processing of Deep Neural Networks: A Tutorial and Survey”, Proceedings of the IEEE, 2017, vol. 105, no. 12, pp. 2295–2329. https://doi.org/10.1109/JPROC.2017.2761740

T. Elsken, J. H. Metzen, F. Hutter, “Neural Architecture Search: A Survey”, Journal of Machine Learning Research, 2019, vol. 20, no. 55, pp. 1–21. https://doi.org/10.1007/978-3-030-05318-5_11

L. Deng et al., “Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey”, Proceedings of the IEEE, 2020, vol. 108, no. 4, pp. 485–532. https://doi.org/10.1109/JPROC.2020.2976475

A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications”, arXiv preprint arXiv:1704.04861, 2017.

S. Han, H. Mao, W. J. “Dally Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, International Conference on Learning Representations (ICLR), 2016.

W. Wen et al., “Learning Structured Sparsity in Deep Neural Networks”, Advances in Neural Information Processing Systems (NeurIPS), 2016, vol. 29.

B. Jacob et al., “Quantization and Training of Neural Networks ision and Pattern Recognition (CVPR), 2018, pp. 2704–2713. https://doi.org/10.1109/CVPR.2018.00286

A. Gholami et al., “A Survey of Quantization Methods for Efficient Neural Network Inference”, arXiv preprint arXiv:2103.13630, 2021. https://doi.org/10.1201/9781003162810-13

M. S. Abdelfattah et al., “Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator”, Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference (DAC), 2020. https://doi.org/10.1109/DAC18072.2020.9218596

C. Wu, “TFLite: Optimizing Mobile AI with Core ML Integration”, Presentation Slides from Google IO, 2018.

P. Micikevicius et al., “Mixed Precision Training for Deep Neural Networks”, International Conference on Learning Representations (ICLR), 2018.

M. Sandler et al., “MobileNetV2: Inverted Residuals and Linear Bottlenecks”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510–4520. https://doi.org/10.1109/CVPR.2018.00474

K. Hwang, J. H. Lee, “Survey of Hardware Accelerators for Deep Neural Networks”, IEEE Access, 2018, vol. 6, pp. 48259–48280.

K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

G. Litjens et al., “A survey on deep learning in medical image analysis”, Medical Image Analysis, 2017, vol. 42, pp. 60–88. https://doi.org/10.1016/j.media.2017.07.005

V. Sze, Y.-H. Chen, T.-J. Yang, J. S. Emer, “Efficient Processing of Deep Neural Networks: A Tutorial and Survey”, Proceedings of the IEEE, 2017, vol. 105, no. 12, pp. 2295–2329. https://doi.org/10.1109/JPROC.2017.2761740

T. Elsken, J. H. Metzen, F. Hutter, “Neural Architecture Search: A Survey”, Journal of Machine Learning Research, 2019, vol. 20, no. 55, pp. 1–21. https://doi.org/10.1007/978-3-030-05318-5_11

D. V. Prochukhan, “Fundux-oriented hybrid neural network with spatial-frequency processing and channel attention mechanism”, Information Processing Systems. 2025, no. 3(182), pp. 70–75. https://doi.org/10.30748/soi.2025.182.07

D. V. Prochukhan, “Class-oriented Method of Fundus Images Augmentation”, Visnyk of VPI, no. 5, Oct. 2025, pp. 140–145. https://10.31649/1997-9266-2025-182-5-140-145

Downloads

Published

2026-02-11

How to Cite

Prochukhan, D. (2026). A Method for Preparing Convolutional Neural Networks for Edge Deployment. Electronics and Control Systems, 1(87), 9–13. https://doi.org/10.18372/1990-5548.87.20773

Issue

Section

COMPUTER SCIENCES AND INFORMATION TECHNOLOGIES