A Method for Preparing Convolutional Neural Networks for Edge Deployment
DOI:
https://doi.org/10.18372/1990-5548.87.20773Keywords:
artificial intelligence, machine learning, neural networks, Edge deployment, pruningAbstract
This study addresses the critical problem of accuracy loss during the compression of deep neural networks for mobile platforms. The research focuses on optimizing convolutional neural networks for operation under constrained hardware resources and the Memory Wall effect. An innovative Edge-deployment preparation method is proposed, which, unlike traditional sequential approaches, integrates structured pruning, post-training quantization, and a fine-tuning stage into a single iterative cycle. This approach provides a synergistic effect, minimizing accuracy degradation while achieving maximum parameter compression. Comparative analysis results confirm that the developed method meets strict latency and power consumption constraints, which are vital for mobile diagnostics in medical applications. Future research prospects involve adapting this method to other machine learning architectures.
References
K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
G Litjens. et al, “A survey on deep learning in medical image analysis”, Medical Image Analysis, 2017, vol. 42, pp. 60–88. https://doi.org/10.1016/j.media.2017.07.005
V. Sze, Y.-H. Chen, T.-J. Yang, J. S. “Emer Efficient Processing of Deep Neural Networks: A Tutorial and Survey”, Proceedings of the IEEE, 2017, vol. 105, no. 12, pp. 2295–2329. https://doi.org/10.1109/JPROC.2017.2761740
T. Elsken, J. H. Metzen, F. Hutter, “Neural Architecture Search: A Survey”, Journal of Machine Learning Research, 2019, vol. 20, no. 55, pp. 1–21. https://doi.org/10.1007/978-3-030-05318-5_11
L. Deng et al., “Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey”, Proceedings of the IEEE, 2020, vol. 108, no. 4, pp. 485–532. https://doi.org/10.1109/JPROC.2020.2976475
A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications”, arXiv preprint arXiv:1704.04861, 2017.
S. Han, H. Mao, W. J. “Dally Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, International Conference on Learning Representations (ICLR), 2016.
W. Wen et al., “Learning Structured Sparsity in Deep Neural Networks”, Advances in Neural Information Processing Systems (NeurIPS), 2016, vol. 29.
B. Jacob et al., “Quantization and Training of Neural Networks ision and Pattern Recognition (CVPR), 2018, pp. 2704–2713. https://doi.org/10.1109/CVPR.2018.00286
A. Gholami et al., “A Survey of Quantization Methods for Efficient Neural Network Inference”, arXiv preprint arXiv:2103.13630, 2021. https://doi.org/10.1201/9781003162810-13
M. S. Abdelfattah et al., “Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator”, Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference (DAC), 2020. https://doi.org/10.1109/DAC18072.2020.9218596
C. Wu, “TFLite: Optimizing Mobile AI with Core ML Integration”, Presentation Slides from Google IO, 2018.
P. Micikevicius et al., “Mixed Precision Training for Deep Neural Networks”, International Conference on Learning Representations (ICLR), 2018.
M. Sandler et al., “MobileNetV2: Inverted Residuals and Linear Bottlenecks”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510–4520. https://doi.org/10.1109/CVPR.2018.00474
K. Hwang, J. H. Lee, “Survey of Hardware Accelerators for Deep Neural Networks”, IEEE Access, 2018, vol. 6, pp. 48259–48280.
K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
G. Litjens et al., “A survey on deep learning in medical image analysis”, Medical Image Analysis, 2017, vol. 42, pp. 60–88. https://doi.org/10.1016/j.media.2017.07.005
V. Sze, Y.-H. Chen, T.-J. Yang, J. S. Emer, “Efficient Processing of Deep Neural Networks: A Tutorial and Survey”, Proceedings of the IEEE, 2017, vol. 105, no. 12, pp. 2295–2329. https://doi.org/10.1109/JPROC.2017.2761740
T. Elsken, J. H. Metzen, F. Hutter, “Neural Architecture Search: A Survey”, Journal of Machine Learning Research, 2019, vol. 20, no. 55, pp. 1–21. https://doi.org/10.1007/978-3-030-05318-5_11
D. V. Prochukhan, “Fundux-oriented hybrid neural network with spatial-frequency processing and channel attention mechanism”, Information Processing Systems. 2025, no. 3(182), pp. 70–75. https://doi.org/10.30748/soi.2025.182.07
D. V. Prochukhan, “Class-oriented Method of Fundus Images Augmentation”, Visnyk of VPI, no. 5, Oct. 2025, pp. 140–145. https://10.31649/1997-9266-2025-182-5-140-145
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Electronics and Control Systems

This work is licensed under a Creative Commons Attribution 4.0 International License.
The scientific journal “Electronics and control systems” adheres to the principles of Open Access and provides free, immediate, and permanent access to all published materials without financial, technical, or legal barriers for readers.
All articles are published in Open Access under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Copyright
Authors who publish their works in the journal “Electronics and control systems”:
-
retain the copyright to their publications;
-
grant the journal the right of first publication of the article;
-
agree to the distribution of their materials under the CC BY 4.0 license;
-
have the right to reuse, archive, and distribute their works (including in institutional and subject repositories), provided that proper reference is made to the original publication in the journal.